Something about Interview of Data Scientist
trend of general data scientist interviews earlier, more like software engineer interviews heavy on coding light problem solving and probablity currently more balance btw coding & ML statistics coding still dispensable 0-1 coding problems in phone interviews 1-2 coding problems in onsite interviews new requirement deep understanding into algorithms and metrics context/domain knowledge for problems example: 为什么 decision tree 会 overfitting minimum split 减少 nonlinear bagging greedy method , criterion ,一个数据点, generalize 所有数据 一个数据点, variance 大,参数越多数据越少 层数,最小 node size random forest performace metrics inbalanced data down-sampling 1 / 10 down settling 十倍数目 AOC 不变 数据不平衡 up sampling down sampling smote sampling package overfitting 为什么方法 1 比 2 好,参数这么调 limited area collection prepare bayes rule— debug bootstrap A , b test p-value —phone interview industrial blogs and papers ...