I have been prepared for my career transition from academia to data science for about one year. So I’ve compiled a list of resources that was helpful and hopefully this could save you some time. In general, if you want to become a data scientist, these five topics are must-to-learn.

  • Machine Learning
  • SQL
  • Statistics
  • Coding
  • Product Sense

To get some professional experience, I’m enrolled as a master student in Analytics at Georgia Tech (5/10 courses completed. I will write a blog on the pros and cons of this program. Stay tuned!). I also accomplished a data science fellowship at Insight Data Science this summer (I will write a blog on this data science program as well).

Let’s start with machine learning. A good to have list is:

  • Machine learning: Regression by University of Washington. Have a deep understanding in linear regression will make learning other ML methods much easier. In particular, the concept of bias-variance trade-off, why validation, cost fucntion, derivation of least square solution of cost function, gradient descent approach, Ridge regression and Lasso regression. Once you finish this course, you can proceed to Andrew Ng’s course for other ML methods.

  • Machine learning course by Andrew Ng. It’s an intro level course if you are new to this field. It’s also good to review the notes on concepts before interviews. The original course assignments were designed in MATLAB, if you want to finish the course using Python, here is an alternative way if you want to submit assignments in Python

  • Chris Albon has a nice set of flash cards (not free), I find it’s useful to refresh my concepts before an interview. You can refer to his twitter for free version of flashcard.

  • Learn the ML solution from Kaggle competition problems. I didn’t participate Kaggle contest, but I find the tutorial and shared solution in the discussion is quite helpful. Try to absorb how people tackle the problem and how they choose a specific ML algorithm and a metric to measure the performance.

  • Implement at least 1 data science project, no matter it is a project from Coursera class, boot camp project, or personal side project. “Desribe a data science project you’ve worked on/you enjoyed most.” This question was asked a lot during the interviews, so it is very important to build your data science project. If possible, try to collect data from scracth, so you can leanr to how to do data cleaning.