People always ask me, “What should I put on my resume to get a job?” I’m asked this because as Chief Scientist of one of the first companies using machine learning to predict who companies should hire, people seem to think I know some secret language, like a Freemasons’ handshake for LinkedIn, that will guarantee them a new job. The very fact that you’re asking me this question tells me two things. (1) you are a 23 year-old university educated male, and (2) I shouldn’t hire you. We explore many of the predictive factors in this episode, and they are surprisingly hard to game. But along the way, I did find a lot of things that were negative predictors across 122M working professionals. Here are a few.

As you’ll read about in the episode, claiming standard skills, like C/C++ and Python for developers, are not predictive of your success. In fact, most “work skills” listed on LinkedIn, StackOverflow, or other professional social media sites, were not predictive of anything. And it turns out that having more followers, upvotes, badges, or social signals also predicted nothing about quality of code or other job related ability. (Unsurprisingly, people accumulate high social scores from being adept with social media, not coding.) Those with high social scores gravitated to relatively basic questions, whereas the best coders tended to answer extremely challenging questions that few would understand (and therefore, few would ever read in order to upvote). In fact, a relatively simple predictor for Q&A sites like StackOverflow and Reddit was actually answering the fucking question.

We created a simple model that performed vastly better than social scores in predicting who wrote the best code. From the basic assumption that the answerer knows more than the asker, we went on to collect millions of asker-answerer relationships. We then applied a 1-d Local linear embedding (using hLLE) in order to rank the developers from these pairwise relationships. The ranking from our model closely matched that of human experts who ranked code samples from the same population. Social scores were nearly random compared with human ranking (Spearman coef = 0.73 vs 0.22).

So, work skills and social media are worth nothing? Well, consider the tweet: “Celery is awesome!” If this were about the plant then it’s obviously false–celery is disgusting and should be thrown in the compost. Anything that needs to be coated in peanut butter to be edible is not food. But this particular person was a developer in our database, and our model suggested his “celery” was the multithreading toolkit for Python. Although the tweeter never claims to know Python, our model identified his nod to this complex skill is much more predictive than simply selecting “Python” from a LinkedIn menu of skills. What’s more, the sentiment behind the post mattered: tweeting passionately about parallel processing at 2am to your 8 followers might be a damning statement about your social life, but it is a strong predictor of your endogenous motivation for coding.

In addition to skills, recruiters love to look at schools, and yet our analysis showed that they vastly overweight university degrees. It’s true that a computer science degree from a top program–Stanford, CalTech, MIT, CMU, Berkeley–is a positive predictor of quality of code among programmers. That same analysis (a heat diffusion model on tripartite graph) showed that an undergraduate degree from the bottom ⅔ of universities was actually a negative predictor compared to no attendance at all.

This isn’t to say that people that attended the bottom ⅔ were necessarily bad at their jobs, but that the kind of person that never attended university to begin with and is still working as a professional developer usually writes better code. There are many explanations for why this is true. Many people attend university for fun, for money, or because their parents pressure them into doing so. In contrast, if you pursue computer science and programming to the point of getting hired even without a formal education, it implies that this is a core part of who you are as a person. Individuals that are endogenously motivated by something inside of them simply have better career, and life, outcomes.

Another curious finding was that having a PhD in anything, from anywhere, was a bigger predictor of quality of code than an undergrad in CS at Stanford. That a working developer with a PhD in English creates better code on average than a developer who went to school expressly for that purpose is obviously counterintuitive to the traditional model of educational attainment that we hire for, but it turns out that it's not how the scholarship of English or History applies to programming that matters. Rather, your success in a job is ultimately about who you are. What matters is that you were curious about something and went the extra distance to explore it, even if it wasn’t for which you were trained.

And don’t get me started on “works well with others"...