My thoughts on generative models like ChatGPT

There has been a lot of buzz lately around certain generative models like ChatGPT, Codex, Dall-e 2, Bert, Stable Diffusion, and others. What they seem to have in common is:

They were trained on massive datasets
They are “large” models in that they have hundreds of millions or even many billions of learned parameters
They were made by super smart people with access to state of the art GPU clusters
They cost a lot of money to train

This seems to be at odds with the majority of the data science community, practitioners with far less resources, knowledge, and capabilities. In a way it’s a have and have-nots situation. But then again what have these models accomplished? Sure, there is novelty, some fun, and a ton of potential. However there is growing awareness and fear of AI and I feel like this has sparked calls for regulation, transparency, and improved data ethics.

If you aren’t working for companies like OpenAI, Stability AI, Microsoft, or Google, there are still options for you. First, don’t get discouraged by the hype. If anything, embrace it. We are already seeing open sourcing of big models like BLOOM and Stable Diffusion, which brings large scale transfer learning in reach. Specifically if we look at TIMM or HuggingFace we see a ton of pretrained models that are supported by a robust community of brilliant folks with a passion for open source. These might not be the hyped models like ChatGPT, but we can still build amazing things with other model architectures.

I think what is more important is to take advantage of the options provided by HuggingFace and others, the ability to test out pre-trained models and see how they go in your use case, on low-cost or free GPUs at a small scale. That is how everything starts, from an interesting idea, proof of concept, or passion. Start small, be happy you are living in an age of rapid AI development and community sharing, and focus on problems that you are about or that are beneficial to your situation.

Will data scientists be replaced by automation and meta-models?

Honestly I cannot say. The best defense anyone has against being replaced or reduced to redundancy is to continue learning, exploring, and innovating. If aren’t creating new model architectures and are working as a data scientist or similar, you can still leverage these new technologies to create value for your company. The models themselves are really cool, but what is most important is business viability. Whether that means creating a company around an idea, or bringing new ideas to your company, the key is to keep looking for solutions to real life problems.

I am hesistant to take a stance on this but I will say that I have some anxiety around whether I have a future in data science. It’s all about perspective and finding a place for yourself. Take AutoML for classical ML problems. It is pretty cool in that you can efficiently test out many model architectures and hyperparameter spaces without having to write hundreds of lines of code with the scikit-learn api. But, it also automates much of what we are taught in a machine learning course.

As a reminder to myself and others I’ll leave you with this thought: Don’t be discouraged by the progress in research. Embrace it and know that your knowledge so far will help you better leverage these tools than those without a background in data science. Just because something new comes out that is a game changer, doesn’t mean its going to replace you. Last, take advantage of tools which permit you to write less boilerplate code and focus on the problem at hand.