If you are an avid ChatGPT user (I assume many of you are), weren’t you fazed by recent news reports of the incredible chatbot going bankrupt by the end of 2024? OpenAI, the maker of ChatGPT is burning $700,000 (or Rs 5.8 crore) each day to keep the large language model (LLM) running. Despite this hefty operational expense, ChatGPT’s user base has been consistently dwindling over the past two months. It is Microsoft’s $10 billion funding and the money from other deep-pocket investors that’s keeping ChatGPT afloat.
Concerns are rife regarding the viability of LLMs like ChatGPT. But this is just one side of the story. The other concern being bandied about is the performance of these Generative AI models. A team of researchers from the University of California at Berkeley and Stanford University found that the performance of LLMs, especially the ChatGPT4 version has dipped. In their study, the researchers found that ChatGPT4’s March version outperformed its June counterpart on parameters like solving math problems, answering sensitive questions, responding to opinion surveys, answering multi-hop knowledge-intensive questions, code generation, US Medical License exams, and completing visual reasoning tasks.
This phenomenon is startling because models like ChatGPT are supposed to get smarter and better with time and training. Why is then the model underperforming or delivering unanticipated outcomes? Experts have attributed it to AI drift. AI drift happens when an AI system's performance and behaviour change over time, often due to the changing nature of the data it learns from and interacts with. This can cause the AI system to deviate from its original design and purpose. AI model drift is a form of algorithmic bias that can have unintended consequences.
But I think the concerns about AI drift are overplayed. Language models like ChatGPT are the creation of an ever-evolving human mind. When ChatGPT first exploded into public consciousness, it was hailed as an incredible chatbot, a phenomenon. It amassed 100 million+ users in just two months of launch. Now when its user base is contracting and its performance is inconsistent on sundry occasions, questions have surfaced on its capabilities. I believe the performance of ChatGPT-like models is benchmarked against the swelling expectations of the users. We have known the array of things that ChatGPT can do- write a blog, formulate a GTM strategy or even compose a Shakespearean sonnet in a trice.
We were told that LLMs like ChatGPT would be insanely capable as time rolled on. What we weren’t told or did not realize is that even AI at its zenith can be fallible and vulnerable. They will be prone to some errors and flaws, just like you and me. When a master pianist loses her touch, a violinist hits an occasional sour note or an adept writer produces a below-par piece, don’t you dismiss them as aberrations? Why the same yardstick is then not applied to Generative AI models? If humans trip at their feet, the machines modelled by humans can stagger too.
These are still nascent days for LLMs. They are in the cradle of innovation. Let us give them ample time to mature and train them with data that is free of bias and inaccuracies. The quality of data you impart to any LLM will determine their output. That’s not something which a machine can control, it’s in the grip of a human behind the model. Just as a sponge soaks up both clear water and murky contaminants, these models absorb accurate data as well as bias.
My point is not to debunk AI drift. We accept it as only a transient trend that will pass over soon. Just because a puny segment of the internet peddles misinformation and propaganda, would you recommend a ban on it? There should not be any trust deficit among stakeholders on the capabilities of AI models. Going ahead, they will be even more disruptive and transformational. The onus is on us to use Generative AI technology ethically and responsibly to enrich our lives. Here, the developers should make sure that the LLM is updated regularly and trained on high-quality data. They should appraise the LLM's performance and take corrective action if needed. Feedback loops from users and stakeholders are needed to spot if the AI model is straying from the intended purpose. No matter how capable an AI model is, human oversight is irreplaceable in critical decision-making processes to keep AI from drifting too far.
This blog was originally published in Priyadarshi Nanu Pany's LinkedIn account.