EXPERT INSIGHTS: Eric Greenwood on Large Language Models & GPT Technology
October 5, 2023
With the emergence and adoption of Large Language Models (LLMs) and more specifically Generative Pre-trained Transformers such as ChatGPT by major vendors (i.e. Microsoft Nuance Mix and Google CCAI Insights), the industry is on the cusp of a paradigm shift. Conventional IVRs developed in VXML are being replaced by low code and no code models, which are themselves evolving quickly. Common dialogue engines now underly both voice and data paths. And the challenges of tuning Natural Language Models (NLPs) are changing with the adoption of LLMs. As these models evolve, much of what was coded in the past (i.e. routing logic) will in the near future, be dynamically determined by LLM models based on rules defined by non technical business users.
Generative Pre-trained Transformers (GPTs) are a subset of Large Language Models (LLMs) and today represent the leading edge of Artificial Intelligence (AI). Introduced by OpenAI in 2018, GPT models utilize the Transformer Architecture and are pre-trained on vast unlabeled text datasets, enabling them to produce human-like content. By 2023, this became a standard feature of most LLMs, leading to them being collectively termed as GPTs. The GPT nomenclature has been widely adopted. Industry specific GPT models are now emerging such as Salesforce's EinsteinGPT for customer relationship management and Bloomberg's BloombergGPT for finance.
Transformer Architecture is a deep learning model widely used for training LLMs on extensive datasets like Wikipedia and Common Crawl. Input text is broken down into n-grams or tokens. These tokens are converted into vectors using word embeddings. Each token's context is defined based on its relation to other tokens using a mechanism that looks at data from multiple parallel viewpoints simultaneously (parallel multi-head attention mechanism) highlighting key tokens while downplaying less significant ones. The Transformer Architecture is applied across various fields, including natural language processing, computer vision, audio, and multi-modal processing.
The evolution of LLMs has been rapid and transformative over the past few years, signaling a new era in the field of AI. These models, which owe their exceptional capabilities to the Transformer Architecture, represent a blend of sheer computational might and intricate training methodologies. Google developed the BERT (Bidirectional Encoder Representations from Transformers) model in 2018 with 340 million parameters. The following year 2019, OpenAI released GPT-2 with 1.5 billion parameters. And in November of 2022 ChatGPT based on GPT-3.5 with 75 billion parameters fundamentally changed everyone’s perception of AI and its potential.
GPT-4 released in 2023 has 1.7 trillion parameters while Google’s PaLM (Pathways Language Model) has 540 billion parameters. However, while GPT-3.5 reached a pivotal point with 75 billion parameters, it is not yet clear that GPT-4 with its massive increase in parameters has significantly improved the model. Googles PaLM with its more modest 540 billion parameters, may be more efficient and effective.
In 2023, the implications of Large Language Models (LLMs) have become increasingly pronounced and are widely discussed across various sectors. The advancements in LLMs have reached a point where distinguishing between human-written text and LLM-generated text is becoming nearly impossible.
LLMs will and are having a transformative impact on Natural Language Processing (NLPs). They have accelerated NLP development, improved the quality of NLP applications, and opened new possibilities for natural language understanding and generation across a wide range of industries and use cases.
While LLMs also have the potential to reduce the need for extensive, rule-based routing logic (programming) by offering natural language understanding, intent recognition, and context-awareness, the degree to which they eliminate this need is dependent upon multiple factors. LLMs are based on extensive predefined fixed data sets. Use within a unique proprietary environment requires extensive training on nomenclature, data and processes, and integration with existing conventional technology.
The economic potential of LLMs was recently highlighted by Goldman Sachs, who estimated that within the next decade, generative language AI could potentially boost the global GDP by a substantial 7%. However, amidst the dynamism and excitement, there lies a potential pitfall. The allure of these new tools can be misleading, possibly guiding us towards suboptimal decisions.
While it is important to harness the potential of these advancements, it is critical to do so without inadvertently adopting technologies or partnering with entities that may soon become obsolete. Understanding what the technology is, its potential, challenges and where the industry is going, is important to ensuring that investments in LLM/GPT technology today will provide both short-term and long-term benefits. Developing an evolutionary approach that simultaneously takes advantage of existing conventional technology (investments) and new emerging technologies is critical.
Now that you are ready to harness the power of Large Language Models for lasting business agility, reach out to connect with a member of our Product Innovation & Transformation Team.
Like what you’re reading? Sign up below to hear more from our experts on the latest technology trends.