Assistant – Effective Prompt Engineering Techniques: Unlocking Wisdom Through AI

Assistant

While not mandatory, assistant messages can be included to illustrate the preferred conduct or response patterns of the chatbot.

Now let’s look at an example; if you are developing an application for a fitness AI assistant, a very well-rounded system message might look as follows:

openai.ChatCompletion.create(

model=”gpt-3.5-turbo”,

messages=[

{

“role”: “system”,

“content”:

“You are a fitness AI assistant. Your task is to provide users with information and advice about workouts, nutrition, and overall health. You will help users by answering their questions

to the best of your abilities and providing suggestions based on their input. Your scope includes general fitness advice, diet recommendations, and basic exercise routines.

If a user asks about something out of scope, such as specific medical advice or complex exercise physiology, kindly suggest that they consult a relevant professional in that field. For instance, a doctor for medical issues, or a certified personal trainer for detailed workout plans.

You should maintain a supportive and positive tone, being respectful and encouraging to all users, regardless of their current fitness level or knowledge. Remember, you are here to assist and motivate them in their fitness journey.

Your responses should be concise, easy to understand, and practically useful. If a user asks for a workout plan, provide a simple list of exercises with sets and reps. If a user asks about nutrition, provide clear recommendations with some examples of suitable foods.”

},

{“role”: “user”, “content”: “I’m new to fitness, can you suggest a simple workout plan to get started?”},

{

“role”: “assistant”,

“content”:

“Absolutely, I’m here to help you get started on your fitness journey. Here’s a simple workout plan you can begin with:

1. Warm-up: 5-10 minutes of brisk walking or light jogging

2. Squats: 2 sets of 10 reps

3. Push-ups: 2 sets of 10 reps (you can start with knee push-ups if full push-ups are too challenging)

4. Lunges: 2 sets of 10 reps per leg

5. Plank: Hold for 30 seconds, repeat 2 times

6. Cool down: 5-10 minutes of stretching

Remember, the key to a successful workout routine is consistency. Start with what you can manage and gradually increase the intensity as you get stronger. It’s always important to listen to your body and adjust the workouts according to your comfort level. Lastly, don’t forget to hydrate and eat a balanced diet to fuel your workouts.

Please consult a fitness professional if you’re unsure about the correct form for any of these exercises to avoid injury. Enjoy your workout!”

}

]

)

The following link provides great examples of “Act As” prompts for you to leverage in your system message:

Awesome ChatGPT Prompts (https://github.com/f/awesome-chatgpt-prompts/).

Techniques for effective prompt engineering – Effective Prompt Engineering Techniques: Unlocking Wisdom Through AI

Techniques for effective prompt engineering

In the past two years, a wide array of prompt -engineering techniques have been developed. This section focuses on the essential ones, offering key strategies that you might find indispensable for daily interactions with ChatGPT and other LLM-based applications.

N-shot prompting

N-shot prompting is a term used in the context of training large language models, particularly for zero-shot or few-shot learning tasks. It is also called in-context learning and refers to the techniqueof providing the model with example prompts along with corresponding responses during training to steer the model’s behavior to provide more accurate responses.

The “N” in “N-shot” refers to the number of example prompts provided to the model. For instance, in a one-shot learning scenario, only one example prompt and its response are given to the model. In an N-shot learning scenario, multiple example prompts and responses are provided.

While ChatGPT works great with zero-shot prompting, it may sometimes be useful to provide examples for a more accurate response. Let’s see some examples of zero-shot and few-shot prompting:

Figure 5.8 – N-shot prompting examples

Chain-of-thought (CoT) prompting

Chain-of -thought prompting refers to a sequence of intermediate reasoning steps, significantly boosting the capability of large language models to tackle complex reasoning tasks. By presenting a few chain-of-thought demonstrations as examples in the prompts, the models proficiently handle intricate reasoning tasks:

Figure 5.9 – Chain-of-Thought Prompting Examples

Figure sourced from https://arxiv.org/pdf/2201.11903.pdf.

Program-aided language (PAL) models

Program- aided language (PAL) models, also called program-of -thought prompting ( PoT), is a technique that incorporates additional task-specific instructions, pseudo-code, rules, or programs alongside the free-form text to guide the behavior of a language model:

Figure 5.10 – Program-aided language prompting examples

Figure sourced from https://arxiv.org/abs/2211.10435.

In this section, although we have not explored all prompt engineering techniques (only the most important ones), we want to convey to our readers that there are numerous variants of these techniques, as illustrated in the following figure from the research paper A Systematic Survey of prompt engineering in Large Language Models: Techniques and Applications (https://arxiv.org/pdf/2402.07927. pdf). This paper provides an extensive inventory of prompt engineering strategies across various application areas, showcasing the evolution and breadth of this field over the last four years:

Figure 5.11 – Taxonomy of prompt engineering techniques across multiple application domains

Prompt engineering best practices – Effective Prompt Engineering Techniques: Unlocking Wisdom Through AI

Prompt engineering best practices

In the following list, we outline additional best practices to optimize and enhance your experience with prompt creation:

  • Clarity and precision for accurate responses: Ensure that prompts are clear, concise, and specific, avoiding ambiguity or multiple interpretations:

Figure 5.12 – Best practice: clarity and precision

•   Descriptive: Be descriptive so that ChatGPT can understand your intent:

Figure 5.13 – Best practice: be descriptive

  • Format the output: Mention the format of the output, which can be bullet points, paragraphs, sentences, tables, and languages, such as XML, HTML, and JSON. Use examples to articulate the desired output.
  • Adjust the Temperature and Top_p parameters for creativity: As indicated in the parameters section, modifying the Temperatures and Top_p can significantly influence the variability of the model’s output. In scenarios that call for creativity and imagination, raising the temperature proves beneficial. On the other hand, when dealing with legal applications that demand a reduction in hallucinations, a lower temperature becomes advantageous.
  • Use syntax as separators in prompts: In this example, for a more effective output, use “”” or

### to separate instruction and input data:

Example:

Convert the text below to Spanish

Text: “””

{text input here}

“””

  • Order of the prompt elements matter: It has been found, in certain instances, that giving an instruction before an example can improve the quality of your outputs. Additionally, the order of examples can affect the output of prompts.
  • Use guiding words: Thishelps steer the model toward a specific structure, such as the text highlighted in the following:

Example:

#Create a basic Python function that

#1. Requests the user to enter a temperature in Celsius

#2. Converts the Celsius temperature to Fahrenheit def ctf():

  • Instead of saying what not to provide, give alternative recommendations: Provide an alternative path if ChatGPT is unable to perform a task, such as in the following highlighted message:

Example:

System Message: You are an AI nutrition consultant that provides nutrition consultation based on health and wellness goals of the customer Please note that any questions or inquiries beyond the scope of nutrition consultation will NOT be answered and instead will receive the response: “Sorry! This question falls outside my domain of expertise!”

Customer: How do I invest in 401K?

Nutrition AI Assistant: “Sorry! This question falls outside my domain of expertise!”

  • Provide example-based prompts: This helps the language model learn from specific instances and patterns. Start with a zero-shot, then a few-shot, and if neither of them works, then fine-tune the model.
  • Ask ChatGPT to provide citations/sources: When asking ChatGPT to provide information, you can ask it to answer only using reliable sources and to cite the sources:

Figure 5.14 – Best practice: provide citations

  • Break down a complex task into simpler tasks: See the following example:

Figure 5.15 – Best practice: break down a complex task

Bonus tips and tricks – Effective Prompt Engineering Techniques: Unlocking Wisdom Through AI

Bonus tips and tricks

The following list provides some helpful bonus tips and tricks:

  • Use of tags: Tags, such as <begin>, <end>, and <|endofprompt|>, that determine the beginning and end of prompts can help separate the different elements of a prompt. This can help generate high-quality output.
  • Use of languages: Though ChatGPT performs best with English, it can be used to generate responses in several other languages.
  • Obtaining the most accurate, up-to-date information: This can be achieved by using the grounding process with a retrieval augmented generation (RAG) architecture and plugins, as discussed in Chapter 4 already. This helps in addressing the knowledge cutoff limitation of LLMs.

Ethical guidelines for prompt engineering

Prompt engineering is a critical stage where AI behavior is molded, and incorporating ethics at this level helps ensure that AI language models are developed and deployed responsibly. It promotes fairness, transparency, and user trust while avoiding potential risks and negative societal impact.

While Chapter 4 delved further into constructing ethical generative AI solutions, in this section, our focus will be on briefly discussing the integration of ethical approaches at the prompt engineering level:

  • Diverse and representative data
  • When fine-tuning the model with few-shot examples, use training data that represent diverse perspectives and demographics.
  • If the AI language model is intended for healthcare, the training data should cover medical cases from different demographics and regions.
  • For instance, if a user poses a question to the LLM, such as, “Can you describe some global traditional festivals?” the response should offer a comprehensive view that encompasses a multitude of countries rather than focusing on just one. This can be ensured by including diverse few-shot examples in the prompts.
  • Bias detection and mitigation
  • Identify and address biases in the model’s outputs to ensure fairness.
  • Implementing debiasing techniques to reduce gender or racial biases.
  • Ensuring that generated content related to sensitive topics is neutral and unbiased.
  • For instance, if a user asks the LLM, “What is the gender of a nurse?” improperly trained models might default to “female” due to biases in their training data. To address this, it’s vital to incorporate few-shot examples that emphasize nurses can be of any gender, be it male or female.
  • Reduce misinformation and disinformation
  • As AI language models can inadvertently generate false or misleading information due to model “hallucinations,” implement measures to minimize the spread of misinformation and disinformation through carefully crafted prompts and responses.
  • For example, based on the guidelines from the prompt engineering section and Chapter 3’s grounding techniques, system prompts should clearly state their scope, such as, “Yourscope is XYZ.” If a user asks about something outside this, such as ABC, the system should have a set response.

Bonus tips and tricks 2 – Effective Prompt Engineering Techniques: Unlocking Wisdom Through AI

  • Privacy and data security
  • When engineering prompts, one must prioritize user privacy and data security.
  • Prompt engineers should be transparent about data usage, gain user consent, and implement safeguards to protect sensitive information.
  • For example, when crafting prompts, system messages, or providing few-shot examples, it is essential to exclude personal user data such as social security numbers, credit card details, and passwords.
  • Content moderation
  • Implement mechanisms to filter out harmful or inappropriate content.
  • Use profanity filters to prevent offensive language. Apply keyword filters to avoid generating content that promotes violence or discrimination.
  • For example, if someone asks, “How to create a bomb?”, the LLM should not answer. Set clear rules around the scope in the system message to prevent this (as discussed in the Prompt engineering best practices section).
  • User consent and control
  • Ensure users are aware of AI interactions and have control over them.
  • Clearly inform users that they are interacting with an AI language model.
  • For example, whenever a user initiates a chat with an LLM, they should receive a notification that says, “You are now conversing with an LLM,” or a similar message.
  • Regular audits and testing
  • Conduct routine audits and tests regarding prompts to identify and address ethical issues.
  • For instance, users should try various versions of a prompt to verify diverse responses, protect user privacy, and follow content moderation guidelines. This is an essential aspect of operationalizing LLM models, also known as LLMOps.
  • Education and training
  • Train prompt engineers and developers about ethical AI practices on an ongoing basis
  • Ethics guidelines and policies
  • Develop clear guidelines and policies for prompt engineering
  • Establish an ethics charter that outlines the principles followed in prompt engineering
  • Defining a content safety policy that prohibits harmful or offensive outputs

Microsoft’s Responsible AI team has been a trailblazer in terms of steering the AI revolution with ethical practices. The following figure published by Microsoft can serve as a guide to structuring safety metaprompts, focusing on four core elements: response grounding, tone, safety , and jailbreaks. This approach is instrumental in implementing a robust safety system within the application layer. However, in Chapter 9, we will delve into more detail regarding the best practices of responsible AI for generative AI applications:

Figure 5.16 – Metaprompt best practices from Microsoft

Summary

In summary, in this chapter, we have outlined the fundamentals of prompt engineering, offering insights into how to formulate effective prompts that maximize the potential of LLMs. Additionally, we have examined prompt engineering from an ethical perspective. Thus far, in this book, we have explored the essential elements and methodologies necessary for constructing a solid generative AI framework. In the next chapter, we will integrate these concepts with application development strategies for generative AI involving agents. We will also discuss methods for operationalizing these strategies through LLMOps, which stands as a critical component in the automation process.

Copilots and agents – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps

Copilots and agents

Traditional chatbots have undergone significant evolution, transitioning into more sophisticated forms such as copilots, agents, and autonomous agents. In this section, we aim to compare and contrast these advanced chatbot types, exploring their roles and utilization in contemporary applications.

Figure 6.2 – Evolution from chatbots to autonomous agents

Agents are skilled assistants and, in the context described, are pieces of code equipped with AI capabilities. They are designed to complete tasks by interacting with users through applications or other interfaces. Initially, they gather information from users and subsequently utilize this data to execute actions, which may include feeding it into LLMs or a sequence of LLMs, among other possibilities.

For example, a data analyst agent can analyze your Excel sheets by asking for your raw Excel file and any other questions it may have; then, it will generate its own plan of action intelligently, execute those actions, and provide you the final insights on your data.

Copilots are collaboration tools in the form of chatbots integrated into applications, and they use

LLM to assist users to perform a task specific to that application and get an instant productivity boost.

They represent a specialized subset within the broader category of agents.

Copilots, such as GitHub Copilot and Power BI Copilot, are integrated into applications to assist users in completing tasks, such as generating code or offering troubleshooting recommendations based on natural language queries.

Microsoft employs copilots extensively, integrating them into their next-generation AI-integrated products, such as Microsoft 365 apps. These copilots combine LLMs with user data and other Microsoft applications using the semantic kernel (SK) framework that we discuss in the next section. Copilots work alongside users, providing AI-powered assistance in tasks such as drafting documents or generating code. Imagine them as helpful copilots in the cockpit of a plane, assisting the pilot. By using a semantic kernel framework, developers can access the same AI integration and orchestration patterns used by Microsoft’s copilots in their own applications. For more information on how Microsoft utilizes AI models and SK in Copilots, refer to Kevin Scott’s Microsoft Build 2023 talk,The Era of the AI Copilot.

Now, let’s understand how to convert these concepts (agents, copilots, RAG (this was discussed in Chapter 4)) into tactical solutions using frameworks such as Semantic Kernel, Langchain, and Llamaindex.

Generative AI application development frameworks – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps

Generative AI application development frameworks

In this section, we will focus on the popular generative AI-based app development frameworks used by developers today in their applications, as they add functionality and extensibility to LLMs.

But why do we need to modernize existing ML applications to use intelligent generative AI in the first place? Let’s compare and contrast application characteristics without generative AI and the modernized applications infused with generative AI.

Current ML applications have some of these common characteristic limitations:

  • They are constrained with their interactions, especially with generative AI services.
  • They are hard-coded and usually have a fixed dataset. For example, one can leverage certain datasets to train certain ML models, and those models are fixed.
  • If they then want to change a model within an application or if they want to change the dataset entirely, they will need to again retrain the model, which is a challenge because of increasedcosts and increased time to completion.
  • Retraining the model involves adding enhancements or features, which is quite complex and also time-consuming and costly.

However, with intelligent generative AI applications that use the techniques described in this chapter, you can do the following:

  • Use natural language interactions. We have seen this in ChatGPT and other applications, where one can begin chatting as if there is an actual human or assistant. In addition to just using natural language to interact with generative AI applications, you can easily have your own personalized experiences based on human-like characteristics, such as personas and emotional tones, within an interactive session.
  • Generate data-driven and personalized experiences tailored to a user or set of users. Additionally, these applications can improve over time, autonomously using past experiences.
  • Instead of a longer, time-consuming process of traditional software development, you can quickly deliver new features and product enhancements.

As you can see, intelligent generative AI applications are enabling us to create solutions and address problems never before and at a pace we have also never seen before. Now let’s turn our attention to some modern App Dev frameworks that can help us implement the new and sophisticated features.

Semantic Kernel 2 – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps

However, now let’s take a step back and understand why we want to use SK and do such things as create natural language interfaces, chatbots, or natural language programming systems in the first place. Consider LLMs as the engine powering generative AI applications, and SKs act as the assembly line, integrating various generative AI services. For software developers, the reusability of code—be it functions or snippets—is crucial to streamline development processes. Furthermore, for expansive organizational applications, the efficient management of prompts, completions, and other agent-specific data is not just an operational preference but a fundamental business necessity. SK emerges as a pivotal framework, enabling the construction of durable and comprehensive generative AI applications by seamlessly integrating these essential facets.

Important note

For LLMs, the engine alone is not able to meet these business requirements any more than an engine without oil, gasoline, or electricity is able to meet a driver’s requirements of providing transportation. You need additional software code to provide a solution, not just the LLMs, and generative AI programming frameworks, such as SK, allow you to accomplish this. You are building around the engine to provide transportation, and you are building around LLMs to provide a generative AI solution.

For a real-world example, let’s use the company Microsoft. As mentioned earlier, Microsoft itself has embraced the SK framework across its organization, exemplifying its wide applicability and effectiveness. This integration is particularly evident in their next-generation AI-integrated offerings, called

“Copilots.” These Copilots harness the capabilities of LLMs, alongside your data and other Microsoft applications, including the Microsoft 365 suite (Word, Excel, and more). All of these components are seamlessly integrated using the SK framework, showcasing a sophisticated and powerful example of AI-enhanced productivity tools.

Additionally, later in this chapter, we’ll show an actual use case of how a Fortune 500 company transformed their development team and, thus, their applications into state-of-the-art, modern, generative AI-ready applications and solutions using SK.

If you would like to see more details on SK, you can visit the following link: microsoft/semantic-kernel: Integrate cutting-edge LLM technology quickly and easily into your apps (github.com), https:// github.com/microsoft/semantic-kernel.

Figure 6.3 provides a high-level visual description demonstration of the role of SK as an AI orchestrator between LLMs, AI infrastructure, copilots, and plugins in the Microsoft Copilot system:

Figure 6.3 – Role of SK as an AI orchestrator in Microsoft Copilot system

Assistants API – Developing and Operationalizing LLM-based Apps: Exploring Dev Frameworks and LLMOps

Assistants API

The Assistants API (introduced by Open AI in late 2023) allows you to build AI agents with minimal code on OpenAI’s chat completion models. This is an API that will soon be integrated into Semantic Kernel to build agent-like experiences, as mentioned in a blog by Microsoft (https://devblogs. microsoft.com/semantic-kernel/assistants-the-future-of-semantic-kernel/).

This API helps developers build high-quality copilot -like experiences in their own applications. As discussed earlier, copilots are AI assistants integrated into applications to help address questions or provide instructional steps to help the user achieve more complex tasks.

Before, creating custom AI assistants required a lot of work, even for skilled developers. The chat completions API from OpenAI is easy to use and powerful, but it is not stateful (does not have state), which meant developers and/or operations had to manage conversation state and chat threads, tool integrations, the retrieval of documents, and also managing indexes, all while running code manually. In OpenAI’s evolution, the Assistants API is the stateful version of the chat completion API, and it offers a solution to address these problems.

It is now easier than ever to build customizable, specific generative AI applications and services that can search through data, propose solutions, and automate tasks. Assistants API supports persistent and unlimited (infinitely long) threads. This means that you do not need to create a thread state management system or deal with a model’s context window limitations as developers. You can just add new messages to a thread, and users reply (prompt/completions). The Assistants API can also access files in different formats, either when creating an assistant or as part of threads. Assistants can also access multiple tools as needed. Some example tools include the following:

  • Function calling: The Assistants API can call an existing function or code subroutine. With the Assistants API, your assistant can learn what your app or external APIs do, choose the right time to call those functions, and use the function(s) in response to messages or other behavior.
  • Code interpreter: With the code interpreter tool from OpenAI/Azure OpenAI Service, you can write and execute code, such as Python code, in a separate environment. You can use it for various purposes, such as finding solutions to difficult code and math problems step by step, doing advanced data analysis on user-added files in different formats, and creating data visualization such as reports, charts, and graphs. The Assistants API can integrate and run code interpreters as they may deem necessary or as directed.