<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Posts on Slobodan Stojanović</title>
        <link>https://slobodan.me/posts/</link>
        <description>Recent content in Posts on Slobodan Stojanović</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <lastBuildDate>Wed, 17 Sep 2025 17:00:00 +0200</lastBuildDate>
        <atom:link href="https://slobodan.me/posts/index.xml" rel="self" type="application/rss+xml" />
        
        <item>
            <title>Employees, AI, and AI employees</title>
            <link>https://slobodan.me/posts/employees-ai-and-ai-employees/</link>
            <pubDate>Wed, 17 Sep 2025 17:00:00 +0200</pubDate>
            
            <guid>https://slobodan.me/posts/employees-ai-and-ai-employees/</guid>
            <description>&lt;p&gt;Two and a half years ago, my cofounder, Lav, and I tried to create an AI cofounder (CofounderGPT, as we called it), had a lot of fun, and failed. That&amp;rsquo;s not the only thing we&amp;rsquo;ve tried using AI; far from it. We use it daily for a variety of tasks, including (but not limited to) research, coding, marketing, managing and improving our product, creating new products, and for our hobbies. However, the AI cofounder project still runs somewhere in the back of our heads. Can we, or anyone, create it at some point in the future? And, what would be the impact on our products, employees, friends, and everyone if it gets created?&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Employees, AI, and AI employees&#34; src=&#34;https://slobodan.me/images/posts/employees-ai-and-ai-employees/employees-and-ai-employees.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;This is not another AGI discussion (well, it&amp;rsquo;s a monologue). I do not believe the evil artificial intelligence will threaten us all in the near future. However, if you continue reading past this point, expect a long article with diagrams and (mostly wrong) opinions on AI, employees, and cofounders.&lt;/p&gt;
&lt;h2 id=&#34;a-tale-of-an-impossible-ai-cofounder&#34;&gt;A tale of an impossible AI cofounder&lt;/h2&gt;
&lt;p&gt;I do not understand things very well until I can visualize the way they work. In tech, the easiest way to understand how things work is to try to recreate them or use them to build something.&lt;/p&gt;
&lt;p&gt;From the first moment I tried using LLMs, it was clear that they are amazing and have huge potential. Lav and I kept sending each other amazing demos we saw on Twitter. Then we tried recreating them just for fun and learning purposes. Most ended up with facepalming and &amp;ldquo;of course, it does not work except for this specific case in the demo!&amp;rdquo; But we learned a lot.&lt;/p&gt;
&lt;p&gt;Then, one day,  we decided to try building a product using AI (like building one product is not hard enough). And to make things even more interesting, we decided to use AI as a cofounder and create something without any employees or external help (while still running and working on &lt;a href=&#34;https://vacationtracker.io/&#34;&gt;Vacation Tracker&lt;/a&gt; full-time). This decision set us on a journey of 36 working days with CofounderGPT (over 5 months), which was fully documented here: &lt;a href=&#34;https://knowlo.co/blog/day-1-startup-adventure-with-cofoundergpt/&#34;&gt;https://knowlo.co/blog/day-1-startup-adventure-with-cofoundergpt/&lt;/a&gt;. And then, we just stopped. From the very beginning, it was clear that this experiment had failed. It was impossible to have a real AI cofounder. However, we continued because it was fun and we learned a lot.&lt;/p&gt;
&lt;p&gt;But why was it impossible? Were we too early? Would it be different today with GPT 5 Pro, Claude Opus 4.1, Cursor, reasoning models, and all the other amazing tools we have access to? It would be different for sure. And we would have an illusion that our experiment might succeed for a long time.&lt;/p&gt;
&lt;p&gt;There were and still are many problems with the potential AI cofounder. Some are very obvious, such as the context window (the amount of data an AI can keep in its memory), reasoning (the ability to think through and analyze problems), and available tools (the things an AI cofounder can use to do its job). All these things are getting better and better. Just look at Cursor, Claude Code, OpenAI Codex, and other coding AI tools today. The improvement of AI coding abilities over the past few years is so big that it&amp;rsquo;s not even measurable anymore.&lt;/p&gt;
&lt;p&gt;However, a cofounder is more than a set of skills. Cofounders have visions, motivation, passion, and many other qualities that an AI cannot replicate. They make decisions (which is easy for an AI) and accept the potential consequences of these decisions (which is not even close to possible with LLMs and the current state of AI). And they have emotions and gut feelings.&lt;/p&gt;
&lt;p&gt;To be honest, a true AI cofounder would not be possible from a legal perspective at the moment. However, the focus was on cofounder skills, not legal matters (they are not fun anyway).&lt;/p&gt;
&lt;p&gt;Ok, let&amp;rsquo;s say that cofounders are a bit out of reach of the AI in its current stage. But what about employees? Will AI replace developers and other employees in the near future?&lt;/p&gt;
&lt;p&gt;To understand that, let&amp;rsquo;s rewind a bit and try to understand how people do things.&lt;/p&gt;
&lt;h2 id=&#34;a-case-of-ai-employees&#34;&gt;A case of AI employees&lt;/h2&gt;
&lt;p&gt;I enjoy using AI for coding, research, and product management. Writing code or exploring and building a feature prototype has never been easier. I feel a lot more productive. Everything moves so fast! New tools, new models, better practices. Does that mean that we&amp;rsquo;ll have full-time AI developers replacing actual people soon?&lt;/p&gt;
&lt;p&gt;If you compare the number of data center vs office space construction in the US, it seems so. The trend of building space for GPUs and our future digital employees is rising rapidly.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Data Center vs Office Construction chart&#34; src=&#34;https://slobodan.me/images/posts/employees-ai-and-ai-employees/data-centers-vs-office-construction.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Source: &lt;a href=&#34;https://x.com/JosephPolitano/status/1951740903925715126&#34;&gt;https://x.com/JosephPolitano/status/1951740903925715126&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But, of course, it&amp;rsquo;s not that simple. To understand why, we need to go back to the core goal of each company: to make money. Whatever a company does, it needs to generate revenue to survive. Even hot new startups that focus on market capture instead of profits need to make money through investments.&lt;/p&gt;
&lt;p&gt;How do companies make money? By selling products or services (providing value) to entities that can pay. So, companies hire employees to do things that lead to creating and selling products or services (directly, such as sales, programming, or indirectly, such as management, HR, etc.).&lt;/p&gt;
&lt;p&gt;So, our oversimplified value chain could look similar to the following.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Companies need to make money.&lt;/li&gt;
&lt;li&gt;Companies make money by creating and selling products or services (value) to entities capable of purchasing them.&lt;/li&gt;
&lt;li&gt;Companies need employees to create and sell products or services.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img alt=&#34;Oversimplified company value chain&#34; src=&#34;https://slobodan.me/images/posts/employees-ai-and-ai-employees/company-value-chain.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Obviously, it&amp;rsquo;s more complex than that. But if we focus on the employees, we can simplify even more and say:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Companies need to make money.&lt;/li&gt;
&lt;li&gt;Companies make money by doing something (creating, selling, etc.).&lt;/li&gt;
&lt;li&gt;Companies need employees to be able to do that &amp;ldquo;something.&amp;rdquo;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;But what type of employees are we talking about here? It&amp;rsquo;s quite different when we talk about factory workers or product managers and software developers. Or is it? In a timeline long enough, all jobs will either become obsolete and disappear or become fully automated.&lt;/p&gt;
&lt;p&gt;But let&amp;rsquo;s focus on what we call &lt;em&gt;knowledge workers&lt;/em&gt;. &amp;ldquo;Knowledge workers are professionals whose primary asset is their mental skills and expertise, used for critical thinking, problem-solving, and generating new information rather than just performing manual tasks.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;How do knowledge workers do their jobs? The answer is obvious: it depends on the actual job. Their main asset are their mental skills and expertise, but they use various tools to accomplish more and amplify their skills.&lt;/p&gt;
&lt;p&gt;Initially, knowledge workers did everything manually, but eventually, they or someone else created a variety of tools to automate and simplify parts of their jobs, enabling them to complete more tasks and reduce the number of mistakes. For example, software developers have frameworks, linters, and many other tools. But we now have AI, too! Where do Cursor, Claude Code, OpenAI Codex, and other tools fit?&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s try to visualize this evolution. First, we&amp;rsquo;ll take our company value chain and add another axis to it, which represents an evolution of all these components (an X-axis). We&amp;rsquo;ll get something similar to the diagram below. Actually, this diagram is a special type of map, known as a &lt;a href=&#34;https://www.wardleymaps.com/&#34;&gt;Wardley Map&lt;/a&gt;. The Y-axis represents a value chain with a user (or an anchor) at the top (Company) and all of its needs (company needs, interaction channels, activities, etc.) below. Needs higher on the vertical axis are more visible and important in the context of the map than those lower in the value chain.&lt;/p&gt;
&lt;p&gt;Wardley Maps define the following stages of the evolution:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Genesis&lt;/strong&gt; - Novel, uncertain, constantly changing&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Custom Built&lt;/strong&gt; - Growing, becoming more defined&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Product&lt;/strong&gt; - Stable, well-defined, widely available&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Commodity&lt;/strong&gt; - Ubiquitous, standardized, utility-like&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;It&amp;rsquo;s a bit hard to fit employees in these evolution stages. We do not think about human evolution in the same way. However, we could fit the work they do.&lt;/p&gt;
&lt;p&gt;Initially, we had knowledge workers doing everything manually, which was the &lt;em&gt;genesis&lt;/em&gt; stage, because it was novel. Then knowledge workers started using tools to amplify their skills, which was (and still is) a &lt;em&gt;custom build&lt;/em&gt; stage. This may be a bit confusing because the actual tools they (or we) use are often in the product or even commodity stage, but the way we use them is still custom, and it&amp;rsquo;s slowly becoming more defined. Finally, we see these amazing AI tools that are still not stable enough, but they are slowly moving knowledge workers to the &lt;em&gt;product&lt;/em&gt; stage: an employee + AI automation will become well-defined and widely available over time. But what&amp;rsquo;s next? What would be the actual commodity version of a knowledge worker? Fully automated (AI) employee, I guess.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;A Wardley map showing an evolution of employees&#34; src=&#34;https://slobodan.me/images/posts/employees-ai-and-ai-employees/evolution-of-employees.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Scary, right?&lt;/p&gt;
&lt;p&gt;It is! Luckily, it&amp;rsquo;s as true as the Windows updater telling you it&amp;rsquo;s 99% done.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Update progress bar stuck at 99%&#34; src=&#34;https://slobodan.me/images/posts/employees-ai-and-ai-employees/almost-updated.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;So, is AI coming for our jobs? And if it does, when?&lt;/p&gt;
&lt;h2 id=&#34;is-ai-coming-for-our-jobs&#34;&gt;Is AI coming for our jobs?&lt;/h2&gt;
&lt;p&gt;The only proper way to answer this question is to define what our job actually is. As I already said, if we look far enough in the future, all currently known jobs will eventually become either obsolete or fully automated. But we don&amp;rsquo;t really care about thousands of years from now. We care about the relatively near future (decades, I guess).&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s take software developers as an example. If you are a software developer and you think your job is to write code, then I have bad news.&lt;/p&gt;
&lt;p&gt;Luckily, businesses rarely care about code (yet some care even about GitHub stars, I know). Ok, but what are the actual responsibilities of a software developer, then? Companies need to make money, so it must be something related to earning more money or spending less money. They need to use their mental skills and expertise to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Intent &amp;amp; prioritization&lt;/strong&gt;: Decide which problems are worth solving, define success metrics (revenue, retention, service levels), identify constraints, and set sequencing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Boundaries &amp;amp; contracts&lt;/strong&gt;: Shape the domain model and API/data contracts to keep change cheap and risk contained.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Code &amp;amp; implementation&lt;/strong&gt;: Turn intent into working software: write code and tests, integrate with existing systems, and document expected behavior.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Verify &amp;amp; operate (reliability &amp;amp; maintenance)&lt;/strong&gt;: Verify changes using tests and controlled rollouts, run the system in production with monitoring, alerting, error tracking, incident response, rollbacks, etc.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Security&lt;/strong&gt;: Define trust boundaries, access controls, and data‑handling rules; manage vulnerabilities, exceptions, and risk acceptance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Delivery &amp;amp; pipelines&lt;/strong&gt;: Automate build/test/deploy/rollback, with quality and compliance checks, ensuring changes ship safely and predictably.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Long‑term strategy &amp;amp; evolution&lt;/strong&gt;: Choose a target architecture and evolve it in alignment with the business strategy; manage tech debt and total cost of ownership.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Obviously, these depend on the software engineer&amp;rsquo;s expertise. It&amp;rsquo;s not the same for frontend and backend engineers, juniors and seniors, etc. It also depends on the person who is preparing the list. Finally, it depends on the LLM and model we use to analyze the list of essential software engineer activities. Yes, of course, we use LLMs to assist us with almost everything today.&lt;/p&gt;
&lt;p&gt;AI affects most of these activities, but some have a high potential to be fully (or mostly) replaced by AI, and other activities will be augmented by AI. Let&amp;rsquo;s see activity by activity:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Intent &amp;amp; prioritization&lt;/strong&gt;: AI can help you find relevant information faster and help you evaluate everything you need to know (you can run potential scenarios through brainstorming-like sessions, etc.). It could help with KPIs, ROI, documentation, and many other things. But, it can&amp;rsquo;t replace your judgment, among other important things. AI could help, but it&amp;rsquo;s, again, up to you to use and apply that help and come up with something that makes sense in each specific case (so, it&amp;rsquo;s still in the custom built phase).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Boundaries &amp;amp; contracts&lt;/strong&gt;: API schemas, payloads, examples, scaffolds, and similar things are (or soon will be) a commodity with AI. These will be mostly automated. However, you&amp;rsquo;ll still need to define domain boundaries and understand the domain. However, this will get more well-defined over time, so I think it&amp;rsquo;s closer to the product phase.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Code &amp;amp; implementation&lt;/strong&gt;: This one seems obvious, AI already writes code well enough. Over time, you&amp;rsquo;ll write code only in specific cases, such as very novel logic, examples of tricky integrations, or in cases when you need to gain a very high level of performance in your area of expertise (and when I say expertise, I mean you are one of the best, not just good). So, this goes to the commodity stage.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Verify &amp;amp; operate (reliability &amp;amp; maintenance)&lt;/strong&gt;: You still need to make decisions, but everything gets easier to decide (AI does triage, correlates logs and traces, sends PRs, suggests decisions). I would say that this goes to the product phase and gradually shifts to the commodity phase over the years.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Security&lt;/strong&gt;: Something similar might happen to security. It&amp;rsquo;s still crucial for you to make a decision and have the final word, but applications are continually improving. You make decisions, and the app provides what you need. Well-defined, widely available =&amp;gt; product phase.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Delivery &amp;amp; pipelines&lt;/strong&gt;: This is another area that we can see easily dominated by AI in the relatively near future. You might still need to think about the release strategy and to give final approvals in some cases, but CI/CD, versioning, changelogs, canaries, rollbacks, and similar are strong candidates for the full AI automation (shifts to the commodity stage).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Long‑term strategy &amp;amp; evolution&lt;/strong&gt;: Similarly to the intent and prioritization, AI will be able to give you all the critical information and assistance you might need. For example, codebase analytics, refactor opportunities (or even PRs), potential impact analysis, and different potential paths. However, it&amp;rsquo;s still up to you, your understanding of the goal, judgment, and a team to use these and come up with something that makes sense for the current and desired state of the business you are working on (or for). This stays in the custom built phase.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here&amp;rsquo;s the updated map:&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Evolution of software develoeprs&#34; src=&#34;https://slobodan.me/images/posts/employees-ai-and-ai-employees/software-developers-map.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;So, what do you say, is AI coming for software engineer jobs? AI has high potential to eliminate the typing aspect of the job really fast, but a software developer&amp;rsquo;s job is not typing code. It&amp;rsquo;s problem-solving through thinking, judgment, communication, and writing instructions for computers to follow.&lt;/p&gt;
&lt;p&gt;We are going in circles. Now that we can get the code faster, we started talking about better specifications, then we&amp;rsquo;ll focus on tests, and so on. The only important difference is that, as my good friend said recently, these circles are not actually circles; they are a big spiral, and everything goes faster and faster.&lt;/p&gt;
&lt;h2 id=&#34;focused-and-recharged-super-employee&#34;&gt;Focused and recharged super employee&lt;/h2&gt;
&lt;p&gt;If AI automates most of the typing aspect of the software engineering job, we can expect something similar for other knowledge workers sooner or later. However, that does not sound so scary if you remember that the main asset of knowledge workers is their mental skills and expertise, used for critical thinking, problem-solving, and generating new information rather than just performing manual tasks.&lt;/p&gt;
&lt;p&gt;We will have more time to devote to the essential things, such as thinking and judgment. Ok, that sounds a bit scary.&lt;/p&gt;
&lt;p&gt;However, there&amp;rsquo;s one problem with that: you can write code 14 hours per day, but you can&amp;rsquo;t really make good decisions and process large amounts of high-quality information that long.&lt;/p&gt;
&lt;p&gt;You need to be focused and rested.&lt;/p&gt;
&lt;p&gt;Are you a cofounder, software developer, marketing manager, product manager, or do you have some other role? It does not matter. What matters is that you have enough focus and mental power to review all generated code, prepare prototypes, write instructions (prompts), orchestrate all agents, and decide what matters. These activities burn your energy faster than AI burns tokens. You can&amp;rsquo;t do that and have 10 meetings a day. Or work long hours and weekends. You need your battery at 100%.&lt;/p&gt;
&lt;p&gt;How do you recharge and get focused? That&amp;rsquo;s different for each of us. However, skipping meetings that do not provide value, replying to non-urgent Slack messages or emails the next working day, closing your laptop after you finish your workday, and taking PTO are definitely helpful.&lt;/p&gt;
</description>
            <content type="html"><![CDATA[<p>Two and a half years ago, my cofounder, Lav, and I tried to create an AI cofounder (CofounderGPT, as we called it), had a lot of fun, and failed. That&rsquo;s not the only thing we&rsquo;ve tried using AI; far from it. We use it daily for a variety of tasks, including (but not limited to) research, coding, marketing, managing and improving our product, creating new products, and for our hobbies. However, the AI cofounder project still runs somewhere in the back of our heads. Can we, or anyone, create it at some point in the future? And, what would be the impact on our products, employees, friends, and everyone if it gets created?</p>
<p><img alt="Employees, AI, and AI employees" src="/images/posts/employees-ai-and-ai-employees/employees-and-ai-employees.png"></p>
<p>This is not another AGI discussion (well, it&rsquo;s a monologue). I do not believe the evil artificial intelligence will threaten us all in the near future. However, if you continue reading past this point, expect a long article with diagrams and (mostly wrong) opinions on AI, employees, and cofounders.</p>
<h2 id="a-tale-of-an-impossible-ai-cofounder">A tale of an impossible AI cofounder</h2>
<p>I do not understand things very well until I can visualize the way they work. In tech, the easiest way to understand how things work is to try to recreate them or use them to build something.</p>
<p>From the first moment I tried using LLMs, it was clear that they are amazing and have huge potential. Lav and I kept sending each other amazing demos we saw on Twitter. Then we tried recreating them just for fun and learning purposes. Most ended up with facepalming and &ldquo;of course, it does not work except for this specific case in the demo!&rdquo; But we learned a lot.</p>
<p>Then, one day,  we decided to try building a product using AI (like building one product is not hard enough). And to make things even more interesting, we decided to use AI as a cofounder and create something without any employees or external help (while still running and working on <a href="https://vacationtracker.io/">Vacation Tracker</a> full-time). This decision set us on a journey of 36 working days with CofounderGPT (over 5 months), which was fully documented here: <a href="https://knowlo.co/blog/day-1-startup-adventure-with-cofoundergpt/">https://knowlo.co/blog/day-1-startup-adventure-with-cofoundergpt/</a>. And then, we just stopped. From the very beginning, it was clear that this experiment had failed. It was impossible to have a real AI cofounder. However, we continued because it was fun and we learned a lot.</p>
<p>But why was it impossible? Were we too early? Would it be different today with GPT 5 Pro, Claude Opus 4.1, Cursor, reasoning models, and all the other amazing tools we have access to? It would be different for sure. And we would have an illusion that our experiment might succeed for a long time.</p>
<p>There were and still are many problems with the potential AI cofounder. Some are very obvious, such as the context window (the amount of data an AI can keep in its memory), reasoning (the ability to think through and analyze problems), and available tools (the things an AI cofounder can use to do its job). All these things are getting better and better. Just look at Cursor, Claude Code, OpenAI Codex, and other coding AI tools today. The improvement of AI coding abilities over the past few years is so big that it&rsquo;s not even measurable anymore.</p>
<p>However, a cofounder is more than a set of skills. Cofounders have visions, motivation, passion, and many other qualities that an AI cannot replicate. They make decisions (which is easy for an AI) and accept the potential consequences of these decisions (which is not even close to possible with LLMs and the current state of AI). And they have emotions and gut feelings.</p>
<p>To be honest, a true AI cofounder would not be possible from a legal perspective at the moment. However, the focus was on cofounder skills, not legal matters (they are not fun anyway).</p>
<p>Ok, let&rsquo;s say that cofounders are a bit out of reach of the AI in its current stage. But what about employees? Will AI replace developers and other employees in the near future?</p>
<p>To understand that, let&rsquo;s rewind a bit and try to understand how people do things.</p>
<h2 id="a-case-of-ai-employees">A case of AI employees</h2>
<p>I enjoy using AI for coding, research, and product management. Writing code or exploring and building a feature prototype has never been easier. I feel a lot more productive. Everything moves so fast! New tools, new models, better practices. Does that mean that we&rsquo;ll have full-time AI developers replacing actual people soon?</p>
<p>If you compare the number of data center vs office space construction in the US, it seems so. The trend of building space for GPUs and our future digital employees is rising rapidly.</p>
<p><img alt="Data Center vs Office Construction chart" src="/images/posts/employees-ai-and-ai-employees/data-centers-vs-office-construction.png"></p>
<p>Source: <a href="https://x.com/JosephPolitano/status/1951740903925715126">https://x.com/JosephPolitano/status/1951740903925715126</a>.</p>
<p>But, of course, it&rsquo;s not that simple. To understand why, we need to go back to the core goal of each company: to make money. Whatever a company does, it needs to generate revenue to survive. Even hot new startups that focus on market capture instead of profits need to make money through investments.</p>
<p>How do companies make money? By selling products or services (providing value) to entities that can pay. So, companies hire employees to do things that lead to creating and selling products or services (directly, such as sales, programming, or indirectly, such as management, HR, etc.).</p>
<p>So, our oversimplified value chain could look similar to the following.</p>
<ol>
<li>Companies need to make money.</li>
<li>Companies make money by creating and selling products or services (value) to entities capable of purchasing them.</li>
<li>Companies need employees to create and sell products or services.</li>
</ol>
<p><img alt="Oversimplified company value chain" src="/images/posts/employees-ai-and-ai-employees/company-value-chain.png"></p>
<p>Obviously, it&rsquo;s more complex than that. But if we focus on the employees, we can simplify even more and say:</p>
<ol>
<li>Companies need to make money.</li>
<li>Companies make money by doing something (creating, selling, etc.).</li>
<li>Companies need employees to be able to do that &ldquo;something.&rdquo;</li>
</ol>
<p>But what type of employees are we talking about here? It&rsquo;s quite different when we talk about factory workers or product managers and software developers. Or is it? In a timeline long enough, all jobs will either become obsolete and disappear or become fully automated.</p>
<p>But let&rsquo;s focus on what we call <em>knowledge workers</em>. &ldquo;Knowledge workers are professionals whose primary asset is their mental skills and expertise, used for critical thinking, problem-solving, and generating new information rather than just performing manual tasks.&rdquo;</p>
<p>How do knowledge workers do their jobs? The answer is obvious: it depends on the actual job. Their main asset are their mental skills and expertise, but they use various tools to accomplish more and amplify their skills.</p>
<p>Initially, knowledge workers did everything manually, but eventually, they or someone else created a variety of tools to automate and simplify parts of their jobs, enabling them to complete more tasks and reduce the number of mistakes. For example, software developers have frameworks, linters, and many other tools. But we now have AI, too! Where do Cursor, Claude Code, OpenAI Codex, and other tools fit?</p>
<p>Let&rsquo;s try to visualize this evolution. First, we&rsquo;ll take our company value chain and add another axis to it, which represents an evolution of all these components (an X-axis). We&rsquo;ll get something similar to the diagram below. Actually, this diagram is a special type of map, known as a <a href="https://www.wardleymaps.com/">Wardley Map</a>. The Y-axis represents a value chain with a user (or an anchor) at the top (Company) and all of its needs (company needs, interaction channels, activities, etc.) below. Needs higher on the vertical axis are more visible and important in the context of the map than those lower in the value chain.</p>
<p>Wardley Maps define the following stages of the evolution:</p>
<ol>
<li><strong>Genesis</strong> - Novel, uncertain, constantly changing</li>
<li><strong>Custom Built</strong> - Growing, becoming more defined</li>
<li><strong>Product</strong> - Stable, well-defined, widely available</li>
<li><strong>Commodity</strong> - Ubiquitous, standardized, utility-like</li>
</ol>
<p>It&rsquo;s a bit hard to fit employees in these evolution stages. We do not think about human evolution in the same way. However, we could fit the work they do.</p>
<p>Initially, we had knowledge workers doing everything manually, which was the <em>genesis</em> stage, because it was novel. Then knowledge workers started using tools to amplify their skills, which was (and still is) a <em>custom build</em> stage. This may be a bit confusing because the actual tools they (or we) use are often in the product or even commodity stage, but the way we use them is still custom, and it&rsquo;s slowly becoming more defined. Finally, we see these amazing AI tools that are still not stable enough, but they are slowly moving knowledge workers to the <em>product</em> stage: an employee + AI automation will become well-defined and widely available over time. But what&rsquo;s next? What would be the actual commodity version of a knowledge worker? Fully automated (AI) employee, I guess.</p>
<p><img alt="A Wardley map showing an evolution of employees" src="/images/posts/employees-ai-and-ai-employees/evolution-of-employees.png"></p>
<p>Scary, right?</p>
<p>It is! Luckily, it&rsquo;s as true as the Windows updater telling you it&rsquo;s 99% done.</p>
<p><img alt="Update progress bar stuck at 99%" src="/images/posts/employees-ai-and-ai-employees/almost-updated.png"></p>
<p>So, is AI coming for our jobs? And if it does, when?</p>
<h2 id="is-ai-coming-for-our-jobs">Is AI coming for our jobs?</h2>
<p>The only proper way to answer this question is to define what our job actually is. As I already said, if we look far enough in the future, all currently known jobs will eventually become either obsolete or fully automated. But we don&rsquo;t really care about thousands of years from now. We care about the relatively near future (decades, I guess).</p>
<p>Let&rsquo;s take software developers as an example. If you are a software developer and you think your job is to write code, then I have bad news.</p>
<p>Luckily, businesses rarely care about code (yet some care even about GitHub stars, I know). Ok, but what are the actual responsibilities of a software developer, then? Companies need to make money, so it must be something related to earning more money or spending less money. They need to use their mental skills and expertise to:</p>
<ul>
<li><strong>Intent &amp; prioritization</strong>: Decide which problems are worth solving, define success metrics (revenue, retention, service levels), identify constraints, and set sequencing.</li>
<li><strong>Boundaries &amp; contracts</strong>: Shape the domain model and API/data contracts to keep change cheap and risk contained.</li>
<li><strong>Code &amp; implementation</strong>: Turn intent into working software: write code and tests, integrate with existing systems, and document expected behavior.</li>
<li><strong>Verify &amp; operate (reliability &amp; maintenance)</strong>: Verify changes using tests and controlled rollouts, run the system in production with monitoring, alerting, error tracking, incident response, rollbacks, etc.</li>
<li><strong>Security</strong>: Define trust boundaries, access controls, and data‑handling rules; manage vulnerabilities, exceptions, and risk acceptance.</li>
<li><strong>Delivery &amp; pipelines</strong>: Automate build/test/deploy/rollback, with quality and compliance checks, ensuring changes ship safely and predictably.</li>
<li><strong>Long‑term strategy &amp; evolution</strong>: Choose a target architecture and evolve it in alignment with the business strategy; manage tech debt and total cost of ownership.</li>
</ul>
<p>Obviously, these depend on the software engineer&rsquo;s expertise. It&rsquo;s not the same for frontend and backend engineers, juniors and seniors, etc. It also depends on the person who is preparing the list. Finally, it depends on the LLM and model we use to analyze the list of essential software engineer activities. Yes, of course, we use LLMs to assist us with almost everything today.</p>
<p>AI affects most of these activities, but some have a high potential to be fully (or mostly) replaced by AI, and other activities will be augmented by AI. Let&rsquo;s see activity by activity:</p>
<ul>
<li><strong>Intent &amp; prioritization</strong>: AI can help you find relevant information faster and help you evaluate everything you need to know (you can run potential scenarios through brainstorming-like sessions, etc.). It could help with KPIs, ROI, documentation, and many other things. But, it can&rsquo;t replace your judgment, among other important things. AI could help, but it&rsquo;s, again, up to you to use and apply that help and come up with something that makes sense in each specific case (so, it&rsquo;s still in the custom built phase).</li>
<li><strong>Boundaries &amp; contracts</strong>: API schemas, payloads, examples, scaffolds, and similar things are (or soon will be) a commodity with AI. These will be mostly automated. However, you&rsquo;ll still need to define domain boundaries and understand the domain. However, this will get more well-defined over time, so I think it&rsquo;s closer to the product phase.</li>
<li><strong>Code &amp; implementation</strong>: This one seems obvious, AI already writes code well enough. Over time, you&rsquo;ll write code only in specific cases, such as very novel logic, examples of tricky integrations, or in cases when you need to gain a very high level of performance in your area of expertise (and when I say expertise, I mean you are one of the best, not just good). So, this goes to the commodity stage.</li>
<li><strong>Verify &amp; operate (reliability &amp; maintenance)</strong>: You still need to make decisions, but everything gets easier to decide (AI does triage, correlates logs and traces, sends PRs, suggests decisions). I would say that this goes to the product phase and gradually shifts to the commodity phase over the years.</li>
<li><strong>Security</strong>: Something similar might happen to security. It&rsquo;s still crucial for you to make a decision and have the final word, but applications are continually improving. You make decisions, and the app provides what you need. Well-defined, widely available =&gt; product phase.</li>
<li><strong>Delivery &amp; pipelines</strong>: This is another area that we can see easily dominated by AI in the relatively near future. You might still need to think about the release strategy and to give final approvals in some cases, but CI/CD, versioning, changelogs, canaries, rollbacks, and similar are strong candidates for the full AI automation (shifts to the commodity stage).</li>
<li><strong>Long‑term strategy &amp; evolution</strong>: Similarly to the intent and prioritization, AI will be able to give you all the critical information and assistance you might need. For example, codebase analytics, refactor opportunities (or even PRs), potential impact analysis, and different potential paths. However, it&rsquo;s still up to you, your understanding of the goal, judgment, and a team to use these and come up with something that makes sense for the current and desired state of the business you are working on (or for). This stays in the custom built phase.</li>
</ul>
<p>Here&rsquo;s the updated map:</p>
<p><img alt="Evolution of software develoeprs" src="/images/posts/employees-ai-and-ai-employees/software-developers-map.png"></p>
<p>So, what do you say, is AI coming for software engineer jobs? AI has high potential to eliminate the typing aspect of the job really fast, but a software developer&rsquo;s job is not typing code. It&rsquo;s problem-solving through thinking, judgment, communication, and writing instructions for computers to follow.</p>
<p>We are going in circles. Now that we can get the code faster, we started talking about better specifications, then we&rsquo;ll focus on tests, and so on. The only important difference is that, as my good friend said recently, these circles are not actually circles; they are a big spiral, and everything goes faster and faster.</p>
<h2 id="focused-and-recharged-super-employee">Focused and recharged super employee</h2>
<p>If AI automates most of the typing aspect of the software engineering job, we can expect something similar for other knowledge workers sooner or later. However, that does not sound so scary if you remember that the main asset of knowledge workers is their mental skills and expertise, used for critical thinking, problem-solving, and generating new information rather than just performing manual tasks.</p>
<p>We will have more time to devote to the essential things, such as thinking and judgment. Ok, that sounds a bit scary.</p>
<p>However, there&rsquo;s one problem with that: you can write code 14 hours per day, but you can&rsquo;t really make good decisions and process large amounts of high-quality information that long.</p>
<p>You need to be focused and rested.</p>
<p>Are you a cofounder, software developer, marketing manager, product manager, or do you have some other role? It does not matter. What matters is that you have enough focus and mental power to review all generated code, prepare prototypes, write instructions (prompts), orchestrate all agents, and decide what matters. These activities burn your energy faster than AI burns tokens. You can&rsquo;t do that and have 10 meetings a day. Or work long hours and weekends. You need your battery at 100%.</p>
<p>How do you recharge and get focused? That&rsquo;s different for each of us. However, skipping meetings that do not provide value, replying to non-urgent Slack messages or emails the next working day, closing your laptop after you finish your workday, and taking PTO are definitely helpful.</p>
]]></content>
        </item>
        
        <item>
            <title>How AI Agents work and how to build them</title>
            <link>https://slobodan.me/posts/ai-agents/</link>
            <pubDate>Wed, 23 Apr 2025 17:00:00 +0200</pubDate>
            
            <guid>https://slobodan.me/posts/ai-agents/</guid>
            <description>&lt;p&gt;Have you heard about AI Agents? Of course, you heard about them. These are the intelligent agents who will take our jobs in a few years!&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t want to scare you, but someone on Twitter said that &amp;ldquo;most jobs will become obsolete&amp;rdquo; in less than 10 years. McKinsey agrees (they say AI Agents will replace 70% of office work), and Goldman, too.&lt;/p&gt;
&lt;p&gt;So, I guess our clock is ticking. We don&amp;rsquo;t have much time. It&amp;rsquo;s probably better to take a woodworking course or something similar.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/clock-ticking-240.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;But I am not that good at woodworking. So, let&amp;rsquo;s try to understand how AI Agents work and if they are that scary.&lt;/p&gt;
&lt;p&gt;If you read Twitter or Linkedin, AI Agents look like special agents that can do everything. The demos that they share look amazing.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/special-agent-400.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;However, these agents don&amp;rsquo;t feel that special when you use them. They are helpful for specific cases, just like travel agents. If you give enough details about your desired journey and budget to travel agents, they can find you the vacation you want and plan the entire trip. Just like Cursor! If you give enough details and specific instructions, vibe coding feels like magic (parts of the application start assembling in front of you). In other cases, Cursor feels as smart and useful as Alexa or Siri.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/travel-agent-800.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;So, AI Agents can be extremely helpful, especially if you understand how they work. But before we can understand AI Agents, we need to understand LLMs.&lt;/p&gt;
&lt;h2 id=&#34;how-llms-work&#34;&gt;How LLMs work&lt;/h2&gt;
&lt;p&gt;Large Language Models or LLMs are very good at predicting the next best set of words based on &lt;em&gt;your input&lt;/em&gt; (questions, part of the text they need to complete, or detailed instructions), their &lt;em&gt;training data&lt;/em&gt; (all the text that creators of the LLM you use were able to use for training, such as books, websites, your private data [just kidding, or am I?], and other datasets), &lt;em&gt;context&lt;/em&gt; (the previous conversation flow or documents you attached), and &lt;em&gt;specific configurations&lt;/em&gt; (such as weights, which prioritize certain word patterns, and settings like temperature, which control the randomness of predictions).&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s use the same example I used in the &amp;ldquo;&lt;a href=&#34;https://slobodan.me/posts/5-prompt-engineering-tips-for-developers/&#34;&gt;5 Prompt Engineering Tips for Developers&lt;/a&gt;&amp;rdquo; article! If you ask an LLM to finish the following sentence: &amp;ldquo;I am speaking at,&amp;rdquo; it&amp;rsquo;ll probably say something such as &amp;ldquo;a business conference,&amp;rdquo; &amp;ldquo;a tech meetup,&amp;ldquo;or &amp;ldquo;a community forum.&amp;rdquo; There&amp;rsquo;s almost zero chance it would say, &amp;ldquo;A Martian picnic?&amp;rdquo; Or &amp;ldquo;a space farmer&amp;rsquo;s market.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/simple-prompt.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;However, if we add a few sentences to the beginning of the instructions (or prompt, as we call it when talking to LLMs) that tell an LLM that it is a playful, chatty cartoon character named&amp;quot;Space Bunny,&amp;rdquo; the LLM would not finish the sentence with &amp;ldquo;a tech meetup,&amp;rdquo; or similar, but with something more similar to a Martian picnic.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/updated-simple-prompt.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;When you talk to an LLM, your question or a set of instructions is called a &lt;em&gt;prompt&lt;/em&gt;. So, prompts are just instructions. You tell an LLM what you want, and it tries to reply based on your input and context, as well as its training set and configuration. If your instructions are clear, there&amp;rsquo;s a higher chance you&amp;rsquo;ll get a helpful reply. However, an LLM will reply even if your instructions do not make sense. In that case and in some other cases, its replies might not be based on truth (we call that hallucinations). Everything related to hallucinations is improving fast, so whatever I write here will probably not be true in a few months.&lt;/p&gt;
&lt;p&gt;So, you give your instructions (or write your prompt if you want to sound smarter), LLM takes these instructions, spins up some GPUs, burns a small forest, &amp;ldquo;eats&amp;rdquo; some of your tokens, and you get an unexpected wisdom or a hallucination. In the world of LLMs, tokens = money. You burn them like the Monopoly money, but the key difference is that LLM tokens are connected to your credit card.&lt;/p&gt;
&lt;p&gt;But how does an LLM know how to reply to your prompt?&lt;/p&gt;
&lt;p&gt;Computers are not that great with words. They prefer numbers. So, an LLM will split your instructions into tokens (yeah, these are the tokens I mentioned above). A token is a set of characters that is sometimes equal to a word, sometimes to the part of the word, and sometimes to a set of letters and other characters such as spaces, dots, commas, etc. The exact number of tokens your instructions have depends on the algorithm the LLM uses. You can see the visualization of OpenAI&amp;rsquo;s tokenizer in the image below or here: &lt;a href=&#34;https://platform.openai.com/tokenizer&#34;&gt;https://platform.openai.com/tokenizer&lt;/a&gt;. You&amp;rsquo;ll get slightly different results based on the model you select.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/openai-tokenizer.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;But tokens are still words! A tokenizer represents each of these tokens as a set of numbers (so each token becomes an array of numbers). These numbers are vectors that can be placed in a multidimensional space. The entire training set of an LLM is also transformed into tokens and then vectors and put in the same multidimensional space. The major power of LLMs is their ability to put related words (based on their vast training sets) close to each other in this space.&lt;/p&gt;
&lt;p&gt;For a quick visual example, imagine that each token converts to an array of two numbers (two-dimensional space is easy to visualize). Then, we would be able to put our dots in this space similar to the following image:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/vectors.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Now that an LLM converts your instructions to a set of vectors (an array of arrays of numbers!) and puts them in its multidimensional space, it can use its algorithm to find the closest vectors that might be a good answer to your instructions. LLMs are &lt;strong&gt;Large&lt;/strong&gt; Language Models, meaning they are trained on a massive set of data, which helps them put these vectors in the correct places in the multidimensional space and offer a meaningful answer.&lt;/p&gt;
&lt;p&gt;Luckily, LLMs are products, and like other products, they evolve with user feedback and misusage. So we got many useful features that were not initially available, such as system prompts (parts of the prompts that are more important than the rest of the conversation with an LLM), better coding and JSON skills, etc.&lt;/p&gt;
&lt;h2 id=&#34;an-undercover-agent&#34;&gt;&lt;strong&gt;An undercover agent&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Meet my friend Claude. I ask it many weird questions every day. Claude is nice, so it tries to answer each question in detail politely.&lt;/p&gt;
&lt;p&gt;One day, I asked Claude what the weather was like in Belgrade. I ask way more weird questions to both Claude and ChatGPT. But this question is special!&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/claude-01.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s special because Claude can&amp;rsquo;t answer it. It told me politely that it had no access to real-time weather information. Ah, I forgot that ChatGPT can search the internet, but Claude can&amp;rsquo;t do it yet!&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/claude-02.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;It makes sense that Claude does not have an answer to my question because it takes months to train an LLM model. I could ask another LLM to answer my question or simply check my phone. But I like Claude! Can I do something to help it to answer this kind of question?&lt;/p&gt;
&lt;p&gt;Can I do a quick Google search for Claude when it needs some real-time data?&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s a weird idea, but let&amp;rsquo;s try it! I&amp;rsquo;ll tell Claude that it should let me know when it needs me to search the internet. Claude can be a bit chatty, so I&amp;rsquo;ll make sure to tell it to provide an exact search phrase I should use. For example, telling me &lt;code&gt;Google:weather in Belgrade today&lt;/code&gt; would be ideal.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/claude-03.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;It seems that my friend Claude likes this game. Let&amp;rsquo;s ask again, &amp;ldquo;What&amp;rsquo;s the weather like in Belgrade today?&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/claude-04.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;It worked! Claude provided an exact search query so I can do a Google search and provide a screenshot. The reply was more detailed than I needed it to be, but it did not matter; I understood my assignment.&lt;/p&gt;
&lt;p&gt;I copied the search phrase, opened my browser, and googled it. Then, I took a screenshot of the result and sent it to Claude. And Claude replied with useful information about the current weather in Belgrade!&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/claude-05.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Claude definitely liked this game.&lt;/p&gt;
&lt;p&gt;But while I did this just for fun, I accidentally did one more thing &amp;ndash; I just created an AI agent!&lt;/p&gt;
&lt;p&gt;I know it&amp;rsquo;s not a very useful agent, as I could just read the weather data from Google. But it&amp;rsquo;s still an agent.&lt;/p&gt;
&lt;p&gt;I also know ChatGPT can search the internet, so I could use it instead of Claude. But ChatGPT is also an agent! It&amp;rsquo;s just an undercover agent that looks like a plain old LLM. To be fair, Claude is also an agent. Just ask it to draw a diagram or create a webpage for you, and you&amp;rsquo;ll see some superpowers that LLMs do not have.&lt;/p&gt;
&lt;h2 id=&#34;how-ai-agents-work&#34;&gt;How AI Agents work&lt;/h2&gt;
&lt;p&gt;LLMs are amazing! They really are. But like many other tools, they are good at some things but not so good at others.&lt;/p&gt;
&lt;p&gt;For example, LLMs are excellent at picking the best set of tokens to continue the set of tokens we provided. Or, in a human-understandable language, they are very good at answering questions, completing sentences, writing text, etc.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/llms.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;You ask a question. LLM replies. Sometimes, it&amp;rsquo;s helpful information; sometimes, you must ask a follow-up question. But it always replies.&lt;/p&gt;
&lt;p&gt;However, not all of these replies are based on truth. Sometimes, an LLM replies with false information that we call hallucinations. That&amp;rsquo;s because it tries to find the closest set of tokens to your tokenized instructions (or your question) and always finds something.&lt;/p&gt;
&lt;p&gt;LLMs do not really care about the truth. They care about the closest tokens to your tokenized instructions, their training sets, their configuration, and some additional parameters.&lt;/p&gt;
&lt;p&gt;But what makes an LLM an agent?&lt;/p&gt;
&lt;p&gt;Agents are LLMs with something that provides missing information or capabilities to help LLMs answer our questions. If we call these things &amp;ldquo;tools,&amp;rdquo; agents are LLMs with tools.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/agents.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;However, to be an agent, LLM must orchestrate these tools and decide when it has enough information to answer our questions or complete our tasks. If we orchestrate the tools with predefined code, LLM is just a tool in our code, and our code is not an AI agent.&lt;/p&gt;
&lt;p&gt;The diagram above looks familiar. It looks like &lt;em&gt;a while loop&lt;/em&gt;!&lt;/p&gt;
&lt;p&gt;So, I guess we can say that an AI agent is like a &amp;ldquo;while loop&amp;rdquo; that keeps asking available tools to provide additional information or capability until it has all it needs to complete the task or answer the question.&lt;/p&gt;
&lt;p&gt;Anything can be a tool that provides missing capabilities or information to LLMs as long as LLMs have an easy way to use that tool.&lt;/p&gt;
&lt;p&gt;For example, I was a tool that my friend Claude used to find information about the current weather in Belgrade! But that made our &amp;ldquo;while loop&amp;rdquo; expensive because it used both LLM tokens and my time.&lt;/p&gt;
&lt;p&gt;These &amp;ldquo;while loops&amp;rdquo; are generally expensive. They are not expensive because of the big O notation and code complexity but because in each iteration, LLM evaluates whether it can complete the task and uses tokens (and our money).&lt;/p&gt;
&lt;p&gt;Being expensive depends on the value it provides, but it&amp;rsquo;s always a good idea to be careful. You can be careful by setting the billing alarms and spending limits, making sure that the LLM does not iterate indefinitely  (by limiting the number of iterations), picking the right model for your task (sometimes cheaper models can also complete your tasks), and configuring monitoring, error tracking, and alarms.&lt;/p&gt;
&lt;h2 id=&#34;while-loops-and-where-to-write-them&#34;&gt;&lt;strong&gt;While loops and where to write them&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;So, if agents are while loops with LLMs and some additional tools, where do we write these while loops to create an agent?&lt;/p&gt;
&lt;p&gt;The answer is almost anywhere. While creating an AI agent using pen and paper could be a thing, it&amp;rsquo;s not really a practical way of making an agent. Another cost-inefficient and unhelpful way of creating a while loop is using a person to act as one. However, you can write this &amp;ldquo;while loop&amp;rdquo; wherever you need it. For example, it can be inside the app you are working on, in your terminal, on a server (using any backend language you prefer), in a browser, etc. As long as you are careful and you do not leak your LLM secret key and other similar secrets.&lt;/p&gt;
&lt;p&gt;To write an agent &amp;ldquo;while loop,&amp;rdquo; you need to do the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Choose an LLM model that fits your needs and your budget (which can be $0 or whatever other number).&lt;/li&gt;
&lt;li&gt;Define a system prompt with a clear explanation of all the tools you want to provide (including when and how to use them).&lt;/li&gt;
&lt;li&gt;Ask an LLM to reply in a strict JSON format or any other structure you prefer.&lt;/li&gt;
&lt;li&gt;Make sure you parse and validate the reply correctly.&lt;/li&gt;
&lt;li&gt;Handle errors and set the maximal number of iterations, billing budget, and alerts.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Remember, LLMs are good at talking to humans (actually imitating human interaction), but human language is not easy to parse in the code. If you worked with LLMs and tried to get a JSON reply and nothing else, I am sure you, at least once, got the reply similar to: &amp;ldquo;Here&amp;rsquo;s your JSON: &lt;code&gt;{...}&lt;/code&gt;.&amp;rdquo; Yelling at LLM and telling it to reply with JSON works sometimes, but in some cases, even 3 exclamation points at the end do not help. Even a simple agent we built inside Claude.ai replied with a sentence in front of the search phrase:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/claude-04.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;You can either pick a format that is easier to parse or use a simple trick explained in my previous article: provide the beginning of the reply and let an LLM continue. You can see the code example here: &lt;a href=&#34;https://slobodan.me/posts/5-prompt-engineering-tips-for-developers/&#34;&gt;5 Prompt Engineering Tips for Developers&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But, while understanding how these LLM &amp;ldquo;while loops&amp;rdquo; work is good, you do not need to write your own while loops. There are many existing tools and frameworks you can use.&lt;/p&gt;
&lt;h2 id=&#34;ai-agent-tools-and-frameworks&#34;&gt;&lt;strong&gt;AI Agent tools and frameworks&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;AI Agent tools and frameworks are like JavaScript frameworks &amp;ndash; we get many new ones every day. Pick any word that comes to your mind. The chance to find a JavaScript package with that name in NPM and a .ai domain with that name is higher than the latest US-to-China tariff percentage.&lt;/p&gt;
&lt;p&gt;For example, LangChain was the AI Agent framework a while back. Today, we have LlamaIndex and many other popular tools besides it. Big players like Microsoft have their own open-source takes, such as AutoGen. And, of course, services such as Amazon Bedrock Agents. There are many other examples, from tools for non-coders and open-source tools to enterprise-grade tools.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s hard to pick the best one. If you want to check just one that works with JavaScript or TypeScript, you can &lt;a href=&#34;https://ts.llamaindex.ai/&#34;&gt;start with LlamaIndex&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;LlamaIndex sounds similar to the Meta Llama models. But it&amp;rsquo;s not the same. Actually, LlamaIndex supports the Meta Llama model and many others (including OpenAI models, Anthropic models, open source models, Amazon Bedrock, Azure OpenAI, etc.).&lt;/p&gt;
&lt;p&gt;Another interesting thing about LlamaIndex is that they focused on the AI Agent memory issue as an important problem to focus on. If you have worked with AI Agents, you know what I am talking about. If not, read on.&lt;/p&gt;
&lt;h2 id=&#34;remember-remember-the-conversation-we-had-yesterday&#34;&gt;Remember, remember&amp;hellip; the conversation we had yesterday&lt;/h2&gt;
&lt;p&gt;As I already mentioned, LLMs are limited by their training set, configuration, your instructions, and a few other things. One of their most important limitations is their &lt;em&gt;context size&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The context size represents the maximum number of tokens an LLM can have in a single conversation (through the API, UI, or any other way you interact with it). It&amp;rsquo;s a hard stop. Once you fill the context with tokens, an LLM will explode. Well, not literally. But it&amp;rsquo;ll stop working. If you have used LLMs from the early days, you might remember that after a certain number of messages, LLM seems to forget what you were talking about. That&amp;rsquo;s because the context was filled, and an LLM removed the initial messages to make space for your new messages. Luckily, LLM then made system prompts, sticky parts of the conversation that always stay in context and allow you to provide the instructions.&lt;/p&gt;
&lt;p&gt;If you manage to fill the context, LLMs will most likely do one of the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Remove some messages from the beginning of your conversation, but keep the system prompt so it still follows the instructions). This can cause an LLM to forget some parts of your conversation.&lt;/li&gt;
&lt;li&gt;Summarize some parts of the conversation and replace N messages with the summary (well, LLMs are good with summarization). The quality of the remaining conversation depends on the way an LLM summarizes the conversation.&lt;/li&gt;
&lt;li&gt;Block you from sending more messages (most likely if you are using an API).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/llm-context.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Luckily, the context size is increasing fast (Claude has a 200k token context, Gemini 1M context, and Llama now has a context size with up to 10M tokens). However, a larger context size can decrease the ability of an LLM to find specific items in it. Also, we want to fit larger items in the context. We started with simple spreadsheets and PDFs, and now we want to embed whole knowledge bases, books, project documentation, etc.&lt;/p&gt;
&lt;p&gt;Again, luckily, many smart people work with LLMs, so they quickly came up with an effective way to make the most of the (at that time very) limited LLM context size. However, naming things is hard (ask OpenAI and Anthropic or simply read the names of their models), so they called this approach &lt;em&gt;Retrieval-Augmented Generation&lt;/em&gt; (RAG).&lt;/p&gt;
&lt;p&gt;While RAG sounds complicated and is still one of the most misunderstood terms related to LLM, it is quite a simple but powerful concept.&lt;/p&gt;
&lt;p&gt;In short, instead of putting all documents in the system prompt, you can wait for the user question, then tokenize it before replying and do a vector search against your knowledge base to find a few closest matches. Then, you take these pieces and tell an LLM to respond to the user&amp;rsquo;s question in the context of the provided pieces of your knowledge base.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/rag.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Before you do a vector search, you need to split your knowledge base into reasonable chunks (i.e., articles, sections of the articles, or even paragraphs in some cases) and create vectors from each piece.&lt;/p&gt;
&lt;p&gt;And when I say vector search, I mean something similar to the vector search that LLMs use under the hood. Remember the following image?&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/vectors.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;You can use a vector DB to do a vector search, but that&amp;rsquo;s not required, as you can do vector searches in some of the popular databases (such as PostgreSQL, ElasticSearch, etc.) or store vectors almost anywhere and create your own vector search function (Amazon S3, for example).&lt;/p&gt;
&lt;p&gt;Writing your own vector search (or actually vector similarity) function also sounds complicated, but luckily, you can ask an LLM to write that function for you, and it can look similar to the following one:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-javascript&#34; data-lang=&#34;javascript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Calculate cosine similarity between two vectors
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;cosineSimilarity&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;vector1&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;vector2&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Calculate the dot product of the two vectors
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;dotProduct&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;vector1&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;reduce&lt;/span&gt;((&lt;span style=&#34;color:#a6e22e&#34;&gt;sum&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;a&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;i&lt;/span&gt;) =&amp;gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;sum&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;a&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;vector2&lt;/span&gt;[&lt;span style=&#34;color:#a6e22e&#34;&gt;i&lt;/span&gt;], &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Calculate the magnitude of both vectors
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;magnitude&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        Math.&lt;span style=&#34;color:#a6e22e&#34;&gt;sqrt&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;vector1&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;reduce&lt;/span&gt;((&lt;span style=&#34;color:#a6e22e&#34;&gt;sum&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;val&lt;/span&gt;) =&amp;gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;sum&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;val&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;val&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; Math.&lt;span style=&#34;color:#a6e22e&#34;&gt;sqrt&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;vector2&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;reduce&lt;/span&gt;((&lt;span style=&#34;color:#a6e22e&#34;&gt;sum&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;val&lt;/span&gt;) =&amp;gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;sum&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;val&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;val&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Return the cosine similarity
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;return&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;dotProduct&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;/&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;magnitude&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This function returns a number that represents the similarity percentage. You can run the same function against each of the knowledge base article vectors and pick the two top ones with more than 80% similarity or something similar.&lt;/p&gt;
&lt;p&gt;A simple function like the one I showed above would work fine for smaller databases. However, you should use a more efficient search for large data sets.&lt;/p&gt;
&lt;p&gt;AI Agents use many tokens, and often need vast knowledge bases. LlamaIndex helps with more efficient vector search and allows you to create agents that are not like they came from the Memento movie.&lt;/p&gt;
&lt;p&gt;However, explaining RAG and LLM memory in detail requires more than a few paragraphs. All these explanations would convert this article into a short book. So, let&amp;rsquo;s leave that for another article and get back to AI Agents.&lt;/p&gt;
&lt;h2 id=&#34;lets-build-an-ai-agent&#34;&gt;Let&amp;rsquo;s build an AI Agent!&lt;/h2&gt;
&lt;p&gt;LLMs are products. Products evolve and add features based on users&amp;rsquo; feedback and usage patterns. Well, using tools is one of the important features that most LLMs now support natively. Some LLMs call this feature tools (for example, tools in Claude), some call it functions (for example, OpenAI functions), but it&amp;rsquo;s the same thing that allows us to build AI Agents.&lt;/p&gt;
&lt;p&gt;Built-in tools have a few clear benefits, such as replies in a strict JSON format, a well-defined format, and an easier ability to stream responses via HTTP. They also have less surface area for errors because they are now built into the LLM itself. But, as always, there are many different standards, and if you want to switch to some other LLM, you&amp;rsquo;ll probably need to define tools in a slightly different format. However, a simple abstraction (or, even better, a hexagonal architecture) makes this problem easier to manage.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s build a simple agent using built-in tools! You can pick any LLM you like. I&amp;rsquo;ll use Claude Sonnet 3.7 on Amazon Bedrock. The example below would work fine with other models. I use Amazon Web Services (AWS) every day, so Bedrock is a natural choice (despite its limits, especially in European data centers).&lt;/p&gt;
&lt;p&gt;So, where do we start? With our &amp;ldquo;while loop,&amp;rdquo; of course! Remember this?&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/agents.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;I want to build a simple agent for my product, Vacation Tracker. It will be very simple because otherwise, I would need to write a book to show all the details. I want my AI Agent to be able to do the following 3 things only:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Help users to request leave, such as PTO.&lt;/li&gt;
&lt;li&gt;Let users see which coworkers are not working today and who will be off this or next week.&lt;/li&gt;
&lt;li&gt;Answer some basic questions about our product using our knowledge base.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With these 3 features, my &amp;ldquo;while loop&amp;rdquo; would look like the following diagram.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/vt-agent-diagram.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;There are so many different ways to build this AI Agent using AWS. For example, we could create a simple serverless solution like the diagram below, with the following components:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I would use an Amazon API Gateway to expose the API endpoint for my AI Agent.&lt;/li&gt;
&lt;li&gt;My AI Agent &amp;ldquo;while loop,&amp;rdquo; or business logic, would be in a Lambda function that defines the specification of the tool, invokes the LLM, and talks to the Vacation Tracker API and storage for our knowledge base and vectors.&lt;/li&gt;
&lt;li&gt;I would use the Claude Sonnet 3.7 model on Amazon Bedrock.&lt;/li&gt;
&lt;li&gt;I could store the vectors and parts of our knowledge base in the S3 bucket. This is not an ideal long-term solution, but it would work fine for the MVP version.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/aws-architecture-with-apigw.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;By using API Gateway, we get all the benefits this service offers, including easy setup for rate limits, Web Application Firewall (WAF), etc. However, as &lt;a href=&#34;https://www.linkedin.com/feed/update/urn:li:activity:7313466033313406976?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7313466033313406976%2C7316309968616333312%29&amp;dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287316309968616333312%2Curn%3Ali%3Aactivity%3A7313466033313406976%29&#34;&gt;Austen Collins suggested&lt;/a&gt;, API Gateway has some significant downsides when building AI Agents. For example, the API Gateway timeout is limited to 29 seconds (AWS allows us to change the timeout now, but changing the timeout can affect account-level throttle limits, etc.), which could be a serious limitation for more complex production-ready agents that can do some longer tasks. Also, we can&amp;rsquo;t stream the response from our Lambda function, so we need to wait for our agent to generate the whole long reply before we can start showing it to the user. Streaming would allow us to show the response as LLM generates it. This is especially helpful for long responses, as our agent starts responding to the user faster and keeps adding text as the user reads (the effect is similar to typing).&lt;/p&gt;
&lt;p&gt;Luckily, there&amp;rsquo;s an alternative! AWS Lambda supports Lambda function URLs. It&amp;rsquo;s basically a simple HTTP endpoint in front of your Lambda function. The main benefits of the function URL over API Gateway are that it offers timeouts of up to 15 minutes (it&amp;rsquo;s a Lambda timeout, not an API Gateway timeout anymore) and support for streaming responses. Just what we needed!&lt;/p&gt;
&lt;p&gt;However, it comes with many downsides, too. You do not get all the features of the API Gateway, such as built-in rate limits, authorization support, etc. It does not support WAF, either. There are no custom domains for Lambda function URLs. However, you can mitigate some of these downsides by putting a CloudFront in front of the Lambda function, as shown in the diagram below.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/aws-architecture-with-lambda-url.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Is this an ideal setup? It depends on your use case. It&amp;rsquo;s a good start. There are many other alternatives. For example, we could keep the initial setup, and instead of waiting for the reply with an open HTTP connection, we could send a message to a background job and tell the frontend application that the message is received and that we&amp;rsquo;ll send a reply via WebSockets. There&amp;rsquo;s no out-of-the-box streaming for this setup, too, but it can give you more flexibility and some benefits from both approaches.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://slobodan.me/images/posts/ai-agents/aws-architecture-with-websockets.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;In production, we would need to think about our use case, WAF, rate limiting (for our app, for LLM we are using, and for other services we are using), securing the API endpoint (auth token, API key, etc.), monitoring, error logging and handling, storage for the conversation (we do not want to send the whole conversation from the frontend when a user sends a new message), and many other things.&lt;/p&gt;
&lt;p&gt;But let&amp;rsquo;s keep things simple.&lt;/p&gt;
&lt;h2 id=&#34;show-me-some-code&#34;&gt;Show me some code!&lt;/h2&gt;
&lt;p&gt;The initial idea of this article was to show the code. But here we are, 4000 words later, and I barely explained how agents work. I&amp;rsquo;ll show the most important parts of the code here. I might do a &amp;ldquo;part 2&amp;rdquo; article with a code deep dive if anyone reads the article past this point and thinks a complete coding example might be beneficial.&lt;/p&gt;
&lt;p&gt;The most important part of the code is a definition of the available tools. The format depends on a model or a service you use, but for Claude Sonnet 3.7 on AWS Bedrock using the &lt;a href=&#34;https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html&#34;&gt;Converse API&lt;/a&gt;, you can do something similar to the following:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-javascript&#34; data-lang=&#34;javascript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;toolConfig&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;tools&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#75715e&#34;&gt;// We need a similar object for each of our tools
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;    {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#75715e&#34;&gt;// The name and the description of our tool
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;      &lt;span style=&#34;color:#75715e&#34;&gt;// A clear description is important because it helps an LLM to select the right tool
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;request_leave&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;description&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;Request leave (such as PTO, sick day, etc.). The leave request will be submitted to the Vacation Tracker application, and sent to your manager (approver).&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;inputSchema&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#75715e&#34;&gt;// It accepts JSON, I told you developers love JSON!
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;        &lt;span style=&#34;color:#a6e22e&#34;&gt;json&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#a6e22e&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;object&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#a6e22e&#34;&gt;properties&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#75715e&#34;&gt;// And it expects the following properties (see the type and the description for each).
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;            &lt;span style=&#34;color:#75715e&#34;&gt;// For simplicity, I&amp;#39;ll show the most important parts only
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;            &lt;span style=&#34;color:#a6e22e&#34;&gt;startDate&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#a6e22e&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;string&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#a6e22e&#34;&gt;description&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;The date when the leave starts. Format: YYYY-MM-DD&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#a6e22e&#34;&gt;endDate&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#a6e22e&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;string&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#a6e22e&#34;&gt;description&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;The date when the leave ends. Format: YYYY-MM-DD&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#a6e22e&#34;&gt;leaveType&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#a6e22e&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;string&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#a6e22e&#34;&gt;description&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;The type of leave. For example, &amp;#34;vacation&amp;#34;, &amp;#34;sick&amp;#34;, etc.&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#a6e22e&#34;&gt;reason&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#a6e22e&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;string&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#a6e22e&#34;&gt;description&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;The reason for the leave request. For example, &amp;#34;Vacation in Greece&amp;#34;, &amp;#34;I am not feeling well today,&amp;#34; etc.&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#a6e22e&#34;&gt;required&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;startDate&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;endDate&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;leaveType&amp;#39;&lt;/span&gt;],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#75715e&#34;&gt;// Define the other 2 tools here
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  ],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;It&amp;rsquo;s important to provide clear descriptions as that would help an LLM pick the right tool when needed.&lt;/p&gt;
&lt;p&gt;I am a big fan of hexagonal architecture (or ports and adapters), and I would use it in a production code. Our business logic does not care about the Claude Sonnet 3.7 model. It does not care about Amazon Bedrock, either. So, I would put the LLM logic in some kind of repository, initialize it, and use it to send a message when needed. That would make my code cleaner to read, easier to test, and allow me to try other models (i.e., Open AI models, which are not available in Amazon Bedrock) without changing the business logic.&lt;/p&gt;
&lt;p&gt;However, I&amp;rsquo;ll just show the simplest code example without hexagonal architecture to keep things simple.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-javascript&#34; data-lang=&#34;javascript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Stripped down example of using the AWS Bedrock SDK to create a simple AI Agent
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;import&lt;/span&gt; { &lt;span style=&#34;color:#a6e22e&#34;&gt;BedrockRuntimeClient&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;ConverseCommand&lt;/span&gt; } &lt;span style=&#34;color:#a6e22e&#34;&gt;from&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;@aws-sdk/client-bedrock-runtime&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;import&lt;/span&gt; { &lt;span style=&#34;color:#a6e22e&#34;&gt;toolConfig&lt;/span&gt; } &lt;span style=&#34;color:#a6e22e&#34;&gt;from&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;./tool-config.js&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Initialize the Bedrock client and specify the region and a model ID
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;BedrockRuntimeClient&lt;/span&gt;({ &lt;span style=&#34;color:#a6e22e&#34;&gt;region&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;us-east-1&amp;#39;&lt;/span&gt; })
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;modelId&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;anthropic.claude-3-7-sonnet-20250219-v1:0&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Start a conversation with the user message.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;conversation&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#75715e&#34;&gt;// The user messsage is hardcoded for this example, but in a real application, you would get it from the user input
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;      &lt;span style=&#34;color:#75715e&#34;&gt;// Also, in practice, users would probably tell us that they want to go on vacation in the first week of July, and the agent would need to ask for the start date
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;I want to go on vacation on the first week of July. Full week, starting June 30th.&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Create a command with the model ID, the message, and a basic configuration
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;command&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;ConverseCommand&lt;/span&gt;({
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;modelId&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;messages&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;conversation&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;system&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#75715e&#34;&gt;// We would need to provide a more detailed system message in a real application
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;You are Vacation Tracker assistant and you help users to request leaves, see who else from their team is off, check their leave balance and learn about the Vacation Tracker app functionalities. When a user asks to go off for a longer period, assume whole week. Here is the list of available leave types: `PTO`, `SickDay`. Try assuming the correct leave type from the input.&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// We pass our tool configuration to Bedrock
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;toolConfig&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;toolConfig&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;})
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;async&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;run&lt;/span&gt;() {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;try&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#75715e&#34;&gt;// Send the command to the model and wait for the response
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;send&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;command&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;console&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;log&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;Response:&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;JSON&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;stringify&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  } &lt;span style=&#34;color:#66d9ef&#34;&gt;catch&lt;/span&gt; (&lt;span style=&#34;color:#a6e22e&#34;&gt;err&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;console&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;log&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;ERROR&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;err&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// We wrap the call in the run function just to be able to use it in a terminal without the deployment
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// The actual code would define a Lambda function handler
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;run&lt;/span&gt;().&lt;span style=&#34;color:#a6e22e&#34;&gt;then&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;console&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;log&lt;/span&gt;).&lt;span style=&#34;color:#66d9ef&#34;&gt;catch&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;console&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;error&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;To run this code example, you&amp;rsquo;d need an AWS account with Claude Sonnet 3.7 enabled in the Amazon Bedrock (in the &lt;code&gt;us-east-1&lt;/code&gt; region). When you run it, the response should be similar to the following JSON (inline comments make this JSON invalid, but I added them for easier understanding):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Just metadata, you can ignore this part at the moment
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;$metadata&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;httpStatusCode&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;200&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;requestId&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;bc806712-b1b2-40eb-8488-f0085237ebcf&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;attempts&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;totalRetryDelay&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Metrics can be useful, but let&amp;#39;s ignore that too at the moment
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;metrics&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;latencyMs&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;4191&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Claude Sonnet 3.7 response
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;output&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;message&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;: [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#75715e&#34;&gt;// A message that we can show to our users if we want to
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;          &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;text&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Certainly! I&amp;#39;d be happy to help you request leave for your vacation during the first week of July. Based on the information you&amp;#39;ve provided, I&amp;#39;ll submit a leave request for you using the Vacation Tracker application. Let me go ahead and process that for you.&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#75715e&#34;&gt;// Claude Sonnet 3.7 tells us to use a tool!
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;          &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;toolUse&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#75715e&#34;&gt;// Request params, as defined in the tool configuration
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;            &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;input&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;startDate&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2023-06-30&amp;#34;&lt;/span&gt; ,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;endDate&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2023-07-07&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;leaveType&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;PTO&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;reason&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Vacation for the first week of July&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#75715e&#34;&gt;// The name and ID of a tool we should use
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;            &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;name&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;request_leave&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;toolUseId&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tooluse_JaORLwrHSSGifTRMApUrGA&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#75715e&#34;&gt;// This is a response from an agent, we need to pass this with the response
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;      &lt;span style=&#34;color:#75715e&#34;&gt;// from a tool to continue the conversation
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;      &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;role&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;assistant&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Claude tells us that it stopped because it needs a tool to continue
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;stopReason&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tool_use&amp;#34;&lt;/span&gt; ,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Useful metrics for the number of used tokens
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;usage&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;inputTokens&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;630&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;outputTokens&amp;#34;&lt;/span&gt; : &lt;span style=&#34;color:#ae81ff&#34;&gt;183&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;totalTokens&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;813&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;As you can see in the JSON above, Claude tells us that it needs a tool to be able to reply (&lt;code&gt;stopReason: &#39;tool_use&#39;&lt;/code&gt;). It gives us a nice message that we can display to our users if we want to, but keep in mind that the agent still works at this point, so users should not send new messages yet. Finally, it gives us the details about the tool we need to use and the parameters we should send to our tool to get the response.&lt;/p&gt;
&lt;p&gt;After receiving this response, we should send the API request to the Vacation Tracker API with the provided start and end dates, leave type, and reason. But before that, we probably need to validate the data and get the authentication token for the API.&lt;/p&gt;
&lt;p&gt;Before we continue, let&amp;rsquo;s talk about one more thing: request duration. Remember the 29-second timeout in the API Gateway? Well, this request took 4.2 seconds out of these 29. If we add a few hundred milliseconds for the Lambda overhead and a request to a database to get the previous messages in this conversation, we are probably around 5 seconds.&lt;/p&gt;
&lt;p&gt;But that&amp;rsquo;s just the first part of this request. Before we reply, we need to call the Vacation Tracker API, parse the response, and call the Amazon Bedrock again. If we are lucky, our agent will need just one tool to be able to reply, so we&amp;rsquo;ll be at 10 or 15 seconds, including saving the conversation to the DynamoDB.&lt;/p&gt;
&lt;p&gt;What happens if we have a complex agent that could use multiple tools in one request? We can easily get close or over the 29-second timeout, which could break our agent.&lt;/p&gt;
&lt;p&gt;Ok, so, we got the ID of the tool we need to use and the parameters to send to the tool. In this case, an agent wants us to request a leave. In production, we would ask the user to confirm the request details first, but for this simple example, we can send the request straight to the Vacation Tracker API.&lt;/p&gt;
&lt;p&gt;Once the Vacation Tracker API responds, we probably want to process the response to make it clear to our LLM because most API responses are not 100% LLM or human friendly. For example, we can add a description, change the field names to be more descriptive, etc.&lt;/p&gt;
&lt;p&gt;We could send a new request to Amazon Bedrock with the following conversation:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;[
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// The initial user message
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;role&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;user&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;: [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;text&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;I want to go on vacation on the first week of July. Full week, starting June 30th.&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// The unchanged assistant message from the previous example
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;role&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;assistant&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;: [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;text&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Certainly! I&amp;#39;d be happy to help you request leave for your vacation during the first week of July. Based on the information you&amp;#39;ve provided, I&amp;#39;ll submit a leave request for you using the Vacation Tracker application. Let me go ahead and process that for you.&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;toolUse&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;input&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;startDate&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2025-06-30&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;endDate&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2025-07-06&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;leaveType&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;PTO&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;reason&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Vacation for the first week of July&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;name&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;request_leave&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;toolUseId&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tooluse_JaORLwrHSSGifTRMApUrGA&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      }  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Tool response
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#75715e&#34;&gt;// Role needs to be &amp;#34;user&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;role&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;user&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;: [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#75715e&#34;&gt;// We provide the tool result object
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;toolResult&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#75715e&#34;&gt;// Tool ID
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;          &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;toolUseId&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tooluse_JaORLwrHSSGifTRMApUrGA&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;: [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#75715e&#34;&gt;// Response is a JSON object with a custom payload
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;              &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;json&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                &lt;span style=&#34;color:#75715e&#34;&gt;// For example, the API tells us that the request was successfully sent
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;                &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;success&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#66d9ef&#34;&gt;true&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                &lt;span style=&#34;color:#75715e&#34;&gt;// We have the request details
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;                &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;request&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;id&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;request_12345&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;status&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;pending&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;startDate&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2025-06-30&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;endDate&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2025-07-06&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;leaveType&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;PTO&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;reason&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Vacation for the first week of July&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                &lt;span style=&#34;color:#75715e&#34;&gt;// And also the remaining quota
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;                &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;quota&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;remaining&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;used&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;total&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;15&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;unit&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;days&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          ]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When we send this request to Claude, the reply will probably be similar to the following JSON:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Metadata, ignore
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;$metadata&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;httpStatusCode&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;200&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;requestId&amp;#34;&lt;/span&gt; : &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;de4dad99-1d57-46bd-ad71-c69ebe853756&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;attempts&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;totalRetryDelay&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// It took almost 9 seconds to reply!
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;metrics&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;latencyMs&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;8985&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// The actual response object
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;output&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;message&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;: [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#75715e&#34;&gt;// The text we want to show to the user
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;          &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;text&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Great news! I&amp;#39;ve successfully submitted your leave request for your vacation. Here are the details of your request:\n\n- Leave Type: PTO (Paid Time Off)\n- Start Date: June 30, 2025 (Monday)\n- End Date: July 6, 2025 (Sunday)\n- Reason: Vacation for the first week of July\n- Status: Pending approval\n \nYour request has been submitted to the Vacation Tracker application and sent to your manager for approval. The request ID is request_12345.\n\nAdditionally, I can see from the response that your current leave balance is as follows:\n- Total PTO: 15 days\n- Used: 5 days\n- Remaining: 10 days\n\nThis request will use 6 working days of your PTO (assuming a standard Monday to Friday work week). Please note that your request is currently pending approval from your manager. You should receive a notification once it&amp;#39;s been reviewed. \n\nIs there anything else you&amp;#39;d like to know about your leave request or the Vacation Tracker application?&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;role&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;assistant&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Claude says that the request was successfully fulfilled
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;stopReason&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;end_turn&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Usage data
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;usage&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;inputTokens&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;901&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;outputTokens&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;236&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;totalTokens&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#ae81ff&#34;&gt;1137&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;To make it a bit more readable, here&amp;rsquo;s the AI Agent response:&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;em&gt;Great news! I&amp;rsquo;ve successfully submitted your leave request for your vacation. Here are the details of your request:&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Leave Type: PTO (Paid Time Off)&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Start Date: June 30, 2025 (Monday)&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;End Date: July 7, 2025 (Friday)&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Reason: Vacation for the first week of July&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Status: Pending approval&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;Your request has been submitted to the Vacation Tracker application and sent to your manager for approval. The request ID is request_12345.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Additionally, I can see from the response that your current leave balance is as follows:&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Total PTO: 15 days&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Used: 5 days&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Remaining: 10 days&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;This request will use 5 working days of your PTO (assuming a standard Monday to Friday work week). Please note that your request is currently pending approval from your manager. You should receive a notification once it&amp;rsquo;s been reviewed.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Is there anything else you&amp;rsquo;d like to know about your leave request or the Vacation Tracker application?&lt;/em&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;As you can see, our AI Agent is a bit more chatty and friendly. It made up some parts of the reply (for example, we never mentioned the notification), but the response generally looks good.&lt;/p&gt;
&lt;p&gt;In production, we probably want an LLM to reply in a structured way so we can display a predefined UI for the leave request along with a short, friendly message. Also, we need to be careful with dates and other numbers, as LLMs do not care about numbers. Numbers are close in the vector space. For an LLM, 4 is similar to 5 or even 42, but 4 or 5 used PTO days make a big difference for our users.&lt;/p&gt;
&lt;p&gt;And that&amp;rsquo;s it! We built an AI Agent again. This time with code. It&amp;rsquo;s a simple agent, more like a proof of concept. But it&amp;rsquo;s still an agent.&lt;/p&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;After a long article, I&amp;rsquo;ll keep the summary short.&lt;/p&gt;
&lt;p&gt;Before we do a quick summary, you should check 2 relatively fresh things related to AI Agents.&lt;/p&gt;
&lt;p&gt;First, my Twitter feed has been buzzing about MCP servers in the past few weeks. Model Context Protocol (MCP) is an open-source protocol that standardizes how applications provide context (and tool specification) to LLMs, and you can read more about it here: &lt;a href=&#34;https://docs.anthropic.com/en/docs/agents-and-tools/mcp&#34;&gt;https://docs.anthropic.com/en/docs/agents-and-tools/mcp&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Second, OpenAI has some new tools for building agents. Read more about these tools here: &lt;a href=&#34;https://openai.com/index/new-tools-for-building-agents/&#34;&gt;https://openai.com/index/new-tools-for-building-agents/&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To end this long article, I just want to remind you that AI Agents sound very complicated, but they are actually like LLM &amp;ldquo;while loops&amp;rdquo; with tools.&lt;/p&gt;
&lt;p&gt;You have all the skills you need to build these tools. So, go, build tools and agents, and have fun!&lt;/p&gt;
</description>
            <content type="html"><![CDATA[<p>Have you heard about AI Agents? Of course, you heard about them. These are the intelligent agents who will take our jobs in a few years!</p>
<p>I don&rsquo;t want to scare you, but someone on Twitter said that &ldquo;most jobs will become obsolete&rdquo; in less than 10 years. McKinsey agrees (they say AI Agents will replace 70% of office work), and Goldman, too.</p>
<p>So, I guess our clock is ticking. We don&rsquo;t have much time. It&rsquo;s probably better to take a woodworking course or something similar.</p>
<p><img src="/images/posts/ai-agents/clock-ticking-240.png"></p>
<p>But I am not that good at woodworking. So, let&rsquo;s try to understand how AI Agents work and if they are that scary.</p>
<p>If you read Twitter or Linkedin, AI Agents look like special agents that can do everything. The demos that they share look amazing.</p>
<p><img src="/images/posts/ai-agents/special-agent-400.png"></p>
<p>However, these agents don&rsquo;t feel that special when you use them. They are helpful for specific cases, just like travel agents. If you give enough details about your desired journey and budget to travel agents, they can find you the vacation you want and plan the entire trip. Just like Cursor! If you give enough details and specific instructions, vibe coding feels like magic (parts of the application start assembling in front of you). In other cases, Cursor feels as smart and useful as Alexa or Siri.</p>
<p><img src="/images/posts/ai-agents/travel-agent-800.png"></p>
<p>So, AI Agents can be extremely helpful, especially if you understand how they work. But before we can understand AI Agents, we need to understand LLMs.</p>
<h2 id="how-llms-work">How LLMs work</h2>
<p>Large Language Models or LLMs are very good at predicting the next best set of words based on <em>your input</em> (questions, part of the text they need to complete, or detailed instructions), their <em>training data</em> (all the text that creators of the LLM you use were able to use for training, such as books, websites, your private data [just kidding, or am I?], and other datasets), <em>context</em> (the previous conversation flow or documents you attached), and <em>specific configurations</em> (such as weights, which prioritize certain word patterns, and settings like temperature, which control the randomness of predictions).</p>
<p>Let&rsquo;s use the same example I used in the &ldquo;<a href="https://slobodan.me/posts/5-prompt-engineering-tips-for-developers/">5 Prompt Engineering Tips for Developers</a>&rdquo; article! If you ask an LLM to finish the following sentence: &ldquo;I am speaking at,&rdquo; it&rsquo;ll probably say something such as &ldquo;a business conference,&rdquo; &ldquo;a tech meetup,&ldquo;or &ldquo;a community forum.&rdquo; There&rsquo;s almost zero chance it would say, &ldquo;A Martian picnic?&rdquo; Or &ldquo;a space farmer&rsquo;s market.&rdquo;</p>
<p><img src="/images/posts/ai-agents/simple-prompt.png"></p>
<p>However, if we add a few sentences to the beginning of the instructions (or prompt, as we call it when talking to LLMs) that tell an LLM that it is a playful, chatty cartoon character named&quot;Space Bunny,&rdquo; the LLM would not finish the sentence with &ldquo;a tech meetup,&rdquo; or similar, but with something more similar to a Martian picnic.</p>
<p><img src="/images/posts/ai-agents/updated-simple-prompt.png"></p>
<p>When you talk to an LLM, your question or a set of instructions is called a <em>prompt</em>. So, prompts are just instructions. You tell an LLM what you want, and it tries to reply based on your input and context, as well as its training set and configuration. If your instructions are clear, there&rsquo;s a higher chance you&rsquo;ll get a helpful reply. However, an LLM will reply even if your instructions do not make sense. In that case and in some other cases, its replies might not be based on truth (we call that hallucinations). Everything related to hallucinations is improving fast, so whatever I write here will probably not be true in a few months.</p>
<p>So, you give your instructions (or write your prompt if you want to sound smarter), LLM takes these instructions, spins up some GPUs, burns a small forest, &ldquo;eats&rdquo; some of your tokens, and you get an unexpected wisdom or a hallucination. In the world of LLMs, tokens = money. You burn them like the Monopoly money, but the key difference is that LLM tokens are connected to your credit card.</p>
<p>But how does an LLM know how to reply to your prompt?</p>
<p>Computers are not that great with words. They prefer numbers. So, an LLM will split your instructions into tokens (yeah, these are the tokens I mentioned above). A token is a set of characters that is sometimes equal to a word, sometimes to the part of the word, and sometimes to a set of letters and other characters such as spaces, dots, commas, etc. The exact number of tokens your instructions have depends on the algorithm the LLM uses. You can see the visualization of OpenAI&rsquo;s tokenizer in the image below or here: <a href="https://platform.openai.com/tokenizer">https://platform.openai.com/tokenizer</a>. You&rsquo;ll get slightly different results based on the model you select.</p>
<p><img src="/images/posts/ai-agents/openai-tokenizer.png"></p>
<p>But tokens are still words! A tokenizer represents each of these tokens as a set of numbers (so each token becomes an array of numbers). These numbers are vectors that can be placed in a multidimensional space. The entire training set of an LLM is also transformed into tokens and then vectors and put in the same multidimensional space. The major power of LLMs is their ability to put related words (based on their vast training sets) close to each other in this space.</p>
<p>For a quick visual example, imagine that each token converts to an array of two numbers (two-dimensional space is easy to visualize). Then, we would be able to put our dots in this space similar to the following image:</p>
<p><img src="/images/posts/ai-agents/vectors.png"></p>
<p>Now that an LLM converts your instructions to a set of vectors (an array of arrays of numbers!) and puts them in its multidimensional space, it can use its algorithm to find the closest vectors that might be a good answer to your instructions. LLMs are <strong>Large</strong> Language Models, meaning they are trained on a massive set of data, which helps them put these vectors in the correct places in the multidimensional space and offer a meaningful answer.</p>
<p>Luckily, LLMs are products, and like other products, they evolve with user feedback and misusage. So we got many useful features that were not initially available, such as system prompts (parts of the prompts that are more important than the rest of the conversation with an LLM), better coding and JSON skills, etc.</p>
<h2 id="an-undercover-agent"><strong>An undercover agent</strong></h2>
<p>Meet my friend Claude. I ask it many weird questions every day. Claude is nice, so it tries to answer each question in detail politely.</p>
<p>One day, I asked Claude what the weather was like in Belgrade. I ask way more weird questions to both Claude and ChatGPT. But this question is special!</p>
<p><img src="/images/posts/ai-agents/claude-01.png"></p>
<p>It&rsquo;s special because Claude can&rsquo;t answer it. It told me politely that it had no access to real-time weather information. Ah, I forgot that ChatGPT can search the internet, but Claude can&rsquo;t do it yet!</p>
<p><img src="/images/posts/ai-agents/claude-02.png"></p>
<p>It makes sense that Claude does not have an answer to my question because it takes months to train an LLM model. I could ask another LLM to answer my question or simply check my phone. But I like Claude! Can I do something to help it to answer this kind of question?</p>
<p>Can I do a quick Google search for Claude when it needs some real-time data?</p>
<p>It&rsquo;s a weird idea, but let&rsquo;s try it! I&rsquo;ll tell Claude that it should let me know when it needs me to search the internet. Claude can be a bit chatty, so I&rsquo;ll make sure to tell it to provide an exact search phrase I should use. For example, telling me <code>Google:weather in Belgrade today</code> would be ideal.</p>
<p><img src="/images/posts/ai-agents/claude-03.png"></p>
<p>It seems that my friend Claude likes this game. Let&rsquo;s ask again, &ldquo;What&rsquo;s the weather like in Belgrade today?&rdquo;</p>
<p><img src="/images/posts/ai-agents/claude-04.png"></p>
<p>It worked! Claude provided an exact search query so I can do a Google search and provide a screenshot. The reply was more detailed than I needed it to be, but it did not matter; I understood my assignment.</p>
<p>I copied the search phrase, opened my browser, and googled it. Then, I took a screenshot of the result and sent it to Claude. And Claude replied with useful information about the current weather in Belgrade!</p>
<p><img src="/images/posts/ai-agents/claude-05.png"></p>
<p>Claude definitely liked this game.</p>
<p>But while I did this just for fun, I accidentally did one more thing &ndash; I just created an AI agent!</p>
<p>I know it&rsquo;s not a very useful agent, as I could just read the weather data from Google. But it&rsquo;s still an agent.</p>
<p>I also know ChatGPT can search the internet, so I could use it instead of Claude. But ChatGPT is also an agent! It&rsquo;s just an undercover agent that looks like a plain old LLM. To be fair, Claude is also an agent. Just ask it to draw a diagram or create a webpage for you, and you&rsquo;ll see some superpowers that LLMs do not have.</p>
<h2 id="how-ai-agents-work">How AI Agents work</h2>
<p>LLMs are amazing! They really are. But like many other tools, they are good at some things but not so good at others.</p>
<p>For example, LLMs are excellent at picking the best set of tokens to continue the set of tokens we provided. Or, in a human-understandable language, they are very good at answering questions, completing sentences, writing text, etc.</p>
<p><img src="/images/posts/ai-agents/llms.png"></p>
<p>You ask a question. LLM replies. Sometimes, it&rsquo;s helpful information; sometimes, you must ask a follow-up question. But it always replies.</p>
<p>However, not all of these replies are based on truth. Sometimes, an LLM replies with false information that we call hallucinations. That&rsquo;s because it tries to find the closest set of tokens to your tokenized instructions (or your question) and always finds something.</p>
<p>LLMs do not really care about the truth. They care about the closest tokens to your tokenized instructions, their training sets, their configuration, and some additional parameters.</p>
<p>But what makes an LLM an agent?</p>
<p>Agents are LLMs with something that provides missing information or capabilities to help LLMs answer our questions. If we call these things &ldquo;tools,&rdquo; agents are LLMs with tools.</p>
<p><img src="/images/posts/ai-agents/agents.png"></p>
<p>However, to be an agent, LLM must orchestrate these tools and decide when it has enough information to answer our questions or complete our tasks. If we orchestrate the tools with predefined code, LLM is just a tool in our code, and our code is not an AI agent.</p>
<p>The diagram above looks familiar. It looks like <em>a while loop</em>!</p>
<p>So, I guess we can say that an AI agent is like a &ldquo;while loop&rdquo; that keeps asking available tools to provide additional information or capability until it has all it needs to complete the task or answer the question.</p>
<p>Anything can be a tool that provides missing capabilities or information to LLMs as long as LLMs have an easy way to use that tool.</p>
<p>For example, I was a tool that my friend Claude used to find information about the current weather in Belgrade! But that made our &ldquo;while loop&rdquo; expensive because it used both LLM tokens and my time.</p>
<p>These &ldquo;while loops&rdquo; are generally expensive. They are not expensive because of the big O notation and code complexity but because in each iteration, LLM evaluates whether it can complete the task and uses tokens (and our money).</p>
<p>Being expensive depends on the value it provides, but it&rsquo;s always a good idea to be careful. You can be careful by setting the billing alarms and spending limits, making sure that the LLM does not iterate indefinitely  (by limiting the number of iterations), picking the right model for your task (sometimes cheaper models can also complete your tasks), and configuring monitoring, error tracking, and alarms.</p>
<h2 id="while-loops-and-where-to-write-them"><strong>While loops and where to write them</strong></h2>
<p>So, if agents are while loops with LLMs and some additional tools, where do we write these while loops to create an agent?</p>
<p>The answer is almost anywhere. While creating an AI agent using pen and paper could be a thing, it&rsquo;s not really a practical way of making an agent. Another cost-inefficient and unhelpful way of creating a while loop is using a person to act as one. However, you can write this &ldquo;while loop&rdquo; wherever you need it. For example, it can be inside the app you are working on, in your terminal, on a server (using any backend language you prefer), in a browser, etc. As long as you are careful and you do not leak your LLM secret key and other similar secrets.</p>
<p>To write an agent &ldquo;while loop,&rdquo; you need to do the following:</p>
<ol>
<li>Choose an LLM model that fits your needs and your budget (which can be $0 or whatever other number).</li>
<li>Define a system prompt with a clear explanation of all the tools you want to provide (including when and how to use them).</li>
<li>Ask an LLM to reply in a strict JSON format or any other structure you prefer.</li>
<li>Make sure you parse and validate the reply correctly.</li>
<li>Handle errors and set the maximal number of iterations, billing budget, and alerts.</li>
</ol>
<p>Remember, LLMs are good at talking to humans (actually imitating human interaction), but human language is not easy to parse in the code. If you worked with LLMs and tried to get a JSON reply and nothing else, I am sure you, at least once, got the reply similar to: &ldquo;Here&rsquo;s your JSON: <code>{...}</code>.&rdquo; Yelling at LLM and telling it to reply with JSON works sometimes, but in some cases, even 3 exclamation points at the end do not help. Even a simple agent we built inside Claude.ai replied with a sentence in front of the search phrase:</p>
<p><img src="/images/posts/ai-agents/claude-04.png"></p>
<p>You can either pick a format that is easier to parse or use a simple trick explained in my previous article: provide the beginning of the reply and let an LLM continue. You can see the code example here: <a href="https://slobodan.me/posts/5-prompt-engineering-tips-for-developers/">5 Prompt Engineering Tips for Developers</a>.</p>
<p>But, while understanding how these LLM &ldquo;while loops&rdquo; work is good, you do not need to write your own while loops. There are many existing tools and frameworks you can use.</p>
<h2 id="ai-agent-tools-and-frameworks"><strong>AI Agent tools and frameworks</strong></h2>
<p>AI Agent tools and frameworks are like JavaScript frameworks &ndash; we get many new ones every day. Pick any word that comes to your mind. The chance to find a JavaScript package with that name in NPM and a .ai domain with that name is higher than the latest US-to-China tariff percentage.</p>
<p>For example, LangChain was the AI Agent framework a while back. Today, we have LlamaIndex and many other popular tools besides it. Big players like Microsoft have their own open-source takes, such as AutoGen. And, of course, services such as Amazon Bedrock Agents. There are many other examples, from tools for non-coders and open-source tools to enterprise-grade tools.</p>
<p>It&rsquo;s hard to pick the best one. If you want to check just one that works with JavaScript or TypeScript, you can <a href="https://ts.llamaindex.ai/">start with LlamaIndex</a>.</p>
<p>LlamaIndex sounds similar to the Meta Llama models. But it&rsquo;s not the same. Actually, LlamaIndex supports the Meta Llama model and many others (including OpenAI models, Anthropic models, open source models, Amazon Bedrock, Azure OpenAI, etc.).</p>
<p>Another interesting thing about LlamaIndex is that they focused on the AI Agent memory issue as an important problem to focus on. If you have worked with AI Agents, you know what I am talking about. If not, read on.</p>
<h2 id="remember-remember-the-conversation-we-had-yesterday">Remember, remember&hellip; the conversation we had yesterday</h2>
<p>As I already mentioned, LLMs are limited by their training set, configuration, your instructions, and a few other things. One of their most important limitations is their <em>context size</em>.</p>
<p>The context size represents the maximum number of tokens an LLM can have in a single conversation (through the API, UI, or any other way you interact with it). It&rsquo;s a hard stop. Once you fill the context with tokens, an LLM will explode. Well, not literally. But it&rsquo;ll stop working. If you have used LLMs from the early days, you might remember that after a certain number of messages, LLM seems to forget what you were talking about. That&rsquo;s because the context was filled, and an LLM removed the initial messages to make space for your new messages. Luckily, LLM then made system prompts, sticky parts of the conversation that always stay in context and allow you to provide the instructions.</p>
<p>If you manage to fill the context, LLMs will most likely do one of the following:</p>
<ol>
<li>Remove some messages from the beginning of your conversation, but keep the system prompt so it still follows the instructions). This can cause an LLM to forget some parts of your conversation.</li>
<li>Summarize some parts of the conversation and replace N messages with the summary (well, LLMs are good with summarization). The quality of the remaining conversation depends on the way an LLM summarizes the conversation.</li>
<li>Block you from sending more messages (most likely if you are using an API).</li>
</ol>
<p><img src="/images/posts/ai-agents/llm-context.png"></p>
<p>Luckily, the context size is increasing fast (Claude has a 200k token context, Gemini 1M context, and Llama now has a context size with up to 10M tokens). However, a larger context size can decrease the ability of an LLM to find specific items in it. Also, we want to fit larger items in the context. We started with simple spreadsheets and PDFs, and now we want to embed whole knowledge bases, books, project documentation, etc.</p>
<p>Again, luckily, many smart people work with LLMs, so they quickly came up with an effective way to make the most of the (at that time very) limited LLM context size. However, naming things is hard (ask OpenAI and Anthropic or simply read the names of their models), so they called this approach <em>Retrieval-Augmented Generation</em> (RAG).</p>
<p>While RAG sounds complicated and is still one of the most misunderstood terms related to LLM, it is quite a simple but powerful concept.</p>
<p>In short, instead of putting all documents in the system prompt, you can wait for the user question, then tokenize it before replying and do a vector search against your knowledge base to find a few closest matches. Then, you take these pieces and tell an LLM to respond to the user&rsquo;s question in the context of the provided pieces of your knowledge base.</p>
<p><img src="/images/posts/ai-agents/rag.png"></p>
<p>Before you do a vector search, you need to split your knowledge base into reasonable chunks (i.e., articles, sections of the articles, or even paragraphs in some cases) and create vectors from each piece.</p>
<p>And when I say vector search, I mean something similar to the vector search that LLMs use under the hood. Remember the following image?</p>
<p><img src="/images/posts/ai-agents/vectors.png"></p>
<p>You can use a vector DB to do a vector search, but that&rsquo;s not required, as you can do vector searches in some of the popular databases (such as PostgreSQL, ElasticSearch, etc.) or store vectors almost anywhere and create your own vector search function (Amazon S3, for example).</p>
<p>Writing your own vector search (or actually vector similarity) function also sounds complicated, but luckily, you can ask an LLM to write that function for you, and it can look similar to the following one:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#75715e">// Calculate cosine similarity between two vectors
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">function</span> <span style="color:#a6e22e">cosineSimilarity</span>(<span style="color:#a6e22e">vector1</span>, <span style="color:#a6e22e">vector2</span>) {
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Calculate the dot product of the two vectors
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">dotProduct</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">vector1</span>.<span style="color:#a6e22e">reduce</span>((<span style="color:#a6e22e">sum</span>, <span style="color:#a6e22e">a</span>, <span style="color:#a6e22e">i</span>) =&gt; <span style="color:#a6e22e">sum</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">a</span> <span style="color:#f92672">*</span> <span style="color:#a6e22e">vector2</span>[<span style="color:#a6e22e">i</span>], <span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>  
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Calculate the magnitude of both vectors
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">magnitude</span> <span style="color:#f92672">=</span>
</span></span><span style="display:flex;"><span>        Math.<span style="color:#a6e22e">sqrt</span>(<span style="color:#a6e22e">vector1</span>.<span style="color:#a6e22e">reduce</span>((<span style="color:#a6e22e">sum</span>, <span style="color:#a6e22e">val</span>) =&gt; <span style="color:#a6e22e">sum</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">val</span> <span style="color:#f92672">*</span> <span style="color:#a6e22e">val</span>, <span style="color:#ae81ff">0</span>))
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">*</span> Math.<span style="color:#a6e22e">sqrt</span>(<span style="color:#a6e22e">vector2</span>.<span style="color:#a6e22e">reduce</span>((<span style="color:#a6e22e">sum</span>, <span style="color:#a6e22e">val</span>) =&gt; <span style="color:#a6e22e">sum</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">val</span> <span style="color:#f92672">*</span> <span style="color:#a6e22e">val</span>, <span style="color:#ae81ff">0</span>))
</span></span><span style="display:flex;"><span>  
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Return the cosine similarity
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">dotProduct</span> <span style="color:#f92672">/</span> <span style="color:#a6e22e">magnitude</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This function returns a number that represents the similarity percentage. You can run the same function against each of the knowledge base article vectors and pick the two top ones with more than 80% similarity or something similar.</p>
<p>A simple function like the one I showed above would work fine for smaller databases. However, you should use a more efficient search for large data sets.</p>
<p>AI Agents use many tokens, and often need vast knowledge bases. LlamaIndex helps with more efficient vector search and allows you to create agents that are not like they came from the Memento movie.</p>
<p>However, explaining RAG and LLM memory in detail requires more than a few paragraphs. All these explanations would convert this article into a short book. So, let&rsquo;s leave that for another article and get back to AI Agents.</p>
<h2 id="lets-build-an-ai-agent">Let&rsquo;s build an AI Agent!</h2>
<p>LLMs are products. Products evolve and add features based on users&rsquo; feedback and usage patterns. Well, using tools is one of the important features that most LLMs now support natively. Some LLMs call this feature tools (for example, tools in Claude), some call it functions (for example, OpenAI functions), but it&rsquo;s the same thing that allows us to build AI Agents.</p>
<p>Built-in tools have a few clear benefits, such as replies in a strict JSON format, a well-defined format, and an easier ability to stream responses via HTTP. They also have less surface area for errors because they are now built into the LLM itself. But, as always, there are many different standards, and if you want to switch to some other LLM, you&rsquo;ll probably need to define tools in a slightly different format. However, a simple abstraction (or, even better, a hexagonal architecture) makes this problem easier to manage.</p>
<p>Let&rsquo;s build a simple agent using built-in tools! You can pick any LLM you like. I&rsquo;ll use Claude Sonnet 3.7 on Amazon Bedrock. The example below would work fine with other models. I use Amazon Web Services (AWS) every day, so Bedrock is a natural choice (despite its limits, especially in European data centers).</p>
<p>So, where do we start? With our &ldquo;while loop,&rdquo; of course! Remember this?</p>
<p><img src="/images/posts/ai-agents/agents.png"></p>
<p>I want to build a simple agent for my product, Vacation Tracker. It will be very simple because otherwise, I would need to write a book to show all the details. I want my AI Agent to be able to do the following 3 things only:</p>
<ol>
<li>Help users to request leave, such as PTO.</li>
<li>Let users see which coworkers are not working today and who will be off this or next week.</li>
<li>Answer some basic questions about our product using our knowledge base.</li>
</ol>
<p>With these 3 features, my &ldquo;while loop&rdquo; would look like the following diagram.</p>
<p><img src="/images/posts/ai-agents/vt-agent-diagram.png"></p>
<p>There are so many different ways to build this AI Agent using AWS. For example, we could create a simple serverless solution like the diagram below, with the following components:</p>
<ul>
<li>I would use an Amazon API Gateway to expose the API endpoint for my AI Agent.</li>
<li>My AI Agent &ldquo;while loop,&rdquo; or business logic, would be in a Lambda function that defines the specification of the tool, invokes the LLM, and talks to the Vacation Tracker API and storage for our knowledge base and vectors.</li>
<li>I would use the Claude Sonnet 3.7 model on Amazon Bedrock.</li>
<li>I could store the vectors and parts of our knowledge base in the S3 bucket. This is not an ideal long-term solution, but it would work fine for the MVP version.</li>
</ul>
<p><img src="/images/posts/ai-agents/aws-architecture-with-apigw.png"></p>
<p>By using API Gateway, we get all the benefits this service offers, including easy setup for rate limits, Web Application Firewall (WAF), etc. However, as <a href="https://www.linkedin.com/feed/update/urn:li:activity:7313466033313406976?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7313466033313406976%2C7316309968616333312%29&dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287316309968616333312%2Curn%3Ali%3Aactivity%3A7313466033313406976%29">Austen Collins suggested</a>, API Gateway has some significant downsides when building AI Agents. For example, the API Gateway timeout is limited to 29 seconds (AWS allows us to change the timeout now, but changing the timeout can affect account-level throttle limits, etc.), which could be a serious limitation for more complex production-ready agents that can do some longer tasks. Also, we can&rsquo;t stream the response from our Lambda function, so we need to wait for our agent to generate the whole long reply before we can start showing it to the user. Streaming would allow us to show the response as LLM generates it. This is especially helpful for long responses, as our agent starts responding to the user faster and keeps adding text as the user reads (the effect is similar to typing).</p>
<p>Luckily, there&rsquo;s an alternative! AWS Lambda supports Lambda function URLs. It&rsquo;s basically a simple HTTP endpoint in front of your Lambda function. The main benefits of the function URL over API Gateway are that it offers timeouts of up to 15 minutes (it&rsquo;s a Lambda timeout, not an API Gateway timeout anymore) and support for streaming responses. Just what we needed!</p>
<p>However, it comes with many downsides, too. You do not get all the features of the API Gateway, such as built-in rate limits, authorization support, etc. It does not support WAF, either. There are no custom domains for Lambda function URLs. However, you can mitigate some of these downsides by putting a CloudFront in front of the Lambda function, as shown in the diagram below.</p>
<p><img src="/images/posts/ai-agents/aws-architecture-with-lambda-url.png"></p>
<p>Is this an ideal setup? It depends on your use case. It&rsquo;s a good start. There are many other alternatives. For example, we could keep the initial setup, and instead of waiting for the reply with an open HTTP connection, we could send a message to a background job and tell the frontend application that the message is received and that we&rsquo;ll send a reply via WebSockets. There&rsquo;s no out-of-the-box streaming for this setup, too, but it can give you more flexibility and some benefits from both approaches.</p>
<p><img src="/images/posts/ai-agents/aws-architecture-with-websockets.png"></p>
<p>In production, we would need to think about our use case, WAF, rate limiting (for our app, for LLM we are using, and for other services we are using), securing the API endpoint (auth token, API key, etc.), monitoring, error logging and handling, storage for the conversation (we do not want to send the whole conversation from the frontend when a user sends a new message), and many other things.</p>
<p>But let&rsquo;s keep things simple.</p>
<h2 id="show-me-some-code">Show me some code!</h2>
<p>The initial idea of this article was to show the code. But here we are, 4000 words later, and I barely explained how agents work. I&rsquo;ll show the most important parts of the code here. I might do a &ldquo;part 2&rdquo; article with a code deep dive if anyone reads the article past this point and thinks a complete coding example might be beneficial.</p>
<p>The most important part of the code is a definition of the available tools. The format depends on a model or a service you use, but for Claude Sonnet 3.7 on AWS Bedrock using the <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html">Converse API</a>, you can do something similar to the following:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">toolConfig</span> <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">tools</span><span style="color:#f92672">:</span> [
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// We need a similar object for each of our tools
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#75715e">// The name and the description of our tool
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>      <span style="color:#75715e">// A clear description is important because it helps an LLM to select the right tool
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>      <span style="color:#a6e22e">name</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;request_leave&#39;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">description</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;Request leave (such as PTO, sick day, etc.). The leave request will be submitted to the Vacation Tracker application, and sent to your manager (approver).&#39;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">inputSchema</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#75715e">// It accepts JSON, I told you developers love JSON!
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>        <span style="color:#a6e22e">json</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>          <span style="color:#a6e22e">type</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;object&#39;</span>,
</span></span><span style="display:flex;"><span>          <span style="color:#a6e22e">properties</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>            <span style="color:#75715e">// And it expects the following properties (see the type and the description for each).
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>            <span style="color:#75715e">// For simplicity, I&#39;ll show the most important parts only
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>            <span style="color:#a6e22e">startDate</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">type</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;string&#39;</span>,
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">description</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;The date when the leave starts. Format: YYYY-MM-DD&#39;</span>,
</span></span><span style="display:flex;"><span>            },
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">endDate</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">type</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;string&#39;</span>,
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">description</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;The date when the leave ends. Format: YYYY-MM-DD&#39;</span>,
</span></span><span style="display:flex;"><span>            },
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">leaveType</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">type</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;string&#39;</span>,
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">description</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;The type of leave. For example, &#34;vacation&#34;, &#34;sick&#34;, etc.&#39;</span>,
</span></span><span style="display:flex;"><span>            },
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">reason</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">type</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;string&#39;</span>,
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">description</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;The reason for the leave request. For example, &#34;Vacation in Greece&#34;, &#34;I am not feeling well today,&#34; etc.&#39;</span>,
</span></span><span style="display:flex;"><span>            },
</span></span><span style="display:flex;"><span>          },
</span></span><span style="display:flex;"><span>          <span style="color:#a6e22e">required</span><span style="color:#f92672">:</span> [<span style="color:#e6db74">&#39;startDate&#39;</span>, <span style="color:#e6db74">&#39;endDate&#39;</span>, <span style="color:#e6db74">&#39;leaveType&#39;</span>],
</span></span><span style="display:flex;"><span>        },
</span></span><span style="display:flex;"><span>      },
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// Define the other 2 tools here
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  ],
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>It&rsquo;s important to provide clear descriptions as that would help an LLM pick the right tool when needed.</p>
<p>I am a big fan of hexagonal architecture (or ports and adapters), and I would use it in a production code. Our business logic does not care about the Claude Sonnet 3.7 model. It does not care about Amazon Bedrock, either. So, I would put the LLM logic in some kind of repository, initialize it, and use it to send a message when needed. That would make my code cleaner to read, easier to test, and allow me to try other models (i.e., Open AI models, which are not available in Amazon Bedrock) without changing the business logic.</p>
<p>However, I&rsquo;ll just show the simplest code example without hexagonal architecture to keep things simple.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#75715e">// Stripped down example of using the AWS Bedrock SDK to create a simple AI Agent
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">import</span> { <span style="color:#a6e22e">BedrockRuntimeClient</span>, <span style="color:#a6e22e">ConverseCommand</span> } <span style="color:#a6e22e">from</span> <span style="color:#e6db74">&#39;@aws-sdk/client-bedrock-runtime&#39;</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">import</span> { <span style="color:#a6e22e">toolConfig</span> } <span style="color:#a6e22e">from</span> <span style="color:#e6db74">&#39;./tool-config.js&#39;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Initialize the Bedrock client and specify the region and a model ID
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">client</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">BedrockRuntimeClient</span>({ <span style="color:#a6e22e">region</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;us-east-1&#39;</span> })
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">modelId</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;anthropic.claude-3-7-sonnet-20250219-v1:0&#39;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Start a conversation with the user message.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">conversation</span> <span style="color:#f92672">=</span> [
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;user&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{
</span></span><span style="display:flex;"><span>      <span style="color:#75715e">// The user messsage is hardcoded for this example, but in a real application, you would get it from the user input
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>      <span style="color:#75715e">// Also, in practice, users would probably tell us that they want to go on vacation in the first week of July, and the agent would need to ask for the start date
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>      <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;I want to go on vacation on the first week of July. Full week, starting June 30th.&#39;</span>,
</span></span><span style="display:flex;"><span>    }],
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Create a command with the model ID, the message, and a basic configuration
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">command</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">ConverseCommand</span>({
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">modelId</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">messages</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">conversation</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">system</span><span style="color:#f92672">:</span> [{
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// We would need to provide a more detailed system message in a real application
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>    <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;You are Vacation Tracker assistant and you help users to request leaves, see who else from their team is off, check their leave balance and learn about the Vacation Tracker app functionalities. When a user asks to go off for a longer period, assume whole week. Here is the list of available leave types: `PTO`, `SickDay`. Try assuming the correct leave type from the input.&#39;</span>,
</span></span><span style="display:flex;"><span>  }],
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// We pass our tool configuration to Bedrock
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#a6e22e">toolConfig</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">toolConfig</span>,
</span></span><span style="display:flex;"><span>})
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">async</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">run</span>() {
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">try</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// Send the command to the model and wait for the response
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">response</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">client</span>.<span style="color:#a6e22e">send</span>(<span style="color:#a6e22e">command</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#39;Response:&#39;</span>, <span style="color:#a6e22e">JSON</span>.<span style="color:#a6e22e">stringify</span>(<span style="color:#a6e22e">response</span>))
</span></span><span style="display:flex;"><span>  } <span style="color:#66d9ef">catch</span> (<span style="color:#a6e22e">err</span>) {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#39;ERROR&#39;</span>, <span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// We wrap the call in the run function just to be able to use it in a terminal without the deployment
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">// The actual code would define a Lambda function handler
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#a6e22e">run</span>().<span style="color:#a6e22e">then</span>(<span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>).<span style="color:#66d9ef">catch</span>(<span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">error</span>)
</span></span></code></pre></div><p>To run this code example, you&rsquo;d need an AWS account with Claude Sonnet 3.7 enabled in the Amazon Bedrock (in the <code>us-east-1</code> region). When you run it, the response should be similar to the following JSON (inline comments make this JSON invalid, but I added them for easier understanding):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Just metadata, you can ignore this part at the moment
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#f92672">&#34;$metadata&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;httpStatusCode&#34;</span>: <span style="color:#ae81ff">200</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;requestId&#34;</span>: <span style="color:#e6db74">&#34;bc806712-b1b2-40eb-8488-f0085237ebcf&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;attempts&#34;</span>: <span style="color:#ae81ff">1</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;totalRetryDelay&#34;</span>: <span style="color:#ae81ff">0</span>
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Metrics can be useful, but let&#39;s ignore that too at the moment
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#f92672">&#34;metrics&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;latencyMs&#34;</span>: <span style="color:#ae81ff">4191</span>
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Claude Sonnet 3.7 response
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#f92672">&#34;output&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;message&#34;</span>: {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;content&#34;</span>: [
</span></span><span style="display:flex;"><span>        {
</span></span><span style="display:flex;"><span>          <span style="color:#75715e">// A message that we can show to our users if we want to
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>          <span style="color:#f92672">&#34;text&#34;</span>: <span style="color:#e6db74">&#34;Certainly! I&#39;d be happy to help you request leave for your vacation during the first week of July. Based on the information you&#39;ve provided, I&#39;ll submit a leave request for you using the Vacation Tracker application. Let me go ahead and process that for you.&#34;</span>
</span></span><span style="display:flex;"><span>        },
</span></span><span style="display:flex;"><span>        {
</span></span><span style="display:flex;"><span>          <span style="color:#75715e">// Claude Sonnet 3.7 tells us to use a tool!
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>          <span style="color:#f92672">&#34;toolUse&#34;</span>: {
</span></span><span style="display:flex;"><span>            <span style="color:#75715e">// Request params, as defined in the tool configuration
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>            <span style="color:#f92672">&#34;input&#34;</span>: {
</span></span><span style="display:flex;"><span>              <span style="color:#f92672">&#34;startDate&#34;</span>: <span style="color:#e6db74">&#34;2023-06-30&#34;</span> ,
</span></span><span style="display:flex;"><span>              <span style="color:#f92672">&#34;endDate&#34;</span>: <span style="color:#e6db74">&#34;2023-07-07&#34;</span>,
</span></span><span style="display:flex;"><span>              <span style="color:#f92672">&#34;leaveType&#34;</span>: <span style="color:#e6db74">&#34;PTO&#34;</span>,
</span></span><span style="display:flex;"><span>              <span style="color:#f92672">&#34;reason&#34;</span>: <span style="color:#e6db74">&#34;Vacation for the first week of July&#34;</span>
</span></span><span style="display:flex;"><span>            },
</span></span><span style="display:flex;"><span>            <span style="color:#75715e">// The name and ID of a tool we should use
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>            <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;request_leave&#34;</span>,
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">&#34;toolUseId&#34;</span>: <span style="color:#e6db74">&#34;tooluse_JaORLwrHSSGifTRMApUrGA&#34;</span>
</span></span><span style="display:flex;"><span>          }
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>      ],
</span></span><span style="display:flex;"><span>      <span style="color:#75715e">// This is a response from an agent, we need to pass this with the response
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>      <span style="color:#75715e">// from a tool to continue the conversation
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>      <span style="color:#f92672">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;assistant&#34;</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Claude tells us that it stopped because it needs a tool to continue
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#f92672">&#34;stopReason&#34;</span>: <span style="color:#e6db74">&#34;tool_use&#34;</span> ,
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Useful metrics for the number of used tokens
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#f92672">&#34;usage&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;inputTokens&#34;</span>: <span style="color:#ae81ff">630</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;outputTokens&#34;</span> : <span style="color:#ae81ff">183</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;totalTokens&#34;</span>: <span style="color:#ae81ff">813</span>
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>As you can see in the JSON above, Claude tells us that it needs a tool to be able to reply (<code>stopReason: 'tool_use'</code>). It gives us a nice message that we can display to our users if we want to, but keep in mind that the agent still works at this point, so users should not send new messages yet. Finally, it gives us the details about the tool we need to use and the parameters we should send to our tool to get the response.</p>
<p>After receiving this response, we should send the API request to the Vacation Tracker API with the provided start and end dates, leave type, and reason. But before that, we probably need to validate the data and get the authentication token for the API.</p>
<p>Before we continue, let&rsquo;s talk about one more thing: request duration. Remember the 29-second timeout in the API Gateway? Well, this request took 4.2 seconds out of these 29. If we add a few hundred milliseconds for the Lambda overhead and a request to a database to get the previous messages in this conversation, we are probably around 5 seconds.</p>
<p>But that&rsquo;s just the first part of this request. Before we reply, we need to call the Vacation Tracker API, parse the response, and call the Amazon Bedrock again. If we are lucky, our agent will need just one tool to be able to reply, so we&rsquo;ll be at 10 or 15 seconds, including saving the conversation to the DynamoDB.</p>
<p>What happens if we have a complex agent that could use multiple tools in one request? We can easily get close or over the 29-second timeout, which could break our agent.</p>
<p>Ok, so, we got the ID of the tool we need to use and the parameters to send to the tool. In this case, an agent wants us to request a leave. In production, we would ask the user to confirm the request details first, but for this simple example, we can send the request straight to the Vacation Tracker API.</p>
<p>Once the Vacation Tracker API responds, we probably want to process the response to make it clear to our LLM because most API responses are not 100% LLM or human friendly. For example, we can add a description, change the field names to be more descriptive, etc.</p>
<p>We could send a new request to Amazon Bedrock with the following conversation:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>[
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// The initial user message
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;user&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;content&#34;</span>: [
</span></span><span style="display:flex;"><span>      {
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">&#34;text&#34;</span>: <span style="color:#e6db74">&#34;I want to go on vacation on the first week of July. Full week, starting June 30th.&#34;</span>
</span></span><span style="display:flex;"><span>      }
</span></span><span style="display:flex;"><span>    ]
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// The unchanged assistant message from the previous example
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;assistant&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;content&#34;</span>: [
</span></span><span style="display:flex;"><span>      {
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">&#34;text&#34;</span>: <span style="color:#e6db74">&#34;Certainly! I&#39;d be happy to help you request leave for your vacation during the first week of July. Based on the information you&#39;ve provided, I&#39;ll submit a leave request for you using the Vacation Tracker application. Let me go ahead and process that for you.&#34;</span>
</span></span><span style="display:flex;"><span>      },
</span></span><span style="display:flex;"><span>      {
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">&#34;toolUse&#34;</span>: {
</span></span><span style="display:flex;"><span>          <span style="color:#f92672">&#34;input&#34;</span>: {
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">&#34;startDate&#34;</span>: <span style="color:#e6db74">&#34;2025-06-30&#34;</span>,
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">&#34;endDate&#34;</span>: <span style="color:#e6db74">&#34;2025-07-06&#34;</span>,
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">&#34;leaveType&#34;</span>: <span style="color:#e6db74">&#34;PTO&#34;</span>,
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">&#34;reason&#34;</span>: <span style="color:#e6db74">&#34;Vacation for the first week of July&#34;</span>
</span></span><span style="display:flex;"><span>          },
</span></span><span style="display:flex;"><span>          <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;request_leave&#34;</span>,
</span></span><span style="display:flex;"><span>          <span style="color:#f92672">&#34;toolUseId&#34;</span>: <span style="color:#e6db74">&#34;tooluse_JaORLwrHSSGifTRMApUrGA&#34;</span>
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>      }  
</span></span><span style="display:flex;"><span>    ]
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Tool response
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// Role needs to be &#34;user&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>    <span style="color:#f92672">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;user&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;content&#34;</span>: [
</span></span><span style="display:flex;"><span>      {
</span></span><span style="display:flex;"><span>        <span style="color:#75715e">// We provide the tool result object
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>        <span style="color:#f92672">&#34;toolResult&#34;</span>: {
</span></span><span style="display:flex;"><span>          <span style="color:#75715e">// Tool ID
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>          <span style="color:#f92672">&#34;toolUseId&#34;</span>: <span style="color:#e6db74">&#34;tooluse_JaORLwrHSSGifTRMApUrGA&#34;</span>,
</span></span><span style="display:flex;"><span>          <span style="color:#f92672">&#34;content&#34;</span>: [
</span></span><span style="display:flex;"><span>            {
</span></span><span style="display:flex;"><span>              <span style="color:#75715e">// Response is a JSON object with a custom payload
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>              <span style="color:#f92672">&#34;json&#34;</span>: {
</span></span><span style="display:flex;"><span>                <span style="color:#75715e">// For example, the API tells us that the request was successfully sent
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>                <span style="color:#f92672">&#34;success&#34;</span>: <span style="color:#66d9ef">true</span>,
</span></span><span style="display:flex;"><span>                <span style="color:#75715e">// We have the request details
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>                <span style="color:#f92672">&#34;request&#34;</span>: {
</span></span><span style="display:flex;"><span>                  <span style="color:#f92672">&#34;id&#34;</span>: <span style="color:#e6db74">&#34;request_12345&#34;</span>,
</span></span><span style="display:flex;"><span>                  <span style="color:#f92672">&#34;status&#34;</span>: <span style="color:#e6db74">&#34;pending&#34;</span>,
</span></span><span style="display:flex;"><span>                  <span style="color:#f92672">&#34;startDate&#34;</span>: <span style="color:#e6db74">&#34;2025-06-30&#34;</span>,
</span></span><span style="display:flex;"><span>                  <span style="color:#f92672">&#34;endDate&#34;</span>: <span style="color:#e6db74">&#34;2025-07-06&#34;</span>,
</span></span><span style="display:flex;"><span>                  <span style="color:#f92672">&#34;leaveType&#34;</span>: <span style="color:#e6db74">&#34;PTO&#34;</span>,
</span></span><span style="display:flex;"><span>                  <span style="color:#f92672">&#34;reason&#34;</span>: <span style="color:#e6db74">&#34;Vacation for the first week of July&#34;</span>
</span></span><span style="display:flex;"><span>                },
</span></span><span style="display:flex;"><span>                <span style="color:#75715e">// And also the remaining quota
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>                <span style="color:#f92672">&#34;quota&#34;</span>: {
</span></span><span style="display:flex;"><span>                  <span style="color:#f92672">&#34;remaining&#34;</span>: <span style="color:#ae81ff">10</span>,
</span></span><span style="display:flex;"><span>                  <span style="color:#f92672">&#34;used&#34;</span>: <span style="color:#ae81ff">5</span>,
</span></span><span style="display:flex;"><span>                  <span style="color:#f92672">&#34;total&#34;</span>: <span style="color:#ae81ff">15</span>,
</span></span><span style="display:flex;"><span>                  <span style="color:#f92672">&#34;unit&#34;</span>: <span style="color:#e6db74">&#34;days&#34;</span>
</span></span><span style="display:flex;"><span>                }
</span></span><span style="display:flex;"><span>              }
</span></span><span style="display:flex;"><span>            }
</span></span><span style="display:flex;"><span>          ]
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>      }
</span></span><span style="display:flex;"><span>    ]
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>]
</span></span></code></pre></div><p>When we send this request to Claude, the reply will probably be similar to the following JSON:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Metadata, ignore
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#f92672">&#34;$metadata&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;httpStatusCode&#34;</span>: <span style="color:#ae81ff">200</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;requestId&#34;</span> : <span style="color:#e6db74">&#34;de4dad99-1d57-46bd-ad71-c69ebe853756&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;attempts&#34;</span>: <span style="color:#ae81ff">1</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;totalRetryDelay&#34;</span>: <span style="color:#ae81ff">0</span>
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// It took almost 9 seconds to reply!
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#f92672">&#34;metrics&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;latencyMs&#34;</span>: <span style="color:#ae81ff">8985</span>
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// The actual response object
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#f92672">&#34;output&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;message&#34;</span>: {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;content&#34;</span>: [
</span></span><span style="display:flex;"><span>        {
</span></span><span style="display:flex;"><span>          <span style="color:#75715e">// The text we want to show to the user
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>          <span style="color:#f92672">&#34;text&#34;</span>: <span style="color:#e6db74">&#34;Great news! I&#39;ve successfully submitted your leave request for your vacation. Here are the details of your request:\n\n- Leave Type: PTO (Paid Time Off)\n- Start Date: June 30, 2025 (Monday)\n- End Date: July 6, 2025 (Sunday)\n- Reason: Vacation for the first week of July\n- Status: Pending approval\n \nYour request has been submitted to the Vacation Tracker application and sent to your manager for approval. The request ID is request_12345.\n\nAdditionally, I can see from the response that your current leave balance is as follows:\n- Total PTO: 15 days\n- Used: 5 days\n- Remaining: 10 days\n\nThis request will use 6 working days of your PTO (assuming a standard Monday to Friday work week). Please note that your request is currently pending approval from your manager. You should receive a notification once it&#39;s been reviewed. \n\nIs there anything else you&#39;d like to know about your leave request or the Vacation Tracker application?&#34;</span>
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>      ],
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;assistant&#34;</span>,
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Claude says that the request was successfully fulfilled
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#f92672">&#34;stopReason&#34;</span>: <span style="color:#e6db74">&#34;end_turn&#34;</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Usage data
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#f92672">&#34;usage&#34;</span>: {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;inputTokens&#34;</span>: <span style="color:#ae81ff">901</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;outputTokens&#34;</span>: <span style="color:#ae81ff">236</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;totalTokens&#34;</span>: <span style="color:#ae81ff">1137</span>
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>To make it a bit more readable, here&rsquo;s the AI Agent response:</p>
<hr>
<p><em>Great news! I&rsquo;ve successfully submitted your leave request for your vacation. Here are the details of your request:</em></p>
<ul>
<li><em>Leave Type: PTO (Paid Time Off)</em></li>
<li><em>Start Date: June 30, 2025 (Monday)</em></li>
<li><em>End Date: July 7, 2025 (Friday)</em></li>
<li><em>Reason: Vacation for the first week of July</em></li>
<li><em>Status: Pending approval</em></li>
</ul>
<p><em>Your request has been submitted to the Vacation Tracker application and sent to your manager for approval. The request ID is request_12345.</em></p>
<p><em>Additionally, I can see from the response that your current leave balance is as follows:</em></p>
<ul>
<li><em>Total PTO: 15 days</em></li>
<li><em>Used: 5 days</em></li>
<li><em>Remaining: 10 days</em></li>
</ul>
<p><em>This request will use 5 working days of your PTO (assuming a standard Monday to Friday work week). Please note that your request is currently pending approval from your manager. You should receive a notification once it&rsquo;s been reviewed.</em></p>
<p><em>Is there anything else you&rsquo;d like to know about your leave request or the Vacation Tracker application?</em></p>
<hr>
<p>As you can see, our AI Agent is a bit more chatty and friendly. It made up some parts of the reply (for example, we never mentioned the notification), but the response generally looks good.</p>
<p>In production, we probably want an LLM to reply in a structured way so we can display a predefined UI for the leave request along with a short, friendly message. Also, we need to be careful with dates and other numbers, as LLMs do not care about numbers. Numbers are close in the vector space. For an LLM, 4 is similar to 5 or even 42, but 4 or 5 used PTO days make a big difference for our users.</p>
<p>And that&rsquo;s it! We built an AI Agent again. This time with code. It&rsquo;s a simple agent, more like a proof of concept. But it&rsquo;s still an agent.</p>
<h2 id="summary">Summary</h2>
<p>After a long article, I&rsquo;ll keep the summary short.</p>
<p>Before we do a quick summary, you should check 2 relatively fresh things related to AI Agents.</p>
<p>First, my Twitter feed has been buzzing about MCP servers in the past few weeks. Model Context Protocol (MCP) is an open-source protocol that standardizes how applications provide context (and tool specification) to LLMs, and you can read more about it here: <a href="https://docs.anthropic.com/en/docs/agents-and-tools/mcp">https://docs.anthropic.com/en/docs/agents-and-tools/mcp</a>.</p>
<p>Second, OpenAI has some new tools for building agents. Read more about these tools here: <a href="https://openai.com/index/new-tools-for-building-agents/">https://openai.com/index/new-tools-for-building-agents/</a>.</p>
<p>To end this long article, I just want to remind you that AI Agents sound very complicated, but they are actually like LLM &ldquo;while loops&rdquo; with tools.</p>
<p>You have all the skills you need to build these tools. So, go, build tools and agents, and have fun!</p>
]]></content>
        </item>
        
        <item>
            <title>5 Prompt Engineering Tips for Developers</title>
            <link>https://slobodan.me/posts/5-prompt-engineering-tips-for-developers/</link>
            <pubDate>Tue, 05 Nov 2024 12:00:00 +0100</pubDate>
            
            <guid>https://slobodan.me/posts/5-prompt-engineering-tips-for-developers/</guid>
            <description>&lt;p&gt;How many times has someone asked you to do something without providing clear instructions? We often get mad in these situations because we can&amp;rsquo;t read someone&amp;rsquo;s mind.&lt;/p&gt;
&lt;p&gt;However, it&amp;rsquo;s hard to give clear instructions. Not just to you, your manager, or your friends. It&amp;rsquo;s hard to give clear instructions even if you are a person whose main job is to provide clear instructions.&lt;/p&gt;
&lt;h2 id=&#34;a-sentence-can-destroy-fleets&#34;&gt;A sentence can destroy Fleets&lt;/h2&gt;
&lt;p&gt;It was July, 1914. The beginning of WW1 was just a few days away. &lt;em&gt;SMS Goeben&lt;/em&gt;, an Imperial German Navy battlecruiser built for speed and power, dominated the Mediterranean Sea, accompanied by a light cruiser, SMS Breslau. Goeben was significantly larger and better armored than the British Navy battleships, but it was experiencing issues with its boilers, which limited its speed and operational efficiency.&lt;/p&gt;
&lt;p&gt;SMS Goeben encountered the British battlecruisers HMS &lt;em&gt;Indefatigable&lt;/em&gt; and &lt;em&gt;Indomitable&lt;/em&gt; and four additional vessels.&lt;/p&gt;
&lt;p&gt;Winston Churchill, the political head of the British Royal Navy at that moment, ordered the British Navy to &amp;ldquo;shadow, not to engage&amp;rdquo; with SMS Goeben until the war was officially declared. And they did it.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;The British Navy chasing SMS Goeben&#34; src=&#34;https://slobodan.me/images/posts/5-tips/sms-goeben.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;The war was declared just a few days later. However, Churchill&amp;rsquo;s order contained the following sentence:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;… do not at this stage be brought to action against superior forces.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For Churchill, the superior forces were the Austrian Fleet. However, this part of unclear instructions allowed SMS Goeben and SMS Breslau to escape and reach Constantinople, where they eventually became part of the Ottoman Navy, playing a key role in drawing the Ottoman Empire into the war and leading naval operations in the Black Sea against Russia.&lt;/p&gt;
&lt;h2 id=&#34;giving-clear-instructions-to-computers&#34;&gt;Giving clear instructions to computers&lt;/h2&gt;
&lt;p&gt;Human interactions are complicated. While I hope your unclear instructions never had an impact as significant as the one described above, you probably have many stories.&lt;/p&gt;
&lt;p&gt;Luckily, talking to computers is easier, as they have more predictable behaviors and no emotions. Programming languages make communication with computers predictable and repeatable, which makes our job easier. However, &lt;em&gt;Large Language Models&lt;/em&gt; (LLMs) are still different beasts. Currently, there&amp;rsquo;s a lot of magic and mysticism involved in LLM prompting.&lt;/p&gt;
&lt;p&gt;But all the magic aside, we know enough about LLMs to get helpful information from them.&lt;/p&gt;
&lt;p&gt;Large Language Models are good at guessing the next best set of words based on &lt;em&gt;your input&lt;/em&gt; (instructions, questions, or part of the text), their &lt;em&gt;training data&lt;/em&gt; (text from many books, websites, and other datasets), &lt;em&gt;context&lt;/em&gt; (documents or previous conversation flow), and &lt;em&gt;specific configurations&lt;/em&gt; (such as weights, which prioritize certain word patterns, and settings like temperature, which control the randomness of predictions).&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s say you asked an LLM to finish the following sentence: &amp;ldquo;I am speaking at&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Will it continue with &amp;ldquo;a Martian picnic?&amp;rdquo; Or &amp;ldquo;a space farmer&amp;rsquo;s market?&amp;rdquo; Highly unlikely. You&amp;rsquo;ll get a reply similar to the following: &amp;ldquo;a business conference,&amp;rdquo; &amp;ldquo;a tech meetup,&amp;rdquo; or &amp;ldquo;a community forum.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;A simple prompt example&#34; src=&#34;https://slobodan.me/images/posts/5-tips/simple-prompt.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;But what happens when you change the context? Answers can now be the complete opposite!&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s say you add the following at the beginning of your prompt: &amp;ldquo;You are a playful, chatty cartoon character called &amp;ldquo;Space Bunny&amp;rdquo; selling sandwiches on Mars near the intergalactic highway.&amp;rdquo; It&amp;rsquo;s more likely that an LLM continues with &amp;ldquo;a Martian picnic&amp;rdquo; or &amp;ldquo;a space farmer&amp;rsquo;s market&amp;rdquo; now.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Updated simple prompt&#34; src=&#34;https://slobodan.me/images/posts/5-tips/updated-simple-prompt.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;I already said &amp;ldquo;prompt&amp;rdquo; a few times. But prompts are not magic. They are just instructions. You tell an LLM what you want, and it tries to reply based on its training, your instructions are a few additional parameters and configurations. Simple!&lt;/p&gt;
&lt;p&gt;It works similarly to the instructions you give to humans. The better you explain what you want, the higher chance you&amp;rsquo;ll get a useful reply. But the main difference is (or is it?) that an LLM always replies. Not all of its replies are based on truth.&lt;/p&gt;
&lt;p&gt;So, prompts are just instructions. But how do LLMs work?&lt;/p&gt;
&lt;p&gt;You give your instructions. An LLM eats a lot of tokens, and it spills some unexpected wisdom. Or hallucination.&lt;/p&gt;
&lt;p&gt;Wait, what are tokens?&lt;/p&gt;
&lt;p&gt;Money. Tokens are money. I am not kidding. Most of the time, you pay for using an LLM based on the number of tokens it eats and spills out. So, for you, tokens = money.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;An LLM eats tokens&#34; src=&#34;https://slobodan.me/images/posts/5-tips/llm-eats-tokens.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;So, less tokens = less money out of your pocket = better? Well, not really. There&amp;rsquo;s another side to that story.&lt;/p&gt;
&lt;p&gt;For you, tokens are money, but for an LLM, tokens are the only way to understand and follow your instructions. LLMs are computers, and you might think that computers are good with text because you interact with them using text and clicks, but they are not. They like numbers more (don&amp;rsquo;t tell anyone 🤫). When you enter a text, a computer converts that to numbers, does some operations, converts numbers back to text, and displays that text on the screen. It&amp;rsquo;s an oversimplification, of course.&lt;/p&gt;
&lt;p&gt;LLMs do the same. What a surprise, right?&lt;/p&gt;
&lt;p&gt;First, they convert your instructions to tokens. Tokens are words or parts of the words that help LLMs to match patterns. There&amp;rsquo;s no universal way for LLMs to convert words and sentences to tokens. There are many tokenizers. However, one generalization is still close to the truth: around 750 words roughly equal 1000 tokens.&lt;/p&gt;
&lt;p&gt;There are also many visual explanations for tokenized prompts. For example, you can try OpenAI&amp;rsquo;s tokenizer here: &lt;a href=&#34;https://platform.openai.com/tokenizer&#34;&gt;https://platform.openai.com/tokenizer&lt;/a&gt;. You&amp;rsquo;ll get slightly different results based on the model you select.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;OpenAI&amp;rsquo;s tokenizer&#34; src=&#34;https://slobodan.me/images/posts/5-tips/openai-tokenizer.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;So, an LLM first splits your instructions into chunks we call tokens. And then (this is oversimplification, again) it converts each of these tokens to a series of numbers (array of integers). These arrays of integers are often called vectors. You can also think about them as dots in a multidimensional space. But best of all, these dots are not randomly put in the space; they are close to other dots with a similar meaning. For example, a vector representation of the word &amp;ldquo;potato&amp;rdquo; is close to a &amp;ldquo;tomato&amp;rdquo; but not close to a &amp;ldquo;business conference.&amp;rdquo; And that&amp;rsquo;s one of the most important sources of the LLMs&amp;rsquo; magic.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s imagine that each token converts to a simple array of two numbers (two-dimensional space) to simplify the visual representation.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Oversimplified visual representation of vectors in 2D space&#34; src=&#34;https://slobodan.me/images/posts/5-tips/vectors.png&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;anatomy-of-a-prompt&#34;&gt;Anatomy of a prompt&lt;/h2&gt;
&lt;p&gt;Remember when I said that prompts are just instructions? I lied. Instructions are the most useful subset of prompts, but in general, any text (or just a single letter or character) can be a prompt. If you type a single character, such as &amp;ldquo;$,&amp;rdquo; that&amp;rsquo;s a valid prompt. Just remember that the quality of a reply depends on the quality of your prompt.&lt;/p&gt;
&lt;p&gt;However, prompts are not just written words. With multimodal LLMs (a fancy name for LLMs that understand different types of input), you can use your voice or upload a document, an image, or even a song. You can go further and combine these inputs. These are all valid prompts. I often upload a spreadsheet or some other document and provide a few sentences of textual instructions.&lt;/p&gt;
&lt;p&gt;So, asking, &amp;ldquo;Who is faster: Godzilla or T-Rex?&amp;rdquo; is a valid prompt. Let me save you some tokens: the answer is T-Rex.&lt;/p&gt;
&lt;p&gt;Asking an LLM to &amp;ldquo;Write a 500-word article about the bad influence of Amazon&amp;rsquo;s RTO policy on Lambda cold starts&amp;rdquo; is also a valid prompt. It took ChatGPT 13 seconds to figure this one out. I was afraid to ask Claude. It&amp;rsquo;s hosted on AWS, and I don&amp;rsquo;t want it to be part of the next round of layoffs.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;Amazon RTO policy prompt result&#34; src=&#34;https://slobodan.me/images/posts/5-tips/amazon-rto-policy.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;So, almost anything could be a prompt. However, not all prompts are useful. Luckily, LLMs are products, and like with many other products, LLM users (we) did many crazy things (&amp;ldquo;lizards,&amp;rdquo; as my friend would &lt;a href=&#34;https://lizardoptimization.org/&#34;&gt;call us&lt;/a&gt;) and came up with some helpful tricks. Products evolve with user needs and feedback. LLMs are not different.&lt;/p&gt;
&lt;p&gt;One of the early tricks was telling an LLM to pretend to be an expert in the field related to the prompt. This worked incredibly well until you hit the context limit (the number of tokens or characters an LLM can keep in &amp;ldquo;memory&amp;rdquo;). Once you reach the context limit, the conversation often takes a weird direction. Imagine talking to Dory Fish on LSD.&lt;/p&gt;
&lt;p&gt;People behind LLMs figured out that some parts of our instructions are more important than others and that we often repeat that part across multiple LLM conversations. And system prompts were born. A system prompt is a repeatable part of the prompt with higher priority than the rest of your conversation, which always remains in the context (oversimplification, again).&lt;/p&gt;
&lt;p&gt;Repeatability and more control are often essential when you work with LLMs. The system prompts help, but we (software developers, at least) already know how to talk to computers to get repeatable and controlled responses. We can get amazing results by combining code with LLM interaction through API.&lt;/p&gt;
&lt;p&gt;How do we get these &amp;ldquo;amazing results?&amp;rdquo; Let&amp;rsquo;s see the 5 habits of highly effective prompters! Oops, I forgot this is not Linkedin. I mean 5 simple tricks for better prompting results.&lt;/p&gt;
&lt;h2 id=&#34;five-simple-tricks-for-better-prompting-results&#34;&gt;Five simple tricks for better prompting results&lt;/h2&gt;
&lt;p&gt;This article is already too long, so there&amp;rsquo;s no need for a cliffhanger. Here are the tricks I want to tell you about:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Hint the beginning of the answer.&lt;/li&gt;
&lt;li&gt;Give examples.&lt;/li&gt;
&lt;li&gt;Think step by step.&lt;/li&gt;
&lt;li&gt;Use tools.&lt;/li&gt;
&lt;li&gt;Ask an LLM to improve the prompt.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let&amp;rsquo;s go through them (step by step).&lt;/p&gt;
&lt;h3 id=&#34;hint-the-beginning-of-the-answer&#34;&gt;Hint the beginning of the answer&lt;/h3&gt;
&lt;p&gt;I told you to combine code (JavaScript, Python, or anything else; I don&amp;rsquo;t judge) and LLM for better responses. But to do so, you need to get a structured response from an LLM.&lt;/p&gt;
&lt;p&gt;While there are countless ways to structure a response, we really love JSON. So, we ask an LLM politely to answer with JSON and nothing else. But an LLM is also polite, so the conversation from time to time looks like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;User:&lt;/strong&gt; &lt;em&gt;&amp;hellip;some super long and smart instructions&amp;hellip;&lt;/em&gt; Always answer with valid JSON and nothing else.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Assistant:&lt;/strong&gt; Here&amp;rsquo;s your JSON:&lt;/p&gt;
&lt;p&gt;```json&lt;/p&gt;
&lt;p&gt;{&lt;/p&gt;
&lt;p&gt;&amp;ldquo;some&amp;rdquo;: &amp;ldquo;valid&amp;rdquo;,&lt;/p&gt;
&lt;p&gt;&amp;ldquo;JSON&amp;rdquo;: true,&lt;/p&gt;
&lt;p&gt;}&lt;/p&gt;
&lt;p&gt;```&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Then your JSON parser starts yelling at you, and you do the same to the LLM. Replying, &amp;ldquo;I said JSON only!!!&amp;rdquo; in capslock works sometimes.&lt;/p&gt;
&lt;p&gt;But here&amp;rsquo;s one more trick you can try: write the beginning of the assistant&amp;rsquo;s reply in the API call!&lt;/p&gt;
&lt;p&gt;LLM APIs expect a conversation (user -&amp;gt; agent -&amp;gt; user -&amp;gt; agent, and so on). If you finish your API call with the user&amp;rsquo;s message, LLM will reply with the assistant&amp;rsquo;s message. However, if you finish your API call with the assistant&amp;rsquo;s message, LLM will treat it as the beginning of its reply and simply continue.&lt;/p&gt;
&lt;p&gt;Below is the code example with AWS Bedrock using AWS Node.js SDK and Bedrock Converse API. Other LLMs support something similar.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-javascript&#34; data-lang=&#34;javascript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;import&lt;/span&gt; { &lt;span style=&#34;color:#a6e22e&#34;&gt;BedrockRuntimeClient&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;ConverseCommand&lt;/span&gt; } &lt;span style=&#34;color:#a6e22e&#34;&gt;from&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;@aws-sdk/client-bedrock-runtime&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Create a Bedrock Runtime client in the AWS Region you want to use.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;BedrockRuntimeClient&lt;/span&gt;({ &lt;span style=&#34;color:#a6e22e&#34;&gt;region&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;eu-central-1&amp;#34;&lt;/span&gt; })
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Set the model ID, i.e., Claude 3.5 Sonnet
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;modelId&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;anthropic.claude-3-5-sonnet-20240620-v1:0&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Start a conversation with the user message.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;userPrompt&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;Return the result of all games of the NBA 2023 finals as a short structured JSON.&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;conversation&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;userPrompt&lt;/span&gt; }],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;assistant&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;{&amp;#34;&amp;#39;&lt;/span&gt; }], &lt;span style=&#34;color:#75715e&#34;&gt;// &amp;lt;---- Assistant&amp;#39;s reply starts with `{&amp;#34;`
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Create a command with the model ID, the message, and a basic configuration.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;command&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;ConverseCommand&lt;/span&gt;({
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;modelId&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;messages&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;conversation&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;})
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;try&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Send the command to the model and wait for the response
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;send&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;command&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Extract, parse and print the response text, but prepand `{ &amp;#34;`
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;responseText&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;{ &amp;#34;&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;output&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;message&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;[&lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;].&lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;parsedJson&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;JSON&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;parse&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;responseText&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;console&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;log&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;parsedJson&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;} &lt;span style=&#34;color:#66d9ef&#34;&gt;catch&lt;/span&gt; (&lt;span style=&#34;color:#a6e22e&#34;&gt;err&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;console&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;log&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;ERROR&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;err&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;give-examples&#34;&gt;Give examples&lt;/h3&gt;
&lt;p&gt;We can use a similar approach for one more trick.&lt;/p&gt;
&lt;p&gt;You probably use ChatGPT or Claude at least a few times per week. Do you always get a perfect reply after your initial prompt? I don&amp;rsquo;t. I often ask follow-up questions, point out mistakes, or give a few examples to get the answer I am looking for. I know many other people do the same.&lt;/p&gt;
&lt;p&gt;However, we often send just one detailed command via the API and expect LLMs to deliver a perfect response. If someone tries to look smart, they&amp;rsquo;ll probably refer to this as &amp;ldquo;zero-shot prompting.&amp;rdquo; It&amp;rsquo;s just a name for sending direct commands without examples.&lt;/p&gt;
&lt;p&gt;Why does it need the name? It doesn&amp;rsquo;t. But as you can expect, there&amp;rsquo;s a fancy name for the opposite, too. If you provide examples and simulate conversation, you are using &amp;ldquo;few-shot prompting.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;For example, you can ask an LLM to label a review comment with a &amp;ldquo;positive&amp;rdquo; or &amp;ldquo;negative&amp;rdquo; label and return it in JSON format. You can do that by sending a prompt similar to the following:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Here&amp;rsquo;s a review comment. Return a JSON with the label &amp;ldquo;positive&amp;rdquo; or &amp;ldquo;negative&amp;rdquo; for this comment:&lt;/p&gt;
&lt;p&gt;&amp;ldquo;Super helpful, worth it!&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This example is simple, so the &amp;ldquo;zero-shot prompting&amp;rdquo; works perfectly fine. But let&amp;rsquo;s say you want to label neutral comments as &amp;ldquo;negative.&amp;rdquo; LLM will not know that. You can improve your prompt or even add a detailed system prompt, which might work well. But you can also do the following:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Label review comments and return a JSON with the label &amp;ldquo;positive&amp;rdquo; or &amp;ldquo;negative.&amp;rdquo; Here are a few examples:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&amp;ldquo;Great product 10/10&amp;rdquo; =&amp;gt; &lt;code&gt;{ &amp;quot;label&amp;quot;: &amp;quot;positive&amp;quot; }&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Didn&amp;rsquo;t work very well&amp;rdquo; =&amp;gt; &lt;code&gt;{ &amp;quot;label&amp;quot;: &amp;quot;negative&amp;quot; }&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Not sure&amp;rdquo; =&amp;gt; &lt;code&gt;{ &amp;quot;label&amp;quot;: &amp;quot;negative&amp;quot; }&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Label the following comment based on the examples above:&lt;/p&gt;
&lt;p&gt;&amp;ldquo;Super helpful, worth it!&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This technique will improve your chance to get more precise labeling. It&amp;rsquo;s like a mini in-prompt training.&lt;/p&gt;
&lt;p&gt;But there&amp;rsquo;s a cleaner way to do this using the same approach we used in the first trick: simulate the conversation by passing multiple user and agent messages.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s see the code example!&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-javascript&#34; data-lang=&#34;javascript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Same as above
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;import&lt;/span&gt; { &lt;span style=&#34;color:#a6e22e&#34;&gt;BedrockRuntimeClient&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;ConverseCommand&lt;/span&gt; } &lt;span style=&#34;color:#a6e22e&#34;&gt;from&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;@aws-sdk/client-bedrock-runtime&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;BedrockRuntimeClient&lt;/span&gt;({ &lt;span style=&#34;color:#a6e22e&#34;&gt;region&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;eu-central-1&amp;#34;&lt;/span&gt; })
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;modelId&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;anthropic.claude-3-5-sonnet-20240620-v1:0&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Start a conversation with the user message.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;systemPrompt&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;Label review comments and return a JSON with the label &amp;#34;positive&amp;#34; or &amp;#34;negative.&amp;#34;&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;conversation&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Great product 10/10&amp;#34;&lt;/span&gt; }], &lt;span style=&#34;color:#75715e&#34;&gt;// Rating example
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;assistant&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;{&amp;#34;label&amp;#34;:&amp;#34;positive&amp;#34;}&amp;#39;&lt;/span&gt; }], &lt;span style=&#34;color:#75715e&#34;&gt;// Label example
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Didn&amp;#39;t work very well&amp;#34;&lt;/span&gt; }], &lt;span style=&#34;color:#75715e&#34;&gt;// Rating example #2
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;assistant&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;{&amp;#34;label&amp;#34;:&amp;#34;negative&amp;#34;}&amp;#39;&lt;/span&gt; }], &lt;span style=&#34;color:#75715e&#34;&gt;// Label example #2
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Not sure&amp;#34;&lt;/span&gt; }], &lt;span style=&#34;color:#75715e&#34;&gt;// Rating example #3
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;assistant&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;{&amp;#34;label&amp;#34;:&amp;#34;negative&amp;#34;}&amp;#39;&lt;/span&gt; }], &lt;span style=&#34;color:#75715e&#34;&gt;// Label example #3
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Super helpful, worth it!&amp;#34;&lt;/span&gt; }], &lt;span style=&#34;color:#75715e&#34;&gt;// &amp;lt;---- ACTUAL REVIEW
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// You can even hint the begin of the reply here
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Create a command with the model ID, the message, and a basic configuration.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;command&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;ConverseCommand&lt;/span&gt;({
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;modelId&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;system&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;systemPrompt&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;messages&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;conversation&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;})
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;try&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Send the command to the model and wait for the response
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;send&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;command&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Extract, parse and print the response text
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;responseText&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;output&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;message&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;[&lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;].&lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;parsedJson&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;JSON&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;parse&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;responseText&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;console&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;log&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;parsedJson&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;} &lt;span style=&#34;color:#66d9ef&#34;&gt;catch&lt;/span&gt; (&lt;span style=&#34;color:#a6e22e&#34;&gt;err&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;console&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;log&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;ERROR&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;err&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Easy, right? Just remember that this approach eats more tokens. And tokens = money.&lt;/p&gt;
&lt;h3 id=&#34;think-step-by-step&#34;&gt;Think step by step&lt;/h3&gt;
&lt;p&gt;The previous trick works well with humans, too. If you give examples when you delegate a task, there&amp;rsquo;s a higher chance you&amp;rsquo;ll get the task done.&lt;/p&gt;
&lt;p&gt;But imagine you could use the following trick in conversation with humans, too! Let me give you an example.&lt;/p&gt;
&lt;p&gt;You might have a boss or a client. One of the superpowers of bosses and clients is that they can come up with completely unreasonable ideas they want you to do now. Trust me, I was on both sides of that process.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;A boss with an ureasonable request&#34; src=&#34;https://slobodan.me/images/posts/5-tips/boss.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Imagine you could ask them, &amp;ldquo;Let&amp;rsquo;s think step by step. How would that feature help our users?&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;&amp;ldquo;Think step-by-step&amp;rdquo;&#34; src=&#34;https://slobodan.me/images/posts/5-tips/think-step-by-step.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;In practice, this scenario rarely works because people have complex emotions. Instead of them thinking through the process step by step and explaining every detail, you could end up with a new green LinkedIn badge around your profile photo.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;#opentowork&#34; src=&#34;https://slobodan.me/images/posts/5-tips/opentowork.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Luckily, LLMs have no emotions. Yet. So, adding &amp;ldquo;Let&amp;rsquo;s think step by step&amp;rdquo; or a similar phrase uses more tokens for the response, but it often improves the response.&lt;/p&gt;
&lt;p&gt;It actually works so well that OpenAI based its new O1 models on a similar concept. They said the following in &lt;a href=&#34;https://openai.com/index/introducing-openai-o1-preview/&#34;&gt;the announcement&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We&amp;rsquo;ve developed a new series of AI models designed to spend more time thinking before they respond. They can reason through complex tasks and solve harder problems than previous models in science, coding, and math.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;As the O1 models already use something similar under the hood, you may not want to try this trick if you use them.&lt;/p&gt;
&lt;h3 id=&#34;use-tools&#34;&gt;Use tools&lt;/h3&gt;
&lt;p&gt;LLMs are powerful. We use them for everything. And, as Maslow said a long time ago, &amp;ldquo;it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;But LLMs suck at some specific tasks. These tasks depend on the model we use, but most models struggle with their math skills, working with dates, etc. The previous tricks might help with these skill issues in some situations but are not real solutions.&lt;/p&gt;
&lt;p&gt;Luckily, all major LLMs now support tools. So, instead of using a hammer for everything, we can bring our tools for specific cases.&lt;/p&gt;
&lt;p&gt;But what are tools, and why are they useful? Here&amp;rsquo;s an example!&lt;/p&gt;
&lt;p&gt;I asked Claude: &amp;ldquo;What&amp;rsquo;s the weather like in Montreal today?&amp;rdquo; Even if you don&amp;rsquo;t have the current weather info, you can bet it&amp;rsquo;s cold in Montreal in November. But Claude tries to be helpful, and it answers the following:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I need to clarify something: I don&amp;rsquo;t have access to real-time weather data or current conditions. While I can discuss Montreal&amp;rsquo;s typical weather patterns and climate, I cannot tell you today&amp;rsquo;s specific weather conditions. For current weather information in Montreal, I&amp;rsquo;d recommend:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Checking a weather website or app&lt;/li&gt;
&lt;li&gt;Looking up Environment Canada&amp;rsquo;s weather service&lt;/li&gt;
&lt;li&gt;Checking local Montreal news stations&amp;rsquo; weather reports&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Would you like to know about Montreal&amp;rsquo;s typical weather patterns for this time of year instead?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Instead of giving up, we can give an LLM, in this case, Claude, the skill to check the current weather conditions in the specified location.&lt;/p&gt;
&lt;p&gt;How do we do that?&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s see the code example! I&amp;rsquo;ll use AWS Bedrock Converse API with Node.js and Claude Sonnet 3.5 as a model, but this is just an example. You can do something similar with any popular model.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-javascript&#34; data-lang=&#34;javascript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Same as above
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;import&lt;/span&gt; { &lt;span style=&#34;color:#a6e22e&#34;&gt;BedrockRuntimeClient&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;ConverseCommand&lt;/span&gt; } &lt;span style=&#34;color:#a6e22e&#34;&gt;from&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;@aws-sdk/client-bedrock-runtime&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;BedrockRuntimeClient&lt;/span&gt;({ &lt;span style=&#34;color:#a6e22e&#34;&gt;region&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;eu-central-1&amp;#34;&lt;/span&gt; })
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;modelId&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;anthropic.claude-3-5-sonnet-20240620-v1:0&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Start a conversation with the user message.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;conversation&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{ &lt;span style=&#34;color:#a6e22e&#34;&gt;text&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;What&amp;#39;s the weather like in Montreal?&amp;#34;&lt;/span&gt; }],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Define available tools
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;toolConfig&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;tools&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;toolSpec&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#75715e&#34;&gt;// The name and the description of our tool
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;name&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;get_weather&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;description&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;Get weather conditions for a specific location&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;inputSchema&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#75715e&#34;&gt;// It accepts JSON, I told you we love JSON!
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;        &lt;span style=&#34;color:#a6e22e&#34;&gt;json&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#a6e22e&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;object&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#a6e22e&#34;&gt;properties&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#75715e&#34;&gt;// And it expects the &amp;#34;location&amp;#34; property (see the type and the description)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;            &lt;span style=&#34;color:#75715e&#34;&gt;// For simplicity, we&amp;#39;ll skip the date part. Our tool will always answer with today&amp;#39;s weather
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;            &lt;span style=&#34;color:#a6e22e&#34;&gt;location&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#a6e22e&#34;&gt;type&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;string&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#a6e22e&#34;&gt;description&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;The location for which you want the weather conditions. Location can be any city, country, etc.&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#a6e22e&#34;&gt;required&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [ &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;location&amp;#39;&lt;/span&gt; ],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Create a command with the model ID, the message, and a basic configuration.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;command&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;ConverseCommand&lt;/span&gt;({
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;modelId&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;messages&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;conversation&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;toolConfig&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;toolConfig&lt;/span&gt;, &lt;span style=&#34;color:#75715e&#34;&gt;// Pass the tool configuration to Bedrock
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;})
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;try&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Send the command to the model and wait for the response
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;send&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;command&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// We&amp;#39;ll see this part in the next example
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;} &lt;span style=&#34;color:#66d9ef&#34;&gt;catch&lt;/span&gt; (&lt;span style=&#34;color:#a6e22e&#34;&gt;err&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;console&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;log&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;ERROR&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;err&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That&amp;rsquo;s the first part of the magic. Claude now knows it can use a tool. However, it still needs to learn how to answer the question. Let&amp;rsquo;s add the actual skill.&lt;/p&gt;
&lt;p&gt;AWS Bedrock Converse command will reply with a JSON object similar to the following:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;output&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;message&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;role&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;assistant&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;: [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;toolUse&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;toolUseId&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tooluse_kZJMlvQmRJ6eAyJE5GIl7Q&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;name&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;get_weather&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;input&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                            &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;location&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Montreal&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            ]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;stopReason&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tool_use&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now that we know that, we can update the code with the following (this is a very simple example):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-javascript&#34; data-lang=&#34;javascript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// Code from above
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;try&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Send the command to the model and wait for the response
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;let&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;send&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;command&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Check if the stop signal is &amp;#34;tool_use&amp;#34; and if our tool was used
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;if&lt;/span&gt; (&lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;stopReason&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;===&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;tool_use&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;output&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;message&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;[&lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;].&lt;span style=&#34;color:#a6e22e&#34;&gt;toolUse&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;name&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;===&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;get_weather&amp;#39;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#75715e&#34;&gt;// Extract location and toolUseId
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;location&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;output&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;message&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;[&lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;].&lt;span style=&#34;color:#a6e22e&#34;&gt;toolUse&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;input&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;location&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;toolUseId&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;output&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;message&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;[&lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;].&lt;span style=&#34;color:#a6e22e&#34;&gt;toolUse&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;toolUseId&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#75715e&#34;&gt;// And call a function that might call some 3rd party API to get the current weather conditions
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;weatherConditionsResponse&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;getWeatherConditions&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;location&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#75715e&#34;&gt;// Then invoke the model again with the tool_use message and the reply
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;updatedCommand&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;ConverseCommand&lt;/span&gt;({
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;modelId&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;messages&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        ...&lt;span style=&#34;color:#a6e22e&#34;&gt;conversation&lt;/span&gt;, &lt;span style=&#34;color:#75715e&#34;&gt;// Previous message
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;        &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;output&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;message&lt;/span&gt;, &lt;span style=&#34;color:#75715e&#34;&gt;// Assistant response
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;        &lt;span style=&#34;color:#75715e&#34;&gt;// Tool response
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;        {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    			&lt;span style=&#34;color:#a6e22e&#34;&gt;role&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    			&lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        		{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            	&lt;span style=&#34;color:#a6e22e&#34;&gt;toolResult&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                &lt;span style=&#34;color:#a6e22e&#34;&gt;toolUseId&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;toolUseId&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                &lt;span style=&#34;color:#a6e22e&#34;&gt;content&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; [
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                    &lt;span style=&#34;color:#75715e&#34;&gt;// Let&amp;#39;s assume that weatherConditionsResponse contains JSON an LLM can understand
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;                    &lt;span style=&#34;color:#a6e22e&#34;&gt;json&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;weatherConditionsResponse&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                ],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          ],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;toolConfig&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;toolConfig&lt;/span&gt;, &lt;span style=&#34;color:#75715e&#34;&gt;// Pass the tool configuration to Bedrock
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;    })
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;response&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;client&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;send&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;updatedCommand&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#75715e&#34;&gt;// Continue parsing the code
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;} &lt;span style=&#34;color:#66d9ef&#34;&gt;catch&lt;/span&gt; (&lt;span style=&#34;color:#a6e22e&#34;&gt;err&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;console&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;log&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;ERROR&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;err&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;And that&amp;rsquo;s it. You have a tool now! Remember that the tool can be any function in your preferred programming language (even Java, I don&amp;rsquo;t judge you).&lt;/p&gt;
&lt;p&gt;You probably heard people talking about AI agents. Sounds magical, right? But most of the time, agents are just 3 raccoons (pardon, tools) in a trench coat.&lt;/p&gt;
&lt;h3 id=&#34;ask-an-llm-to-improve-the-prompt&#34;&gt;Ask an LLM to improve the prompt&lt;/h3&gt;
&lt;p&gt;Can an LLM write a prompt? Remember, prompts are just commands. Often textual. And LLMs are good with text. So, in theory, yes, it can.&lt;/p&gt;
&lt;p&gt;But that&amp;rsquo;s theory. What about practice?&lt;/p&gt;
&lt;p&gt;You should try it! It&amp;rsquo;s simple. Just paste your initial prompt to ChatGPT or Claude and ask it to review and improve it. Guess what? The better instructions you provide, the better result you may expect.&lt;/p&gt;
&lt;p&gt;Both &lt;a href=&#34;https://platform.openai.com/docs/guides/prompt-generation&#34;&gt;OpenAI&lt;/a&gt; and &lt;a href=&#34;https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/prompt-generator&#34;&gt;Claude&lt;/a&gt; now have their prompt generator tools. These generators are a good start.&lt;/p&gt;
&lt;p&gt;But we do not need to reinvent the wheel. Giving commands and instructions evolved with humans. Many people wrote about it. For example, the military has many techniques for giving commands and communicating intent.&lt;/p&gt;
&lt;p&gt;One of the useful concepts is the Commander&amp;rsquo;s Intent Statement, which aims to explain the desired end state. An effective Commander&amp;rsquo;s Intent statement should contain the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Purpose:&lt;/em&gt; Explain why personnel must complete the assignment.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Task:&lt;/em&gt; Explain the objective or goal.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;End state:&lt;/em&gt; Explain what the result should look like.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;An easy way to improve your current prompt is to paste it in ChatGPT or Claude and ask an LLM to rewrite it to follow the Commander&amp;rsquo;s Intent Statement. You can tell an LLM to ask you if any important details are missing.&lt;/p&gt;
&lt;p&gt;However, the Commander&amp;rsquo;s Intent Statement is just one example. I recently re-read an excellent book called &lt;a href=&#34;https://www.amazon.com/Sources-Power-20th-Anniversary-Decisions-ebook/dp/B08BT2GHXS/ref=sr_1_1&#34;&gt;&amp;ldquo;Sources of Power: How People Make Decisions&amp;rdquo;&lt;/a&gt; by Gary Klein. The book contains a lot of useful information. While the book does not talk about LLMs, some concepts apply to both human conversations and LLMs. For example, the &amp;ldquo;Considerations in Communicating intent&amp;rdquo; section presents a good &amp;ldquo;framework&amp;rdquo; to review the quality of the command or intent statement.&lt;/p&gt;
&lt;p&gt;Gary mentions that a person could present seven types of information to help the people receiving the request understand what to do. Here&amp;rsquo;s the list of all seven types:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The purpose of the task (the higher level goals)&lt;/li&gt;
&lt;li&gt;The objective of the task (an image of the desired outcome)&lt;/li&gt;
&lt;li&gt;The sequence of steps in the plan&lt;/li&gt;
&lt;li&gt;The rationale for the plan&lt;/li&gt;
&lt;li&gt;The key decisions that may have to be made&lt;/li&gt;
&lt;li&gt;Antigoals&lt;/li&gt;
&lt;li&gt;Constraints and other considerations&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;You do not need to have all seven types of information for an effective prompt. For example, you need a strong sequence of steps only when the task&amp;rsquo;s purpose and objective are not strong enough. However, the list provides good questions to ask when writing a complex prompt.&lt;/p&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;Let&amp;rsquo;s quickly recap the five tips mentioned above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Hint the beginning of an answer, LLM will continue.&lt;/li&gt;
&lt;li&gt;Give examples (or use the few-shot prompting technique if you want to sound smart).&lt;/li&gt;
&lt;li&gt;Ask an LLM to think step-by-step.&lt;/li&gt;
&lt;li&gt;Help an LLM with tools. Remember that tools are just functions.&lt;/li&gt;
&lt;li&gt;Ask an LLM to help you improve the prompt.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And, before you go, I just want to state the obvious: writing effective prompts requires thinking and writing. You&amp;rsquo;ll need to write a lot. I know! You hate writing. But you wrote thousands of lines of JavaScript or Python last month. That&amp;rsquo;s also just a bunch of text and instructions.&lt;/p&gt;
</description>
            <content type="html"><![CDATA[<p>How many times has someone asked you to do something without providing clear instructions? We often get mad in these situations because we can&rsquo;t read someone&rsquo;s mind.</p>
<p>However, it&rsquo;s hard to give clear instructions. Not just to you, your manager, or your friends. It&rsquo;s hard to give clear instructions even if you are a person whose main job is to provide clear instructions.</p>
<h2 id="a-sentence-can-destroy-fleets">A sentence can destroy Fleets</h2>
<p>It was July, 1914. The beginning of WW1 was just a few days away. <em>SMS Goeben</em>, an Imperial German Navy battlecruiser built for speed and power, dominated the Mediterranean Sea, accompanied by a light cruiser, SMS Breslau. Goeben was significantly larger and better armored than the British Navy battleships, but it was experiencing issues with its boilers, which limited its speed and operational efficiency.</p>
<p>SMS Goeben encountered the British battlecruisers HMS <em>Indefatigable</em> and <em>Indomitable</em> and four additional vessels.</p>
<p>Winston Churchill, the political head of the British Royal Navy at that moment, ordered the British Navy to &ldquo;shadow, not to engage&rdquo; with SMS Goeben until the war was officially declared. And they did it.</p>
<p><img alt="The British Navy chasing SMS Goeben" src="/images/posts/5-tips/sms-goeben.png"></p>
<p>The war was declared just a few days later. However, Churchill&rsquo;s order contained the following sentence:</p>
<blockquote>
<p>&ldquo;… do not at this stage be brought to action against superior forces.&rdquo;</p>
</blockquote>
<p>For Churchill, the superior forces were the Austrian Fleet. However, this part of unclear instructions allowed SMS Goeben and SMS Breslau to escape and reach Constantinople, where they eventually became part of the Ottoman Navy, playing a key role in drawing the Ottoman Empire into the war and leading naval operations in the Black Sea against Russia.</p>
<h2 id="giving-clear-instructions-to-computers">Giving clear instructions to computers</h2>
<p>Human interactions are complicated. While I hope your unclear instructions never had an impact as significant as the one described above, you probably have many stories.</p>
<p>Luckily, talking to computers is easier, as they have more predictable behaviors and no emotions. Programming languages make communication with computers predictable and repeatable, which makes our job easier. However, <em>Large Language Models</em> (LLMs) are still different beasts. Currently, there&rsquo;s a lot of magic and mysticism involved in LLM prompting.</p>
<p>But all the magic aside, we know enough about LLMs to get helpful information from them.</p>
<p>Large Language Models are good at guessing the next best set of words based on <em>your input</em> (instructions, questions, or part of the text), their <em>training data</em> (text from many books, websites, and other datasets), <em>context</em> (documents or previous conversation flow), and <em>specific configurations</em> (such as weights, which prioritize certain word patterns, and settings like temperature, which control the randomness of predictions).</p>
<p>Let&rsquo;s say you asked an LLM to finish the following sentence: &ldquo;I am speaking at&rdquo;</p>
<p>Will it continue with &ldquo;a Martian picnic?&rdquo; Or &ldquo;a space farmer&rsquo;s market?&rdquo; Highly unlikely. You&rsquo;ll get a reply similar to the following: &ldquo;a business conference,&rdquo; &ldquo;a tech meetup,&rdquo; or &ldquo;a community forum.&rdquo;</p>
<p><img alt="A simple prompt example" src="/images/posts/5-tips/simple-prompt.png"></p>
<p>But what happens when you change the context? Answers can now be the complete opposite!</p>
<p>Let&rsquo;s say you add the following at the beginning of your prompt: &ldquo;You are a playful, chatty cartoon character called &ldquo;Space Bunny&rdquo; selling sandwiches on Mars near the intergalactic highway.&rdquo; It&rsquo;s more likely that an LLM continues with &ldquo;a Martian picnic&rdquo; or &ldquo;a space farmer&rsquo;s market&rdquo; now.</p>
<p><img alt="Updated simple prompt" src="/images/posts/5-tips/updated-simple-prompt.png"></p>
<p>I already said &ldquo;prompt&rdquo; a few times. But prompts are not magic. They are just instructions. You tell an LLM what you want, and it tries to reply based on its training, your instructions are a few additional parameters and configurations. Simple!</p>
<p>It works similarly to the instructions you give to humans. The better you explain what you want, the higher chance you&rsquo;ll get a useful reply. But the main difference is (or is it?) that an LLM always replies. Not all of its replies are based on truth.</p>
<p>So, prompts are just instructions. But how do LLMs work?</p>
<p>You give your instructions. An LLM eats a lot of tokens, and it spills some unexpected wisdom. Or hallucination.</p>
<p>Wait, what are tokens?</p>
<p>Money. Tokens are money. I am not kidding. Most of the time, you pay for using an LLM based on the number of tokens it eats and spills out. So, for you, tokens = money.</p>
<p><img alt="An LLM eats tokens" src="/images/posts/5-tips/llm-eats-tokens.png"></p>
<p>So, less tokens = less money out of your pocket = better? Well, not really. There&rsquo;s another side to that story.</p>
<p>For you, tokens are money, but for an LLM, tokens are the only way to understand and follow your instructions. LLMs are computers, and you might think that computers are good with text because you interact with them using text and clicks, but they are not. They like numbers more (don&rsquo;t tell anyone 🤫). When you enter a text, a computer converts that to numbers, does some operations, converts numbers back to text, and displays that text on the screen. It&rsquo;s an oversimplification, of course.</p>
<p>LLMs do the same. What a surprise, right?</p>
<p>First, they convert your instructions to tokens. Tokens are words or parts of the words that help LLMs to match patterns. There&rsquo;s no universal way for LLMs to convert words and sentences to tokens. There are many tokenizers. However, one generalization is still close to the truth: around 750 words roughly equal 1000 tokens.</p>
<p>There are also many visual explanations for tokenized prompts. For example, you can try OpenAI&rsquo;s tokenizer here: <a href="https://platform.openai.com/tokenizer">https://platform.openai.com/tokenizer</a>. You&rsquo;ll get slightly different results based on the model you select.</p>
<p><img alt="OpenAI&rsquo;s tokenizer" src="/images/posts/5-tips/openai-tokenizer.png"></p>
<p>So, an LLM first splits your instructions into chunks we call tokens. And then (this is oversimplification, again) it converts each of these tokens to a series of numbers (array of integers). These arrays of integers are often called vectors. You can also think about them as dots in a multidimensional space. But best of all, these dots are not randomly put in the space; they are close to other dots with a similar meaning. For example, a vector representation of the word &ldquo;potato&rdquo; is close to a &ldquo;tomato&rdquo; but not close to a &ldquo;business conference.&rdquo; And that&rsquo;s one of the most important sources of the LLMs&rsquo; magic.</p>
<p>Let&rsquo;s imagine that each token converts to a simple array of two numbers (two-dimensional space) to simplify the visual representation.</p>
<p><img alt="Oversimplified visual representation of vectors in 2D space" src="/images/posts/5-tips/vectors.png"></p>
<h2 id="anatomy-of-a-prompt">Anatomy of a prompt</h2>
<p>Remember when I said that prompts are just instructions? I lied. Instructions are the most useful subset of prompts, but in general, any text (or just a single letter or character) can be a prompt. If you type a single character, such as &ldquo;$,&rdquo; that&rsquo;s a valid prompt. Just remember that the quality of a reply depends on the quality of your prompt.</p>
<p>However, prompts are not just written words. With multimodal LLMs (a fancy name for LLMs that understand different types of input), you can use your voice or upload a document, an image, or even a song. You can go further and combine these inputs. These are all valid prompts. I often upload a spreadsheet or some other document and provide a few sentences of textual instructions.</p>
<p>So, asking, &ldquo;Who is faster: Godzilla or T-Rex?&rdquo; is a valid prompt. Let me save you some tokens: the answer is T-Rex.</p>
<p>Asking an LLM to &ldquo;Write a 500-word article about the bad influence of Amazon&rsquo;s RTO policy on Lambda cold starts&rdquo; is also a valid prompt. It took ChatGPT 13 seconds to figure this one out. I was afraid to ask Claude. It&rsquo;s hosted on AWS, and I don&rsquo;t want it to be part of the next round of layoffs.</p>
<p><img alt="Amazon RTO policy prompt result" src="/images/posts/5-tips/amazon-rto-policy.png"></p>
<p>So, almost anything could be a prompt. However, not all prompts are useful. Luckily, LLMs are products, and like with many other products, LLM users (we) did many crazy things (&ldquo;lizards,&rdquo; as my friend would <a href="https://lizardoptimization.org/">call us</a>) and came up with some helpful tricks. Products evolve with user needs and feedback. LLMs are not different.</p>
<p>One of the early tricks was telling an LLM to pretend to be an expert in the field related to the prompt. This worked incredibly well until you hit the context limit (the number of tokens or characters an LLM can keep in &ldquo;memory&rdquo;). Once you reach the context limit, the conversation often takes a weird direction. Imagine talking to Dory Fish on LSD.</p>
<p>People behind LLMs figured out that some parts of our instructions are more important than others and that we often repeat that part across multiple LLM conversations. And system prompts were born. A system prompt is a repeatable part of the prompt with higher priority than the rest of your conversation, which always remains in the context (oversimplification, again).</p>
<p>Repeatability and more control are often essential when you work with LLMs. The system prompts help, but we (software developers, at least) already know how to talk to computers to get repeatable and controlled responses. We can get amazing results by combining code with LLM interaction through API.</p>
<p>How do we get these &ldquo;amazing results?&rdquo; Let&rsquo;s see the 5 habits of highly effective prompters! Oops, I forgot this is not Linkedin. I mean 5 simple tricks for better prompting results.</p>
<h2 id="five-simple-tricks-for-better-prompting-results">Five simple tricks for better prompting results</h2>
<p>This article is already too long, so there&rsquo;s no need for a cliffhanger. Here are the tricks I want to tell you about:</p>
<ol>
<li>Hint the beginning of the answer.</li>
<li>Give examples.</li>
<li>Think step by step.</li>
<li>Use tools.</li>
<li>Ask an LLM to improve the prompt.</li>
</ol>
<p>Let&rsquo;s go through them (step by step).</p>
<h3 id="hint-the-beginning-of-the-answer">Hint the beginning of the answer</h3>
<p>I told you to combine code (JavaScript, Python, or anything else; I don&rsquo;t judge) and LLM for better responses. But to do so, you need to get a structured response from an LLM.</p>
<p>While there are countless ways to structure a response, we really love JSON. So, we ask an LLM politely to answer with JSON and nothing else. But an LLM is also polite, so the conversation from time to time looks like this:</p>
<blockquote>
<p><strong>User:</strong> <em>&hellip;some super long and smart instructions&hellip;</em> Always answer with valid JSON and nothing else.</p>
<p><strong>Assistant:</strong> Here&rsquo;s your JSON:</p>
<p>```json</p>
<p>{</p>
<p>&ldquo;some&rdquo;: &ldquo;valid&rdquo;,</p>
<p>&ldquo;JSON&rdquo;: true,</p>
<p>}</p>
<p>```</p>
</blockquote>
<p>Then your JSON parser starts yelling at you, and you do the same to the LLM. Replying, &ldquo;I said JSON only!!!&rdquo; in capslock works sometimes.</p>
<p>But here&rsquo;s one more trick you can try: write the beginning of the assistant&rsquo;s reply in the API call!</p>
<p>LLM APIs expect a conversation (user -&gt; agent -&gt; user -&gt; agent, and so on). If you finish your API call with the user&rsquo;s message, LLM will reply with the assistant&rsquo;s message. However, if you finish your API call with the assistant&rsquo;s message, LLM will treat it as the beginning of its reply and simply continue.</p>
<p>Below is the code example with AWS Bedrock using AWS Node.js SDK and Bedrock Converse API. Other LLMs support something similar.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">import</span> { <span style="color:#a6e22e">BedrockRuntimeClient</span>, <span style="color:#a6e22e">ConverseCommand</span> } <span style="color:#a6e22e">from</span> <span style="color:#e6db74">&#39;@aws-sdk/client-bedrock-runtime&#39;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Create a Bedrock Runtime client in the AWS Region you want to use.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">client</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">BedrockRuntimeClient</span>({ <span style="color:#a6e22e">region</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;eu-central-1&#34;</span> })
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Set the model ID, i.e., Claude 3.5 Sonnet
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">modelId</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;anthropic.claude-3-5-sonnet-20240620-v1:0&#39;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Start a conversation with the user message.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">userPrompt</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;Return the result of all games of the NBA 2023 finals as a short structured JSON.&#39;</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">conversation</span> <span style="color:#f92672">=</span> [
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;user&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">userPrompt</span> }],
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;assistant&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;{&#34;&#39;</span> }], <span style="color:#75715e">// &lt;---- Assistant&#39;s reply starts with `{&#34;`
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  },
</span></span><span style="display:flex;"><span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Create a command with the model ID, the message, and a basic configuration.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">command</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">ConverseCommand</span>({
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">modelId</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">messages</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">conversation</span>,
</span></span><span style="display:flex;"><span>})
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">try</span> {
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Send the command to the model and wait for the response
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">response</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">client</span>.<span style="color:#a6e22e">send</span>(<span style="color:#a6e22e">command</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Extract, parse and print the response text, but prepand `{ &#34;`
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">responseText</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;{ &#34;&#39;</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">response</span>.<span style="color:#a6e22e">output</span>.<span style="color:#a6e22e">message</span>.<span style="color:#a6e22e">content</span>[<span style="color:#ae81ff">0</span>].<span style="color:#a6e22e">text</span>
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">parsedJson</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">JSON</span>.<span style="color:#a6e22e">parse</span>(<span style="color:#a6e22e">responseText</span>)
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#a6e22e">parsedJson</span>)
</span></span><span style="display:flex;"><span>} <span style="color:#66d9ef">catch</span> (<span style="color:#a6e22e">err</span>) {
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#39;ERROR&#39;</span>, <span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><h3 id="give-examples">Give examples</h3>
<p>We can use a similar approach for one more trick.</p>
<p>You probably use ChatGPT or Claude at least a few times per week. Do you always get a perfect reply after your initial prompt? I don&rsquo;t. I often ask follow-up questions, point out mistakes, or give a few examples to get the answer I am looking for. I know many other people do the same.</p>
<p>However, we often send just one detailed command via the API and expect LLMs to deliver a perfect response. If someone tries to look smart, they&rsquo;ll probably refer to this as &ldquo;zero-shot prompting.&rdquo; It&rsquo;s just a name for sending direct commands without examples.</p>
<p>Why does it need the name? It doesn&rsquo;t. But as you can expect, there&rsquo;s a fancy name for the opposite, too. If you provide examples and simulate conversation, you are using &ldquo;few-shot prompting.&rdquo;</p>
<p>For example, you can ask an LLM to label a review comment with a &ldquo;positive&rdquo; or &ldquo;negative&rdquo; label and return it in JSON format. You can do that by sending a prompt similar to the following:</p>
<blockquote>
<p>Here&rsquo;s a review comment. Return a JSON with the label &ldquo;positive&rdquo; or &ldquo;negative&rdquo; for this comment:</p>
<p>&ldquo;Super helpful, worth it!&rdquo;</p>
</blockquote>
<p>This example is simple, so the &ldquo;zero-shot prompting&rdquo; works perfectly fine. But let&rsquo;s say you want to label neutral comments as &ldquo;negative.&rdquo; LLM will not know that. You can improve your prompt or even add a detailed system prompt, which might work well. But you can also do the following:</p>
<blockquote>
<p>Label review comments and return a JSON with the label &ldquo;positive&rdquo; or &ldquo;negative.&rdquo; Here are a few examples:</p>
<ul>
<li>&ldquo;Great product 10/10&rdquo; =&gt; <code>{ &quot;label&quot;: &quot;positive&quot; }</code></li>
<li>&ldquo;Didn&rsquo;t work very well&rdquo; =&gt; <code>{ &quot;label&quot;: &quot;negative&quot; }</code></li>
<li>&ldquo;Not sure&rdquo; =&gt; <code>{ &quot;label&quot;: &quot;negative&quot; }</code></li>
</ul>
<p>Label the following comment based on the examples above:</p>
<p>&ldquo;Super helpful, worth it!&rdquo;</p>
</blockquote>
<p>This technique will improve your chance to get more precise labeling. It&rsquo;s like a mini in-prompt training.</p>
<p>But there&rsquo;s a cleaner way to do this using the same approach we used in the first trick: simulate the conversation by passing multiple user and agent messages.</p>
<p>Let&rsquo;s see the code example!</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#75715e">// Same as above
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">import</span> { <span style="color:#a6e22e">BedrockRuntimeClient</span>, <span style="color:#a6e22e">ConverseCommand</span> } <span style="color:#a6e22e">from</span> <span style="color:#e6db74">&#39;@aws-sdk/client-bedrock-runtime&#39;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">client</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">BedrockRuntimeClient</span>({ <span style="color:#a6e22e">region</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;eu-central-1&#34;</span> })
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">modelId</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;anthropic.claude-3-5-sonnet-20240620-v1:0&#39;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Start a conversation with the user message.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">systemPrompt</span> <span style="color:#f92672">=</span> [
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;Label review comments and return a JSON with the label &#34;positive&#34; or &#34;negative.&#34;&#39;</span>,
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>]
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">conversation</span> <span style="color:#f92672">=</span> [
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;user&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;Great product 10/10&#34;</span> }], <span style="color:#75715e">// Rating example
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;assistant&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;{&#34;label&#34;:&#34;positive&#34;}&#39;</span> }], <span style="color:#75715e">// Label example
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;user&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;Didn&#39;t work very well&#34;</span> }], <span style="color:#75715e">// Rating example #2
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;assistant&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;{&#34;label&#34;:&#34;negative&#34;}&#39;</span> }], <span style="color:#75715e">// Label example #2
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;user&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;Not sure&#34;</span> }], <span style="color:#75715e">// Rating example #3
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;assistant&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;{&#34;label&#34;:&#34;negative&#34;}&#39;</span> }], <span style="color:#75715e">// Label example #3
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  },
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;user&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;Super helpful, worth it!&#34;</span> }], <span style="color:#75715e">// &lt;---- ACTUAL REVIEW
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// You can even hint the begin of the reply here
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Create a command with the model ID, the message, and a basic configuration.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">command</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">ConverseCommand</span>({
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">modelId</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">system</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">systemPrompt</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">messages</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">conversation</span>,
</span></span><span style="display:flex;"><span>})
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">try</span> {
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Send the command to the model and wait for the response
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">response</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">client</span>.<span style="color:#a6e22e">send</span>(<span style="color:#a6e22e">command</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Extract, parse and print the response text
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">responseText</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">response</span>.<span style="color:#a6e22e">output</span>.<span style="color:#a6e22e">message</span>.<span style="color:#a6e22e">content</span>[<span style="color:#ae81ff">0</span>].<span style="color:#a6e22e">text</span>
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">parsedJson</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">JSON</span>.<span style="color:#a6e22e">parse</span>(<span style="color:#a6e22e">responseText</span>)
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#a6e22e">parsedJson</span>)
</span></span><span style="display:flex;"><span>} <span style="color:#66d9ef">catch</span> (<span style="color:#a6e22e">err</span>) {
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#39;ERROR&#39;</span>, <span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Easy, right? Just remember that this approach eats more tokens. And tokens = money.</p>
<h3 id="think-step-by-step">Think step by step</h3>
<p>The previous trick works well with humans, too. If you give examples when you delegate a task, there&rsquo;s a higher chance you&rsquo;ll get the task done.</p>
<p>But imagine you could use the following trick in conversation with humans, too! Let me give you an example.</p>
<p>You might have a boss or a client. One of the superpowers of bosses and clients is that they can come up with completely unreasonable ideas they want you to do now. Trust me, I was on both sides of that process.</p>
<p><img alt="A boss with an ureasonable request" src="/images/posts/5-tips/boss.png"></p>
<p>Imagine you could ask them, &ldquo;Let&rsquo;s think step by step. How would that feature help our users?&rdquo;</p>
<p><img alt="&ldquo;Think step-by-step&rdquo;" src="/images/posts/5-tips/think-step-by-step.png"></p>
<p>In practice, this scenario rarely works because people have complex emotions. Instead of them thinking through the process step by step and explaining every detail, you could end up with a new green LinkedIn badge around your profile photo.</p>
<p><img alt="#opentowork" src="/images/posts/5-tips/opentowork.png"></p>
<p>Luckily, LLMs have no emotions. Yet. So, adding &ldquo;Let&rsquo;s think step by step&rdquo; or a similar phrase uses more tokens for the response, but it often improves the response.</p>
<p>It actually works so well that OpenAI based its new O1 models on a similar concept. They said the following in <a href="https://openai.com/index/introducing-openai-o1-preview/">the announcement</a>:</p>
<blockquote>
<p>We&rsquo;ve developed a new series of AI models designed to spend more time thinking before they respond. They can reason through complex tasks and solve harder problems than previous models in science, coding, and math.</p>
</blockquote>
<p>As the O1 models already use something similar under the hood, you may not want to try this trick if you use them.</p>
<h3 id="use-tools">Use tools</h3>
<p>LLMs are powerful. We use them for everything. And, as Maslow said a long time ago, &ldquo;it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.&rdquo;</p>
<p>But LLMs suck at some specific tasks. These tasks depend on the model we use, but most models struggle with their math skills, working with dates, etc. The previous tricks might help with these skill issues in some situations but are not real solutions.</p>
<p>Luckily, all major LLMs now support tools. So, instead of using a hammer for everything, we can bring our tools for specific cases.</p>
<p>But what are tools, and why are they useful? Here&rsquo;s an example!</p>
<p>I asked Claude: &ldquo;What&rsquo;s the weather like in Montreal today?&rdquo; Even if you don&rsquo;t have the current weather info, you can bet it&rsquo;s cold in Montreal in November. But Claude tries to be helpful, and it answers the following:</p>
<blockquote>
<p>I need to clarify something: I don&rsquo;t have access to real-time weather data or current conditions. While I can discuss Montreal&rsquo;s typical weather patterns and climate, I cannot tell you today&rsquo;s specific weather conditions. For current weather information in Montreal, I&rsquo;d recommend:</p>
<ol>
<li>Checking a weather website or app</li>
<li>Looking up Environment Canada&rsquo;s weather service</li>
<li>Checking local Montreal news stations&rsquo; weather reports</li>
</ol>
<p>Would you like to know about Montreal&rsquo;s typical weather patterns for this time of year instead?</p>
</blockquote>
<p>Instead of giving up, we can give an LLM, in this case, Claude, the skill to check the current weather conditions in the specified location.</p>
<p>How do we do that?</p>
<p>Let&rsquo;s see the code example! I&rsquo;ll use AWS Bedrock Converse API with Node.js and Claude Sonnet 3.5 as a model, but this is just an example. You can do something similar with any popular model.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#75715e">// Same as above
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">import</span> { <span style="color:#a6e22e">BedrockRuntimeClient</span>, <span style="color:#a6e22e">ConverseCommand</span> } <span style="color:#a6e22e">from</span> <span style="color:#e6db74">&#39;@aws-sdk/client-bedrock-runtime&#39;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">client</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">BedrockRuntimeClient</span>({ <span style="color:#a6e22e">region</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;eu-central-1&#34;</span> })
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">modelId</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;anthropic.claude-3-5-sonnet-20240620-v1:0&#39;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Start a conversation with the user message.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">conversation</span> <span style="color:#f92672">=</span> [
</span></span><span style="display:flex;"><span>  {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;user&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">text</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;What&#39;s the weather like in Montreal?&#34;</span> }],
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Define available tools
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">toolConfig</span> <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">tools</span><span style="color:#f92672">:</span> [{
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">toolSpec</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>      <span style="color:#75715e">// The name and the description of our tool
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>      <span style="color:#a6e22e">name</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;get_weather&#39;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">description</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;Get weather conditions for a specific location&#39;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">inputSchema</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#75715e">// It accepts JSON, I told you we love JSON!
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>        <span style="color:#a6e22e">json</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>          <span style="color:#a6e22e">type</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;object&#39;</span>,
</span></span><span style="display:flex;"><span>          <span style="color:#a6e22e">properties</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>            <span style="color:#75715e">// And it expects the &#34;location&#34; property (see the type and the description)
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>            <span style="color:#75715e">// For simplicity, we&#39;ll skip the date part. Our tool will always answer with today&#39;s weather
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>            <span style="color:#a6e22e">location</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">type</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;string&#39;</span>,
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">description</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;The location for which you want the weather conditions. Location can be any city, country, etc.&#39;</span>,
</span></span><span style="display:flex;"><span>            },
</span></span><span style="display:flex;"><span>          },
</span></span><span style="display:flex;"><span>          <span style="color:#a6e22e">required</span><span style="color:#f92672">:</span> [ <span style="color:#e6db74">&#39;location&#39;</span> ],
</span></span><span style="display:flex;"><span>        },
</span></span><span style="display:flex;"><span>      },
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>  }],
</span></span><span style="display:flex;"><span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Create a command with the model ID, the message, and a basic configuration.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">command</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">ConverseCommand</span>({
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">modelId</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">messages</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">conversation</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">toolConfig</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">toolConfig</span>, <span style="color:#75715e">// Pass the tool configuration to Bedrock
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>})
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">try</span> {
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Send the command to the model and wait for the response
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">response</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">client</span>.<span style="color:#a6e22e">send</span>(<span style="color:#a6e22e">command</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// We&#39;ll see this part in the next example
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>} <span style="color:#66d9ef">catch</span> (<span style="color:#a6e22e">err</span>) {
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#39;ERROR&#39;</span>, <span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>That&rsquo;s the first part of the magic. Claude now knows it can use a tool. However, it still needs to learn how to answer the question. Let&rsquo;s add the actual skill.</p>
<p>AWS Bedrock Converse command will reply with a JSON object similar to the following:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;output&#34;</span>: {
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">&#34;message&#34;</span>: {
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;assistant&#34;</span>,
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">&#34;content&#34;</span>: [
</span></span><span style="display:flex;"><span>                {
</span></span><span style="display:flex;"><span>                    <span style="color:#f92672">&#34;toolUse&#34;</span>: {
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">&#34;toolUseId&#34;</span>: <span style="color:#e6db74">&#34;tooluse_kZJMlvQmRJ6eAyJE5GIl7Q&#34;</span>,
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">&#34;name&#34;</span>: <span style="color:#e6db74">&#34;get_weather&#34;</span>,
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">&#34;input&#34;</span>: {
</span></span><span style="display:flex;"><span>                            <span style="color:#f92672">&#34;location&#34;</span>: <span style="color:#e6db74">&#34;Montreal&#34;</span>
</span></span><span style="display:flex;"><span>                        }
</span></span><span style="display:flex;"><span>                    }
</span></span><span style="display:flex;"><span>                }
</span></span><span style="display:flex;"><span>            ]
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;stopReason&#34;</span>: <span style="color:#e6db74">&#34;tool_use&#34;</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Now that we know that, we can update the code with the following (this is a very simple example):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#75715e">// Code from above
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">try</span> {
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Send the command to the model and wait for the response
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">let</span> <span style="color:#a6e22e">response</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">client</span>.<span style="color:#a6e22e">send</span>(<span style="color:#a6e22e">command</span>)
</span></span><span style="display:flex;"><span>  
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Check if the stop signal is &#34;tool_use&#34; and if our tool was used
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">if</span> (<span style="color:#a6e22e">response</span>.<span style="color:#a6e22e">stopReason</span> <span style="color:#f92672">===</span> <span style="color:#e6db74">&#39;tool_use&#39;</span> <span style="color:#f92672">&amp;&amp;</span> <span style="color:#a6e22e">response</span>.<span style="color:#a6e22e">output</span>.<span style="color:#a6e22e">message</span>.<span style="color:#a6e22e">content</span>[<span style="color:#ae81ff">0</span>].<span style="color:#a6e22e">toolUse</span>.<span style="color:#a6e22e">name</span> <span style="color:#f92672">===</span> <span style="color:#e6db74">&#39;get_weather&#39;</span>) {
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// Extract location and toolUseId
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">location</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">response</span>.<span style="color:#a6e22e">output</span>.<span style="color:#a6e22e">message</span>.<span style="color:#a6e22e">content</span>[<span style="color:#ae81ff">0</span>].<span style="color:#a6e22e">toolUse</span>.<span style="color:#a6e22e">input</span>.<span style="color:#a6e22e">location</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">toolUseId</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">response</span>.<span style="color:#a6e22e">output</span>.<span style="color:#a6e22e">message</span>.<span style="color:#a6e22e">content</span>[<span style="color:#ae81ff">0</span>].<span style="color:#a6e22e">toolUse</span>.<span style="color:#a6e22e">toolUseId</span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// And call a function that might call some 3rd party API to get the current weather conditions
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">weatherConditionsResponse</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">getWeatherConditions</span>(<span style="color:#a6e22e">location</span>)
</span></span><span style="display:flex;"><span>    
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// Then invoke the model again with the tool_use message and the reply
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">updatedCommand</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">ConverseCommand</span>({
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">modelId</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">messages</span><span style="color:#f92672">:</span> [
</span></span><span style="display:flex;"><span>        ...<span style="color:#a6e22e">conversation</span>, <span style="color:#75715e">// Previous message
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>        <span style="color:#a6e22e">response</span>.<span style="color:#a6e22e">output</span>.<span style="color:#a6e22e">message</span>, <span style="color:#75715e">// Assistant response
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>        <span style="color:#75715e">// Tool response
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>        {
</span></span><span style="display:flex;"><span>    			<span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;user&#39;</span>,
</span></span><span style="display:flex;"><span>    			<span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [
</span></span><span style="display:flex;"><span>        		{
</span></span><span style="display:flex;"><span>            	<span style="color:#a6e22e">toolResult</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>                <span style="color:#a6e22e">toolUseId</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">toolUseId</span>,
</span></span><span style="display:flex;"><span>                <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> [
</span></span><span style="display:flex;"><span>                  {
</span></span><span style="display:flex;"><span>                    <span style="color:#75715e">// Let&#39;s assume that weatherConditionsResponse contains JSON an LLM can understand
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>                    <span style="color:#a6e22e">json</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">weatherConditionsResponse</span>,
</span></span><span style="display:flex;"><span>                  },
</span></span><span style="display:flex;"><span>                ],
</span></span><span style="display:flex;"><span>              },
</span></span><span style="display:flex;"><span>            },
</span></span><span style="display:flex;"><span>          ],
</span></span><span style="display:flex;"><span>        },
</span></span><span style="display:flex;"><span>      ],
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">toolConfig</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">toolConfig</span>, <span style="color:#75715e">// Pass the tool configuration to Bedrock
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>    })
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">response</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">client</span>.<span style="color:#a6e22e">send</span>(<span style="color:#a6e22e">updatedCommand</span>)
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Continue parsing the code
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>} <span style="color:#66d9ef">catch</span> (<span style="color:#a6e22e">err</span>) {
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#39;ERROR&#39;</span>, <span style="color:#a6e22e">err</span>)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>And that&rsquo;s it. You have a tool now! Remember that the tool can be any function in your preferred programming language (even Java, I don&rsquo;t judge you).</p>
<p>You probably heard people talking about AI agents. Sounds magical, right? But most of the time, agents are just 3 raccoons (pardon, tools) in a trench coat.</p>
<h3 id="ask-an-llm-to-improve-the-prompt">Ask an LLM to improve the prompt</h3>
<p>Can an LLM write a prompt? Remember, prompts are just commands. Often textual. And LLMs are good with text. So, in theory, yes, it can.</p>
<p>But that&rsquo;s theory. What about practice?</p>
<p>You should try it! It&rsquo;s simple. Just paste your initial prompt to ChatGPT or Claude and ask it to review and improve it. Guess what? The better instructions you provide, the better result you may expect.</p>
<p>Both <a href="https://platform.openai.com/docs/guides/prompt-generation">OpenAI</a> and <a href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/prompt-generator">Claude</a> now have their prompt generator tools. These generators are a good start.</p>
<p>But we do not need to reinvent the wheel. Giving commands and instructions evolved with humans. Many people wrote about it. For example, the military has many techniques for giving commands and communicating intent.</p>
<p>One of the useful concepts is the Commander&rsquo;s Intent Statement, which aims to explain the desired end state. An effective Commander&rsquo;s Intent statement should contain the following:</p>
<ul>
<li><em>Purpose:</em> Explain why personnel must complete the assignment.</li>
<li><em>Task:</em> Explain the objective or goal.</li>
<li><em>End state:</em> Explain what the result should look like.</li>
</ul>
<p>An easy way to improve your current prompt is to paste it in ChatGPT or Claude and ask an LLM to rewrite it to follow the Commander&rsquo;s Intent Statement. You can tell an LLM to ask you if any important details are missing.</p>
<p>However, the Commander&rsquo;s Intent Statement is just one example. I recently re-read an excellent book called <a href="https://www.amazon.com/Sources-Power-20th-Anniversary-Decisions-ebook/dp/B08BT2GHXS/ref=sr_1_1">&ldquo;Sources of Power: How People Make Decisions&rdquo;</a> by Gary Klein. The book contains a lot of useful information. While the book does not talk about LLMs, some concepts apply to both human conversations and LLMs. For example, the &ldquo;Considerations in Communicating intent&rdquo; section presents a good &ldquo;framework&rdquo; to review the quality of the command or intent statement.</p>
<p>Gary mentions that a person could present seven types of information to help the people receiving the request understand what to do. Here&rsquo;s the list of all seven types:</p>
<ol>
<li>The purpose of the task (the higher level goals)</li>
<li>The objective of the task (an image of the desired outcome)</li>
<li>The sequence of steps in the plan</li>
<li>The rationale for the plan</li>
<li>The key decisions that may have to be made</li>
<li>Antigoals</li>
<li>Constraints and other considerations</li>
</ol>
<p>You do not need to have all seven types of information for an effective prompt. For example, you need a strong sequence of steps only when the task&rsquo;s purpose and objective are not strong enough. However, the list provides good questions to ask when writing a complex prompt.</p>
<h2 id="summary">Summary</h2>
<p>Let&rsquo;s quickly recap the five tips mentioned above:</p>
<ul>
<li>Hint the beginning of an answer, LLM will continue.</li>
<li>Give examples (or use the few-shot prompting technique if you want to sound smart).</li>
<li>Ask an LLM to think step-by-step.</li>
<li>Help an LLM with tools. Remember that tools are just functions.</li>
<li>Ask an LLM to help you improve the prompt.</li>
</ul>
<p>And, before you go, I just want to state the obvious: writing effective prompts requires thinking and writing. You&rsquo;ll need to write a lot. I know! You hate writing. But you wrote thousands of lines of JavaScript or Python last month. That&rsquo;s also just a bunch of text and instructions.</p>
]]></content>
        </item>
        
    </channel>
</rss>
