Prompt Guide
1. Introduction
Prompt engineering is the process of designing and refining prompts to guide Large Language Models (LLMs) toward producing desired outputs. Whether you are a college student exploring AI for the first time or an aspiring AI engineer who needs to integrate LLMs into real-world applications, a solid grounding in prompt engineering is crucial.
In this guide, we'll walk through key concepts, parameters, and prompting techniques. We'll also discuss best practices and common pitfalls. By the end, you should understand how to harness LLMs more effectively-whether for text summarization, classification, code generation, or more creative tasks.
For end-to-end AI workflows, check out our AI Agents guide.
Why Prompt Engineering Matters
As AI becomes increasingly integrated into our daily lives and work processes, the ability to effectively communicate with these systems becomes a critical skill. Consider these real-world applications:
- Content Creation: Marketing teams using AI to generate blog posts, product descriptions, and social media content that maintains brand voice
- Code Development: Developers leveraging AI assistants to debug code, generate tests, or implement features based on specifications
- Data Analysis: Analysts using LLMs to interpret complex datasets, generate SQL queries, or summarize findings for stakeholders
- Customer Support: Companies implementing AI chatbots that can accurately understand and respond to customer inquiries
- Education: Teachers creating personalized learning materials or assessment questions tailored to specific learning objectives
In each case, the difference between a mediocre and an exceptional AI interaction often comes down to how well the human has engineered their prompts. This guide will equip you with the techniques to craft prompts that yield precise, relevant, and useful outputs across these diverse applications.
2. LLM Output Configuration
Before diving into specific prompt styles, it's important to understand the configurations that affect LLM outputs. These parameters significantly influence how the model responds and can be adjusted to match your specific needs.
Core Parameters
- Output length (Max tokens): Controls how many tokens the LLM can generate. Generating more tokens can produce more detailed responses but is more expensive and potentially slower.Example: For summarizing a long document, you might set max_tokens=150 for a brief overview or max_tokens=500 for a more comprehensive summary.
- Temperature: Governs the level of randomness in the model's output. A temperature of
0
is deterministic (greedy decoding), while higher temperatures lead to more diverse or unexpected results.Example: When generating SQL queries (temperature=0.1) vs. creative writing (temperature=0.8). Try this yourself by asking for a tagline for a coffee shop at different temperature settings. - Top-K / Top-P (Nucleus Sampling): Determines how many tokens are considered in each step (top-K) or what cumulative probability mass the model can sample from (top-P). Both can be combined for finer control.Example: top_p=0.9 means the model will only consider tokens whose cumulative probability adds up to 90%, effectively filtering out less likely options.
- Frequency/Presence Penalty: Reduces the likelihood of repetition by penalizing tokens that have already appeared (frequency) or appeared at all (presence).Example: In long-form content generation, setting frequency_penalty=0.8 can help prevent the model from repeating the same phrases.
Optimal Settings for Common Tasks
Task Type | Temperature | Top-P | Frequency Penalty | Typical Max Tokens |
---|---|---|---|---|
Code Generation | 0.1-0.3 | 0.9 | 0.1 | 1000-2000 |
Factual Q&A | 0-0.2 | 0.95 | 0 | 150-300 |
Creative Writing | 0.7-0.9 | 0.95-1.0 | 0.5-0.8 | 500-2000 |
Summarization | 0.3-0.5 | 0.9 | 0.3 | 150-400 |
Chatbot/Conversation | 0.5-0.7 | 0.9-0.95 | 0.5 | 100-300 |
Practical Examples
Example 1: Deterministic vs. Creative Settings
Prompt: "Generate a tagline for a new coffee shop called 'Byte Brew' that caters to programmers."
Temperature = 0, Top-P = 0.9
"Code Better, Coffee Faster – Byte Brew: Where Programmers Refuel."
Note how the output is functional, predictable, and straightforward.
Temperature = 0.9, Top-P = 1.0
"Compile Your Morning: Where Every Sip Debugs Your Day! { Java.brew() }"
The higher temperature allows for more creative, playful outputs.
Example 2: Max Tokens Impact
Prompt: "Explain quantum computing to me."
Max Tokens = 50
"Quantum computing uses quantum bits or 'qubits' that can exist in multiple states simultaneously, unlike classical bits. This enables quantum computers to perform certain calculations exponentially faster by..."
The explanation is cut off due to token limitations.
Max Tokens = 200
"Quantum computing uses quantum bits or 'qubits' that can exist in multiple states simultaneously, unlike classical bits. This enables quantum computers to perform certain calculations exponentially faster by exploring many solutions at once. They leverage quantum properties like superposition and entanglement to solve complex problems in cryptography, optimization, and simulation that would be practically impossible for classical computers to tackle efficiently."
The longer token limit allows for a complete explanation.
The key to mastering LLM configuration is experimentation. Start with the recommended settings for your task type, then gradually adjust parameters based on the outputs you receive. Keep a log of what works best for different scenarios in your specific application domain.
3. Prompting Techniques
3.1 Zero-Shot Prompting
Zero-shot prompting provides no examples, just an instruction or question. This is the simplest method: you simply issue your request and see if the LLM can handle it. While straightforward, crafting effective zero-shot prompts still requires precision and clarity.
Real-World Applications
- Quick content classification
- Simple data extraction tasks
- Answering factual or definitional questions
- Initial draft generation
- Simple language translations
Basic Zero-Shot Example
Prompt:
"Classify the following movie review as POSITIVE, NEGATIVE, or NEUTRAL: 'I loved the cinematography but hated the storyline.'"
Response:
NEUTRAL
Advanced Zero-Shot Example: Product Feature Extraction
Prompt:
"Extract the product specifications from this product description as a structured list of features with their values: 'The XDR-590 Smart Monitor features a 32-inch 4K UHD display with HDR10+ support. It has a 144Hz refresh rate and 1ms response time, perfect for gaming or professional work. Connectivity includes 2 HDMI 2.1 ports, 1 DisplayPort 1.4, and a USB-C hub with 90W power delivery. The monitor comes with built-in 10W stereo speakers and is VESA mount compatible (200x200mm).'"
Response:
Product Specifications:
- Model: XDR-590 Smart Monitor
- Display Size: 32-inch
- Resolution: 4K UHD
- HDR Support: HDR10+
- Refresh Rate: 144Hz
- Response Time: 1ms
- Connectivity:
- 2 HDMI 2.1 ports
- 1 DisplayPort 1.4
- USB-C hub with 90W power delivery
- Audio: Built-in 10W stereo speakers
- VESA Mount: Compatible (200x200mm)
Zero-Shot Prompting Best Practices
- Be specific about the format you want the response in
- Clearly define the task boundaries and constraints
- When extracting information, be precise about exactly what you need
- Use clear, unambiguous language
- Consider adding "think step by step" for complex tasks even in zero-shot mode
While zero-shot can be powerful, it may fail for more nuanced or complex tasks. If zero-shot performance is insufficient, try providing examples (one-shot or few-shot).
3.2 One-Shot & Few-Shot Prompting
Providing one or more examples in your prompt can significantly improve the quality of the response. One-shot prompting uses exactly one example, while few-shot prompting includes multiple examples. This approach effectively "teaches" the model what you want by demonstration, rather than just description.
When to Use Few-Shot Prompting
- Complex classification or categorization tasks
- Specific formatting or stylistic requirements
- Custom data extraction patterns
- Domain-specific tasks where the model might lack expertise
- When consistency across multiple outputs is critical
Basic One-Shot Example: Pizza Order Parsing
Prompt:
Parse a customer's pizza order into JSON: EXAMPLE: "I want a small pizza with cheese and pepperoni." JSON:{ "size": "small", "type": "normal", "ingredients": ["cheese", "pepperoni"] }Now parse: "I'd like a large pizza with tomato sauce, ham, and pineapple."
Response:
{ "size": "large", "type": "normal", "ingredients": ["tomato sauce", "ham", "pineapple"] }
Advanced Few-Shot Example: Custom Email Classification
Prompt:
Classify each customer email into one of these categories: BILLING_QUESTION, TECHNICAL_ISSUE, FEATURE_REQUEST, ACCOUNT_CHANGE, or GENERAL_INQUIRY. Example 1: Email: "I'm trying to reset my password but the link in the email isn't working. Can you help?" Category: TECHNICAL_ISSUE Example 2: Email: "I'd like to upgrade from the basic plan to the premium plan. What do I need to do?" Category: ACCOUNT_CHANGE Example 3: Email: "My credit card was charged twice this month. Please explain why." Category: BILLING_QUESTION Now classify this email: "It would be really helpful if your app could integrate with Google Calendar. Is this something you're planning to add?"
Response:
FEATURE_REQUEST
Advanced Example: Few-Shot for Structured App Review Analysis
Prompt:
Analyze app reviews and extract the following information in JSON format: 1. Overall sentiment (positive, negative, or neutral) 2. Mentioned features (list) 3. Bugs or issues reported (list) 4. Feature requests (list) 5. User experience rating (1-5 based on context) Example 1: Review: "This app is great for tracking my workouts! The interface is intuitive and I love the progress charts. Would be nice if it could sync with my Apple Watch though." Analysis:{ "sentiment": "positive", "features_mentioned": ["workout tracking", "intuitive interface", "progress charts"], "bugs_reported": [], "feature_requests": ["Apple Watch sync"], "user_experience_rating": 4 }Example 2: Review: "Constant crashes when trying to save my data. Also, the ads are way too intrusive. The concept is good but the execution is terrible." Analysis:{ "sentiment": "negative", "features_mentioned": [], "bugs_reported": ["crashes when saving data", "intrusive ads"], "feature_requests": [], "user_experience_rating": 2 }Now analyze this review: "I've been using this app for about 2 months now. The meditation sessions are fantastic and I've noticed my sleep has improved. However, sometimes the audio cuts out in the middle of a session which is frustrating. It would also be helpful to have offline downloads for when I'm traveling."
Response:
{ "sentiment": "positive", "features_mentioned": ["meditation sessions"], "bugs_reported": ["audio cuts out mid-session"], "feature_requests": ["offline downloads"], "user_experience_rating": 4 }
Few-Shot Prompting Best Practices
- Use examples that cover the range of responses you want
- Order matters: put the most representative examples first
- For classification tasks, include examples of each class
- Make sure your examples are formatted consistently
- 3-5 examples are typically sufficient; more isn't always better
- If possible, include examples that demonstrate edge cases
Few-shot prompting is one of the most effective techniques in your prompt engineering toolkit. The examples act as implicit instructions, helping the LLM understand the precise format, style, and type of response you're looking for.
3.3 System Prompting
System prompts define overarching behavior or rules for the LLM. They might require the model to return only JSON, limit the style of the response, or adhere to certain constraints. Example:
"You are a helpful assistant. Classify the following review only as 'POSITIVE', 'NEGATIVE', or 'NEUTRAL', and provide no explanation."
This instructs the LLM on how to structure output and remain consistent across multiple queries.
3.4 Role Prompting
Role prompting assigns a specific persona or perspective to the model. For example:
"Act as a travel agent focusing on family-friendly experiences in London. Suggest three day-trip ideas that are ideal for kids under 7."
Switching the role to "technical interviewer" or "motivational speaker" can drastically alter the style of the LLM's output.
3.5 Contextual Prompting
Here, you supply additional context (e.g., previous conversation text, user preferences, or background info) to guide the LLM. This helps the model focus on relevant details and produce more accurate or tailored results. For instance, letting the model know a user's location or the style they prefer.
3.6 Step-Back Prompting
This approach has the model consider a general or broader question before addressing the main question. By "stepping back," the LLM can gather contextual or conceptual knowledge, then apply it more effectively when it finally addresses the focal question.
3.7 Chain of Thought (CoT)
Chain of Thought prompting involves asking the model to reveal its intermediate reasoning steps rather than simply providing the final answer. This can dramatically improve performance on multi-step reasoning tasks (e.g., math problems).
Example:
"When I was 3, my partner was 3 times my age. Now I am 20. How old is my partner? Let's think step by step."
With CoT, the LLM explains each stage of its thought process and arrives at a more accurate final answer.
3.8 Self-Consistency
Self-consistency improves upon CoT by sampling multiple reasoning paths (using a higher temperature), then selecting the most common final answer. This is effectively "majority voting" over multiple independent reasoning chains, which can reduce errors caused by single-pass mistakes.
3.9 Tree of Thoughts (ToT)
Tree of Thoughts generalizes CoT by branching the reasoning into multiple paths, exploring them, and choosing the best approach. This can help with especially complex tasks that require branching exploration rather than a single linear chain of thought.
3.10 ReAct (Reason & Act)
ReAct is a paradigm where the LLM is given the ability to take actions (e.g., searching the web or calling APIs) between reasoning steps. This merges chain-of-thought reasoning with external tool usage. After each reasoning segment, the LLM can "act" by querying a service or performing an operation, then continuing its reasoning with the new information.
3.11 Automatic Prompt Engineering
Automatic Prompt Engineering (APE) involves letting the LLM generate multiple prompt candidates, evaluating them (often via automated metrics like BLEU or ROUGE), and then choosing or refining the best prompt. This can speed up the time-consuming trial-and-error stage of prompt design.
4. Code Prompting
LLMs can generate, translate, and debug code across various programming languages. Mastering code-specific prompts can significantly enhance developer productivity and help solve complex programming challenges.
Common Code-Related Tasks
Code Generation
- Creating functions with specific requirements
- Building API endpoints
- Scaffolding project structures
- Implementing algorithms
- Generating test cases
Code Transformation
- Converting between programming languages
- Refactoring legacy code
- Modernizing syntax
- Optimizing for performance
- Adding type annotations
Debugging & Analysis
- Finding and fixing bugs
- Explaining complex code
- Security vulnerability detection
- Code review and suggestions
- Performance bottleneck identification
Documentation
- Writing function and class documentation
- Generating README files
- Creating tutorials and examples
- API documentation
- Explaining complex algorithms
Effective Code Prompting Examples
Example 1: Function Creation with Chain of Thought
Prompt:
Write a JavaScript function that calculates the Fibonacci sequence up to the nth term using memoization for efficiency. Think through the solution step by step, explaining each component: 1. First explain the Fibonacci sequence and why memoization helps 2. Then implement the function 3. Finally, provide an example of how to call the function Make sure to include proper error handling and comments.
Example 2: Debugging with Context
Prompt:
Debug the following Python code that's supposed to merge two sorted lists into a single sorted list, but it's not working correctly. The current output shows duplicate values and sometimes misses elements. ```python def merge_sorted_lists(list1, list2): result = [] i = j = 0 while i < len(list1) and j < len(list2): if list1[i] < list2[j]: result.append(list1[i]) i += 1 else: result.append(list2[j]) j += 1 # Add remaining elements result.extend(list1[i:]) result.extend(list2[j:]) return result # Test case list1 = [1, 3, 5, 7] list2 = [2, 4, 6, 8] print(merge_sorted_lists(list1, list2)) ``` First identify the bug, then explain why it's happening, and finally provide the corrected code.
Example: 3: Language Conversion with Requirements
Prompt:
Convert this Java code to TypeScript while improving its readability and implementing modern TypeScript features like: - Proper type annotations - Optional chaining - Nullish coalescing - Destructuring where appropriate - Using ES6+ syntax ```java public class UserManager { private Map<String, User> users = new HashMap<>(); public User getUser(String id) { return users.getOrDefault(id, null); } public boolean addUser(User user) { if (user == null || user.getId() == null) { return false; } if (users.containsKey(user.getId())) { return false; } users.put(user.getId(), user); return true; } public List<String> getUserNames() { List<String> names = new ArrayList<>(); for (User user : users.values()) { if (user.getName() != null) { names.add(user.getName()); } } return names; } } public class User { private String id; private String name; private int age; private String email; public User(String id, String name, int age, String email) { this.id = id; this.name = name; this.age = age; this.email = email; } public String getId() { return id; } public String getName() { return name; } public int getAge() { return age; } public String getEmail() { return email; } } ``` Please add comments explaining the TypeScript-specific improvements you've made.
Code Prompting Best Practices
- Be Specific About Requirements: Clearly outline functionality, edge cases, and performance considerations.
- Provide Context and Constraints: Mention target environment, framework versions, or any specific libraries to use or avoid.
- Request Explanations: Ask the LLM to explain its approach or reasoning behind certain implementation choices.
- Use Input/Output Examples: Include sample inputs and expected outputs to clarify requirements.
- Iterative Refinement: Start with a basic implementation and then iteratively ask for improvements or optimizations.
- Always Review Generated Code: Never blindly use AI-generated code without understanding it and testing thoroughly.
Remember that AI-generated code should be a starting point or assistant, not a replacement for software engineering expertise. Always validate the code's correctness, security, and performance before incorporating it into production systems.
5. Best Practices
- Provide Examples: Use one-shot or few-shot prompts whenever possible to anchor the model's output.
- Design with Simplicity: Keep prompts clear and concise. Overly verbose prompts can confuse LLMs.
- Be Specific About the Output: Indicate the desired format (JSON, bullet points, paragraphs, etc.) and style (formal, humorous, etc.).
- Use Instructions Over Constraints: Positive instructions are often more effective than long lists of "do not" constraints.
- Control Max Tokens: Either configure maximum tokens or explicitly ask for "tweet-length" or "3-sentence" responses.
- Use Variables in Prompts: Parameterize aspects (e.g., city name) to reuse the same prompt shape across different inputs.
- Experiment with Input & Output Formats: Try different ways of structuring your prompt and the answer you want.
- Document Attempts: Keep a record of prompt versions, model configs, successes, and failures.
- Leverage Chain of Thought & Reasoning: For more complex tasks, step-by-step reasoning or self-consistency can greatly improve results.
6. JSON & Schemas
Returning JSON (or other structured formats) can improve reliability and limit hallucinations. Consider including a schema in your prompt to structure the output. For instance:
"Return valid JSON following this schema: { "movie_reviews": [ { "sentiment": "POSITIVE|NEGATIVE|NEUTRAL", "name": "string" } ] }"
However, generating long JSON can exceed token limits and risk truncation. Tools like json-repair
can help fix malformed outputs.
7. Summary
Prompt Engineering is an essential skill for anyone looking to effectively utilize Large Language Models. From zero-shot tasks to advanced methods like Chain of Thought, ReAct, and Tree of Thoughts, the techniques here can help you systematically refine your interactions with LLMs.
Key Takeaways
- Match Technique to Task: Different prompting methods excel at different types of tasks - learn which to apply when
- Configuration Matters: Parameters like temperature and max tokens significantly affect output quality
- Examples Are Powerful: Few-shot prompting can dramatically improve performance on specialized tasks
- Structure Improves Reliability: Using JSON schemas or other structured formats increases consistency
- Iteration Is Essential: Great prompts usually evolve through systematic testing and refinement
Practical Applications
Content Creation
- • Blog post drafting
- • Email campaign writing
- • Product description generation
- • Social media content creation
- • Translation and localization
Development
- • Code completion and generation
- • Automated testing and QA
- • API documentation generation
- • Bug identification and fixing
- • Data transformation pipelines
Business Intelligence
- • Customer feedback analysis
- • Competitive research
- • Data visualization scripting
- • Report summarization
- • Trend identification
Continued Learning
The field of prompt engineering is rapidly evolving. To stay current:
- Experiment with different models to understand their unique capabilities
- Build a personal library of effective prompts for common tasks
- Document your approaches and learnings from each project
- Participate in prompt engineering communities and forums
- Stay updated on new research papers and techniques
Final Reminder
Remember that prompt engineering is both a science and an art. While there are principles and patterns to follow, there's also tremendous room for creativity and experimentation. The most effective prompt engineers combine technical knowledge with curiosity and persistence.
By applying these techniques and best practices, you can guide LLMs more reliably and build powerful AI-driven applications across various domains.