Building AI Agents with LangChain Tutorial: Complete Guide from Setup to Production

LangChain has become the de facto framework for building AI agents—and for good reason. It provides abstractions that handle the messy parts (prompt management, memory, tool calling, chains) while giving you flexibility to customize everything.

But most building AI agents with LangChain tutorial content stops at toy examples: "Here's how to build a chatbot that tells jokes!" What you actually need is a production-grade guide that covers real architecture patterns, error handling, monitoring, and deployment.

This tutorial takes you from zero to a working AI agent that can actually be deployed in production. You'll learn LangChain fundamentals, build a real customer support agent, and understand the patterns that scale.

Why LangChain for AI Agents?

LangChain solves problems you'd otherwise build yourself:

1. LLM provider abstraction: Swap between OpenAI, Anthropic, Cohere, or local models without rewriting code.

2. Memory management: Built-in conversation history, vector store integration, and context handling.

3. Tool/function calling: Standard interface for giving agents access to APIs, databases, and external systems.

4. Chains and agents: Compose multi-step workflows, decision trees, and autonomous behaviors.

5. Ecosystem: Integrations with vector databases (Pinecone, Weaviate), observability tools (LangSmith), and deployment platforms.

Alternative frameworks exist (AutoGPT, Microsoft Semantic Kernel, Haystack), but LangChain has the largest community, most integrations, and best documentation. Check our comparing AI agent frameworks 2026 guide for detailed comparisons.

Prerequisites

You need:

Python 3.9+
OpenAI API key (or Anthropic/other LLM provider)
Basic understanding of async programming (optional but helpful)

Install dependencies:

pip install langchain langchain-openai langchain-community python-dotenv

Set up environment variables:

# .env file
OPENAI_API_KEY=sk-...

LangChain Fundamentals: Core Concepts

Before building agents, understand the building blocks.

1. LLMs and Chat Models

LLMs: Text-in, text-out models (legacy, less common now).

Chat Models: Message-in, message-out models (GPT-4, Claude, Gemini).

from langchain_openai import ChatOpenAI

# Initialize chat model
llm = ChatOpenAI(
    model="gpt-4o",
    temperature=0.7,
    max_tokens=1000
)

# Simple invocation
response = llm.invoke("What is LangChain?")
print(response.content)

2. Prompts and Templates

Manage prompts as reusable templates:

from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant specializing in {domain}."),
    ("human", "{user_input}")
])

# Format and invoke
formatted_prompt = prompt.format_messages(
    domain="technical support",
    user_input="How do I reset my password?"
)

response = llm.invoke(formatted_prompt)

3. Chains

Compose multiple steps into a pipeline:

from langchain.chains import LLMChain

chain = prompt | llm

# Invoke chain
response = chain.invoke({
    "domain": "customer support",
    "user_input": "What's your refund policy?"
})

4. Memory

Maintain conversation context across turns:

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(return_messages=True)

# Store conversation
memory.save_context(
    {"input": "Hi, I need help with my order"},
    {"output": "I'd be happy to help! Can you provide your order number?"}
)

# Retrieve context
history = memory.load_memory_variables({})
print(history["history"])

5. Tools

Give agents access to external functions:

from langchain.agents import tool

@tool
def search_knowledge_base(query: str) -> str:
    """Search the company knowledge base for information."""
    # Implement actual search logic
    results = knowledge_base.search(query)
    return results

# Agent can now call this tool when needed

Building Your First Agent: Customer Support Assistant

Let's build a real agent that can:

Answer questions from a knowledge base
Look up order status
Create support tickets

Step 1: Define Tools

from langchain.agents import tool
import json

# Mock database (replace with real DB in production)
orders_db = {
    "ORD-12345": {"status": "shipped", "tracking": "1Z999AA10123456784"},
    "ORD-12346": {"status": "processing", "tracking": None}
}

tickets_db = []

@tool
def search_docs(query: str) -> str:
    """Search company documentation and knowledge base."""
    # Mock implementation - replace with real vector search
    docs = {
        "refund": "Refunds are processed within 5-7 business days. Contact support@company.com.",
        "shipping": "Standard shipping takes 3-5 days. Express shipping available.",
        "returns": "Returns accepted within 30 days with original receipt."
    }
    
    for key, value in docs.items():
        if key in query.lower():
            return value
    
    return "No relevant documentation found. Please contact support."

@tool
def lookup_order(order_id: str) -> str:
    """Look up order status by order ID."""
    order = orders_db.get(order_id.upper())
    
    if not order:
        return f"Order {order_id} not found."
    
    status = f"Order {order_id}: Status = {order['status']}"
    if order['tracking']:
        status += f", Tracking = {order['tracking']}"
    
    return status

@tool
def create_ticket(issue_description: str) -> str:
    """Create a support ticket for complex issues requiring human review."""
    ticket_id = f"TKT-{len(tickets_db) + 1:05d}"
    
    tickets_db.append({
        "id": ticket_id,
        "description": issue_description,
        "status": "open"
    })
    
    return f"Support ticket {ticket_id} created. Our team will respond within 24 hours."

# Tool list
tools = [search_docs, lookup_order, create_ticket]

Step 2: Create the Agent

from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import ChatOpenAI

# Initialize LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Agent prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a helpful customer support agent for TechCorp.

Your goal is to resolve customer issues efficiently using available tools:
- Search documentation for policy/product questions
- Look up order status when given an order ID
- Create support tickets for complex issues you cannot resolve

Be polite, concise, and helpful. If you don't know something, say so and create a ticket."""),
    MessagesPlaceholder(variable_name="chat_history", optional=True),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

# Create agent
agent = create_openai_tools_agent(llm, tools, prompt)

# Create executor (runs the agent loop)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,  # Show reasoning steps
    max_iterations=5,  # Prevent infinite loops
    handle_parsing_errors=True
)

Step 3: Add Conversation Memory

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Run agent with memory
def chat(user_input):
    response = agent_executor.invoke({
        "input": user_input,
        "chat_history": memory.load_memory_variables({})["chat_history"]
    })
    
    # Save to memory
    memory.save_context(
        {"input": user_input},
        {"output": response["output"]}
    )
    
    return response["output"]

# Test conversations
print(chat("Hi! What's your refund policy?"))
print(chat("Can you check order ORD-12345?"))
print(chat("I received a damaged item, can you help?"))

Step 4: Error Handling and Validation

Production agents need robust error handling:

def safe_chat(user_input, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = agent_executor.invoke({
                "input": user_input,
                "chat_history": memory.load_memory_variables({})["chat_history"]
            })
            
            # Validate response
            if not response or "output" not in response:
                raise ValueError("Invalid agent response")
            
            # Save to memory
            memory.save_context(
                {"input": user_input},
                {"output": response["output"]}
            )
            
            return response["output"]
            
        except Exception as e:
            print(f"Attempt {attempt + 1} failed: {e}")
            
            if attempt == max_retries - 1:
                # Final fallback
                return "I'm experiencing technical difficulties. A support ticket has been created, and our team will contact you shortly."
            
            # Retry with exponential backoff
            time.sleep(2 ** attempt)

Advanced Patterns

Multi-Agent Systems

Build agents that delegate to specialized sub-agents:

# Specialist agents
billing_agent = create_billing_agent()
technical_agent = create_technical_agent()

# Router agent
router_prompt = """Route customer queries to the appropriate specialist:
- Billing issues → billing_agent
- Technical problems → technical_agent
- General questions → answer directly
"""

@tool
def route_to_billing(query: str) -> str:
    """Route billing questions to billing specialist."""
    return billing_agent.invoke(query)

@tool
def route_to_technical(query: str) -> str:
    """Route technical issues to technical specialist."""
    return technical_agent.invoke(query)

router_tools = [route_to_billing, route_to_technical]
router_agent = create_openai_tools_agent(llm, router_tools, router_prompt)

Learn more about multi-agent orchestration patterns for complex systems.

Custom Tool Schemas

Provide detailed schemas to help agents use tools correctly:

from langchain.tools import Tool

lookup_order_tool = Tool(
    name="lookup_order",
    func=lookup_order,
    description="""Look up order status by order ID.
    
    Input should be a valid order ID in format ORD-XXXXX (e.g., ORD-12345).
    
    Returns order status and tracking information if available.
    
    Example: lookup_order("ORD-12345") -> "Order ORD-12345: Status = shipped, Tracking = 1Z999AA10123456784"
    """,
)

Streaming Responses

Provide real-time feedback for better UX:

from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

# Enable streaming
llm_streaming = ChatOpenAI(
    model="gpt-4o",
    streaming=True,
    callbacks=[StreamingStdOutCallbackHandler()]
)

agent_streaming = create_openai_tools_agent(llm_streaming, tools, prompt)
executor_streaming = AgentExecutor(agent=agent_streaming, tools=tools, verbose=True)

# Stream response
for chunk in executor_streaming.stream({"input": "What's your shipping policy?"}):
    print(chunk, end="", flush=True)

RAG (Retrieval-Augmented Generation)

Integrate vector search for knowledge-intensive tasks:

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import TextLoader

# Load and chunk documents
loader = TextLoader("company_docs.txt")
documents = loader.load()

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)
chunks = text_splitter.split_documents(documents)

# Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)

# Create retrieval tool
@tool
def search_knowledge_base(query: str) -> str:
    """Search company knowledge base using semantic similarity."""
    results = vectorstore.similarity_search(query, k=3)
    
    context = "\n\n".join([doc.page_content for doc in results])
    return context

# Add to agent tools
tools.append(search_knowledge_base)

Monitoring and Observability

Production agents need logging and tracing:

LangSmith Integration

import os

# Enable LangSmith tracing
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "ls-..."
os.environ["LANGCHAIN_PROJECT"] = "customer-support-agent"

# All agent calls now automatically traced in LangSmith dashboard
response = agent_executor.invoke({"input": user_query})

Custom Logging

import logging
from datetime import datetime

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def logged_chat(user_input):
    start_time = datetime.utcnow()
    
    try:
        response = agent_executor.invoke({"input": user_input})
        
        logger.info({
            "timestamp": start_time.isoformat(),
            "user_input": user_input,
            "agent_output": response["output"],
            "tools_used": [step.tool for step in response.get("intermediate_steps", [])],
            "latency_ms": (datetime.utcnow() - start_time).total_seconds() * 1000,
            "status": "success"
        })
        
        return response["output"]
        
    except Exception as e:
        logger.error({
            "timestamp": start_time.isoformat(),
            "user_input": user_input,
            "error": str(e),
            "status": "error"
        })
        
        raise

Deployment Patterns

REST API with FastAPI

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel

app = FastAPI()

class ChatRequest(BaseModel):
    message: str
    session_id: str

@app.post("/chat")
async def chat_endpoint(request: ChatRequest):
    try:
        # Get or create session memory
        memory = get_or_create_memory(request.session_id)
        
        # Run agent
        response = agent_executor.invoke({
            "input": request.message,
            "chat_history": memory.load_memory_variables({})["chat_history"]
        })
        
        # Save to memory
        memory.save_context(
            {"input": request.message},
            {"output": response["output"]}
        )
        
        return {"response": response["output"]}
        
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

# Run: uvicorn app:app --host 0.0.0.0 --port 8000

Async Processing for Scale

import asyncio
from langchain.callbacks.manager import AsyncCallbackManager

async def async_chat(user_input):
    response = await agent_executor.ainvoke({"input": user_input})
    return response["output"]

# Process multiple requests concurrently
async def handle_batch(requests):
    tasks = [async_chat(req) for req in requests]
    responses = await asyncio.gather(*tasks)
    return responses

Docker Deployment

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Testing AI Agents

Unit test tools and chains:

import pytest

def test_search_docs():
    result = search_docs("refund policy")
    assert "5-7 business days" in result

def test_lookup_order():
    result = lookup_order("ORD-12345")
    assert "shipped" in result.lower()

def test_agent_response():
    response = agent_executor.invoke({"input": "What's your shipping policy?"})
    assert response["output"]
    assert len(response["output"]) > 0

Integration tests with LLM calls:

@pytest.mark.integration
def test_agent_end_to_end():
    # Test full conversation flow
    response1 = chat("Hi, I need help")
    assert "help" in response1.lower()
    
    response2 = chat("Check order ORD-12345")
    assert "shipped" in response2.lower()

Performance Optimization

See our detailed guide on ai agent cost optimization strategies for cost-effective agent design.

Caching

from langchain.cache import InMemoryCache
from langchain.globals import set_llm_cache

# Enable caching
set_llm_cache(InMemoryCache())

# Identical prompts return cached results (near-instant, $0 cost)

Prompt Optimization

# Minimize tokens
short_prompt = ChatPromptTemplate.from_messages([
    ("system", "Customer support agent. Be concise."),
    ("human", "{input}")
])

# vs verbose prompt (3x more tokens, 3x more cost)

Common Pitfalls and Solutions

1. Agent loops infinitely

Set max_iterations on AgentExecutor
Improve tool descriptions so agent knows when to stop

2. Agent ignores tools

Use function calling models (GPT-4, Claude 3+)
Provide clear, detailed tool descriptions with examples

3. Context window exceeded

Use conversation summarization
Implement context pruning (keep only recent messages)

4. High latency

Enable async processing
Use streaming for perceived responsiveness
Cache common queries

5. Inconsistent behavior

Lower temperature (0-0.3 for deterministic behavior)
Add output validation and retry logic

Conclusion

Building AI agents with LangChain gives you production-ready abstractions, ecosystem integrations, and flexibility to customize everything. You've learned to:

Set up LangChain and understand core concepts
Build a functional customer support agent with tools and memory
Add error handling, logging, and monitoring
Deploy agents as APIs with FastAPI
Optimize performance and cost

From here, explore advanced patterns like multi-agent systems, RAG pipelines, and custom tool development. The LangChain documentation and community are excellent resources as you build more complex agents.

Build AI That Works For Your Business

At AI Agents Plus, we help companies move from AI experiments to production systems that deliver real ROI. Whether you need:

Custom AI Agents — Autonomous systems that handle complex workflows, from customer service to operations
Rapid AI Prototyping — Go from idea to working demo in days using vibe coding and modern AI frameworks
Voice AI Solutions — Natural conversational interfaces for your products and services

We've built AI systems for startups and enterprises across Africa and beyond.

Ready to explore what AI can do for your business? Let's talk →

Building AI Agents with LangChain Tutorial: Complete Guide from Setup to Production

Building AI Agents with LangChain Tutorial: Complete Guide from Setup to Production

Why LangChain for AI Agents?

Prerequisites

LangChain Fundamentals: Core Concepts

1. LLMs and Chat Models

2. Prompts and Templates

3. Chains

4. Memory

5. Tools

Building Your First Agent: Customer Support Assistant

Step 1: Define Tools

Step 2: Create the Agent

Step 3: Add Conversation Memory

Step 4: Error Handling and Validation

Advanced Patterns

Multi-Agent Systems

Custom Tool Schemas

Streaming Responses

RAG (Retrieval-Augmented Generation)

Monitoring and Observability

LangSmith Integration

Custom Logging

Deployment Patterns

REST API with FastAPI

Async Processing for Scale

Docker Deployment

Testing AI Agents

Performance Optimization

Caching

Prompt Optimization

Common Pitfalls and Solutions

Conclusion

Build AI That Works For Your Business

About AI Agents Plus Editorial

Related Posts

LLM Agent Telemetry Signals and Monitoring Best Practices

LangChain vs AutoGen 2026: Choosing the Right Framework for Multi-Agent Systems

LangChain vs LlamaIndex vs Semantic Kernel: Complete Framework Comparison 2026

Ready to Transform Your Business with AI?