Claude Code is an incredible AI development tool, but what if you could use its powerful capabilities as a headless LLM API in your own applications? I built a production-ready API wrapper that runs Claude Code in headless mode, exposing it as a RESTful API with authentication, rate limiting, and enterprise features. Here's how you can convert Claude Code into your own LLM API.
Why Build Your Own Claude Code API?
Running Claude Code as an API gives you several advantages over direct usage:
- Headless operation: No interactive terminal needed
- RESTful interface: Integrate with any application
- Authentication & security: Control who can access your API
- Rate limiting: Prevent abuse and manage costs
- Production-ready: Error handling, logging, and monitoring
- Database integration: Store prompts, responses, and usage metrics
Architecture Overview
The system consists of three main components:
- Go API Server: Handles HTTP requests, authentication, and routing
- Claude Service: Manages Claude Code execution in headless mode
- Database Layer: Stores usage data, prompts, and responses
The Claude Service Implementation
The core of this system is the Claude service that executes Claude Code commands. Here's the Go implementation:
type ClaudeService struct {
claudePath string
}
func NewClaudeService() *ClaudeService {
claudePath := "claude"
if cp := os.Getenv("CLAUDE_PATH"); cp != "" {
claudePath = cp
}
return &ClaudeService{claudePath: claudePath}
}
func (s *ClaudeService) SendPrompt(ctx context.Context, prompt string) (string, error) {
ctx, cancel := context.WithTimeout(ctx, MaxClaudeTimeout)
defer cancel()
args := []string{"--dangerously-skip-permissions", "-p", prompt}
cmd := exec.CommandContext(ctx, s.claudePath, args...)
cmd.Env = append(os.Environ(),
"HEADLESS=1",
"TERM=xterm-256color",
)
cmd.Stderr = nil
output, err := cmd.CombinedOutput()
if err != nil {
return "", fmt.Errorf("claude command failed: %w", err)
}
outputStr := stripAnsiCodes(string(output))
trimmedOutput := trimOutput(outputStr)
return trimmedOutput, nil
}
Key Implementation Details
- HEADLESS=1: Runs Claude Code without interactive terminal
- --dangerously-skip-permissions: Bypasses permission prompts for API usage
- Context timeout: Prevents hanging requests (5 minute max)
- Output sanitization: Strips ANSI codes and trims whitespace
API Handler with Authentication
The HTTP handler wraps the Claude service with proper request validation and authentication:
type ClaudeRequest struct {
Prompt string `json:"prompt" binding:"required"`
}
type ClaudeHandler struct {
claudeService *services.ClaudeService
}
func (h *ClaudeHandler) SendPrompt(c *gin.Context) {
var req ClaudeRequest
if err := c.ShouldBindJSON(&req); err != nil {
utils.ValidationError(c, "prompt is required")
return
}
if len(req.Prompt) > 100000 {
utils.ValidationError(c, "prompt too large (max 100,000 chars)")
return
}
response, err := h.claudeService.SendPrompt(c.Request.Context(), req.Prompt)
if err != nil {
utils.InternalErrorResponse(c, "Failed: "+err.Error())
return
}
utils.SuccessResponse(c, http.StatusOK, gin.H{
"response": response,
})
}
API Integration and Routing
Add the Claude endpoint to your existing Go API with authentication middleware:
protected := api.Group("")
protected.Use(middleware.AuthMiddleware(cfg))
{
protected.POST("/claude", claudeHandler.SendPrompt)
}
Usage Examples
Once your API is running, here's how to use it:
cURL Example
curl -X POST http://localhost:3007/api/claude \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{"prompt": "Explain quantum computing in simple terms"}'
JavaScript Example
const response = await fetch('http://localhost:3007/api/claude', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_TOKEN'
},
body: JSON.stringify({
prompt: 'Write a function to sort an array'
})
});
const data = await response.json();
console.log(data.response);
Production Considerations
Security
- JWT Authentication: Only authenticated users can access the API
- Rate limiting: Implement per-user request limits
- Input validation: Restrict prompt size and content
- Environment variables: Store sensitive data securely
Performance
- Request timeout: 5-minute limit prevents resource exhaustion
- Output truncation: Cap responses at 100MB
- Logging: Track usage and monitor performance
- Database storage: Log prompts and responses for analytics
Scalability
- Connection pooling: Reuse database connections
- Graceful shutdown: Handle interrupts properly
- Health checks: Monitor API status
- Load balancing: Deploy multiple instances behind a load balancer
Environment Setup
Set these environment variables for production:
# Claude Code configuration
CLAUDE_PATH=/usr/local/bin/claude
HEADLESS=1
# API configuration
PORT=3007
JWT_SECRET=your-secret-key
MONGODB_URI=mongodb://localhost:27017
Extending the API
This basic implementation can be extended with:
- Streaming responses: Real-time output delivery
- Prompt templates: Pre-defined prompt patterns
- Usage analytics: Track token consumption and costs
- Multi-model support: Switch between different Claude models
- Batch processing: Handle multiple prompts concurrently
- Webhook integration: Send responses to external systems
Benefits Over Direct Claude Code Usage
Converting Claude Code to an API provides several advantages:
- Integration: Use Claude Code in any application regardless of language
- Control: Implement custom authentication, rate limiting, and access policies
- Monitoring: Track usage, costs, and performance metrics
- Scalability: Deploy as a microservice with load balancing
- Consistency: Standardized interface for all applications
Real-World Use Cases
This API wrapper enables powerful applications:
- Automation platforms: Schedule and execute Claude Code tasks
- Chatbots: Build AI assistants using Claude Code as the backend
- Content generation: Automated article writing and code generation
- Code review systems: Integrate AI code review into CI/CD pipelines
- Data analysis: Process and analyze data using natural language
Turning Claude Code into a headless LLM API transforms it from a development tool into a production-ready service that can power any application. The Go implementation provides excellent performance, strong typing, and easy deployment, making it perfect for enterprise integrations.
I've successfully deployed this API in my automation platform, handling everything from simple code generation to complex data analysis tasks. The combination of Claude Code's AI capabilities and Go's performance creates a powerful API that can scale to meet production demands.
If you're interested in extending this further, check out my guide on building custom MCP servers or learn more about Claude Code best practices.