How to Automate Your Work with AI Agents
The automation Gold Rush is here. While everyone debates AGI timelines, the real opportunity is right now: building agents that do real work while you sleep.
This is practical. This works. This is how you win.
1. Core principle
Agents are disposable workers.
They think briefly, do a job, write results, then die.
If your system cannot be restarted at any moment with zero loss of correctness, it is broken.
2. Mental model
Think in layers, not personalities.
- Tools handle reality
- Skills shape thinking
- Agents execute tasks
- Hooks clean up messes
- Commands define workflows
Bundle by lifespan, not by feature.
3. Components
3.1 Tools via MCP
Tools connect agents to the outside world.
Rules:
- One MCP per domain
- Minimal surface area
- Read and write separated
- Mounted only when needed
Think of tools as infrastructure, not teammates.
3.2 Skills
Skills are small instruction blocks that teach how to think.
Examples:
- How to analyze ad performance
- How to propose budget changes
- How to execute changes safely
Rules:
- Short
- Reusable
- Loaded explicitly per step
Skills build habits. They do not store memories.
3.3 Agents
Agents are role + skills + temporary context.
Rules:
- Short lived
- Single responsibility
- No long term memory
- No tool access beyond their job
Common agent types:
- Thinker analyzes and proposes
- Executor applies changes
- Reviewer validates outputs
Agents are interns with amnesia.
3.4 Hooks
Hooks run before or after steps. They do boring but critical work.
Examples:
- Trim input data
- Enforce safety limits
- Summarize outputs
- Clear context
- Write logs
Hooks prevent systems from drifting into chaos.
3.5 Commands
Commands are the only interface humans touch.
A command bundles:
- Which agents to spawn
- Which skills to load
- Which tools to mount
- Which hooks to run
Commands define workflows, not conversations.
4. Google Ads agent (example)
Goal: Automatically analyze Google Ads performance, propose improvements, apply safe changes, repeat on schedule.
4.1 Define the loop
The loop shape:
- Observe
- Analyze
- Plan
- Validate
- Execute
- Log
- Exit
No step skips allowed.
4.2 Thinker agent
Responsibilities:
- Read recent Ads data
- Analyze performance
- Propose changes only
Permissions:
- Google Ads read tool
Output format:
Use structured fields, not prose.
CampaignID = string
ChangeType = budget or bid or pause
Delta = percentage or amount
Reason = short sentence
ExpectedOutcome = short sentence
RiskLevel = low medium high
If output does not match format, it is rejected.
4.3 Analysis hooks
Pre analysis hook:
- Limit data window
- Normalize metrics
- Remove outliers
Post analysis hook:
- Compress reasoning
- Enforce max changes per run
- Strip raw metrics
Hooks handle data manipulation, never judgment.
4.4 Validation
Before execution, validate:
- Budget change under cap
- Max campaigns touched
- No high risk changes without approval
Put validation in code, not prompts.
4.5 Executor agent
Responsibilities:
- Apply approved changes
- Nothing else
Permissions:
- Google Ads write tool
Executor never analyzes. Executor never decides.
4.6 Execution hooks
Pre execute hook:
- Check kill switch
- Reconfirm limits
Post execute hook:
- Log diffs
- Snapshot metrics
- Clear agent context
No memory survives the run.
4.7 Persistence
All state lives outside the model.
Persist:
- Input metrics snapshot
- Proposed plans
- Applied changes
- Timestamps
You must be able to replay history without an LLM.
5. Tech Stack
This is the stack that makes it work:
Agents SDK - Orchestrates the agent lifecycle, skill loading, and state management. Handles the think-act-loop so you don’t have to.
MCP servers - Model Context Protocol connectors for your tools:
google_ads- Campaign reads and writesanalytics- Performance data extractionstorage- S3 or blob storage for logs and outputs
Each MCP is an isolated domain with minimal surface area. Read and write access are separate servers for safety.
Scheduler - Cron for simple schedules. Queue-based triggers (SQS, Pub/Sub, RabbitMQ) for event-driven automation.
DB - PostgreSQL or plain tables for structured state. S3-style logs (JSONL) for audit trails. All state outside the model.
Config repo - Versioned YAML or JSON defining budgets, limits, and safety rules. This is your safety system. Keep safety in code, not prompts.
This stack is boring. That’s the point.
6. Scheduling
Use a scheduler. Agent memory is unreliable.
Options:
- Cron every 24 hours
- Queue based triggers
Each run is stateless. Failure equals retry with same inputs.
7. Safety defaults
Hard rules:
- Max budget delta per run
- Max number of campaigns
- Manual approval for high risk
- Global kill switch
If safety lives in prompts, it will fail.
8. Scaling the system
To scale:
- Add more commands
- Add more agents
- Reuse skills
- Reuse hooks
Do not:
- Make agents smarter
- Increase context size
- Add persistent memory
Scale through repetition, not through cognition.
9. Final rule
If an agent surprises you, that is a bug.
Good agent systems feel boring, predictable, and slightly underwhelming.
That is how you know they are working.
Tools & Resources
- Agents SDK - OpenAI Agents SDK, or similar orchestration frameworks
- MCP servers - Model Context Protocol for tool integration
- Scheduler - Cron, Airflow, or cloud queue services (SQS, Pub/Sub)
- Storage - S3, Google Cloud Storage, or PostgreSQL
- Config - Versioned YAML/JSON files in Git
TL;DR
- Agents are disposable workers: think briefly, do work, write results, die
- Think in layers: tools, skills, agents, hooks, commands
- Skills are reusable habits, not memories
- Agents are interns with amnesia - short lived, single responsibility
- Validation in code, not prompts
- All state outside the model - replay history without LLMs
- Safety in code: kill switches, limits, caps (not prompts)
- Scale through repetition, not cognition
Build boring systems that work.