Part 1 of 4 in Using AI
← PrevNext →

The four levels of AI integration

When people imagine working with AI, they picture Tony Stark and Jarvis. A seamless dialogue with an omniscient assistant. "Jarvis, build me a new suit." Done.

I've spent two years coding alongside AI. The reality looks nothing like the movies.

Here's what I've learned: the Jarvis fantasy isn't wrong, it's incomplete.

The mistake most developers make is treating AI as a single tool. You either "use AI" or you don't. But the question isn't whether, it's how much and when.

I've come to see AI integration as a spectrum with four levels:

  • 1. Assistance: AI predicts your next move. You execute.
  • 2. Direction: AI implements your specification. You guide each step.
  • 3. Collaboration: AI explores and builds. You set direction, intervene, iterate.
  • 4. Integration: AI becomes part of the product. You design hybrid systems.

Each level shifts the bottleneck. Each requires different skills and different levels of trust. Get it wrong, and you'll either micromanage what could run autonomously, or blindly trust what needed supervision.

This series maps this territory, not from theory, but from experience.


Level 1: The predictive accelerator

This is where most of us started. GitHub Copilot, ChatGPT in a browser tab. The AI watches what you're doing and predicts your next move.

The first time Copilot completed an entire function from a comment, I felt like I'd discovered a cheat code. Write // validate email format and watch twenty lines of regex appear. It felt like the machine was reading my mind.

The workflow doesn't change, the friction disappears. You're still in control. The AI just handles the typing. The gap between intention and implementation shrinks.

But here's the trap I fell into.

The "Tab-Tab-Tab" syndrome. You start accepting suggestions without reading them. The code looks right. The tests pass. Ship it. Weeks later, you're debugging something and realize you have no idea how half your codebase works because you didn't actually write it, you just approved it.

And the suggestions are generic, the most common implementation, not the best one for your codebase.

Level 1 is powerful, but it's a trap if you mistake speed for quality. The goal isn't to type faster, it's to free mental energy for decisions that matter.


Level 2: The director

Level 2 is where you start telling agents exactly what to do.

Select some code, hit Cmd+K: "Refactor this to use async/await." "Add rate limiting with a 100 req/min cap." "Handle the case where user is null." Surgical strikes. You're specific, detailed, step-by-step.

Directing removes the mechanical layer, not the thinking. You still decide the architecture and discover edge cases. The agent handles the typing. The building stays with you.

At first, my prompts were vague and results were garbage. Over time, I learned the agent's patterns, when to chunk tasks, when to show examples, when to constrain ("don't use external libraries"), when to iterate.

The best specification I've found is tests. Write a test that describes what should happen, then tell the agent: "Make this pass, following our patterns." The test is unambiguous. The agent handles the mechanics. You review the logic.

The bottleneck moves. It's no longer "how fast can I type?" It's "how clearly can I express intent?" Clarity of thought becomes more valuable than typing speed.


Level 3: The collaborator

Level 2 is about precise, contained changes. Level 3 is about working with the AI on bigger problems, iterating toward a solution together.

"Add CSV export to the analytics dashboard. Follow our existing PDF export patterns."

The AI explores, proposes an approach, starts implementing. But you're not just waiting for a final result. You're in the loop, watching, redirecting, iterating. "That's the wrong pattern, look at how we did PDF." "Good, but handle the error case differently." "Now add tests."

It's a conversation toward a solution, not a single prompt and a review.

Your role shifts from guiding steps to shaping direction. You set the vision, watch where it's heading, and course-correct. You learn its tendencies, when it overcomplicates, when it misses edge cases, when it nails it. You develop a rhythm: prompt, watch, adjust, prompt again.

The tradeoff: collaboration requires more trust. You're not checking "is this line correct?" but "is this the right approach?" You need judgment to catch when the AI is heading the wrong way.

I use direction for surgical precision. Collaboration for building features where I know what I want but the path has many steps.


Level 4: The systems designer

Levels 1-3 treat AI as a tool that helps you build software. Level 4 is different. AI becomes a component of the software itself.

This is where things get interesting. You're no longer just writing code, you're building systems where predictable logic and unpredictable AI work together.

A concrete example: I built a news API that ingests articles, processes them, and serves them to users. The processing pipeline is where AI lives.

The pipeline runs every two hours:

  1. Ingest, Fetch news, filter by significance
  2. Deduplicate, Compare against recent reports to avoid repetition
  3. Classify, Assign tier (General/Niche/Off-topic) and traits
  4. Publish, Transform reports into readable articles
  5. Challenge, Generate quiz questions and fabricated article variants for a "spot the fake" game

An if/else approach can't handle semantic deduplication, is "Fed raises rates" the same story as "Federal Reserve increases interest rates amid inflation concerns"? Classification requires understanding context that no rule engine can capture.

But a pure AI approach would be chaos. What if it hallucinates a classification that doesn't exist? What if it generates a quiz with no correct answer?

The answer is hybrid architecture. Code handles the constraints; AI handles the ambiguity.

Here's what this looks like in practice:

// Code: defines the ONLY valid outputs
static readonly SCHEMA = z.object({
classification: z.enum(['GENERAL', 'NICHE', 'OFF_TOPIC']),
reason: z.string(),
traits: z.object({
essential: z.boolean(),
positive: z.boolean(),
}),
});

The AI can reason about whether an article is mainstream or niche. But it cannot return anything outside this schema. Validation ensures the AI's output fits my constraints.

It's a sandwich: Code → AI → Code

  • Code prepares the context (fetch the report data, format the prompt)
  • AI performs the reasoning (classify, generate, analyze)
  • Code validates the output (schema enforcement, domain value objects, database storage)

The mental model shifts. Some components are now unpredictable. You need validation layers, fallbacks, and graceful degradation. You're not just debugging logic errors, you're debugging reasoning errors.

Your role evolves from writing logic to orchestrating intelligence.


The map

After two years of navigating this landscape, here's how I think about it:

  • Level 1, Assistance: AI predicts → Execute faster
  • Level 2, Direction: AI implements → Guide each step
  • Level 3, Collaboration: AI explores → Set direction, iterate
  • Level 4, Integration: AI reasons → Design hybrid systems

Each level has its place. The skill is recognizing which one fits the task.

The Jarvis fantasy wasn't wrong, it was one point on a spectrum. The real skill is knowing which type of AI assistance the moment requires.


This article is the first in a series exploring each level in depth. Next: directing agents, the art of precise, surgical strikes.

The four levels of AI integration | Jean-Baptiste Terrazzoni