This website uses cookies

Read our Privacy policy and Terms of use for more information.

Edition 346 | April 19, 2026 The Dyslexic AI Newsletter by LM Lab AI

What You'll Learn Today

  • Why building an AI evaluation tool means I have to evaluate the AI tools I build with first

  • Why this recursive problem is weirder and more important than it sounds

  • The six builders I am actually considering right now (and what each one does well)

  • The evaluation criteria that matter when you are choosing a builder, not just using one

  • Why "cognitive fit for my own brain" might be the deciding factor

  • How any dyslexic thinker with an idea and a laptop can run this same decision process

Reading Time: 10 minutes Listening Time: 14 minutes

Happy Sunday.

In Edition 345 ("We Have Been Asking the Wrong Question About AI"), I left you with a cliffhanger.

I told you that if I am going to build a software product that helps businesses and families evaluate AI tools, I also need a way to evaluate the AI tools I use to build that software.

I called it meta. I promised a whole edition of its own.

This is that edition.

And I am going to tell you right up front: this is weirder and more important than it sounds.

Not because the concept is hard. It is actually pretty simple once you see it.

Because the people who understand this recursion early are going to have a huge advantage in the AI economy. And most people are still not thinking about it at all.

Let me show you what I mean.

The Problem in One Sentence

Here it is, as simply as I can put it.

Before I can build a tool that helps people choose the right AI tool for their needs, I have to choose the right AI builder for building that tool.

Yes, that is a tongue twister.

No, I did not plan it that way.

It just happens to be the honest truth about where I am sitting this morning.

Because the evaluation framework I laid out in Edition 345 requires software. Dashboards. Scoring engines. Weighted criteria. Custom templates. Export capabilities. User onboarding. Mobile responsiveness. A clean interface that does not make neurodivergent users want to throw their laptop across the room.

I cannot build that with a prompt. I need a builder.

And in April 2026, there are a lot of builders to choose from.

The Landscape Right Now

Here is what I am actually looking at. Not a theoretical list. The tools I am actively considering for the Cognitive Partner OS stack I talked about in Edition 344.

Claude Code. Anthropic's terminal-based agentic coding tool. This is my daily driver for almost everything I have been vibe coding over the last six months. It is powerful, it handles complex codebases well, and it remembers context across sessions better than most. The shipping pace right now is insane. Just in the last two weeks they have added parallel sessions, session recaps, and interactive /powerup lessons, as we covered in Edition 343.

Cursor. The AI-native IDE built on VS Code. Many developers consider it the most polished AI coding experience right now. Multi-model support. Inline completions. Agent mode. For someone doing serious engineering work, this is often the answer.

Codex. OpenAI's coding agent. I was using this yesterday when I built one of the two prototypes I mentioned in Edition 344. It got the JSON ingestion working and has solid reasoning for complex tasks.

Google AI Studio. This is where I built the other prototype yesterday. It has some interesting advantages for quick iteration and working with Gemini models directly. Not traditionally thought of as a "builder" in the same sense as the others, but for certain kinds of prototyping it is fast and accessible.

Bolt.new. I have been using Bolt since it launched, and it was my go-to tool before I got more familiar with Claude Code. It is still part of my stack. Bolt was instrumental in building most of the websites I am currently running. When I need to go from an idea to a working site fast, Bolt has been the reliable answer.

Lovable. Specifically built for rapid prototyping and stakeholder demos. If I wanted to show a family or a business what the evaluation framework looks like before building the real version, Lovable could probably get me there faster than anything else on this list.

Six builders. Six different strengths. Six different fits for different brains, different projects, different stages.

And I have to pick.

Or more accurately, I have to pick which one to start with, because most serious projects end up using more than one.

Why This Is Not Just a Technical Question

Here is where it gets interesting.

When most people write about AI coding tools, they compare them on the wrong axes.

Speed. Cost. Model support. Autocomplete quality. Enterprise features.

Those things matter. But they are not the most important thing for me.

Because I am not a traditional developer. I am a dyslexic thinker with a recreation degree who talks to his computer.

So when I evaluate a builder, the questions I ask are different.

Does it tolerate my voice-first workflow? Can I dump messy thoughts into it without being punished for imprecise syntax? Will it ask me clarifying questions when I am ambiguous, or will it just guess wrong and force me to debug its assumptions?

Does it keep context across sessions? Because if I have to re-explain the project every time I open the tool, that is cognitive load I cannot afford to pay over and over. This is the jagged frontier problem from Edition 343 applied to builders instead of end-user models.

Does it show me what it is thinking? When something breaks, can I understand why? Or does it feel like a black box that either works or does not work with no way to debug?

Does it handle error tolerance well? When I describe what I want in nonlinear, loop-heavy, voice-note language, does it still get the gist? Or do I have to clean myself up before the tool will work?

Does it let me iterate fast? Because the way my brain works, the first version is never right. The second version is closer. The fifth version is usually good. I need a builder that does not punish me for needing more cycles.

Does it fit my cognitive style? Not the abstract idea of fit. The real thing. When I use it for three hours, do I feel energized or drained?

This is the cognitive fit framework from Edition 345, applied to the tool I am choosing to build with, not just the tools I am building for others.

Every one of those questions matters more to me than speed or model size or enterprise features.

And my hunch is that they matter more to a lot of other people too. Especially neurodivergent people. Especially people who have been told their whole lives that their brains are the problem.

They are not the problem. The tools have been the problem. Now we can choose tools that fit us instead of forcing ourselves to fit the tools.

The Three-Layer Stack

Here is how I am thinking about this right now.

Layer 1. Evaluate the builders I am considering using. Claude Code. Cursor. Codex. Google AI Studio. Bolt.new. Lovable. Score them against my own cognitive fit criteria.

Layer 2. Use the winning builder (or builder combination) to create the evaluation platform itself. The Cognitive Partner OS engine for dyslexic thinkers. The business version. The family version.

Layer 3. Use that platform to help others evaluate AI tools for their own needs. Individuals. Families. Homeschoolers. Businesses. Teams. Eventually schools.

That may sound recursive. It is. But it is also practical.

Because the quality of Layer 3 depends on the quality of Layer 2. And the quality of Layer 2 depends on whether I picked the right tool at Layer 1.

Get the first layer wrong and everything above it gets harder.

This is why I keep coming back to the point that evaluation is not just a skill for end users. It is a skill for builders. For founders. For anyone who is trying to go from idea to working software in an AI-first world.

The Honest Truth About Where I Am

I want to be straight with you.

I have not made the final decision yet. I am in the middle of Layer 1.

Yesterday I got both Google AI Studio and Codex to ingest my ChatGPT JSON export. That was the first proof of concept for Engine 1 of the Cognitive Partner OS. Both of them worked. Both of them felt different. Both of them have strengths I am still figuring out.

Claude Code is probably going to be in the stack somewhere because it is already my daily driver and the new April features (parallel sessions, session recaps, team onboarding commands) are specifically designed for the kind of long-running project I am building.

Cursor is tempting for the IDE experience but I do not need a full engineering environment for most of what I do.

Bolt.new and Lovable are both interesting for the parts of this project that are user-facing. Bolt is already where most of my current websites live, so it has a built-in advantage for continuity. Lovable is the faster path for a demo-grade dashboard. If I need to ship something a family can actually use on a phone or a tablet, one of those two is probably the right answer for that layer specifically.

What I am probably going to end up with is a combination. Different tools for different layers of the same project. Claude Code for the backend logic and the evaluation engine. Bolt or Lovable for the user-facing dashboard. Google AI Studio or Codex for specific prototyping tasks where I want a different model's perspective.

The answer is not "one tool to rule them all." The answer is "the right tool for the right layer."

And I can only figure that out by running the evaluation.

This is exactly the point I made in Edition 345. The people who win in this era are not the ones chasing the hottest single tool. They are the ones who know how to evaluate fit across options and build a stack that actually works for their specific situation.

Why This Matters for Everyone (Not Just Builders)

You might be reading this thinking "Matt, I am not building software. Why should I care about this?"

Here is why.

The same recursion shows up in every important AI decision right now.

If you are picking an AI tool for your business, you are also picking the vendor ecosystem that will surround it. The integrations. The data policies. The future trajectory. The cultural fit with your team. The tool is not just the tool. It is a whole stack of choices.

If you are picking an AI tool for your homeschool, you are also picking the implicit values baked into that tool. How it handles errors. How it encourages or discourages certain kinds of thinking. How it treats a kid who does not fit the default mold.

If you are picking an AI tool for yourself, you are picking a cognitive partner that is going to shape how you think, what you focus on, and what you end up building. That is not a small decision.

Every AI tool choice is actually a stack of choices. The meta layer is real whether you are aware of it or not.

The only question is whether you are going to evaluate it deliberately, or just hope you picked right.

OK But What Do I Actually Do With This?

Three things. This week.

1. Name the Tools in Your Stack

List every AI tool you use regularly. Not just the chatbots. The ones built into your email. Your calendar. Your writing apps. Your design tools. The ones you might not think of as "AI tools" but definitely are.

You probably have more than you think. And most of them were not chosen deliberately.

2. Score Them Against Cognitive Fit

Pick the three most important ones. Score each one from one to ten on questions like: Does it fit how I actually think? Does it reduce cognitive load or add to it? Does it support voice? Does it handle my mistakes gracefully? Do I feel energized or drained after using it?

You might find out that a tool you have been using for six months is not actually the right fit. That is useful information.

3. Ask Yourself What the Next Layer Looks Like

If you are building anything (content, a business, a family system, a homeschool curriculum, a project), ask the meta question. What am I using to build the thing I am building? Is that tool the right fit? Have I evaluated it deliberately, or did I just grab the first one that worked?

That question is worth 15 minutes a week. Minimum.

What This Means for You Right Now

We are at a strange moment in AI history.

The tools are accelerating faster than anyone can track. Edition 343 covered the Stanford AI Index showing capability jumps that broke the scale. Edition 341 covered Claude's shipping pace and the sheer volume of new options launching every week. Edition 344 was me building two prototypes in one morning.

The pace is not slowing down. It is speeding up.

Which means the people who can evaluate tools well, at every layer of their stack, are going to compound advantage every month. And the people who just grab whatever is trending are going to find themselves repeatedly migrating, repeatedly frustrated, repeatedly starting over.

Sticktoitness is not just about discipline. It is about conditions, as Tobin Trevarthen said in the piece I linked in Edition 342. The tools you choose are part of your conditions. Choose badly and you are fighting your environment. Choose well and the environment supports you.

I am choosing right now. In the open. With you.

And I will tell you what I land on, why, and what I learn in the process. That is the deal.

Previously

  • Edition 345: "We Have Been Asking the Wrong Question About AI" (evaluation framework manifesto, one engine two tracks, the cliffhanger)

  • Edition 344: "I Woke Up at 4AM With a Random AI Idea" (Cognitive Partner OS, the first prototypes)

  • Edition 343: "Stanford Just Measured Everything About AI. They Forgot to Measure Us." (AI Index, jagged frontier, Stanford dyslexia research)

  • Edition 342: "The Weight in My Chest" (sticktoitness, conditions, autonomy)

  • Edition 341: "I Have Never Seen Anything Like This Before" (state of AI, building evaluation tools)

  • Edition 340: "I Have Four of the Five Layers. Time to Close the Loop." (self-improving loop)

  • Edition 339: "Your AI Just Forgot Everything. Again." (Karpathy, five-layer stack)

Matt "Coach" Ivey Founder, LM Lab AI | Creator, The Dyslexic AI Newsletter

Dictated, not typed. Obviously.

TL;DR- For My Fellow Skimmers

🔁 Before I can build a tool that helps others evaluate AI, I have to evaluate the AI tools I use to build that tool. Meta, but real.

🛠️ Six builders in the running: Claude Code, Cursor, Codex, Google AI Studio, Replit, Lovable. Each strong at different things. None of them wins on every axis.

🧠 For dyslexic thinkers, the evaluation criteria are different. Not speed or model size. Cognitive fit. Voice tolerance. Error handling. Iteration speed. Whether three hours with the tool leaves you energized or drained.

🏗️ The three-layer stack: evaluate the builders, pick the best fit, use the winning tool to build the platform that helps others evaluate AI tools for their own needs. Get Layer 1 wrong and everything above it gets harder.

🤝 The answer is usually not one tool. It is a combination. Different tools for different layers of the same project.

🧩 Every AI tool choice is actually a stack of choices. The meta layer is real whether you are aware of it or not. The only question is whether you evaluate deliberately or hope you picked right.

🔧 Three things to do this week: name all the AI tools in your stack, score the top three on cognitive fit, and ask what the next layer looks like for whatever you are building.

Reply

Avatar

or to participate

Keep Reading