Ten Days, Three Practice Runs: Passing the Claude Certified Architect Exam

Ten days of prep, three practice runs, and the one wrong answer that would have sunk the score.

7 min read

The first study tool I ever built was a set of flashcards in code. I was in college, more interested in programming than in memorizing dates for American History, so I wrote an app to drill myself. After graduation I built StudyStack, a flashcard-sharing site, mostly to prove to myself that I could ship a real website. Decades later, preparing for Anthropic’s first technical certification, I reached for the same instinct. I built my own practice exam.

I passed the Claude Certified Architect – Foundations exam on May 30th with a 738 score. By my own math, one more wrong answer would have put me under. This is what the ten days before that score looked like.

Why I sat it

VGV is part of the Claude Partner Network, and the company is supporting engineers getting certified to strengthen the partnership. That was the nudge. The reason it stuck is more personal.

I am one of the most seasoned engineers at VGV. I never want to be the person who stopped keeping pace, and AI is the fastest-moving thing I have worked alongside in a long career. I already use Claude Code every day building Flutter apps, I had watched hours of Anthropic’s course videos, and I had spent time in the Claude Console writing about what I found. Sitting a proctored exam that tried to measure architectural judgment sounded like a good way to find out whether daily use had turned into real understanding. Those are not the same thing, and the exam is built to tell them apart.

Starting with the practice exam

I went straight at Anthropic’s official practice exam before reading much of anything. I wanted to know what I was walking into. I was not sure whether you could retake the practice exam more than once, so I guessed you could and hoped for the best. You can, and that mattered more than I expected.

The first run taught me two things. The practice exam grades each question the moment you answer, with an explanation of why the right answer is right. I screenshotted every question I got wrong, along with the ones I felt lucky to have gotten right. The second lesson was about pace. About a third of the way through, I realized that at the speed I was reading, I was going to run out of time. Sixty questions in 120 minutes leaves you under two minutes each, and the scenarios are dense enough that two minutes goes fast. Something glitched at the end of that first run and I never saw a final score, but I think I would have landed somewhere in the 700s.

After that I had Claude quiz me. I handed it the exam guide and asked it to drill me on the domains, an hour or two a night for about a week. I cross-checked myself against another model too, to see whether they agreed on the hard ones. I also took the practice exam a couple more times, early in the morning when I knew nothing would interrupt me, screenshotting the questions that tripped me up so I could study them before the next attempt. My scores climbed. Around 850 the second time, around 930 the last.

Here is the trap in that climb. The practice exam reuses its question pool, so studying my screenshots was partly teaching me the practice answers rather than the underlying ideas. The rising numbers felt like progress. Some of it was memorization wearing the costume of mastery. The AI quizzing had the same flavor of comfort. Both Claude and the other model had me convinced I was ready. The real exam disagreed.

Exam morning

I woke up early on a Saturday, cleared my desk, and disconnected my extra monitors. A proctored exam was going to want to confirm I was not cheating, and I would rather remove the question than answer it. Signing in meant installing the proctoring software and granting it the camera and microphone. That took a couple of minutes.

I read the first question and knew immediately that the practice exam and the real exam share no questions. Nothing looked familiar. One habit carried over well and badly at the same time. The practice runs had taught me to flag hard questions and return to them, so I did. Ten questions in, I had flagged about seven of them. The timer kept moving and the fear of failing climbed with the flag count. I took a breath and kept going.

The questions come in four blocks of fifteen, each built on its own production scenario. The guide notes there are six possible scenarios and your exam draws four of them at random. My first block was the one that rattled me most. The middle two were fine. The last was manageable. I reached the final question with eighteen minutes left, went back to everything I had flagged, and found the second read far less intimidating than the first. I changed about half of those answers. The clock ran out as I was reviewing the last one.

Failing carries a real cost. If you do not pass, you cannot retake it for six months, which is most of a lifetime at the speed this field moves. That stake was sitting on every flagged question.

The hardest part was the waiting

Unlike the practice exam, the real one shows you nothing when you finish. No score, no breakdown. One Anthropic email said results within seven days, another said seven to ten, and online I had read about people getting theirs in two days or in several weeks.

The exam had been harder than I expected, and the gap between my practice scores and how the real thing felt had knocked out my confidence. I put my odds at roughly even. I tried to get myself ready to take a failure well. I checked my email and my Anthropic profile several times a day. At one point I dreamed that the results had come in and I had passed, which was a wonderful feeling until I woke up and realized I still did not know.

My best guess for the delay is that the scoring is scaled, so Anthropic is likely rotating new questions through the exam and equating each score against how everyone performed on the items you happened to get. That is speculation. What I know is that the Friday night after the exam, the email arrived. I had passed with a 738.

What the score report told me

The detailed report breaks performance down by domain, and mine read like an honest map of where I was thin. I got 44 of 60 right, a 73 percent raw accuracy that the scaling lifted to 738. Agentic Architecture and Orchestration was my strongest at 78 percent. Tool Design and MCP Integration came in at 70. Context Management and Reliability sat at 73. My weakest domain, the one the report literally flagged with “focus here,” was Claude Code Configuration and Workflows at 69 percent.

That last number is the one I keep thinking about. Claude Code is the tool I use most, every working day. Daily use had given me fluency with the parts I reach for and blind spots around the parts I do not. The exam found the blind spots. That is the whole value of a test like this. It does not reward the work you already do well.

What I would tell the next person

Build something to study, do not only read. Writing my own questions and screenshotting my wrong answers taught me more than any video, and that is the same thing the flashcard app taught me in college. Just know the limit. If your practice tool reuses its questions, your rising scores are measuring recall of that pool as much as real understanding.

Treat the AI tutors as sparring partners, not judges. Claude is good at drilling you on a topic and explaining what you missed. It is also encouraging by nature, and encouragement is not the same as readiness. Let the practice exam under real time pressure be your honesty check, not the model’s verdict.

Watch the clock from the first question, since under two minutes each is tighter than it sounds. Flag and move rather than grinding, but do not let a pile of flags rattle you early. And go in expecting the scenarios to be unfamiliar, because they will be.

One last thing surprised me more than anything on the exam itself. The certification expires six months after you pass. Pass or fail, if you want the credential you are sitting this again before the year is out. At first that felt absurd. The longer I sat with it, the more it fit the subject. The tools this exam covers change on a quarterly rhythm, and a credential that claimed to certify production judgment for three years would be lying by month four.

So I am back where I started, building study tools to keep up. The flashcards were paper-thin compared to a multi-agent CI scenario, but the instinct is the same one I have had since college. The certificate is the part that expires. The habit of learning the thing well enough to teach it back is what keeps you current.

About the Author

John Weidner

Engineering

John has been a software developer for 35 years and has built Flutter apps for 6 of them. He works for Very Good Ventures, where he has contributed to projects for Divine, Trackhouse Racing, and Scooters. On the side he runs UserBob, a remote user-testing service, and StudyStack, an educational flashcard-sharing platform.

Frequently Asked Questions

What is the Claude Certified Architect – Foundations exam?

Anthropic's first technical certification. It runs 60 questions in 120 minutes, organized into four blocks of fifteen, each built on its own production scenario drawn at random from a pool of six.

What score do you need to pass, and how long do results take?

The cutoff is 720 out of 1000; I passed with a 738. Scoring is scaled rather than a raw percentage, and the real exam shows you nothing when you finish — results arrived within about a week.

How should you prepare for the exam?

Getting above 900 on the practice exam is the bare minimum to being able to get a passing grade.

Does the Claude Certified Architect certification expire?

Yes. The credential expires six months after you pass, so keeping it current means sitting the exam again — fitting for a subject whose tooling changes on a quarterly rhythm.