#016 April 16, 2026 · 7 min read

The Pre-Launch Audit: 3 CRITICAL + 5 HIGH Findings Stella Review Caught

Hours before launch, my framework flagged three CRITICAL security issues. Here's what they were, why they mattered, and why 'vibes check' reviews fail exactly when you need them.

stella-protocol stella-review buster-call security house-of-riddle

The short version

House of Riddle was feature-complete and staged clean. Stella Review still produced 3 CRITICAL + 5 HIGH + 11 MEDIUM + 7 LOW findings. All three CRITICALs would have bitten within a week of launch: leaked answer field, missing RLS policy, committed service-role key. Not because I'm clever at security — because the review is a *named phase*, not a vibes check.

Hours before launch, my framework flagged three CRITICAL security issues. Here is what they were and why they mattered.

House of Riddle v1.0 was feature-complete. 21 routes across landing, auth, tower browsing, riddle play, leaderboard, and admin. 4 SQL migrations. Full auth with email, Google OAuth, and password reset. Server-side answer verification so answers never touch the client. 14 achievement badges auto-awarded via Postgres triggers. Staged on Vercel preview, running clean.

By any honest read, it was ready to ship.

Then I ran Stella Review — the mandatory gate before the CLOSE phase of Stella Protocol (my AI-PM methodology). The review has two sub-skills: Lilith Red (security audit) and Lilith Blue (quality and performance audit). Together they produced 3 CRITICAL and 5 HIGH findings. All three CRITICAL would have bitten me within a week of public launch.

What you'll learn
01
Three common security bugs that hide through feature-complete staging — and the exact fixes for each.
02
How to classify findings — CRITICAL vs HIGH vs MEDIUM — so some ship with deadlines and some never ship.
03
Why named review phases survive the conditions (tired, deadline-pressured) that kill informal "I'll look it over" checks.

The 3 CRITICAL findings

1. Server-side answer verification gap on community riddles

What it was. Admin-created riddles worked correctly — the answer stayed server-side, the client submitted a guess via POST, and the server returned a boolean. Clean.

Community riddles, a stretch feature in v1.0, had a subtle bug. The GET response included the full riddle object — including the answer field. The component did not render it, but it was in the network payload. Open DevTools, read the answer.

Why critical. Breaks the core product promise. Leaderboard integrity and achievement legitimacy depend on answers being unguessable by inspection. Trivially exploitable — no auth bypass, just DevTools.

Fix. Answer moved to a separate column, only fetched by the submission endpoint. Client-facing GET returns question text and metadata. Submission POST verifies server-side and returns boolean + XP delta.

2. RLS policy missing on community_riddles table

What it was. RLS was correctly configured on submissions and profiles. The migration that added community_riddles shipped without RLS policies. Any authenticated user could hit the Supabase REST API and DELETE or UPDATE any riddle.

Why critical. One malicious user could wipe the community library. RLS is silent when not enabled — nothing fails, it just works too well.

Fix. Four RLS policies on community_riddles:

alter table community_riddles enable row level security;

create policy "public can read published"
  on community_riddles for select
  using (status = 'published');

create policy "author can update own"
  on community_riddles for update
  using (auth.uid() = author_id);

create policy "author can delete own"
  on community_riddles for delete
  using (auth.uid() = author_id);

create policy "authenticated can insert as self"
  on community_riddles for insert
  with check (auth.uid() = author_id);

SELECT public for published riddles. UPDATE and DELETE scoped to author. INSERT forced to stamp the current user.

3. Environment secret in committed seed file

What it was. A dev seed script. I hardcoded the Supabase service role key for local convenience. “I will move this to env before I commit.” I did not. Four commits back.

Why critical. Service role key bypasses RLS — root access to the database. Public repo exposure equals full DB compromise.

Fix. Rotate the key via Supabase dashboard. Rewrite git history with git filter-repo and force push. Update seed to require SUPABASE_SERVICE_ROLE_KEY from env. Add a pre-commit hook that greps for common key prefixes (sbp_, eyJ, sk_).

The hook matters most. Rotation fixes this incident. The hook prevents the next one.

The 5 HIGH findings

Documented, shipped with timing plan (fix within 2 weeks of public launch):

  • Rate limiting absent on auth endpoints. Sign-in and password reset had no throttle. Added Vercel middleware-level rate limit.
  • N+1 query in leaderboard. Fine at 100 users, 30s at 10k. Fixed with a single SQL using row_number() over (order by xp desc).
  • No input length limit on community riddle text. DB bloat risk, XSS surface if rendering ever shifts. Added 2000-char limit client + server.
  • Missing CSRF token on profile update. Low exploit value, but CSRF is a cheap check with a known pattern.
  • Admin routes not gated by admin middleware. UI did not link them for non-admins, but URLs were reachable if guessed. Added isAdmin check in proxy.ts for /admin/*.

Buster Call classification

Stella Protocol has a named skill — Buster Call — for how to handle review findings. The rule is explicit:

  • CRITICAL. Blocks ship. Fix before deploy. No exceptions, no “we will follow up.” A CRITICAL that ships undocumented is a regression in framework discipline.
  • HIGH. Ships with a timing plan. Logged in brain/vivre-cards.md — the append-only decision log — with owner and deadline (default: 2 weeks post-launch). Acceptable to ship because impact is bounded, but the clock starts.
  • MEDIUM. Documented for P1 or next release. Not a blocker individually. Stacked — three MEDIUMs on the same surface area — elevates to HIGH treatment.
  • LOW. Noted in review. Optional. Often cleanup or style.

House of Riddle v1.0 shipped with 3 CRITICAL fixed (4 hours of work, all same day), 5 HIGH documented with deadlines, 11 MEDIUM queued for P1, 7 LOW noted.

What Stella Review does not cover

Stella Review is not a complete security bar. Scope is defined:

  • Manual pentest. Social engineering, chained exploits, crypto review — not in scope. For products handling real money or PII, bring in a human.
  • Deep accessibility. I run axe and Lighthouse, but manual keyboard and screen-reader passes need human time. P1 list.
  • Load testing. Separate concern. Leaderboard tested at 10k synthetic rows after the N+1 fix; not 100k concurrent sessions.
  • User research. Stella is maker-side validation, not user validation. Different gate.

The real point

This is not a “I am great at security” story. Every finding is catchable by any dev with a slow afternoon and a checklist. Hardcoded secrets, missing RLS, leaky payloads — known failure modes with known fixes.

The story is consistency. Stella Review is a named, mandatory phase on every project, at the same point, with the same checklist. I do not decide whether to audit. I audit because the phase is on the timeline.

Informal “I will look it over before launch” reviews fail exactly when I am tired or deadline-pressured. Named phases do not fail for those reasons. They run, or the phase does not complete.

Lesson

Security as a named phase beats security as a vibes check. If “audit” is not on your timeline, it does not happen.


Related:


Key Takeaways

  1. Feature-complete does not mean ship-ready. The bugs that bite in week one — leaked fields, missing RLS, committed secrets — are invisible to “does it work?” testing. They surface only against a structured review checklist.
  2. Classify findings so priorities are explicit, not emotional. CRITICAL blocks ship. HIGH ships with a clock. MEDIUM queues for P1. LOW is optional. Without the tiers, every finding either becomes “must-fix” and you never ship, or “we’ll get to it” and you never do.
  3. Make the review a named phase, not a vibes check. Informal reviews fail under the exact conditions you need them most — late, tired, deadline-pressured. A named phase runs or the phase isn’t done. That’s the whole trick.

Satellite: Lilith Red (security audit) · Lilith Blue (quality audit) · Morgans (this post) · Pipeline: AUDIT — Stella Review → Morgans