Why Playwright fixtures stop scaling at three layers deep

DATE 2026-05-04 READ 3 min WORDS 625 REVISIONS 0
$ git log --oneline
  • # no recorded revisions yet — git history fills in once this file has commits in main

I built a fixture graph in a hurry, four layers deep, and could not unwind it for a sprint and a half. The lesson was structural, not stylistic — and it is the kind of lesson Playwright’s docs cannot give you, because they teach the mechanism, not the cost.

The Playwright fixtures API is a cleanly designed abstraction. You declare a fixture, you depend on it from another fixture, you compose, you reuse. Two layers feels great. Three layers still feels great. At four layers, something gives.[1]

This is a post about why that happens, what the cost looks like in practice, and the refactor I landed on after deleting most of the graph and starting over.

When the graph was still tractable

Before the refactor, three Playwright API surfaces could plausibly carry a fixture graph this size. I tried two of them. The table is what I wrote on the whiteboard before picking test.extend.

APIComposition StyleTeardownVerdict
test.extenddeclarative · auto DI by namefixture body after use()chosen — feels native, but the cost was hidden
base.extend + project.useinheritance · per-project overrideoverride at config layerrejected — couples test to project config
worker fixturessingleton per workermanual at end-of-workerrejected for our case — auth state is per-test, not per-worker
# table 1 · three ways to compose Playwright fixtures, weighed at the start

Here is the fixture from the project that ran fine for about ten months. Two layers: an authedPage that depends on a browser, and a checkoutPage that depends on the authedPage. Nothing surprising.

# fixtures.ts · 17 lines · view on git
TS
import { test, expect } from "@playwright/test";
import { login } from "./helpers/auth";

// Two layers: browser → authedPage → checkoutPage. Tractable.
export const fixtures = {
  authedPage: async ({ browser }, use) => {
    const ctx = await browser.newContext();
    const page = await ctx.newPage();
    await login(page);
    await use(page);
    await ctx.close();
  }, // ~600ms cold start, predictable
  checkoutPage: async ({ authedPage }, use) => {
    await authedPage.goto("/checkout");
    await use(authedPage);
  },
};

The line I want you to look at is line 13 — the comment that says ~600ms cold start, predictable. That comment is the only thing in this file that has anything to say about cost, and it is wrong, but it is wrong in a useful way. It is wrong because the cost was already non-linear, I just couldn’t see it from inside two layers.[2]

What the graph actually looks like

Four-layer fixture graph: browser, authedPage, flow.* fan-out at layer 3, three leaf page-objects at layer 4.
# fig 1 · four layers, sketched — fan-out at L3 is the dominant cost

I had a mental model of the suite. I instrumented the run. The two are different enough that I want to put them next to each other before any diagram.

A · MENTAL MODEL
B · MEASURED
  • Three layers, one chain.
  • Five fixture nodes. Five edges.
  • Cost is roughly the cold-start of browser.newContext().
  • Adding a test is free.
  • Total edge time: ~2.4s.

  • Four layers, branching at the third.
  • Nine nodes. Eleven edges.
  • Cost is dominated by the fan-out at flow.*, not the page object setup.
  • Adding a test costs +2.4s.
  • Total edge time: ~14.7s.
# compare 1 · what I told myself vs what the test runner reported

The discrepancy is the whole article.

What I landed on after deleting most of it

The rewrite collapses the four-layer graph to two layers plus a context bag. The context bag is not a fixture — it’s a plain object — and that is the entire trick.

# rewrite.ts · 9 lines · view on git
TS
// Two layers + a plain context bag.
// The bag is NOT a fixture. That is the whole trick.

export const test = base.extend<Ctx>({
  ctx: async ({ authedPage }, use) => {
    const bag = { page: authedPage, order: null };
    await use(bag);
  },
});

The graph is now flat at the test level. Composition happens in plain function calls inside the test body, not inside the fixture chain. Cost is now linear in fixtures, not in test count.