Free · Top 25 questions

Playwright (JavaScript) interview questions

The 25 Playwright JavaScript questions that decide modern E2E interviews — auto-waiting, locators, fixtures and the framework comparisons.

Each answer has three parts — Concept (the real difference), Interview Strategy (how to say it in the room), and Code (a quick glimpse). This is the same format as my paid kits.

300+ questions in the full kit

300+ Playwright JavaScript SDET Q&A

25 of 300+. The full kit goes deep on architecture, locators, network mocking, API testing, fixtures, CI/CD, AI/MCP and the senior SDET track — all in this three-pill format.

Get the full kit →

Playwright vs Selenium — what is the real difference?

Mid

Concept

The headline difference is architecture. Selenium speaks the W3C WebDriver protocol, historically an HTTP request per command, which adds latency and is why you lean on explicit waits. Playwright keeps a persistent connection to the browser (CDP for Chromium, dedicated protocols for Firefox and WebKit) and bakes auto-waiting into every action through actionability checks, which removes a whole class of flakiness. Playwright also bundles its own browser builds, ships tracing, codegen and network interception in the box, and parallelism is free. Where Selenium still wins: a far larger ecosystem and Grid maturity, the widest language support, and it drives real installed browsers, which some compliance setups require.

Interview Strategy

Lead with architecture — Playwright's persistent connection with built-in auto-waiting versus Selenium's per-command WebDriver round-trips — because that one difference explains the flakiness gap everything else follows from. Then be scrupulously fair about where Selenium still wins: the broadest language and grid ecosystem and mature real-device clouds. The senior signal is naming Selenium 4's BiDi protocol, which shows you know the gap has narrowed and you are comparing today's tools, not a stale caricature.

How to phrase it

The real difference is architectural, not cosmetic. Selenium speaks the W3C WebDriver protocol, which historically meant an HTTP request per command, and that latency plus the lack of built-in waiting is why Selenium suites lean so heavily on explicit waits. Playwright keeps a persistent connection to the browser and bakes auto-waiting into every action. That single design choice removes a whole category of flakiness you fight constantly in Selenium. Playwright also bundles its own browser builds, ships tracing, codegen and network interception, and gives you parallelism for free. But I am careful to be fair, because pretending Selenium is obsolete is a junior tell. Selenium still wins on ecosystem and Grid maturity, on the widest language support, and it drives the real installed browser, which some compliance environments require. And Selenium 4 added BiDi and relative locators, so the gap is narrower than people assume. My honest landing is that for greenfield JavaScript or TypeScript web testing, Playwright is the default, but Selenium remains the right call for a polyglot org with a mature Grid investment.

Key points to hit

Selenium: W3C WebDriver, historically HTTP-per-command — hence heavy explicit waits.
Playwright: persistent connection plus built-in auto-waiting removes a class of flakiness.
Playwright bundles browsers and ships tracing, codegen and interception; parallelism is free.
Selenium still wins: ecosystem, Grid maturity, widest language support, real installed browsers.
Selenium 4's BiDi narrows the gap, but Playwright is the greenfield JS or TS default.

Code

// Playwright: auto-waits for the element to be actionable
await page.getByRole("button", { name: "Submit" }).click();

// Selenium (Java) equivalent: the explicit wait is on you
// new WebDriverWait(driver, Duration.ofSeconds(10))
//   .until(ExpectedConditions.elementToBeClickable(
//     By.cssSelector("button[type=submit]"))).click();

Playwright vs Cypress — which would you pick for E2E?

Mid

Concept

The dividing line is where the test runs. Cypress runs inside the browser, in the same event loop as the app, which gives a superb time-travel debugger and direct app-state access. Playwright runs out of process over a protocol, which is what lets it do things Cypress structurally struggles with: true multi-tab and multi-origin flows, multiple browser contexts, and full browser control. Three concrete Playwright wins: WebKit and Safari support, which Cypress still does not test at all; free built-in parallelism, where Cypress's optimal parallel runs are tied to paid Cypress Cloud; and multi-language bindings versus Cypress being JavaScript and TypeScript only. Cypress keeps a real edge in interactive debugging and component-testing experience.

Interview Strategy

Anchor on the architectural split — Playwright drives the browser out-of-process while Cypress runs inside it — because nearly every practical difference follows from that one fact. Then give the three concrete Playwright wins: real WebKit and Safari coverage, free parallelism, and multiple language bindings. The senior signal is conceding Cypress's genuine strengths — its time-travelling debugger and developer experience are excellent — so you sound like an engineer weighing tradeoffs rather than a fan.

How to phrase it

The split that explains everything else is where the test runs. Cypress runs inside the browser, in the same event loop as the application, which is exactly why its time-travel debugger is so good and why it has such direct access to app state. Playwright runs out of process over a protocol, and that is what lets it do the things Cypress structurally finds hard — true multi-tab flows, multi-origin scenarios like an OAuth redirect, multiple browser contexts for multi-user tests. I give three concrete advantages. First, WebKit — Playwright tests the Safari engine and Cypress simply does not, which matters for any product with real Safari traffic. Second, parallelism is free and built in, whereas Cypress's good parallel story is tied to paid Cypress Cloud, so the cost compounds on large suites. Third, Playwright has bindings in several languages while Cypress is JavaScript and TypeScript only. I do concede Cypress's strengths, because they are real — the interactive debugging and component testing are genuinely lovely. But my honest landing is that Cypress is delightful for a Chrome-only single-page app, and Playwright is the safer default the moment you need cross-browser, especially Safari.

Key points to hit

Cypress runs in-browser (great debugger, app-state access); Playwright runs out-of-process.
Out-of-process enables true multi-tab, multi-origin and multiple browser contexts.
Playwright tests WebKit and Safari — Cypress does not test the Safari engine at all.
Playwright parallelism is free; Cypress's optimal parallel runs need paid Cypress Cloud.
Cypress keeps an edge in interactive debugging and component-testing experience.

Code

// Playwright: a multi-origin OAuth popup — structurally easy out-of-process
const [authPopup] = await Promise.all([
  context.waitForEvent("page"),
  page.getByRole("button", { name: "Sign in with Google" }).click(),
]);
await authPopup.getByLabel("Email").fill(process.env.OAUTH_USER);
await authPopup.getByRole("button", { name: "Next" }).click();

Playwright vs Puppeteer — aren't they the same team?

Mid

Concept

They share DNA — the Playwright team at Microsoft includes engineers who built Puppeteer at Google — so the APIs feel familiar. The difference is about scope. Puppeteer is a browser-automation library, Chromium-first with only limited Firefox support, and it has no built-in test runner — you bring Jest or Mocha and assemble the harness yourself. Playwright is a full testing framework: cross-browser across Chromium, Firefox and WebKit, with its own test runner, fixtures, parallelism and retries, plus auto-waiting, tracing and codegen. The practical rule is by job: reach for Puppeteer for scraping, PDF generation or single-browser scripting, and reach for Playwright for cross-browser end-to-end testing.

Interview Strategy

Note the shared lineage up front — the same team, a common ancestry — so you sound informed rather than dismissive, then frame the real difference as scope: Puppeteer is a browser-automation library, Playwright is a full testing framework with a runner, fixtures and assertions. The senior signal is the use-case rule that shows judgement — Puppeteer for scraping, PDF generation and Chrome-centric scripting, Playwright for a cross-browser end-to-end suite that needs structure.

How to phrase it

They share DNA — the Playwright team at Microsoft includes engineers who built Puppeteer at Google — so the two APIs feel familiar if you have used either. The difference is really about scope. Puppeteer is a browser-automation library, Chromium-first with only limited Firefox support, and it has no built-in test runner. If you want to write tests with it, you bring your own — Jest or Mocha — and assemble the harness, the waiting and the parallelism yourself. Playwright is a full testing framework. It is cross-browser across Chromium, Firefox and WebKit; it ships its own test runner with fixtures, parallelism and retries; and it has auto-waiting, tracing and codegen built in. So the practical rule I give is about the job. If I want a lightweight Chromium library for scraping, generating PDFs or a one-off single-browser script, Puppeteer is a perfectly good, focused choice. If I am building a cross-browser end-to-end suite, Playwright is the natural pick because everything a test suite needs is in the box rather than bolted on afterwards. The senior framing is that this is not better versus worse, it is library versus framework, and you choose by what you are actually building.

Key points to hit

Shared lineage — the Playwright team includes ex-Puppeteer engineers, so the APIs feel similar.
Puppeteer: a Chromium-first library, limited Firefox, no built-in test runner.
Playwright: a full framework — cross-browser, own runner, fixtures, tracing, codegen.
Use Puppeteer for scraping, PDF and single-browser scripting.
Use Playwright for cross-browser end-to-end testing — runner and waiting are built in.

Code

// Playwright Test: runner plus assertions in one
import { test, expect } from "@playwright/test";

test("home loads", async ({ page }) => {
  await page.goto("/");
  await expect(page).toHaveTitle(/Home/);
});

JavaScript vs TypeScript for Playwright — does it matter?

Junior

Concept

JavaScript is the language Playwright shipped with first and the native language of the web. The practical advantage is that there is no compilation step — tests run directly under Node, with no tsc stage to slow the feedback loop, and the whole team can read them. The honest tradeoff against TypeScript is the loss of compile-time type checking, which catches typos in locators, fixture names and config before the test runs. Playwright runs .ts files directly via its own transpilation, so 'TypeScript is harder to set up' is no longer true. In 2026, TypeScript is the greenfield default; JavaScript still wins for mixed-skill teams, small suites, or codebases with no build tooling yet.

Interview Strategy

Lead with the two things interviewers actually reward — JavaScript is the native language of the web and there is zero build step between writing a test and running it. Then volunteer the TypeScript tradeoff openly rather than pretending JavaScript is strictly better; that honesty is the senior signal. Avoid the lazy 'JavaScript is easier' line, and frame it as the path of least resistance for a mixed front-end and QA team sharing one language and one toolchain.

How to phrase it

JavaScript is the language Playwright was built for first, and the language of the web, so it is a frictionless choice. The biggest day-to-day advantage is that there is no compilation step — I run tests straight under Node, there is no tsc stage in the pipeline, and the feedback loop stays fast. On a team where front-end developers also contribute tests, JavaScript is the path of least resistance because everyone already knows it. The honest tradeoff is that I lose compile-time type checking, and I do not pretend that away — I compensate with JSDoc annotations on every public method, ESLint with the Playwright rules, and proper code review of the page-object interfaces. I would also correct one stale myth: Playwright runs TypeScript files directly through its own transpilation, so the old 'TypeScript needs a build step' objection is gone. The 2026 reality is that TypeScript has become the default for greenfield Playwright projects, so I am clear about when I would still pick plain JavaScript: a mixed-skill team, a small suite, or a codebase with no build tooling yet.

Key points to hit

JavaScript is Playwright's first-class language and needs no compilation step.
TypeScript adds compile-time checks that catch typos in locators, fixtures and config.
Playwright runs .ts directly via its own transpilation — no separate build to set up.
Offset JavaScript's missing types with JSDoc, ESLint and code review.
2026 reality: TypeScript is the greenfield default; JavaScript wins for mixed teams and zero-build setups.

Code

// Plain JS still gets typed autocomplete via JSDoc
/** @type {import('@playwright/test').PlaywrightTestConfig} */
const config = { use: { baseURL: "https://example.com" } };
module.exports = config;

locator vs ElementHandle (page.$) — which should you use?

Mid

Concept

An ElementHandle (from page.$ or page.$$) is a reference to a specific DOM node captured at one moment. A Locator (from getByRole, locator() and friends) is a lazy description of how to find an element, re-resolved every time you use it. That difference is everything: because a handle points at a node that may be re-rendered, it goes stale — the React-style re-render that swaps the node out is exactly the Selenium stale-element problem. A locator re-queries on each action, so it is immune to staleness and participates in auto-waiting and strict mode. Playwright explicitly discourages ElementHandle; the only times you touch a handle are rare low-level needs.

Interview Strategy

Lead with the one-line distinction — a handle is a captured node, a locator is a lazy, re-resolved description — because everything else follows from it, then land the payoff: locators are immune to the stale-element problem that plagued Selenium. The senior signal is connecting locators to auto-waiting and strict mode, and naming that Playwright actively discourages ElementHandle. The follow-up to pre-empt is whether a handle is ever justified — concede the rare low-level case while keeping it an escape hatch, not a default.

How to phrase it

An ElementHandle, from page-dollar or page-dollar-dollar, is a reference to a specific DOM node captured at one moment in time. A Locator, from getByRole or the locator method, is a lazy description of how to find an element, and it gets re-resolved every single time I use it. That difference is the whole story. Because a handle points at a node that may be re-rendered, it goes stale — and the React-style re-render that swaps the node out is exactly the stale-element exception that plagued Selenium suites. A locator re-queries on each action, so it is simply immune to that, and it also participates in auto-waiting and strict mode, which a raw handle does not. Playwright explicitly discourages ElementHandle in favour of locators, so I never reach for page-dollar by habit. To pre-empt the follow-up about whether a handle is ever justified: yes, rarely, for a genuinely low-level need where I have to operate on a captured node directly, and even then it is an escape hatch I reach for deliberately, not a default. The line I land for the interviewer is that locators are instructions, not references, and that single design choice is why Playwright removed an entire category of flakiness.

Key points to hit

ElementHandle (page.$ or page.$$) is a captured DOM node; a Locator is a lazy, re-resolved description.
Handles go stale on re-render — the classic Selenium stale-element problem.
Locators re-query each use, so they are staleness-immune and join auto-waiting and strict mode.
Playwright actively discourages ElementHandle in favour of locators.
A handle is justified only for rare low-level needs — an escape hatch, not a default.

Code

// ElementHandle — captured node, can go stale
// const el = await page.$("#total"); // discouraged

// Locator — lazy, re-resolved every use, auto-waiting, strict
const total = page.locator("#total");
await expect(total).toHaveText("4000");

getByRole vs CSS or XPath locators?

Mid

Concept

CSS and XPath target the DOM structure, so they break when markup or class names change. getByRole queries the accessibility tree — the role and accessible name a user actually perceives — so it is more stable and doubles as an accessibility check. Playwright's recommended priority is getByRole, then getByLabel, getByPlaceholder, getByText, getByTestId, and CSS or XPath only as a last resort. Leading with getByRole is deliberate: if getByRole cannot find the control, a screen reader cannot either, so the locator strategy and accessibility testing turn out to be the same activity.

Interview Strategy

Lead with the priority ladder as a deliberate order — getByRole first, then the other user-facing locators, then getByTestId, and CSS or XPath only as a last resort — because reciting it in order signals you have internalised the guidance. The trap to avoid is presenting all locators as equal options; name the brittle-CSS-first habit as the anti-pattern. The senior signal is the why: role locators survive markup churn and double as an accessibility check.

How to phrase it

I treat Playwright's locators as a priority ladder, not a menu. At the top is getByRole, which I use for almost everything interactive, because it finds the element by its accessible role and name — the same way a screen reader would — and it survives the styling and markup churn that constantly breaks a CSS path. Below that come the other user-facing locators, getByLabel especially for form fields, then getByText and getByPlaceholder. Only when none of those gives a clean handle do I reach for getByTestId, and CSS or XPath is the genuine last resort. The reason for the order is resilience and meaning. As a bonus, getByRole doubles as a small accessibility check, because if Playwright cannot find the element by role, a screen-reader user cannot either — that is a real bug, not just a flaky selector. The anti-pattern I avoid is reaching for a brittle CSS chain first out of Selenium habit. To pre-empt the obvious follow-up about when CSS is acceptable: it is fine when there is genuinely no accessible handle and the selector is short and attribute-based, but I prefer getByTestId as the escape hatch because a test id is an explicit contract rather than an accident of structure.

Key points to hit

Priority: getByRole, getByLabel, getByPlaceholder, getByText, getByTestId, then CSS or XPath last.
getByRole queries the accessibility tree, so it survives markup and styling churn.
A getByRole failure is a real accessibility bug, not just a flaky selector.
Trap to avoid: reaching for a brittle CSS chain first out of Selenium habit.
When CSS is unavoidable, keep it short and attribute-based; prefer getByTestId as the escape hatch.

Code

// Brittle — couples to DOM structure
// page.locator("div.btn-primary > span");
// Stable, user-facing
await page.getByRole("button", { name: "Submit" }).click();
await page.getByLabel("Email").fill("a@test.invalid");

Web-first assertions vs manual waits before asserting?

Mid

Concept

Web-first assertions like expect(locator).toBeVisible() and toHaveText() retry automatically, re-checking roughly every 100 milliseconds until the condition holds or the expect timeout expires. Immediate assertions like expect(value).toBe() evaluate once with no retry. The rule is simple: assert on a UI element with a web-first assertion, and assert on a value you already hold with an immediate one. The classic mistake is calling textContent() into a variable and asserting on the string — that snapshots one moment and races the UI — whereas toHaveText() retries until the text matches.

Interview Strategy

State the rule in one line — UI state that may still be settling gets a web-first assertion, an already-retrieved value gets an immediate one — then show the textContent-then-assert race that proves you have hit it. The senior signal is calling that race the single most common assertion mistake you see in code review, because it explains a huge share of flaky tests in one concrete, recognisable pattern.

How to phrase it

Playwright has two assertion families and the difference is retry. Web-first assertions like toBeVisible, toHaveText and toHaveURL retry automatically — they re-check about every hundred milliseconds until the condition is met or the expect timeout runs out. Immediate assertions like toBe and toHaveLength evaluate once with no retry. My rule is simple: if I am asserting on a UI element, I use a web-first assertion, and if I am asserting on a value I already have in hand — a response status, a string, a number — I use an immediate one, because there is nothing to wait for. The most common assertion mistake I catch in review is mixing those up: someone calls textContent and then asserts on the returned string, which is a race condition, because if the element updates a hundred milliseconds later the assertion has already run and failed. The fix is to use toHaveText, which retries until the text matches. So the heuristic is: live UI state retries, captured values do not.

Key points to hit

Web-first assertions (toBeVisible, toHaveText, toHaveURL) retry every ~100ms until the timeout.
Immediate assertions (toBe, toHaveLength) evaluate once, no retry.
Assert UI elements web-first; assert already-retrieved values immediately.
textContent() then assert is a race condition — use toHaveText() instead.
Heuristic: live UI state retries, captured values do not.

Code

// WRONG — races the UI by snapshotting one moment
// const t = await page.locator("#status").textContent();
// expect(t).toBe("Done");

// CORRECT — web-first, retries until it matches
await expect(page.locator("#status")).toHaveText("Done");

Auto-waiting vs explicit waits — does Playwright still need waits?

Junior

Concept

Before every action Playwright runs actionability checks — the element must be attached, visible, stable and enabled — so it auto-waits without you writing anything, and web-first assertions retry the same way. Explicit waits are for the rare cases the action API cannot infer: waitForResponse to sync on a specific API call, waitForLoadState for a navigation, or locator.waitFor to gate on a state change like a spinner becoming hidden. The one thing you never use is a hard sleep, because it is both slow and flaky.

Interview Strategy

Say plainly that roughly ninety percent of waits disappear because click, fill and assertions wait by default, then narrow the scope: explicit waits are for the gap. Name the legitimate ones — waitForResponse, waitForLoadState — so you show you know when auto-waiting is not enough. Stress that you never use a hard sleep, which is the line that separates you from candidates who 'just add a sleep'.

How to phrase it

Most of the time I write no wait at all, because Playwright runs actionability checks before every action — the element has to be attached, visible, stable and enabled — and web-first assertions retry until their condition holds. So the click waits for the button to be clickable, the assertion waits for the text to appear, and roughly ninety percent of the waits you would write in Selenium just vanish. The explicit waits I do keep are for things the action API cannot infer. waitForResponse when I need to synchronise on a specific API call completing. waitForLoadState when I am gating on a navigation. And locator dot waitFor when I need an element to reach a particular state — visible, hidden, attached or detached — without acting on it, like waiting for a loading spinner to become hidden before I continue. The one thing I never do is a fixed sleep, because it is both slow, since I always pay the full wait, and flaky, since the element may still not be ready. So my framing is: auto-waiting and assertions by default, named explicit waits only for the genuine gaps, and never a hard sleep.

Key points to hit

Actionability checks (attached, visible, stable, enabled) mean actions and assertions auto-wait.
Roughly 90 percent of Selenium-style waits disappear.
Legitimate explicit waits: waitForResponse, waitForLoadState, locator.waitFor for a state change.
locator.waitFor gates on visible, hidden, attached or detached without acting on the element.
Never use a hard sleep — it is both slow and flaky.

Code

// No manual wait needed — click and assert both wait
await page.getByRole("button", { name: "Load" }).click();
await expect(page.getByText("Loaded")).toBeVisible();
// Explicit only when you must sync on the network
await page.waitForResponse("**/api/orders");

waitForTimeout (sleep) vs auto-waiting — why is sleep an anti-pattern?

Junior

Concept

page.waitForTimeout() pauses for a fixed time regardless of page state, so it is both slow — you always pay the full wait — and flaky, because the element may still not be ready when the timer fires. Playwright's auto-waiting continues the instant the element is actionable, up to a timeout, so a deterministic assertion waits exactly as long as needed and no longer. A hard sleep is the single biggest cause of slow, flaky suites.

Interview Strategy

Call waitForTimeout what it is — a hard sleep — and say you only ever use it to debug locally, never in committed tests. Explain that it is the biggest cause of slow, flaky suites, because it is both too long when the page is ready and too short when it is not. That framing alone separates you from candidates who reach for a sleep by reflex.

How to phrase it

waitForTimeout is just a hard sleep, and I treat it as an anti-pattern in committed tests. The problem is that it pauses for a fixed time regardless of what the page is actually doing, which makes it wrong in both directions. It is slow, because I always pay the full wait even when the element was ready a hundred milliseconds in. And it is flaky, because if the page is having a bad day, the element still is not ready when my timer fires, and the test fails anyway. So a fixed sleep is the worst of both worlds, and it is the single biggest cause of slow, flaky suites I see. The correct approach is to let Playwright's auto-waiting do the work: a web-first assertion like toBeVisible waits exactly as long as the element needs, up to the timeout, and continues the instant it is ready. The only place I will type waitForTimeout is locally, for a second, when I am eyeballing what a page is doing during debugging — and it never gets committed.

Key points to hit

waitForTimeout pauses for a fixed time regardless of page state.
It is slow — you always pay the full wait — and flaky — the element may still not be ready.
Auto-waiting continues the instant the element is actionable, up to a timeout.
A hard sleep is the biggest cause of slow, flaky suites.
Only acceptable for local debugging, never in committed tests.

Code

// Flaky and slow
// await page.waitForTimeout(5000);
// Deterministic — waits exactly as long as needed
await expect(page.getByText("Payment successful")).toBeVisible();

page.route mocking vs hitting the real backend?

Mid

Concept

Hitting the real backend tests the full integration but is slow and flaky, and you cannot easily force error or edge cases. page.route intercepts requests so you can stub responses with route.fulfill — fast, deterministic, and able to simulate a 500, a 401, a 429 or an empty list on demand, all states that real infrastructure cannot trigger reliably. The senior view is balance: mock for determinism and hard-to-reach states, but keep a smaller set of true end-to-end tests against a real backend so a fully-mocked suite cannot go green while the real integration is broken.

Interview Strategy

Frame this as a balance question, because the seniority signal is refusing both extremes. Lead with the legitimate reasons to mock — determinism, error and empty states real infra cannot produce on demand, speed, front-end isolation — then make over-mocking the centrepiece: a fully-mocked suite can be green while the real integration is broken. The follow-up to pre-empt is how you catch the mock drifting from reality — a smaller real-backend layer and contract tests.

How to phrase it

I treat mocking as a tool with a real cost, so the answer is about balance. I mock with page dot route when I need determinism — a flaky or rate-limited third party — or when I need to reach a state I cannot easily produce, like a five-hundred, a four-oh-one session expiry, a four-twenty-nine rate limit, or an empty result. Those error states are exactly the ones real infrastructure cannot fail on demand, and route dot fulfill makes each one instant and repeatable. The danger I am always conscious of is over-mocking. If every test mocks the backend, the whole suite can be green while the real integration is quietly broken, because the mock encodes what I think the API returns, and when the real API drifts, the mock just keeps lying convincingly. So the mitigation is a layered strategy: many fast tests with mocks for breadth and front-end logic, plus a smaller set of true end-to-end tests against a real backend that would actually catch contract drift, and ideally contract tests verifying the mock's shape still matches reality. The discipline is to mock deliberately, for a stated reason, never reflexively, and always keep enough unmocked coverage that a breaking backend change cannot sail through green.

Key points to hit

Mock for determinism, hard-to-reach error states, speed and front-end isolation.
route.fulfill simulates 500, 401, 429 and empty states real infra cannot trigger on demand.
Over-mocking risk: a fully-mocked suite can be green while the real integration is broken.
Mitigation: layered strategy — many mocked tests plus a smaller real-backend E2E set.
Mock deliberately for a reason, never reflexively; keep enough unmocked coverage to catch drift.

Code

await page.route("**/api/cart", (route) =>
  route.fulfill({ json: { items: [], total: 0 } })
);
await page.goto("/cart");
await expect(page.getByText("Your cart is empty")).toBeVisible();

Fixtures vs beforeEach — how do you set up tests?

Senior

Concept

A fixture provides a dependency the test uses — a logged-in page, a seeded user, a configured client — because it returns a value, composes with other fixtures, instantiates lazily (only when requested), and guarantees teardown that lives with the setup. Everything after the use() call is teardown, and it runs even when the test fails. A hook like beforeEach is for imperative lifecycle steps tied to a group of tests that do not produce a value the test consumes — a shared starting navigation, resetting state. beforeEach always runs, cannot return a value cleanly, and splits setup from cleanup across two hooks that drift.

Interview Strategy

Lead with the clean division — fixtures provide dependencies tests consume, hooks do imperative lifecycle steps that return nothing — because the choice is the whole question. The senior signal is the teardown-after-use detail and naming auto fixtures as the bridge when you want hook-like ubiquity plus fixture cleanup. The follow-up to pre-empt is whether you ever use beforeEach — yes, for ambient actions like a shared starting navigation.

How to phrase it

Both run setup, but they suit different jobs and I am deliberate about which. A fixture is for providing a dependency the test actually uses — a logged-in page, a seeded user, a configured client — because it returns a value, it composes with other fixtures, it instantiates lazily so it only runs when a test requests it, and its teardown lives right with its setup so they cannot drift apart. The detail I always point to is teardown: whatever comes after the use call in a fixture runs after the test even when the test fails, so that is where I close a context or clean up, and nothing leaks from a broken test into the next one. A hook like beforeEach is for imperative lifecycle steps tied to a group of tests that do not produce a value the test consumes — navigating to a starting page, resetting some state. So my preference is fixtures for things tests depend on, hooks for ambient actions, with auto fixtures bridging the gap when I want hook-like ubiquity but still want the guaranteed cleanup. To pre-empt the follow-up about whether I ever use beforeEach: yes, for exactly those ambient actions — a shared starting navigation is cleaner as a beforeEach than as a fixture. The anti-pattern I push back on is a tangle of beforeEach hooks doing setup that should really be fixtures, because they cannot return values and they split setup from cleanup across two hooks that drift.

Key points to hit

Fixtures provide dependencies tests use — return a value, compose, lazy, teardown lives with setup.
Code after use() is teardown and runs even when the test fails — guaranteed cleanup.
Hooks (beforeEach, afterEach) do imperative lifecycle steps that return no consumed value.
Prefer fixtures for things tests depend on, hooks for ambient actions like a shared navigation.
Auto fixtures bridge the gap: hook-like ubiquity plus fixture cleanup.

Code

export const test = base.extend({
  cartPage: async ({ page }, use) => {
    await page.goto("/cart");
    await use(new CartPage(page));
    // teardown here runs even if the test failed
  },
});
// test("...", async ({ cartPage }) => { ... });

test.describe.serial vs parallel — when do you go serial?

Mid

Concept

By default Playwright runs files in parallel and, with full parallelism, tests within a file too — each in its own isolated browser context. describe.serial forces tests in a block to run in order and stops the rest if one fails, because they share state and depend on each other. Parallel is the goal: isolated tests are faster and more reliable. Serial is for tests that genuinely must share state, like a multi-step wizard you cannot reset between steps — and even then it is a smell to refactor when you can, since one failure cascades.

Interview Strategy

Say parallel is the default and the goal, because isolated tests are faster and more reliable. Use serial only when tests genuinely must share state — a multi-step wizard you cannot reset between steps. The senior signal is treating a serial block as a smell to refactor when you can, since one failure cascades through every test after it.

How to phrase it

Parallel is both the default and what I aim for. Playwright runs files in parallel, and with full parallelism it runs the tests inside a file in parallel too, each in its own isolated browser context, which is what makes them fast and reliable — no test can leak state into another. describe dot serial changes that: it forces the tests in a block to run in order, and if one fails it skips the rest, because the tests share state and each one depends on the previous step having succeeded. I use serial only when tests genuinely must share state and cannot be isolated — the classic case is a multi-step wizard where step two depends on step one and there is no clean way to reset between them. But I am honest that a serial block is a smell. One failure cascades and skips everything after it, so I get less signal, and the block cannot be parallelised. So where I can, I refactor: set up the precondition through the API so each step becomes an independent, isolated test rather than a chain. The senior framing is that serial is an occasionally necessary exception, not a convenience I reach for to avoid proper isolation.

Key points to hit

Parallel is the default — files in parallel, and with full parallelism, tests within a file too.
Each parallel test runs in its own isolated browser context.
describe.serial runs tests in order and skips the rest on the first failure.
Use serial only for tests that genuinely share state, like a multi-step wizard.
Treat a serial block as a smell — one failure cascades; refactor to isolated tests where you can.

Code

test.describe.serial("checkout wizard", () => {
  test("step 1: address", async ({ page }) => { /* ... */ });
  test("step 2: payment", async ({ page }) => { /* ... */ });
});

Workers vs shards — how do you scale the run?

Senior

Concept

Workers are parallel processes on one machine — raise the worker count to use more cores locally or in a single CI job. Shards split the whole test set across multiple machines or CI jobs, each running a fraction, then you merge the blob reports into one HTML report. The two are complementary axes: workers scale you up within a machine, shards scale you out across machines. On CI you combine both — several jobs each with --shard, each using multiple workers — and merge the reports at the end.

Interview Strategy

Say workers scale you up within a machine and shards scale you out across machines — two complementary axes, not alternatives. The senior signal is naming the merge step: several jobs each with --shard, each using multiple workers, then merge-reports to combine the blob reports into one HTML report. Naming that merge shows you have actually wired this into a pipeline rather than read about it.

How to phrase it

Workers and shards are two different axes of scaling, and on a real pipeline I use both. Workers are parallel processes on a single machine — when I bump the worker count, I am using more of that machine's cores, whether that is locally or inside one CI job. Sharding is different: it splits the entire test set across multiple machines or CI jobs, so each shard runs only its fraction of the tests. So the mental model is that workers scale me up within a machine, and shards scale me out across machines. On CI I combine them: I set up a matrix of, say, four jobs, each running with shard equals one-of-four, two-of-four, and so on, and each of those jobs also runs with multiple workers internally. The step people forget, and the one that shows I have done this for real, is the merge at the end: each shard emits a blob report, and I run playwright merge-reports across all the blob reports to produce a single, unified HTML report for the whole run. Without that merge you get four partial reports and no overall picture.

Key points to hit

Workers are parallel processes on one machine — scale up by using more cores.
Shards split the whole test set across machines or CI jobs — scale out.
They are complementary axes, not alternatives.
On CI combine both: a matrix of --shard jobs, each with multiple workers.
Merge the per-shard blob reports with merge-reports into one HTML report.

Code

// One CI matrix job of four
// npx playwright test --shard=2/4 --workers=4
// Then combine every shard's blob report:
// npx playwright merge-reports --reporter=html ./blob-report

Trace vs video vs screenshot — what do you capture on failure?

Mid

Concept

A screenshot is a single image at one moment. A video is a recording of the run. A trace is the richest — a step-by-step timeline with DOM snapshots, network, console and before-and-after views you open in the Trace Viewer and step through. The standard configuration is trace and screenshot on-first-retry or only-on-failure, and video retain-on-failure, so passing runs stay cheap but failures ship with everything. The trace is what you actually debug with, because it lets you replay a failure without re-running it.

Interview Strategy

Lead with the configuration that signals real CI experience: trace and screenshot on first retry or only on failure, video retain-on-failure, so passing runs stay cheap but failures give you everything. The senior signal is stressing that the trace is what you actually debug with — it lets you replay the failure in the Trace Viewer without re-running it, which is decisive for a CI-only failure you cannot reproduce locally.

How to phrase it

These three differ in richness, and I capture them differently. A screenshot is a single image at the failing moment — cheap, but it only tells me the end state. A video is a recording of the whole run, useful for seeing the sequence but awkward to inspect. A trace is by far the richest: it is a step-by-step timeline with DOM snapshots, network activity, the console, and before-and-after views for every action, and I open it in the Trace Viewer and step through it. So my standard config is trace and screenshot set to on-first-retry or only-on-failure, and video set to retain-on-failure. That way a passing run does not pay the cost of capturing anything, but the moment something fails, I get a full trace, a screenshot, and a video. The part I emphasise is that the trace is what I genuinely debug with, because it lets me replay the failure exactly as it happened without re-running the test. That matters most for a CI-only failure I cannot reproduce locally — the trace ships with the DOM snapshot at the failing moment, which is usually enough to diagnose it from the report alone.

Key points to hit

Screenshot is one image; video is a recording; trace is a step-by-step timeline with DOM, network and console.
Config: trace and screenshot on-first-retry or only-on-failure, video retain-on-failure.
Passing runs stay cheap; failures ship with everything.
The trace is what you actually debug with — replay the failure without re-running it.
Decisive for a CI-only failure you cannot reproduce locally.

Code

// playwright.config.js
use: {
  trace: "on-first-retry",
  screenshot: "only-on-failure",
  video: "retain-on-failure",
}

storageState reuse vs logging in inside each test?

Senior

Concept

Logging in through the UI in every test is slow and re-tests the login flow for no reason. storageState saves cookies and local storage once in a setup project, then every test starts already authenticated by loading that state. The pattern is to give login one dedicated test in a setup project, save the state to a file, and point the suite's context at it. For multiple roles, save one state file per role; with parallel workers, give each worker its own state file so there is no cross-worker cookie sharing.

Interview Strategy

Say you log in once in a setup project, save storageState to a file, and point every test's context at it, so the login flow gets one dedicated test and everything else is fast and isolated. For multiple roles, save one state file per role. The senior signal is mentioning per-worker state files so an authenticated session never leaks across parallel workers.

How to phrase it

Logging in through the UI inside every test is the thing I want to avoid — it is slow, and it re-tests the same login flow dozens of times for no reason. The pattern I use is storageState. I have a dedicated setup project whose one job is to log in through the UI, and then I save the cookies and local storage to a file with context dot storageState. Every other test points its context at that file through the config, so it starts already authenticated and never touches the login form. The login flow itself gets exactly one test, which is where it belongs. For an app with multiple roles, I save one state file per role — an admin file, a viewer file — and use test dot use to point a describe block at the role it needs. The detail that makes this safe under parallelism is per-worker state: when I have several workers running, I give each worker its own state file, keyed on the worker index, so one worker's session cannot leak into another worker's tests through a shared cookie jar. So I get speed, isolation, and a single honest test of the login flow.

Key points to hit

UI login in every test is slow and pointlessly re-tests the login flow.
Save storageState once in a setup project; every test starts authenticated from the file.
Login gets one dedicated test; everything else loads the saved state.
Multiple roles: one state file per role, selected with test.use.
Per-worker state files (keyed on worker index) prevent cross-worker cookie sharing.

Code

// auth.setup.js
await page.getByLabel("Email").fill(user);
await page.getByRole("button", { name: "Sign in" }).click();
await page.context().storageState({ path: "auth/user.json" });
// config: use: { storageState: "auth/user.json" }

toHaveText vs toContainText — which assertion and when?

Junior

Concept

toHaveText asserts the element's full text equals the expected value, whitespace-normalised; toContainText asserts the text includes a substring. Both auto-retry; the difference is exact match versus partial. A separate matcher, toHaveValue, is for form-control values — the typed value of an input is not its text content, a common mix-up. Both text matchers also accept a regex for controlled flexibility and an array to assert a whole list of elements in order.

Interview Strategy

Separate the matchers cleanly — toHaveText for exact full text, toContainText for a substring — and default to toHaveText because it catches extra or missing text. The trap to avoid, and a common real bug, is using toHaveText on an input whose typed value is not its text content; toHaveValue is the right matcher there. The senior signal is knowing both text matchers take a regex and an array.

How to phrase it

I default to toHaveText, because it asserts the element's full text, whitespace-normalised, matches exactly — so it catches extra or missing text that a partial match would let through. I drop to toContainText when the surrounding text is dynamic or incidental and I only care that a particular phrase is present — a price next to a label, say, where the rest of the line changes. Both of those auto-retry, so the only difference is exact versus partial. The distinction I always flag, because it is a real bug I have seen, is that neither of those is right for a form field: the value a user typed into an input is not the element's text content, so toHaveText on an input does not work — I use toHaveValue there. And to show range: both text matchers accept a regular expression when I want controlled flexibility, like a total that contains four thousand, and they also accept an array to assert a whole list of elements in order in a single retrying assertion. So my rule is exact by default with toHaveText, partial with toContainText when the copy is incidental, toHaveValue for inputs, and a regex or array when the shape calls for it.

Key points to hit

toHaveText: full text matches exactly (whitespace-normalised); toContainText: contains a substring.
Default to toHaveText — it catches extra or missing text.
Both auto-retry; both accept a regex and an array (assert a list in order).
Trap to avoid: toHaveText on an input — the typed value is not its text content; use toHaveValue.
Use toContainText when surrounding copy is dynamic or incidental.

Code

await expect(page.locator("h1")).toHaveText("Order confirmed");
await expect(page.locator(".note")).toContainText("will be delivered");
await expect(page.locator("#total")).toHaveText(/4,000/);
await expect(page.getByLabel("Email")).toHaveValue("a@test.invalid");

click() vs dispatchEvent('click') — what's the difference?

Mid

Concept

click() runs the full actionability sequence — scroll into view, wait for the element to be visible, stable and enabled, then a real mouse click at the element's centre. dispatchEvent fires a synthetic DOM event directly, skipping all those checks and any overlay or pointer interception. click() behaves like a user and catches real bugs such as an element covered by a modal; dispatchEvent is an escape hatch for an element that genuinely cannot be clicked normally.

Interview Strategy

Say click() is what you want almost always because it behaves like a user and catches real bugs — an element covered by a modal, a button still disabled. The senior signal is naming the actionability sequence explicitly. The trap to avoid is reaching for dispatchEvent to make a stubborn click pass; call it an escape hatch and a workaround, never a default, because it hides exactly the bugs click() is designed to surface.

How to phrase it

click is what I want ninety-nine percent of the time, because it behaves like a real user. Before it clicks, Playwright runs the full actionability sequence: it scrolls the element into view, waits for it to be visible, stable and enabled, checks that nothing is intercepting the pointer, and then performs an actual mouse click at the element's centre. That is exactly why it catches real bugs — if a modal is covering the button, or the button is still disabled, click fails and tells me, which is the failure I want. dispatchEvent dot click is completely different: it fires a synthetic DOM click event directly on the element, skipping every one of those checks and ignoring any overlay or pointer interception. So it will happily click a button that a real user could not reach. I only use it as an escape hatch, for an element that genuinely cannot be clicked normally, and I call it out in review as a workaround rather than a default — because the moment I reach for it to make a stubborn click pass, I am hiding the exact bug that click was designed to surface.

Key points to hit

click() runs actionability: scroll into view, wait for visible, stable and enabled, then a real mouse click.
dispatchEvent fires a synthetic DOM event, skipping all checks and any overlay interception.
click() catches real bugs like an element covered by a modal or still disabled.
Use dispatchEvent only as an escape hatch for an element that genuinely cannot be clicked.
Trap to avoid: using dispatchEvent to force a stubborn click — it hides the bug click() surfaces.

Code

// Real, user-like click with actionability checks
await page.getByRole("button", { name: "Buy" }).click();
// Escape hatch — skips checks, hides interception bugs
// await page.locator("#hidden").dispatchEvent("click");

Strict mode: locator.first() vs a strict locator?

Mid

Concept

Locators are strict by default: if a locator matches more than one element, any action throws a strict mode violation rather than silently acting on the first match. This is intentional — an ambiguous locator signals a real test or application problem. The correct fix is almost always to scope to a parent container or filter by text, not to reach for .first() or .nth(0). A positional selector is acceptable only when order is genuinely meaningful and documented; otherwise a bare .first() hides a locator-quality problem.

Interview Strategy

Frame strict mode as a feature, not an annoyance — a locator matching two elements usually means your selector is too loose, and the error catches it before it becomes a flaky failure. The trap to avoid is slapping .first() or .nth(0) on the violation to silence it. The senior signal is scoping to a container to resolve the real ambiguity, and allowing a positional index only when it is genuinely unavoidable and documented.

How to phrase it

Strict mode means a locator that matches more than one element throws instead of quietly acting on the first one, and I treat that as a feature, not an obstacle. It is Playwright telling me my locator is ambiguous, which usually points at either a sloppy selector or a genuine duplication in the UI — and I would much rather know that now than have the test silently click the wrong element and flake later. The correct fix is almost always to scope to a parent container or filter by text, so the locator resolves to exactly one element by meaning. If I am clicking Delete and three Delete buttons match, I scope to the specific row first, then find Delete inside it. What I avoid is reaching for first or nth-zero just to make the error go away, because that hides the ambiguity rather than resolving it. If I do use nth, I document why — something like 'the first row is always the most recently created' — and a bare first with no explanation is a code smell I flag in review. The cleanest long-term fix, when the app allows it, is adding a data-testid so the element is unambiguous by design.

Key points to hit

Strict by default: a multi-match locator throws, never silently picks the first.
It is a feature — it surfaces ambiguous locators and real UI duplication before they flake.
Fix by scoping to a parent or filtering by text, not a blind .first() or .nth(0).
.first() or .nth() is acceptable only when order is meaningful and documented.
Best long-term fix: add a data-testid for unambiguous targeting.

Code

// Throws if two buttons match — good, fix the selector
// await page.getByRole("button").click();
await page.getByRole("row", { name: "Order 42" })
  .getByRole("button", { name: "Delete" }).click();
// .first() only for a genuinely order-defined list
await page.getByRole("row").first().click();

BrowserContext vs Page — what's the difference?

Mid

Concept

A BrowserContext is an isolated browser session — its own cookies, storage and cache, like a fresh incognito profile. A Page is a single tab inside a context. One context can hold many pages that share the same session. Each test gets its own context, which is how Playwright achieves isolation and safe parallelism without tests leaking state into each other. Multiple contexts in one test simulate two different users; multiple pages within a context handle multi-tab flows.

Interview Strategy

Lead with the analogy that lands instantly — a context is a fresh incognito profile, a page is one tab inside it — then connect it to the thing interviewers care about: each test gets its own context, which is how Playwright achieves isolation and safe parallelism. The senior signal is naming the two real uses: multiple contexts for multi-user tests, multiple pages for multi-tab flows.

How to phrase it

A BrowserContext is an isolated browser session — its own cookies, its own local storage, its own cache — so the cleanest analogy is a fresh incognito profile. A Page is a single tab inside that context. One context can hold many pages, and those pages share the same session, the same cookies. The reason this matters is isolation: every test gets its own context, which is exactly how Playwright keeps tests from leaking state into each other and how it can run them in parallel safely — one test's login or cookies cannot bleed into another's, because they are separate sessions entirely. And the distinction gives me two powerful patterns. If I need to simulate two different users in one test — a buyer and a seller, or an admin and a customer — I create two contexts, each with its own session, and drive them independently. If I need a multi-tab flow within one user's session — clicking a link that opens a new tab — I create a second page inside the same context, so the session is shared. So the rule of thumb is: separate users mean separate contexts, multiple tabs for one user mean multiple pages in one context.

Key points to hit

A BrowserContext is an isolated session (own cookies, storage, cache) — like fresh incognito.
A Page is a single tab inside a context; one context can hold many pages.
Each test gets its own context — the basis of isolation and safe parallelism.
Multiple contexts in one test simulate different users.
Multiple pages in one context handle multi-tab flows for one user.

Code

const buyer = await browser.newContext();
const seller = await browser.newContext();
const buyerPage = await buyer.newPage();
const sellerPage = await seller.newPage();

APIRequestContext vs driving the UI for setup?

Senior

Concept

Driving the UI to create test data — clicking through forms to seed an order — is slow, flaky, and couples your test to screens it is not even testing, so a broken create form breaks every test that depended on it. The request fixture provides an APIRequestContext, Playwright's built-in HTTP client (no Axios or Supertest needed), which makes calls directly. It also shares the browser's cookie jar, so an authenticated UI session authenticates API calls too. The default shape is create via API, test via UI, verify via API, clean up via API.

Interview Strategy

Name the create-act-verify pattern explicitly and justify each step, because structure is what is being assessed. The senior signal is the dependency-chain argument — building a precondition through the UI couples your test to screens it is not testing — plus the distinction that API verification confirms what the backend actually persisted, which a screen reading the right number does not.

How to phrase it

My most efficient UI test design is a four-step pattern built around the request fixture, which gives me an APIRequestContext — Playwright's built-in HTTP client, so I do not pull in Axios or Supertest. Step one, API setup: I create the test data through the API, which is fast and independent of UI state. Step two, the UI action: I exercise the actual scenario I am testing through the browser. Step three, API verification: I confirm via the API that the action had the correct backend effect, not just the correct visual one — because a UI can display success while the write silently failed, and only an API check catches that. Step four, API cleanup. The reason I avoid creating data through the UI is dependency chains: if the create form breaks, every test that relied on it to seed data fails too, and now I am debugging the wrong thing. A nice bonus is that the request context shares the browser's cookie jar, so once my UI session is authenticated, my API calls are authenticated too. The senior line is: test through the UI only what the UI is responsible for, and do the rest — setup, verification, teardown — over HTTP.

Key points to hit

UI-based setup is slow and flaky and couples tests to forms they are not testing.
request fixture is APIRequestContext — Playwright's built-in HTTP client, no Axios or Supertest.
It shares the browser cookie jar, so an authenticated UI session authenticates API calls.
Default shape: create via API, test via UI, verify via API, clean up via API.
API verification confirms the backend effect, not just the visual one.

Code

test("order shows in list", async ({ request, page }) => {
  const res = await request.post("/api/orders", { data: { item: "kit" } });
  const order = await res.json();
  await page.goto("/orders");
  await expect(page.getByText("kit")).toBeVisible();
  await request.delete(`/api/orders/${order.id}`); // cleanup
});

baseURL in config vs hardcoded URLs in tests?

Junior

Concept

Hardcoding full URLs ties every test to one environment, so pointing the suite at staging or local means a find-and-replace across the codebase. Setting baseURL in the config lets tests use relative paths like page.goto('/cart'), and you switch environments by changing one value or reading an env variable. baseURL lives in the use block alongside the other defaults every test inherits, and reading it from process.env lets CI inject the target.

Interview Strategy

Say you set baseURL once and use relative paths everywhere, so the same suite runs against local, staging and CI by overriding one setting. The senior signal is reading baseURL from an environment variable so CI can inject the target without a code change — a small thing that signals you build maintainable, environment-portable suites rather than one wired to a single host.

How to phrase it

Hardcoding full URLs is something I avoid, because it ties every test to one environment. The moment I want to point the suite at staging instead of local, I am doing a find-and-replace across the whole codebase, which is both tedious and error-prone. Instead I set baseURL once in the config's use block, where it sits alongside the other defaults every test inherits, and then every test uses a relative path — page dot goto of slash-cart, not the full host. Playwright resolves the relative path against baseURL, so switching environments is changing exactly one value. The piece I always add is reading baseURL from an environment variable, with a sensible local default, so CI can inject the target host without any code change — the same suite runs unchanged against local, staging and the CI environment, just by setting that variable. It is a small thing, but it is the difference between a suite that is portable across environments and one that is wired to a single host and has to be edited to move.

Key points to hit

Hardcoded URLs tie the suite to one environment and need a find-and-replace to move.
baseURL in the config lets tests use relative paths like page.goto('/cart').
Switch environments by changing one value.
baseURL lives in the use block with the other inherited defaults.
Read it from an env variable so CI injects the target without a code change.

Code

// playwright.config.js
use: { baseURL: process.env.BASE_URL || "http://localhost:3000" }
// test
await page.goto("/cart"); // resolves against baseURL

Retries vs actually fixing the flakiness?

Senior

Concept

Retries re-run a failed test and pass it if a later attempt succeeds, which is a useful safety net for genuinely non-deterministic CI noise. But a test that only passes on retry is usually flaky for a real, fixable reason — most often a missing web-first assertion or a hard sleep. The senior practice is to watch the flaky count in the report, treat anything marked flaky as a bug to investigate, and fix the root cause rather than leaning on retries to stay green.

Interview Strategy

Say retries are a safety net for CI noise, not a fix. The senior signal is naming what you do with the flaky count: watch it in the report, treat anything marked flaky as a bug — usually a missing web-first assertion or a hard sleep — and fix the root cause. Leaning on retries to stay green is the wrong answer; naming root-cause analysis is the right one.

How to phrase it

Retries re-run a failed test and let it pass if a later attempt succeeds, and I do configure them — typically two on CI, zero locally — but I am clear that they are a safety net for genuine non-determinism, not a fix for flakiness. The distinction matters, because a test that only passes on retry is almost always flaky for a real, fixable reason. So what I actually do is watch the flaky count in the report. Playwright marks a test flaky when it failed and then passed on retry, and I treat every one of those as a bug to investigate, not a pass to celebrate. When I dig in, the cause is nearly always one of two things: a missing web-first assertion, where someone read a value and asserted on it instead of letting expect retry, or a hard sleep that was too short. I fix the root cause — swap the snapshot for a retrying assertion, remove the sleep — and the flakiness goes away properly. Leaning on retries to keep the build green is the trap, because it hides intermittent failures and lets real bugs ride along. Retries buy me a stable CI signal while I do the actual work of driving the flaky count to zero.

Key points to hit

Retries re-run a failed test and pass it if a later attempt succeeds — a net for CI noise.
A test that only passes on retry is usually flaky for a real, fixable reason.
Watch the flaky count in the report and treat each flaky test as a bug.
Root cause is usually a missing web-first assertion or a hard sleep — fix that.
Leaning on retries to stay green hides intermittent failures; drive the flaky count to zero.

Code

// playwright.config.js — a net for CI, not a crutch
retries: process.env.CI ? 2 : 0,
// Then chase anything the report marks 'flaky' to its root cause.

Playwright Test runner vs Jest for E2E?

Mid

Concept

Jest is a unit-test runner built around jsdom and module mocking — excellent for logic, not for real browsers. Playwright Test is purpose-built for browser testing: parallelism, fixtures, web-first expect with auto-retrying assertions, tracing, projects for cross-browser, and the HTML reporter, all in one. Bolting Playwright onto Jest is possible but you lose the fixtures, projects and reporting. The clean division is the right runner for the right layer: Jest or Vitest for unit and component logic, Playwright Test for anything touching a real browser.

Interview Strategy

Frame it as the right runner for the right layer rather than a contest. Jest or Vitest for unit and component logic, Playwright Test for anything touching a real browser. The senior signal is knowing that you can run Playwright under Jest but lose the fixtures, projects, web-first expect and reporting that are the whole point — so the integrated runner is the deliberate choice, not an accident.

How to phrase it

I think of this as the right runner for the right layer, not one being better than the other. Jest is a unit-test runner built around jsdom and module mocking — it is excellent for pure logic, for testing a function or a component's behaviour in isolation, and it is fast because there is no real browser. But it does not drive a real browser, so it is the wrong tool for end-to-end. Playwright Test is purpose-built for that: it gives me parallelism, fixtures for dependency injection, web-first expect with auto-retrying assertions, tracing for debugging, projects for running the same suite across Chromium, Firefox and WebKit, and the HTML reporter — all in one integrated runner. Now, you can technically run Playwright's library under Jest, and people ask about that, but when you do you lose the fixtures, the projects, the web-first assertions and the reporting, which is most of the value. So my clean answer is: I keep Jest or Vitest for unit and component logic, and I use Playwright Test for anything that touches a real browser, because each is built for its layer and forcing them together throws away what makes each good.

Key points to hit

Jest is a unit-test runner (jsdom, module mocking) — great for logic, not real browsers.
Playwright Test is purpose-built: parallelism, fixtures, web-first expect, tracing, projects, HTML reporter.
Running Playwright under Jest loses the fixtures, projects, web-first expect and reporting.
Right runner for the right layer: Jest or Vitest for unit, Playwright Test for browser.
The integrated runner is a deliberate choice, not an accident.

Code

import { test, expect } from "@playwright/test";
test("adds to cart", async ({ page }) => {
  await page.goto("/product/1");
  await page.getByRole("button", { name: "Add to cart" }).click();
  await expect(page.getByText("1 item")).toBeVisible();
});

Headed vs headless — which do you run where?

Junior

Concept

Headless runs the browser with no visible window — faster and the default for CI. Headed shows the real browser window, which you use locally to watch and debug a test, often with --debug or --ui mode to step through. Modern Playwright headless is the real browser, not a stripped-down build, so headed and headless results match and you can trust headless in CI.

Interview Strategy

Say you run headless in CI for speed and headed locally when debugging, often with --debug or --ui mode to step through. The senior signal is reassuring the interviewer about reliability: you trust headless results because Playwright's headless is the real browser, not a stripped-down one, so a pass in CI means the same thing as a pass you watched locally.

How to phrase it

Headless runs the browser with no visible window, and that is what I run in CI, because it is faster and there is nothing to watch on a build server anyway — it is the default there. Headed shows the actual browser window, and I use that locally when I am debugging a test and want to see what the page is doing. In practice, when I am debugging I reach for the richer tools: the dash-dash-debug flag, which steps through with the Playwright Inspector, or dash-dash-u-i for the UI mode, which lets me run, watch and time-travel through tests interactively. The reassurance I always give is about trust: Playwright's headless mode is the same real browser engine as headed, not a stripped-down or different build, so the results match. That means a green run in headless CI means exactly the same thing as a green run I watched headed on my machine — I am not trading reliability for the speed of headless. So the split is simple: headless everywhere by default, especially CI, and headed locally with debug or UI mode when I need to see and step through a failure.

Key points to hit

Headless has no visible window — faster and the CI default.
Headed shows the real browser window — for local watching and debugging.
Debug locally with --debug (Inspector) or --ui (UI mode) to step through.
Modern headless is the real browser, not a stripped-down build.
Headed and headless results match, so you can trust headless in CI.

Code

// Local debugging
// npx playwright test --headed
// npx playwright test --ui
// CI: headless by default, nothing to set

Codegen vs hand-writing tests — is the recorder enough?

Mid

Concept

Codegen records your clicks and types into a runnable script, picking user-facing locators automatically — so it is a fast way to discover good locators and scaffold a flow. But it produces flat, linear scripts with no page objects, weak or missing assertions, and brittle recorded steps that need cleaning up. It is a starting point inside a structured suite, not a finished framework: you refactor the output, extract page objects, replace recorded waits with web-first assertions, and parametrise the data.

Interview Strategy

Say you use codegen to discover good locators and scaffold a flow quickly, then refactor. The trap to avoid, and a junior tell, is selling recorded scripts as a finished framework. The senior signal is naming exactly what you change after recording — extract page objects, replace recorded steps with web-first assertions, parametrise data — so codegen is a tool inside a structured suite, not the suite itself.

How to phrase it

Codegen is genuinely useful, but it is a starting point, not the finish line. What it does well is record my clicks and typing into a runnable script and, importantly, pick user-facing locators automatically — getByRole, getByLabel — so it is the fastest way to discover good locators and scaffold a flow I am not sure how to express yet. Where it falls short is everything that makes a suite maintainable: it produces flat, linear scripts with no page objects, the assertions are weak or missing because it only records actions, and some of the recorded steps are brittle. So my workflow is to record, then refactor. I extract the flow into page objects so the logic is reusable, I replace any recorded waits or weak checks with proper web-first assertions like toHaveText and toBeVisible, and I parametrise the data so one recorded path becomes a data-driven test. Selling a pile of recorded scripts as a finished framework is the junior tell — using codegen as a tool to bootstrap locators and flows inside a properly structured suite is the senior move. So yes to codegen as a discovery and scaffolding tool, no to codegen as the framework.

Key points to hit

Codegen records actions into a runnable script and picks user-facing locators automatically.
It is a fast way to discover good locators and scaffold a flow.
It produces flat scripts with no page objects, weak assertions and brittle recorded steps.
Refactor the output: extract page objects, add web-first assertions, parametrise data.
Trap to avoid: selling recorded scripts as a finished framework.

Code

// Scaffold, then refactor
// npx playwright codegen https://example.com
// Recorder gives a flat line:
await page.getByRole("link", { name: "Pricing" }).click();
// You add real assertions and extract page objects.

Want more free SDET prep?

Get my free interview resources and new Q&A drops straight to your inbox.

300+ questions in the full kit

300+ Playwright JavaScript SDET Q&A

25 of 300+. The full kit goes deep on architecture, locators, network mocking, API testing, fixtures, CI/CD, AI/MCP and the senior SDET track — all in this three-pill format.

Get the full kit →

Other stacks

Selenium Java →Selenium Python →Playwright TS →