How QA Wolf’s parallel infrastructure works

Lauren Gibson
April 18, 2025

A well-designed test should run independently, without relying on execution order. If your tests are written to run in isolation, there’s no reason not to run them all at once. But doing it well means solving two hard problems at the same time.

First, your tests need to be atomic. That means no shared state, no failures caused by previous tests, and no leftover data from shared variables between tests. Second, your infrastructure needs to support true concurrency—booting up hundreds or thousands of isolated test environments simultaneously, then shutting them down just as fast, without bottlenecks or resource waste.

That’s the system we’ve built at QA Wolf. When your tests run, the entire suite executes in parallel and completes in the time it takes your longest-running test to finish, whether you have 200 tests or 20,000. We’ll walk through how that works under the hood, starting at the beginning of the test run.

Dynamic test configuration

Each test is stored independently in our database. When a run is triggered (by webhook or manually in the platform), the system logs a new request with a unique build number. This identifier ties together everything from execution results to logs, videos, and metadata.

We then generate what we call a run data file—an instruction set that includes the test code, any associated helper code, and any configuration or custom settings the customer defined in our editor. That file is passed to our runner, which allocates cloud compute resources to run the test.

Resource provisioning

A chart showing our cluster with two nodes, each with two pods, each with a container

Our infrastructure is built on Kubernetes clusters configured with pre-booted nodes—compute instances that are already online and ready to receive work. Each node runs a pre-built container image optimized for running Playwright-based end-to-end tests. These containers come preloaded with all required dependencies, so they can start executing immediately once test code is mounted.

The number of nodes adjusts based on expected demand, but test volume isn’t always predictable. When a run exceeds available capacity, the system spins up new nodes on the fly. That guarantees every test in the suite gets its own container, even if it means pulling capacity from multiple clusters. As shown above, each node hosts two pods, each running one container, and we can spin up as many nodes as needed to run the entire suite in parallel. At the same time, the pool of pre-booted containers is replenished to prepare for upcoming runs. 

Once the system knows how many nodes are required, it reserves them. Each container receives the code for a single test, which we mount to its file system. Containers begin running tests as soon as they’re ready, which usually happens within milliseconds across the entire suite. 

Test execution

Tests at QA Wolf run in browsers, not headless mode, so we can capture video artifacts for every run. These recordings give our customers clear visibility into failures and make troubleshooting faster. That adds compute overhead, but the debugging value outweighs the cost. Videos are stored alongside logs, HAR files, and other artifacts, and linked to each test’s run history in the editor. If a test fails, we retry it up to three times to confirm that the failure isn’t caused by external factors, like an application restart or a temporary environment issue, before logging it as an actual failure. 

We run each test in its own container for a reason. Shared environments introduce variability, and resource contention is a major cause of false positives (e.g., flakes). Test execution is processor-bound, but that CPU usage mostly comes from polling the state of the system under test (SUT). If the SUT is slow to respond, polling increases, and so does load. We can’t control the customer’s app, but we can isolate test execution. So we do.

Result aggregation and reporting

Once all tests (and any retries) have finished, the system aggregates results into a unified database record. On the backend, we shut down all containers and release associated resources. Kubernetes handles cleanup, ensuring no unused cloud instance is left running. That efficiency helps us keep operational costs low for us—and for our customers.

Notification

Our customer dashboard

Customers are notified as soon as a test run completes and results are published. Notifications can be sent via email, messaging tools like Microsoft Teams, or pushed to dashboards such as Grafana or Tableau. Results are always available in the QA Wolf Dashboard for direct review.

Full, cloud-native test parallelization

QA Wolf’s infrastructure is 100% cloud-native, which allows us to parallelize test runs across all available compute at no added cost. Unlike traditional providers still relying on virtual machine-based systems, we don’t charge by the node or shard. That means your test suite runs in parallel by default—every test, every time.

The tradeoff is clear. If you’re paying for 10 parallel nodes and you have 200 tests that take 5–10 minutes each, your test run could stretch to 1–3 hours. With full parallelization, the entire suite finishes in the time it takes your longest test to complete.

This isn’t a premium feature. It’s built into how QA Wolf works. When you sign on, you get test runs that complete in minutes, not hours, by design.

Some disclaimer text about how subscribing also opts user into occasional promo spam

Keep reading

Parallelization
Running the same test in multiple environments isn’t as simple as it sounds
E2E testing
Automated mobile testing without tradeoffs: The QA Wolf approach
Culture of quality
The Test Pyramid is a relic of a bygone era