Multi-tenant testing
Published: May 24, 2025
Last updated: May 24, 2025
During a recent podcast on The Pragmatic Engineer, I heard an interesting approach to testing that Uber takes with their services by leveraging multitenancy.
The approach sounded like an interesting alternative to managing environments with microservices, something that I've been pondering resolutions to myself at work to cutdown lead time to feature deployment.
This blog post looks to built out a minimal approach to using multitenancy within TypeScript microservices, but also goes a step further to see how a unified approach to multitenancy can also be grateful support by network mocking solutions for solving those difficult-to-emulate response scenarios.
Getting started
For this project, I've built out a minimal Turborepo starting place to illustrate some of the different approaches and trade-offs with minimal Hono servers.
For the testing and servers, they are not a true reflection of my approaches to these things, but are there to focus on some key points.
For a better reflection on some of my thoughts behind testing in TypeScript for projects, please read my blog post The Boundaries of TypeScript Testing.
For the companion code, please see here.
Our scenario
For this particular scenario, we are going to work with a chain of requests across three Hono servers:
service-a
will be our core server.downstream-b
will be our stand-in for a service we need to call. This could be a placeholder for another micro-service within our system.downstream-c
will be our stand-in for a server thatdownstream-b
needs to call. This could be a placeholder for another micro-service or a third-party API we need to interact with.
The scenario is straight forward: A calls B, B calls C.
Let's see this in action.
Some of the code from Service A which runs on localhost:3000
:
app.post("/users", async (c) => { try { // ResultAsync.fromPromise comes from neverthrow const body = await ResultAsync.fromPromise(c.req.json(), (err) => err); if (body.isErr()) { c.status(400); return c.json(new NoJsonBodyError()); } const userBodyResult = UsersPostBody.safeParse(body); if (!userBodyResult.success) { c.status(400); return c.json(new ValidationError(userBodyResult.error)); } // We aren't storing this. Just illustrating some adjustments to the value const newUser = { id: v4(), name: userBodyResult.data.name, }; const res = await axios.post( "http://localhost:3001/notifications", new NewUserNotification(newUser) ); switch (res.data._tag) { case "ItemQueued": return c.json(new UserCreatedResponse(newUser)); default: return c.json(new UnhandledResponseTagError()); } } catch (err) { console.error("Something went wrong", err); return c.json(new InternalServerError()); } });
The "creates" a new user, then forwards the request onto another service Downstream B on localhost:3001
:
app.post("/notifications", async (c) => { try { const body = await c.req.json(); const notificationBodyResult = NotificationPostBody.safeParse(body); if (!notificationBodyResult.success) { c.status(400); return c.json(new ValidationError(notificationBodyResult.error)); } const res = await addToQueue( new NewUserNotification(notificationBodyResult.data.value) ); switch (res.data._tag) { case "SuccessfullyQueued": return c.json(new ItemQueuedResponse()); // Propagate the following errors back under one // unified error type for the sake of it. case "ValidationError": case "TimeoutError": return c.json(new QueueError(res.data)); default: return c.json(new InternalServerError()); } } catch (error) { console.error("Error processing notification", error); return c.json(new InternalServerError()); } });
That code attempts to add it to a "queue" (we are not really doing this) on another endpoint localhost:3002
for our service Downstream C.
If it is successfully queued, then it will return the instance of ItemQueuedResponse
, but our logic is handling scenarios for ValidationError
and TimeoutError
error as well where is returns an instance of QueueError
.
For example's sake, Downstream C does not implement handling for a timeout error, however we can make the assumption that this has been implemented and that Downstream B can possibly return errors based off that.
For what it's worth, if you are feeling some confusion about the micro-services in this example, then note that it is a contrived example but certainly emulates real-world services that you may be working with. This setup is to enable us to have a scenario where we can compare solutions for testing.
From the companion code, we can run pnpm dev
from the root directory to start all three servers and test our code is working:
# Start the servers $ pnpm dev # In another terminal window, testing the endpoint with HTTPie $ http POST http://localhost:3000/users name="Dennis" HTTP/1.1 200 OK Connection: keep-alive Content-Length: 92 Date: Sat, 24 May 2025 05:06:39 GMT Keep-Alive: timeout=5 content-type: application/json { "_tag": "UserCreated", "value": { "id": "37ba97de-df86-488f-9258-4867a655fec0", "name": "Dennis" } }
At this point, we can tell that our configuration returns a response along our successful path.
We can also confirm that our NoJsonBodyError
and ValidationError
works by dropping the name
from the request or setting it to be an invalid type:
# NoJsonBody Error $ http POST http://localhost:3000/users HTTP/1.1 400 Bad Request Connection: keep-alive Content-Length: 68 Date: Sat, 24 May 2025 05:31:17 GMT Keep-Alive: timeout=5 content-type: application/json { "_tag": "NoJsonBody", "value": { "message": "Request body is required" } } # ValidationError $ http POST http://localhost:3000/users name:=1 HTTP/1.1 400 Bad Request Connection: keep-alive Content-Length: 139 Date: Sat, 24 May 2025 05:33:30 GMT Keep-Alive: timeout=5 content-type: application/json { "_tag": "ValidationError", "value": { "details": "✖ Invalid input: expected string, received number\n → at name", "message": "Bad Request" } }
But what happens for error scenarios that are not so easy to represent from our remote services?
In our code, there are potential errors to bubble up like the downstream services like TimeoutError
which are not so easy to replicate reliably. Let's take a look at a few of our options.
Options for testing
Our setup is non-trivial. Although my test setup demonstrates multiple micro-services within the same repository, we may not always have that luxury.
At your own workplace, you may have services distributed across multiple teams and repositories, and co-ordination for this level of testing is of utmost importance across the board.
Let's walkthrough some possible solutions for us now.
An approach using testing library mocks
The most basic possibility for emulating certain scenarios is to make use of library mocks.
Popular TypeScript testing frameworks such as Jest and Vitest have mocking capabilities out of the box.
Depending on you configure this, you can stub a return value for a request library (like our Axios library from the examples) or even a module that interacts with the service in order to test certain scenarios.
The benefits of this approach is that it is a quick-and-easy approach that comes built into our testing tools.
The major downside is that your test boundary ends where the mock starts. Your test can no longer give certain guarantees and confidence from the tools that are now mocked out. If we mock out our request library, then we have no confidence in the request library returning the response from the network request itself.
This can be a major loss from the point-of-view of testing that aims to, at a minimum, test all modules within a service until (at least) the network boundary.
An approach using mock server infrastructure
On the other end of the scale is to stand up infrastructure to emulate network traffic. This could be a scenario of running different environments for your service, but it can also move to more extreme cases of standing up infrastructure in an attempt to emulate third-party services and APIs that you interface with in production.
This approach certainly has a spectrum of complexity.
On the side of standing up internal service infrastructure environments, it can give you a chance to run test boundaries that give you test confidence as far as deeper data layers like your databases.
The trade-off with the internal services is that there is always a cost to infrastructure which is not just monetary. There is time-to-production costs, infrastructure management, complexity increases and more. In scenarios where you have multiple environments and service with multiple internal down stream services, you can also face challenges around all environments being at parity for running tests.
It is still standardized to run different environments for your services, but it's worth noting that the approach to this varies from company-to-company and there is still such a thing as over-kill or misprovisioning of resources.
In the case of third-party mock service infrastructure, you face the cost of re-implementing and maintaining an emulation of their third party service's logic and run the risk of implementation logic mismatches.
An approach using Mock Service Worker
With JavaScript/TypeScript, there is the delightful Mock Service Worker (MSW) library which enables you to stub out responses at the network level.
This differs from the module-mocking approach we spoke about earlier in that you service still operates end-to-end from within it's service boundary. As long as your mocked response matches that of the expected success/error responses, then MSW is an unbelievably useful option for emulating response types that are hard to emulate.
A basic example of MSW from their documentation:
// src/mocks/handlers.js import { http, HttpResponse } from "msw"; export const handlers = [ // Intercept "GET https://example.com/user" requests... http.get("https://example.com/user", () => { // ...and respond to them using this JSON response. return HttpResponse.json({ id: "c7b3d8e0-5e0b-4b0f-8b3a-3b9f4b3d3b3d", firstName: "John", lastName: "Maverick", }); }), ];
When MSW is intercepting network calls, a call to https://example.com/user
from your request library will respond with a valid HTTP response with a JSON body.
An approach using multi-tenancy
Uber takes an approach using multi-tenancy.
By enabling multiple tenants (such as test environments, canary releases, and product lines) to coexist within the same infrastructure, Uber achieves code isolation, flexible traffic routing, and improved integration testing, ultimately supporting rapid feature deployment and reliable service performance.
More relevant around the integration testing, Uber uses routing to test services without the constraints of static environment configuration.
Consider a scenario where Uber needs to test a new version of Service B, which interacts with Services A, C, and D. Instead of deploying this new version in a separate staging environment, Uber deploys it as a distinct tenant within the production environment, referred to as Service B' (Service B Prime). Test traffic is routed specifically to Service B', while production traffic continues to flow through the original Service B. This setup ensures that the new version can be tested in a live environment without impacting the production services.
Uber uses HTTP headers to implement multi-tenancy for integration testing in their microservice architecture.
Here's how it works:
- Tenancy Context Propagation: When a request is made (for example, from a test client), Uber adds a special tenancy header to the HTTP request (e.g., X-Tenant-ID: integration-test-123).
- Traffic Routing: Uber's internal routing and service discovery systems use this header to direct the request to the correct tenant-specific instance of the service (e.g., a test version of Service B).
- End-to-End Isolation: This header is propagated across all service calls, ensuring that the entire chain of microservices (B → C → D) remains within the same test tenant. This allows for full end-to-end testing without touching production instances.
While this approach puts a lot of emphasis on a consistent system, it also enables us to reduce some of the complexities previously spoken about around managing environments for microservices and we can leverage an approach similar using HTTP headers (or a consistent equivalent) in order to make a combination of previously explained processes in an attempt to simplify testing new versions or routing to specific HTTP mock responses.
Rethinking our approach to integration testing
Keeping in mind the different tooling and options that we currently have, we can start to pick-and-choose a combination of options to best reflect the trade-offs we want to make.
In our current setup, I'm thinking this will be what works best:
- Making use of headers to conditionally intercept requests when MSW is enabled.
- Configuring middleware to conditionally return certain HTTP responses from the service itself.
- Providing route management using the headers to conditionally alter where requests are sent.
With that in mind, let's work through what some of this can look like.
Taking another look at our Mock Service Worker approach
With the idea in mind around using HTTP headers, we can make use of MSW's passthrough helper in order to conditionally fire off the original request without mocking.
When running in test modes, we can enable MSW and then using those testing headers in order to run our tests. Let's look at how that could look.
import { http, passthrough, HttpResponse } from "msw"; class QueueError { readonly _tag = "QueueError"; readonly value: unknown; constructor(value: unknown) { this.value = value; } } export const handlers = [ http.post("http://localhost:3001/notifications", ({ request }) => { switch (request.headers.get("X-Tenancy-ID")) { case "DownstreamB/ValidationError": return HttpResponse.json( new QueueError({ _tag: "ValidationError", }) ); case "DownstreamB/TimeoutError": return HttpResponse.json( new QueueError({ _tag: "TimeoutError", }) ); default: return passthrough(); } }), ];
In the above, I've chosen to use the header `X-Tenancy-ID`. This is arbitrary and can be swapped out for whatever best suits your testing paradigm.
The above code will conditionally switch on a X-Tenancy-ID
value if provided and only when MSW is enabled and running. In scenarios where MSW is not enabled, it will not have an impact.
In the case of the development server and testing, we can conditionally enable MSW for the duration of the tests.
On the assumption of a Vitest test setup that has MSW configured to run using the following hooks:
// vitest.setup.ts import { beforeAll, afterEach, afterAll } from "vitest"; import { server } from "../src/mocks/node"; // Start server before all tests beforeAll(() => { server.listen({ // If you wanted to, you could set warn here in dev mode onUnhandledRequest: "bypass", }); }); // Reset handlers after each test (important for test isolation) afterEach(() => { server.resetHandlers(); }); // Close server after all tests afterAll(() => { server.close(); });
Then we can write a test to confirm that our configuration works with the X-Tenancy-ID
configuration we've made with the following test:
import axios, { isAxiosError } from "axios"; import { ResultAsync } from "neverthrow"; import { describe, expect, it } from "vitest"; /** * These tests expect all the servers to be up and running. */ describe("app", () => { describe("creating a user", () => { describe("success states", () => { it("successfully creates a user", async () => { const response = await axios.post( "http://localhost:3000/users", { name: "Dennis", }, { headers: { "Content-Type": "application/json", }, } ); expect(response.status).toBe(200); expect(response.data._tag).toBe("UserCreated"); }); }); describe("error states", () => { it("requires a name property within the json body", async () => { const response = await ResultAsync.fromPromise( axios.post( "http://localhost:3000/users", {}, { headers: { "Content-Type": "application/json", }, } ), (err) => err ); // There is likely a better way to handle this, but will // leave it in as it's the assertions that I care most // about using for illustration purposes. if ( !response.isErr() || !isAxiosError(response.error) || !response.error.response ) { throw new Error("Unexpected request success"); } expect(response.error.response.status).toBe(400); expect(response.error.response.data._tag).toBe("ValidationError"); }); it("returns information on the queue service TimeoutError", async () => { const response = await ResultAsync.fromPromise( axios.post( "http://localhost:3000/users", { name: "Dennis", }, { headers: { "Content-Type": "application/json", "X-Tenancy-ID": "DownstreamB/TimeoutError", }, } ), (err) => err ); // There is likely a better way to handle this, but will // leave it in as it's the assertions that I care most // about using for illustration purposes. if ( !response.isErr() || !isAxiosError(response.error) || !response.error.response ) { throw new Error("Unexpected request success"); } expect(response.error.response.status).toBe(500); expect(response.error.response.data._tag).toBe("QueueError"); expect(response.error.response.data.value._tag).toBe("TimeoutError"); }); it("returns information on the queue service ValidationError", async () => { const response = await ResultAsync.fromPromise( axios.post( "http://localhost:3000/users", { name: "Dennis", }, { headers: { "Content-Type": "application/json", "X-Tenancy-ID": "DownstreamB/ValidationError", }, } ), (err) => err ); // There is likely a better way to handle this, but will // leave it in as it's the assertions that I care most // about using for illustration purposes. if ( !response.isErr() || !isAxiosError(response.error) || !response.error.response ) { throw new Error("Unexpected request success"); } expect(response.error.response.status).toBe(500); expect(response.error.response.data._tag).toBe("QueueError"); expect(response.error.response.data.value._tag).toBe("ValidationError"); }); }); }); });
The above tests require the servers to be running in order for the localhost ports to resolve to the server.
Our tests are checking the bare minimum, and I've written them in a way that favors illustrating the MSW mock conditionally in action (it could certainly do with some TLC).
In our test scenarios, MSW mocks are configured to run for each test.
- Our successful scenario bypasses the MSW mock and calls into our microservices Downstream B and Downstream C.
- Our failure scenarios that pass the
X-Tenancy-ID
values that match our MSW mock conditions will be intercepted at the network level and never make it to Downstream B. That being said, the mocked JSON responses are a one-to-one representation of responses from Downstream B.
Of course, we can also make use of this if MSW is enabled on our development servers to check that we are getting the expected responses.
# An example of an MSW intercepted endpoint $ http POST http://localhost:3000/users "X-Tenancy-ID:DownstreamB/TimeoutError" name="Dennis" HTTP/1.1 500 Internal Server Error Connection: keep-alive Content-Length: 53 Date: Sun, 25 May 2025 11:42:24 GMT Keep-Alive: timeout=5 content-type: application/json { "_tag": "QueueError", "value": { "_tag": "TimeoutError" } }
Great! So we've managed to use the X-Tenancy-ID
header to incept requests for difficult-to-emulate downstream error responses in our tests to ensure that Service A can handle those as expected on its infrastructure.
The next step for us is make use of that header for redirecting requests.
Implementing multi-tenancy for services
On real network infrastructure, there are a number of ways to manage routing. Examples of this could be re-routing at API Gateway, load balancers, reverse proxies or within services themselves (along with more alternatives!). Each come with their own value propositions and trade-offs.
In our example, we're going to simplify things by conditionally routing requests based on the X-Tenancy-ID
from within a service application. Again, it's best to do a technical spike and review your own systems in order to understand what may be most beneficial for your needs.
For the example, I'm going to create a duplicate of the code in Downstream C. For the demonstration today, we'll name it Downstream C Prime, but you can think it of as emulating a separate, updated deployment of Downstream C that we want to test out.
For some contrived code in Service B that conditionally re-routes traffic based on the X-Tenancy-ID
header, I have the following:
app.post("/notifications", async (c) => { try { const body = await c.req.json(); const notificationBodyResult = NotificationPostBody.safeParse(body); if (!notificationBodyResult.success) { c.status(400); return c.json(new ValidationError(notificationBodyResult.error)); } // This is shoe-horned in for demonstration. In reality, you should come up // with a better approach to resolving this value. let baseUrl; switch (c.req.header("X-Tenancy-ID")) { case "Feature/QueueUpdate": baseUrl = "http://localhost:3003"; break; default: baseUrl = "http://localhost:3002"; } const res = await addToQueue( new NewUserNotification(notificationBodyResult.data.value), baseUrl ); switch (res.data._tag) { case "SuccessfullyQueued": return c.json(new ItemQueuedResponse()); case "SuccessfullyQueuedAlt": return c.json(new ItemQueuedNewVersionResponse()); // Propagate the following errors back under one // unified error type for the sake of it. case "ValidationError": case "TimeoutError": return c.json(new QueueError(res.data)); default: return c.json(new InternalServerError()); } } catch (error) { console.error("Error processing notification", error); return c.json(new InternalServerError()); } });
In the above code, we have code-dependent conditional logic to resolve the base URL to use for the request with a fallback to the default service.
In real-world projects, I would spike options for remotely resolving these URLs based on the ID or having a policy in place where the target URL has some correlation to the tenancy ID that you are aiming to use. If you do not do this, then you would also need a deployment to the consuming services to match new tenancy IDs. It's out-of-scope of this blog post for me to go further into that, but I will happily update this post if I find an approach that works well for my work.
The updated Downstream C Prime service also has an different response to emulate an "updated" version of the service.
We can test this works against our running servers by making a few calls:
# Without the tenancy ID $ http POST http://localhost:3000/users name="Dennis" HTTP/1.1 200 OK Connection: keep-alive Content-Length: 92 Date: Sun, 25 May 2025 09:59:40 GMT Keep-Alive: timeout=5 content-type: application/json { "_tag": "UserCreated", "value": { "id": "e0371a7e-56d4-4c4e-a3f3-8942caca0003", "name": "Dennis" } } # With the tenancy ID $ http POST http://localhost:3000/users X-Tenancy-ID:Feature/QueueUpdate name="Dennis" HTTP/1.1 200 OK Connection: keep-alive Content-Length: 68 Date: Sun, 25 May 2025 09:58:48 GMT Keep-Alive: timeout=5 content-type: application/json { "_message": "This uses the alternative tenant", "_tag": "UserCreated" }
As we can see, our servers are now responding with different values for our service.
We can also configure our tests for us to test both these alternatives by adding the alternative route as another success route:
describe("success states", () => { it("successfully creates a user", async () => { const response = await axios.post( "http://localhost:3000/users", { name: "Dennis", }, { headers: { "Content-Type": "application/json", }, } ); expect(response.status).toBe(200); expect(response.data._tag).toBe("UserCreated"); }); it("successfully creates a user taking the alternative path", async () => { // This test will run against Downstream C Prime thanks // to the X-Tenancy-ID const response = await axios.post( "http://localhost:3000/users", { name: "Dennis", }, { headers: { "Content-Type": "application/json", "X-Tenancy-ID": "Feature/QueueUpdate", }, } ); expect(response.status).toBe(200); expect(response.data._tag).toBe("UserCreated"); expect(response.data._message).toBe("This uses the alternative tenant"); }); });
When we run our tests with pnpm test
, we can see that all tests pass.
Testing these two tenancy resolutions is more for the sake of showing both alternatives can be used and are conditionally resolved with our header. When writing integration tests for your own applications, you would need to be considerate of your strategies based on changes from merges or dropped tenancies. This is likely something for your release management to take ownership of and is out-of-scope for this blog post.
At this point, we now have a solution to optionally intercept or re-route traffic for our services based on the X-Tenancy-ID
provided.
Conclusion
This post explored a proof-of-concept for multi-tenant testing, as well as using networking mocking solutions to also reduce the amount of infrastructure management required.
These approaches adequately showed the potential that this approach has with enough buy-in and unification. I believe that refining these ideas with a technical design relevant to your situation has the potential to greatly simplify testing complicated microservice architecture systems.
Links and Further Reading
Photo credit: aleksdahlberg
Multi-tenant testing
Introduction