FastAPI's Dependency Injection Makes DB Access Effortless

FastAPI’s dependency injection feels like magic. Declare db: AsyncSession = Depends(get_db) in your route signature and the database session is there when you need it. No boilerplate, just pure convenience, exactly what FastAPI is known and loved for.

But this convenience masks when the connection is actually acquired. And that timing matters more than you think. Under concurrent load, the recommended dependency injection pattern often holds database connections for longer than necessary, creating a bottleneck that cuts throughput and consequentially increases response times. The query finishes in milliseconds, but the connection stays open while your code processes data, calls external APIs or validates and serializes responses.

This isn’t about FastAPI being slow or SQLAlchemy not working as expected. It’s about understanding what happens under the hood and making deliberate choices about connection lifetime.

The pattern recommended in the docs #

For integrating FastAPI with SQLAlchemy, the recommended pattern is to use dependency injection:

from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.ext.asyncio import async_sessionmaker
from fastapi import Depends, FastAPI

engine = create_async_engine("postgresql+asyncpg://user:pw@localhost/db")
SessionLocal = async_sessionmaker(bind=engine)

async def get_db():
    async with SessionLocal() as session:
        yield session

app = FastAPI()

@app.get("/items/{item_id}")
async def get_item(item_id: int, db: AsyncSession = Depends(get_db)):
    item = await db.get(Item, item_id)
    return item

This looks clean and idiomatic and is straightforward to implement and understand. The dependency handles the session lifecycle automatically and you don’t have to think about connection management in your route handlers. For simple CRUD operations that return database objects directly, this is great.

However, most real endpoints don’t just fetch and return data. Here’s a more realistic example:

@app.get("/items/{item_id}")
async def get_item(item_id: int, db: AsyncSession = Depends(get_db)):
    # Simple query requires a database connection
    item = await db.get(Item, item_id)
    
    # Additional operations not requiring a database connection
    # Calls to external APIs (LLMs, file storage), processing, etc.
    result = await heavy_processing(item)
    enriched = await external_api.enrich(result)
    return enriched

The database query finishes in milliseconds, but the subsequent operations continue for hundreds of milliseconds or even multiple seconds. And during that entire time, the database connection is held and not released back to the pool.

Under the hood #

The issue comes down to when async with SessionLocal() acquires a connection from the pool. Here’s what happens:

FastAPI resolves dependencies before calling your route handler
get_db() executes and enters async with SessionLocal()
Connection acquired from the pool immediately
Session yielded to your route handler
Your route handler runs (query + processing + API calls)
Request completes and FastAPI runs dependency cleanup
Connection released back to the pool

The connection is acquired at step 3 and held until step 7. Everything in between, processing, external API calls, response serialization happens while that connection is leased from the pool.

With a pool of 5 connections and 20 concurrent requests where each holds a connection for 500ms, you create a queue. Only 10 requests per second can complete (5 connections * 2 requests per second per connection) because the pool is constantly exhausted. Requests wait not because the database is slow, but because connections aren’t available.

The timeline looks like this:

[acquire connection] ──> [query 5ms] ──> [processing 495ms] ──> [release]
└─────────────────────── connection held: 500ms ────────────────────────┘

You needed the connection for 5ms but held it for 500ms.

Measuring the impact #

To demonstrate this, I set up a simple benchmark with three endpoints doing identical work: a trivial database query followed by 500ms of simulated processing. The only difference is when the database connection is acquired and released.

The setup uses SQLAlchemy 2.0 with asyncpg, PostgreSQL and a connection pool limited to 5 connections with no overflow. Using Locust to simulate 20 concurrent users under load, here are the results:

Pattern	Avg Response Time	Min	Max	Throughput
Dependency injection	1971ms	503ms	5082ms	10 req/s
Lazy acquisition	507ms	501ms	524ms	40 req/s
Factory injection	507ms	502ms	537ms	40 req/s

The dependency injection pattern averages 1971ms, nearly four times slower than the alternatives. The throughput drops to 10 requests per second compared to 40 for the other patterns. The maximum response time of 5082ms shows requests queuing up waiting for connections to become available.

The lazy and factory patterns both average 507ms, which is essentially just the simulated work time plus minimal overhead. Connections are acquired only when needed and released immediately after the query completes.

Alternative patterns #

Both better-performing patterns share the same principle: acquire the connection inside the route handler, only when you’re about to use it and release it immediately after the query completes.

Lazy acquisition #

The simplest approach is to manage the session directly in your route handler:

@app.get("/items/{item_id}")
async def get_item(item_id: int):
    # Acquire connection only when needed
    async with SessionLocal() as db:
        item = await db.get(Item, item_id)
    
    # Connection already released
    result = await heavy_processing(item)
    enriched = await external_api.enrich(result)
    
    return enriched

The connection lifecycle is now explicit and scoped to just the database operation:

[acquire] ──> [query 5ms] ──> [release] ──> [processing 495ms]
└────── connection held: 5ms ──────┘

Trade-offs: This is the most straightforward pattern and makes connection lifetime obvious. The downside is that you lose FastAPI’s dependency injection benefits, which means no automatic session management, harder to test with mocked sessions and no dependency composition. You’re also referencing a global SessionLocal directly, which couples your handler to a specific database setup.

Factory injection #

A middle ground is to inject the sessionmaker instead of a live session:

async def get_sessionmaker():
    return SessionLocal

@app.get("/items/{item_id}")
async def get_item(item_id: int, sessionmaker=Depends(get_sessionmaker)):
    # Acquire connection only when needed
    async with sessionmaker() as db:
        item = await db.get(Item, item_id)
    
    # Connection already released
    result = await heavy_processing(item)
    enriched = await external_api.enrich(result)
    
    return enriched

The connection lifetime is identical to the lazy pattern, acquired and released around just the query. But you keep FastAPI’s dependency injection, which means easier testing (inject a test sessionmaker), better composition (combine with other dependencies) and decoupling from the global SessionLocal.

Trade-offs: Slightly more boilerplate than the lazy approach and the pattern is less immediately obvious to developers unfamiliar with this technique. But the testability and flexibility are almost always worth it.

When this matters #

The database is rarely the bottleneck. Most queries complete in single-digit milliseconds. The problem is holding connections while your application does everything else: processing, calling external APIs and serializing responses. That’s when a 5ms query turns into a 500ms connection hold.

Not every endpoint needs this level of attention to connection lifetime. The canonical dependency pattern works fine when:

The endpoint is simple CRUD with no processing beyond the query
Traffic is low relative to your pool size and requests are short-lived
Response times aren’t critical and you have headroom (which you should not sacrifice for an inefficient pattern)

But you should care about connection lifetime when:

You do work after database queries: Processing results, calling external APIs, formatting complex responses, generating files, etc. Anything that takes more than a few milliseconds while the data is already in memory.
You have concurrent traffic: With a fixed pool of 5 connections, one slow endpoint can block the entire application. Say you have endpoints that typically hold connections for 5ms, but one endpoint holds a connection for 500ms while doing processing. Just 4 concurrent requests to that slow endpoint will occupy 4 of your 5 connections for half a second. Now your fast endpoints only have 1 connection and they start queueing even though their queries are quick. One inefficient pattern spreads latency across your entire API.
Your pool is sized for actual database load: Database connections consume memory and resources on the database server. If you find yourself needing a large pool just to handle moderate traffic, you’re likely holding connections too long. The solution isn’t to increase the pool size to compensate for inefficient usage. The solution is to release connections faster.

For truly long-running operations (reports that take minutes, batch processing or background jobs), you need a different solution entirely. Background tasks, job queues or async task systems are the right approach there. But most endpoints are doing legitimate synchronous work that should complete in milliseconds to seconds. Those endpoints should still release database connections the moment they’re done querying.

The cost of holding connections isn’t always obvious until you’re under load and by then you’re debugging why response times suddenly spike or why requests start timing out. Measuring how long your endpoints hold connections versus how long they actually need them is worth doing before you hit production traffic.

Conclusion #

The dependency injection pattern works well for endpoints that query and return immediately. But most real endpoints do something after fetching data: processing, validation, external API calls or response formatting. Once you add that work, the connection sits idle while your code does things that don’t need database access.

The factory injection pattern keeps dependency injection while giving you control over when connections are held. You still get testability and composability, but the async with block inside your handler makes it clear when the connection is acquired and released. In the benchmark above with endpoints that do 500ms of processing after queries, this delivered 4x faster response times and 4x higher throughput. Your improvement will depend on how much work your endpoints do after database queries.

Connection management isn’t usually something you think about when writing application code. Most of the time the framework handles it and you focus on business logic.

But connection pools are a shared resource with real constraints. When you have concurrent traffic and endpoints that hold connections longer than they query the database, those constraints start affecting your entire application’s performance. Paying attention to when connections are acquired and released isn’t premature optimization. It’s recognizing that some resources need deliberate management even when the framework offers to handle them automatically.

This is also the kind of issue worth watching for in code review. Endpoints that use the documented dependency pattern aren’t wrong, but if you’re reviewing code that does processing or API calls after database queries, it’s worth asking whether connections are being held longer than necessary. Performance issues like this are easy to miss when code is written quickly and looks clean on the surface. They only show up under load by which time they’re difficult to debug and expensive to fix.