Per-Operation Timeouts in .NET: Why HttpClient.Timeout Is Not Enough
HttpClient.Timeout is a property of the infrastructure - it applies the same budget to every call. A per-operation timeout is a property of the use case - you set it per call using CancellationTokenSource and combine it with the caller's token via CreateLinkedTokenSource. The linked token fires when either the time budget expires or the caller cancels, whichever comes first.Your API's latency is only as good as your slowest dependency - unless you draw a timed boundary around it. I keep coming back to the same pattern in .NET services that call external APIs: move the timeout from the HTTP client to the operation, combine it with the caller's token, and always know who cancelled. Here is how it works and why it matters.

What is wrong with HttpClient.Timeout?
Nothing is technically wrong with HttpClient.Timeout - it is just the wrong place to put the budget. It is a property of the infrastructure. It applies the same limit to every single call made through that client, regardless of context.
Consider a service that uses the same HttpClient for two operations: getting a shipping quote and exporting a report. A shipping quote should fail fast - 3 seconds is generous. A report export might legitimately take 2 minutes. Setting one HttpClient.Timeout means you either kill the export too early or wait too long on a hung quote endpoint.
// Same timeout for every call - too blunt
services.AddHttpClient<IGatewayClient, GatewayClient>(
client =>
{
client.BaseAddress = new Uri(gatewayUrl);
// kills the report export
client.Timeout = TimeSpan.FromSeconds(3);
});The fix: set HttpClient.Timeout to Timeout.InfiniteTimeSpan and move the timeout to the operation level, where it belongs.
How do you set a per-operation timeout?
Each call gets its own CancellationTokenSource with a time budget. You then combine it with the caller's cancellation token - typically HttpContext.RequestAborted in ASP.NET Core - using CreateLinkedTokenSource. The resulting token is passed down to the gateway call.
// HttpClient has no deadline of its own
services.AddHttpClient<IGatewayClient, GatewayClient>(
client =>
{
client.BaseAddress = new Uri(options.BaseUrl);
// boundaries live at the call site
client.Timeout = Timeout.InfiniteTimeSpan;
});
// In your service / use case handler
public async Task<ShippingQuote> GetQuoteAsync(
QuoteRequest request,
CancellationToken callerToken) // HttpContext.RequestAborted
{
using var timeoutCts = new CancellationTokenSource(
TimeSpan.FromMilliseconds(
_options.TimeoutMilliseconds));
using var linkedCts =
CancellationTokenSource.CreateLinkedTokenSource(
callerToken,
timeoutCts.Token);
return await _gateway.GetQuoteAsync(
request, linkedCts.Token);
}Each operation now carries its own budget. The shipping quote handler gets 3 seconds; the report export handler gets 2 minutes. Same HTTP client, different boundaries - exactly what the business logic requires.
How does CreateLinkedTokenSource work?
CancellationTokenSource.CreateLinkedTokenSource creates a new token that is cancelled when any of the source tokens cancel. It is a logical OR.
using var timeoutCts = new CancellationTokenSource(
TimeSpan.FromSeconds(3));
using var linkedCts =
CancellationTokenSource.CreateLinkedTokenSource(
callerToken, // fires if client disconnects
timeoutCts.Token // fires after 3 seconds
);
// cancelled when EITHER fires - whichever comes first
await _gateway.GetQuoteAsync(request, linkedCts.Token);This means the operation stops at the tightest active boundary. If the caller disconnects at 1 second and the timeout is 3 seconds, the operation stops at 1 second. If the caller is still connected but the gateway hangs, the operation stops at 3 seconds.
How do you tell who cancelled?
When OperationCanceledException is thrown, you do not always know which token fired. But you need to know - the correct response depends on the cause.
- Timeout fired: the gateway was too slow. Return
504 Gateway Timeoutor a fallback value. - Caller cancelled: the client closed the connection. Nobody is listening - skip the response, just log and stop.
try
{
var quote = await _gateway.GetQuoteAsync(
request, linkedCts.Token);
return Results.Ok(quote);
}
catch (OperationCanceledException)
{
if (timeoutCts.IsCancellationRequested
&& !callerToken.IsCancellationRequested)
{
// Our budget expired - gateway was too slow
_logger.LogWarning(
"Gateway timeout after {Ms}ms",
_options.TimeoutMilliseconds);
return Results.StatusCode(504);
}
// Caller closed connection - no response needed
_logger.LogInformation("Request cancelled by caller.");
return Results.Empty;
}Checking timeoutCts.IsCancellationRequested and the caller's token separately gives you a clean fork. No guessing, no ambiguous catch blocks.
Why must you dispose the linked CancellationTokenSource?
CreateLinkedTokenSource registers callbacks on the parent tokens. Those callbacks keep the linked CTS alive in memory. If you do not dispose it, the callbacks are never removed - and on a service handling thousands of requests against a long-lived HttpContext token, that is a memory leak on every request.
// WRONG - callbacks accumulate, memory leak
var linkedCts =
CancellationTokenSource.CreateLinkedTokenSource(
callerToken, timeoutCts.Token);
await _gateway.GetQuoteAsync(request, linkedCts.Token);
// CORRECT - using disposes both after the call
using var timeoutCts = new CancellationTokenSource(
TimeSpan.FromSeconds(3));
using var linkedCts =
CancellationTokenSource.CreateLinkedTokenSource(
callerToken, timeoutCts.Token);
await _gateway.GetQuoteAsync(request, linkedCts.Token);CancellationTokenSource you create must be disposed. The using keyword is the safest way to guarantee this - it disposes even when an exception is thrown.What happens without any timeout boundary?
Without a boundary, a hung dependency holds your connections hostage indefinitely. Thread-pool threads block waiting for a response that never comes. Under load, the blocked threads pile up faster than they are released. Eventually, the thread pool is exhausted and every incoming request queues behind the hung ones.
One slow external gateway can take down your entire service - not because of high traffic, but because of missing boundaries. Fail fast beats waiting forever.
Frequently asked questions
Because the timeout belongs at the operation level, not the infrastructure level. Setting HttpClient.Timeout applies the same budget to every call. Moving the timeout to a per-call CancellationTokenSource lets each operation define its own budget based on what the business logic actually requires.
It creates a new CancellationTokenSource that is cancelled when any of the source tokens cancel - a logical OR. The resulting token stops the operation at whichever boundary fires first: the caller disconnecting, the time budget expiring, or any other token in the chain.
Check timeoutCts.IsCancellationRequested and the caller's token separately after catching OperationCanceledException. If timeoutCts fired but the caller's token did not, it was your budget. If the caller's token fired, the client closed the connection and no response is needed.
CreateLinkedTokenSource registers callbacks on the parent tokens. Without disposal those callbacks are never removed, keeping the linked CTS alive in memory. On a high-throughput service this becomes a memory leak on every request.
For simple timeout-only scenarios, raw CancellationTokenSource is fine. When you need retry, circuit breaking, and timeout composing together, Polly is the better choice. Either way, understanding the raw mechanism helps you configure and debug the abstraction correctly.
It is HttpContext.RequestAborted, which ASP.NET Core injects automatically when you declare a CancellationToken parameter in a minimal API handler or controller action. It fires when the client closes the connection or the request times out at the server level.