Hello! On Thu, Oct 30, 2025 at 02:48:54PM +0800, solomon wrote:
> # HG changeset patch > # User Zhefeng Chen <[email protected]> > # Date 1761731339 -28800 > # Wed Oct 29 17:48:59 2025 +0800 > # Node ID b81b918263d445c32cb2cff5e2724e3cac62dab3 > # Parent c3be8460587196f376bccf29817c3a6523e18150 > Upstream: max_fails doesn't respect the next_upstream config > > In the description of `max_fails`, it says: > What is considered an unsuccessful attempt is defined by the > `proxy_next_upstream`, `fastcgi_next_upstreami`, > `uwsgi_next_upstream`, `scgi_next_upstream`, > `memcached_next_upstream`, and `grpc_next_upstream` directives. > > But actually 403 and 404 are always considered an successful > attempt, while other cases are always considered an > unsuccessful attempt. > > The `ngx_http_upstream_free_round_robin_peer` function depends > on this to update the health check state. The documentation of the proxy_next_upstream directive (and other directives mentioned) further clarifies the behaviour (http://freenginx.org/r/proxy_next_upstream): : The directive also defines what is considered an unsuccessful : attempt of communication with a server. The cases of error, : timeout and invalid_header are always considered unsuccessful : attempts, even if they are not specified in the directive. The : cases of http_500, http_502, http_503, http_504, and http_429 are : considered unsuccessful attempts only if they are specified in the : directive. The cases of http_403 and http_404 are never considered : unsuccessful attempts. Note that error, timeout, and invalid_header cases are always considered unsuccessful (because they are, indeed, unsuccessful, and there isn't much to be done here). This is what you probably mean by "other cases are always considered an unsuccessful attempt". This doesn't apply to all other cases though - http_500, http_502, http_503, http_504, and http_429 are only considered unsuccessful if they are explicitly specified in the directive. These are valid responses, but at the same time these are error responses, so depending on the particular use case [free]nginx can either accept them as is (and send to the client) or try to fallback to a different server. See ngx_http_upstream_test_next() for details on where these codes are checked. Note well that 403 and 404 are explicitly excluded from being considered unsuccessful attempts. This behaviour is intentional and designed to support some common use cases, notably: 1. When resources are distributed among multiple servers, and "proxy_next_upstream http_404;" is used to iterate over available servers till the requested resource is found. Switching off a server if the resource is not found in such a setup is not desired for obvious reasons. 2. Similarly, but with resources rsynced to all servers, so "proxy_next_upstream http_404 http_403;" makes it possible to fallback to other servers if the resource is not yet synced to a particular server (or the new directory is being synced, see 5231:05c53652e7b4). Could you please elaborate a bit more on why do you suggest to change this behaviour? And how the mentioned use cases are expected to be handled with the new behaviour? Is it intentional that error, timeout, and invalid_header cases won't be considered unsuccessful unless explicitly specified in the proxy_next_upstream directive with the proposed change? And invalid_header won't be considered unsuccessful by default? [...] -- Maxim Dounin http://mdounin.ru/
