Package: nginx
Version: 1.26.3-3+deb13u5
Severity: important

Dear Maintainer,

When QUIC (HTTP/3) is enabled, connections will sometimes fail after
nginx is reloaded. The symptom is slow site loading and/or client
fallback to TCP (HTTP/2 or HTTP/1.1). Testing with curl may result in
e.g.:

% curl --http3-only https://example.org
curl: (7) QUIC connection has been shut down

An examination suggests that the problem stems from nginx worker
processes holding QUIC UDP sockets while they are shutting down. If the
worker process is handling a long-lived TCP session (e.g. websocket) for
another http server then the process may linger for an indefinite amount
of time, during which any QUIC UDP packets delivered to the process will
go unanwsered.

Here is a test case:

server {
        listen 443 quic reuseport default_server;

        server_name _;

        ssl_certificate /etc/ssl/certs/ssl-cert-snakeoil.pem;
        ssl_certificate_key /etc/ssl/private/ssl-cert-snakeoil.key;

        location / {
                return 200 "OK\n";
        }
}

server {
        listen 80 default_server;

        server_name _;

        location / {
                proxy_pass http://localhost:8080;
                proxy_read_timeout 1h;
        }
}

Ensure a freshly started nginx:
# systemctl restart nginx.service

Simulate the proxy destination in a separate terminal:
% nc -l 8080

Demonstrate working QUIC:
% curl --http3-only --insecure https://127.0.0.1
OK
% curl --http3-only --insecure https://127.0.0.1
OK
% curl --http3-only --insecure https://127.0.0.1
OK
% curl --http3-only --insecure https://127.0.0.1
OK

Initiate a long-running TCP session in a separate terminal:
% curl http://127.0.0.1

Reload nginx:
# systemctl reload nginx.service

Demonstrate the problem:
% curl --http3-only --insecure https://127.0.0.1
OK
% curl --http3-only --insecure https://127.0.0.1
curl: (7) QUIC connection has been shut down
% curl --http3-only --insecure https://127.0.0.1
OK
% curl --http3-only --insecure https://127.0.0.1
curl: (7) QUIC connection has been shut down

Confirm the QUIC listening socket is held by a worker process that is
shutting down:
# ss -ulnpH 'sport = 443'
UNCONN 0 0 0.0.0.0:443 0.0.0.0:* 
users:(("nginx",pid=258918,fd=7),("nginx",pid=258917,fd=7),("nginx",pid=257943,fd=7),("nginx",pid=257942,fd=7))
UNCONN 0 0 0.0.0.0:443 0.0.0.0:* 
users:(("nginx",pid=258918,fd=5),("nginx",pid=258917,fd=5),("nginx",pid=257943,fd=5),("nginx",pid=257942,fd=5))
% ps 257943                   
    PID TTY      STAT   TIME COMMAND
 257943 ?        S      0:00 nginx: worker process is shutting down

When all the long-running TCP sessions are ended and the lingering
worker processes do finally shut down, the problem goes away.

Similar reports may or may not be related:
  - https://github.com/nginx/nginx/issues/425
  - https://github.com/nginx/nginx/issues/1399

I am not certain how to fix, though perhaps the listening UDP sockets
need to be closed as soon as they are no longer being used when a worker
process starts shutting down.


-- System Information:
Debian Release: 13.5
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 6.12.90+deb13.1-cloud-amd64 (SMP w/2 CPU threads; PREEMPT)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages nginx depends on:
ii  iproute2      6.15.0-1
ii  libc6         2.41-12+deb13u3
ii  libcrypt1     1:4.4.38-1
ii  libpcre2-8-0  10.46-1~deb13u1
ii  libssl3t64    3.5.6-1~deb13u1
ii  nginx-common  1.26.3-3+deb13u5
ii  zlib1g        1:1.3.dfsg+really1.3.1-1+b1

nginx recommends no packages.

nginx suggests no packages.

-- no debconf information

Reply via email to