https://bz.apache.org/bugzilla/show_bug.cgi?id=70103

暴兴 <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

--- Comment #2 from 暴兴 <[email protected]> ---
Thank you for looking into this. I have captured a thread dump from the
production environment that clearly demonstrates this deadlock, regardless of
the OS version.

The issue is not OS-specific; it is a fundamental behavioral difference between
TCP and UDS connect() system calls when the listening side is absent. Here is
the exact deadlock captured in the wild:

1. The Acceptor Thread (Not consuming connections)


"http-nio-uds-Acceptor" #177 [255] daemon prio=5 os_prio=0 cpu=379673.17ms
elapsed=89634.05s tid=0x0000557f50ca7750 nid=255 sleeping [0x00007f63ff595000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep0([email protected]/Native Method)
at java.lang.Thread.sleep([email protected]/Unknown Source)
at org.apache.tomcat.util.net.Acceptor.run(Acceptor.java:98)
at java.lang.Thread.run([email protected]/Unknown Source)
Analysis: The Acceptor thread is in TIMED_WAITING (sleeping) at
Acceptor.java:98 (likely a brief error-recovery sleep after a transient
exception or just between loop cycles). Crucially, it is NOT blocking on
accept(), which means it cannot consume any incoming connections from the OS
kernel’s accept queue.

2. The Shutdown Thread (Blocked indefinitely on UDS connect)


"tomcat-shutdown" #178 [670] prio=5 os_prio=0 cpu=15.99ms elapsed=89381.36s
tid=0x0000557f51d36890 nid=670 runnable [0x00007f63fc008000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.UnixDomainSockets.connect0([email protected]/Native Method)
at sun.nio.ch.UnixDomainSockets.connect([email protected]/Unknown Source)
at sun.nio.ch.UnixDomainSockets.connect([email protected]/Unknown Source)
at sun.nio.ch.SocketChannelImpl.connect([email protected]/Unknown Source)
at org.apache.tomcat.util.net.NioEndpoint.unlockAccept(NioEndpoint.java:417)
at
org.apache.tomcat.util.net.AbstractEndpoint.pause(AbstractEndpoint.java:1506)
at org.apache.coyote.AbstractProtocol.pause(AbstractProtocol.java:712)
at org.apache.catalina.connector.Connector.pause(Connector.java:1010)
...
Analysis: The shutdown thread is executing unlockAccept() and is stuck inside
the native method UnixDomainSockets.connect0(). Because the Acceptor thread is
not calling accept(), the OS kernel cannot complete the UDS connection. Unlike
TCP (where the kernel buffers the handshake and connect() returns immediately),
UDS connect() blocks until the listening side consumes it. Since there is no
timeout configured, it blocks forever.

The Deadlock Summary:

The Acceptor thread is sleeping/waiting and not calling accept().
The Shutdown thread is blocked infinitely at unlockAccept() -> connect0(),
waiting for an accept() that will never happen.
The shutdown process is permanently hung.
This proves the race condition is real and explains why it’s hard to reproduce
in a quiet local environment: if the Acceptor happens to be blocking on
accept() when unlockAccept() is called, the connect() succeeds instantly. The
bug only triggers when the Acceptor is temporarily outside the accept() call
during the exact window of the shutdown sequence.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to