Thank you for taking the time. Removing Gradle from equation did not make a difference.

I did what I should have done in the beginning though and turned up the logging level. When an endpoint gets shut down, since there is no other way of waking the Acceptor up, a fake connection is made. This is timing out apparently:

Feb 25, 2024 6:23:24 PM org.apache.tomcat.util.net.AbstractEndpoint unlockAccept
FINE: Caught exception trying to unlock accept on port [0]
java.net.SocketTimeoutException: Connect timed out
        at 
java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:546)
        at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597)
        at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
        at java.base/java.net.Socket.connect(Socket.java:633)
        at 
org.apache.tomcat.util.net.AbstractEndpoint.unlockAccept(AbstractEndpoint.java:1124)
        at 
org.apache.tomcat.util.net.NioEndpoint.unlockAccept(NioEndpoint.java:390)
        at 
org.apache.tomcat.util.net.AbstractEndpoint.pause(AbstractEndpoint.java:1394)
        at org.apache.coyote.AbstractProtocol.pause(AbstractProtocol.java:678)
        at org.apache.catalina.connector.Connector.pause(Connector.java:963)
        at 
org.apache.catalina.core.StandardService.stopInternal(StandardService.java:484)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:242)
        at 
org.apache.catalina.core.StandardServer.stopInternal(StandardServer.java:974)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:242)
        at org.apache.catalina.startup.Tomcat.stop(Tomcat.java:447)
        at SlowStop.main(SlowStop.java:14)

You can see the procedure here: https://github.com/apache/tomcat/blob/10.1.x/java/org/apache/tomcat/util/net/AbstractEndpoint.java#L1086

What I noted:
- the local address for shutdown is NOT loopback
- the connection is timing out

After I did some magic to increase the connection timeout from 2000ms to 60 000ms - I got "No route to host".

So after looking up which IP the device belonged to - it wasn't the interface I was looking for.

On quick inspection the acceptor thread (https://github.com/apache/tomcat/blob/10.1.x/java/org/apache/tomcat/util/net/Acceptor.java#L128) was listening on [/[0:0:0:0:0:0:0:0]:39033] , which was correctly picked up at first, but then this local address got transformed:

https://github.com/apache/tomcat/blob/10.1.x/java/org/apache/tomcat/util/net/AbstractEndpoint.java#L1164

It started picking up interfaces to use, and it stopped at the first non-loopback non-link local address, which also happens to be some sort of a bridge network for one of my VMs. I don't quite know why there is no routing set up, but this interface should not have been picked in the first place.

Here's a list of my interfaces:

- lo (loopback)
- wlp4s0 (wifi)
- docker0 (docker networking stuff)
- br-e26d1e697a66 (some bridge)
- br-54aa505a4d52 (some bridge)

The code picked up br-54aa505a4d52 first, it has those IP addresses:

- /fe80:0:0:0:0:0:0:1%br-e26d1e697a66
- /fc00:f853:ccd:e793:0:0:0:1%br-e26d1e697a66
- /172.18.0.1

It stopped iterating once it got to the second one, tried using it and failed. It should not have tried to touch the bridge networks at all.

I'll take a look at how it works on my work laptop with MacOs, but I'd wager a guess that some corporate VPNs have interfaces which have messed up routing as well. Can't tell if it's a bug or a feature, but it sure is unexpected. Making this timeout for acceptor shutdown configurable would be one sort of band-aid.

Chuck - could you please share your thoughts on this?

Sample logs, I updated the repo with this script:

$ ./no_gradle.sh
Feb 25, 2024 7:13:08 PM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["http-nio-auto-1"]
Feb 25, 2024 7:13:08 PM org.apache.catalina.core.StandardService startInternal
INFO: Starting service [Tomcat]
Feb 25, 2024 7:13:08 PM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["http-nio-auto-1-37591"]
Feb 25, 2024 7:13:08 PM org.apache.coyote.AbstractProtocol pause
INFO: Pausing ProtocolHandler ["http-nio-auto-1-37591"]
Feb 25, 2024 7:13:08 PM org.apache.tomcat.util.net.AbstractEndpoint unlockAccept
FINER: About to unlock socket 
for:/[fc00:f853:ccd:e793:0:0:0:1%br-e26d1e697a66]:37591
Feb 25, 2024 7:13:09 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884789930; 
nextExpiration=0; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:10 PM org.apache.tomcat.util.net.AbstractEndpoint unlockAccept
FINE: Caught exception trying to unlock accept on port [0]
java.net.SocketTimeoutException: Connect timed out
        at 
java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:546)
        at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597)
        at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
        at java.base/java.net.Socket.connect(Socket.java:633)
        at 
org.apache.tomcat.util.net.AbstractEndpoint.unlockAccept(AbstractEndpoint.java:1124)
        at 
org.apache.tomcat.util.net.NioEndpoint.unlockAccept(NioEndpoint.java:390)
        at 
org.apache.tomcat.util.net.AbstractEndpoint.pause(AbstractEndpoint.java:1394)
        at org.apache.coyote.AbstractProtocol.pause(AbstractProtocol.java:678)
        at org.apache.catalina.connector.Connector.pause(Connector.java:963)
        at 
org.apache.catalina.core.StandardService.stopInternal(StandardService.java:484)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:242)
        at 
org.apache.catalina.core.StandardServer.stopInternal(StandardServer.java:974)
        at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:242)
        at org.apache.catalina.startup.Tomcat.stop(Tomcat.java:447)
        at SlowStop.main(SlowStop.java:14)

Feb 25, 2024 7:13:10 PM org.apache.catalina.core.StandardService stopInternal
INFO: Stopping service [Tomcat]
Feb 25, 2024 7:13:10 PM org.apache.coyote.AbstractProtocol stop
INFO: Stopping ProtocolHandler ["http-nio-auto-1-37591"]
Feb 25, 2024 7:13:10 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884790964; 
nextExpiration=1708884790930; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:11 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884791967; 
nextExpiration=1708884791964; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:12 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884792968; 
nextExpiration=1708884792967; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:13 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884793969; 
nextExpiration=1708884793968; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:14 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884794971; 
nextExpiration=1708884794969; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:15 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884795972; 
nextExpiration=1708884795971; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:16 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884796974; 
nextExpiration=1708884796972; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:17 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884797976; 
nextExpiration=1708884797974; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:18 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884798977; 
nextExpiration=1708884798976; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:19 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884799979; 
nextExpiration=1708884799977; keyCount=0; hasEvents=false; eval=false
Feb 25, 2024 7:13:20 PM org.apache.tomcat.util.net.Acceptor stop
WARNING: The acceptor thread [http-nio-auto-1-Acceptor] did not stop cleanly
Feb 25, 2024 7:13:20 PM org.apache.tomcat.util.net.NioEndpoint$Poller timeout
FINER: timeout completed: keys processed=0; now=1708884800947; 
nextExpiration=1708884800979; keyCount=0; hasEvents=false; eval=false


Cheers,
Michał


On 25.02.2024 16:01, Chuck Caldarale wrote:

On Feb 23, 2024, at 10:18, Michał Szymborski <michalszymbor...@gmail.com> wrote:

Hi, I've encountered an issue where the acceptor doesn't stop cleanly when 
shutting Tomcat down on some machines. I'm using tomcat-embed-core 10.1.19, 
I've tested it on Ubuntu 22.04 and MacOs Sonoma 14.3.1.

Here is a minimum reproducible example (also available here 
https://github.com/lared/tomcat-acceptor-not-stopping-cleanly):

```
import org.apache.catalina.LifecycleException;
import org.apache.catalina.connector.Connector;
import org.apache.catalina.startup.Tomcat;

class SlowStop {
    public static void main(String[] args) throws LifecycleException {
        var connector = new 
Connector("org.apache.coyote.http11.Http11NioProtocol");
        connector.setPort(0);

        Tomcat tomcat = new Tomcat();
        tomcat.getService().addConnector(connector);

        tomcat.start();
        tomcat.stop();
    }
}
```


Whenever I run this code on setup as listed above, shutdown of the 
ProtocolHandler times out when trying to stop the acceptor. Here is an excerpt 
from logs:

```
Feb 23, 2024 4:42:34 PM org.apache.coyote.AbstractProtocol stop
INFO: Stopping ProtocolHandler ["http-nio-8080"]
Feb 23, 2024 4:42:44 PM org.apache.tomcat.util.net.Acceptor stop
WARNING: The acceptor thread [http-nio-8080-Acceptor] did not stop cleanly
```

This can be traced down to `NioEndpoint` stopping, particularly stopping the 
acceptor:

https://github.com/apache/tomcat/blob/10.1.x/java/org/apache/tomcat/util/net/NioEndpoint.java#L308

https://github.com/apache/tomcat/blob/10.1.x/java/org/apache/tomcat/util/net/Acceptor.java#L180


This issue is not reproducible on every setup (see github actions run on the 
linked repo), and it only affects NIO - NIO2 works as expected.

I've encountered this issue when Spring Boot got upgraded to 3.2.x line (switch 
from Tomcat 9 to 10.1.x).

Could you please advise me where I can go from here? This is particularly 
annoying in our testing setup, where timely teardown of the Tomcat is very 
appreciated (even if disruptive).

Unfortunately I'm a fair bit out of my depth here when it comes to 
troubleshooting this further. It is rather surprising that even though people 
were not able to reproduce it on some of their setups (like MacOs 13.6.4), but 
I actually was able to reproduce it on two machines I own, my colleagues at 
work have the same issue.


I could not reproduce the error in several hundred executions of your test 
program on my M1 MacBook Pro (Sonoma 14.3.1, OpenJDK 21.0.1) using just command 
line execution:

chuck@Chuck-MacBookPro JavaTest > java -version
openjdk version "21.0.1" 2023-10-17 LTS
OpenJDK Runtime Environment Temurin-21.0.1+12 (build 21.0.1+12-LTS)
OpenJDK 64-Bit Server VM Temurin-21.0.1+12 (build 21.0.1+12-LTS, mixed mode)
chuck@Chuck-MacBookPro JavaTest > java -cp `find 
~/Downloads/apache-tomcat-10.1.19-embed -name '*.jar' | tr '\n' ':' | sed 
's/:$/:./'` SlowStop
Feb 25, 2024 8:55:54 AM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["http-nio-auto-1"]
Feb 25, 2024 8:55:54 AM org.apache.catalina.core.StandardService startInternal
INFO: Starting service [Tomcat]
Feb 25, 2024 8:55:54 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["http-nio-auto-1-56406"]
Feb 25, 2024 8:55:54 AM org.apache.coyote.AbstractProtocol pause
INFO: Pausing ProtocolHandler ["http-nio-auto-1-56406"]
Feb 25, 2024 8:55:54 AM org.apache.catalina.core.StandardService stopInternal
INFO: Stopping service [Tomcat]
Feb 25, 2024 8:55:54 AM org.apache.coyote.AbstractProtocol stop
INFO: Stopping ProtocolHandler ["http-nio-auto-1-56406"]

Any chance the problem has something to do with your gradle environment?

   - Chuck



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to