I met the same problem on Tomcat 9.0.74 these days and I think I have found the answer.
Our case is: 1. Open serveral Chrome tabs and each tab establish a websocket connection and a websocket session with Tomcat. To keep the connection and session alive, there is a JS timer who send a STOMP heartbeat message to Tomcat server every 10 seconds. Tomcat will send a STOMP heart beat to Chrome every 10 seconds too. The timeout is 30 seconds on both sides. JS will establish a new websocket connection if the old connection is closed. Open dev tools for each tabs to observe and record the websocket connections. 2. Wait a few minutes and do nothing, we may find that: 1) the AbstractProtocol.waitingProcessors leak probably. 2) the hidden Chrome tab establish serveral websocket connections, only one alive, others are closed by Tomcat server. 3) look at the closed websocket conenctions carefully, we find that the heartbeats from the server are normal, but there is no heartbeat to the server in the last 30 seconds before the connection is closed. 4) many TCP connections are in TIME_WAIT state. The leak may happen when the WsSessions expired on the server side. I think the process is: 1. Chrome's Intensive Throttling will prevent the JS timer to send heartbeat messages on the hidden tabs in 1 Minute. 2. Tomcat check WsSession expiration every second by WsBackgroundThread. The WsSession will expire, and then Tomcat will send a close message to the client/ Chrome, and the client will send a close message as response. 3. In order to fix BZ 66508 dead-locks, https://bz.apache.org/bugzilla/ show_bug.cgi?id=66508, WsRemoteEndpointImplServer will release controll of processor(UpgradeInteralProcessor for websocket) and the socket lock, then re-take controll. The fix may set the socketWrapper.currentProcessor to null when semaphore(messagePartInProgress) contention happens. Now, WsSession is OUTPUT_CLOSED while the socket is not closed. 4. Client send a close message or a normal message to Tomcat, but socketWrapper.currentProcessor is null now instead of a UpgradeInteralProcessor, the AbstractProtocol/Http11NioProtocol will take a Http11Processor to process the websocket message, this causes protocol error which leads to Tocmat close socket immediately. Now, WsSession is OUTPUT_CLOSED and the socket is closed. Normally, processor is released by SocketWrapperBase.close(). SocketWrapperBase will remove its currentProcessor from AbstractProtocol.waitingProcessors. But the currentProcessor is null now and thus cannot be removed. There is no more chance to remove UpgradeInteralProcessor of the expired WsSession. Here is my solution: I think the key point is socketWrapper.currentProcessor should not be set to null when WsSession expires. socketWrapper.currentProcessor is changed by setCurrentProcessor() and takeCurrentProcessor() which both are invoked by client massage processing and protected by socketWrapper.lock. I've create a pr, please reveiw and check it, tks. https://github.com/apache/tomcat/pull/683 Liang --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org