Hi, Short version:
I use httpd on Windows as a reverse proxy for a microservice system. Some services communicate over websockets (more precicely: SignalR). From time to time I have to restart the server in order to read a new configuration. I observe an increasing number of threads blocked by the SignalR connections. It's a matter of time until the server completely freezes because no threads are available for other requests. Details: I reduced my system as much as possible. I end up with two microservices, A and B. A has a SignalR hub. Both, A and B subscribe to the events of this hub. Thus, there should be two connections. Now the experiment: 1. Start the two microservices: They repeatedly try to connect, but fail. This is expected, because they are configured to connect via the reverse proxy and httpd is not running yet. 2. Start httpd (Windows Service): As expected, both services establish their connection, confirmed by the service logs and mod_status showing 2 connections. 3. Restart httpd: In real-world, I call httpd.exe -n "ServiceName" -k restart programmatically. For this experiment, I call it from Powershell. What happens? 3a. The parent starts a new child and hands over 2 sockets, see error.log on Pastebin 3b. The parent needs to stop the old child. The old child cannot stop because of the open connections. The old child waits a grace period of 30s before, then it terminates the 2 threads. My services log that their connection was disconnected and attempt to reconnect. At this moment, 2 more connections appear in mod_status. However, I don't see any socket handover in error.log. 4. Repeat httpd restart. 4a. The parent starts a new child and hands over 2 sockets, see error.log. It's still 2 sockets, although I saw 4 connections in mod_status in the previous step. 4b. The parent shuts down the old child. This time, there is no grace period, but 18(!) threads that failed to exit are terminated, see error.log. Both services log disconnect and reconnect. However, no additional connections appear in mod_stats, it remains 4. When I repeat restarting httpd, most of the time it happens the same as described in step 4. Only difference is a changing number of "threads that failed to exit". But sometimes, additional connections appear in mod_status. I can't reproduce this on purpose. I suspect a race condition how fast the old child is shut down, the new one is started and my services trying to reconnect, but I don't know the httpd source code. To get my job done, I need to know: What can I do to avoid eventually blocking the server? Out of curiosity, I also would like to know what excatly happens, how the SignalR connectios are handed over to the next child, why the first restart works different than the other restarts. I appreciate any hint! Some more information about server and configuration: Version: 2.4.41 Some config snippets: ThreadsPerChild 20 # handy for debugging, not in production RewriteEngine On RewriteCond %{HTTP:Upgrade} websocket [NC] RewriteCond %{HTTP:Connection} upgrade [NC] RewriteRule "^/my/microservice" "wss://hostname:53728%{REQUEST_URI}"[P] ProxyPass /my/microservice https://hostname:53728/my/microservice ProxyPassReverse /my/microservice https://hostname:53728/my/microservice Link to error.log on Pastebin: https://pastebin.com/7a7B0bLb