On 19/05/2021 12:24, Paul P Wolf wrote:
Thank you Thomas. I carefully read your explanation. It makes sense to me and 
is completely different from what I understood up until this point. With this 
new understanding, the problem still persists. Please let me rephrase my issues 
in the light of what I just learned.

To summarize:
- thread limit defines how many requests can be processed concurrently.

Yes, via maxThreads.

- maxConnections defines how many connections are accepted by tomcat via 
socket.accept() and can be monitored by tomcat. this does not include the 
connections/requests being currently processed in an active thread.

Not quite. This does include connections currently being processed in an active thread.

- acceptCount is an OS backlag, which is not monitored by tomcat and the OS may 
decides to override the value.

Correct.

- if all threads, maxConnections and acceptCount backlog are full, further 
requests get refused by the OS

Threads don't matter here. Just maxConnections and acceptCount/backlog

Now my still persisting issues:
Say Tomcat can process 2000 requests a second and the typical client timeout
is 5s, then an acceptCount/backlog of anything up to 10000 should be OK but
above that some clients will time out because Tomcat won't be able to clear
the all backlog before the unprocessed client connections timeout.
If there are more requests than there is space in the backlog and the maxConnections is reached, 
why would you expect client timeouts instead of refused connections? Timeouts are what I see, but 
not what I expect, when I read "Any further simultaneous requests will receive 
"connection refused" errors".

Let me expand on the point I was trying to make. Using the 2000 req/s number above, a client timeout of 5s, an acceptCount of 20000 and no keep-alive I'd expect to see something close to the following:

0s  - 20000 connections in acceptCount
1s  - 18000 connections in acceptCount, 2000 completed requests
2s  - 16000 connections in acceptCount, 4000 completed requests
3s  - 14000 connections in acceptCount, 6000 completed requests
4s  - 12000 connections in acceptCount, 8000 completed requests
5s  - 10000 connections in acceptCount, 10000 completed requests
>5s - 10000 client timeouts, 10000 completed requests

The clients timeout because they spend more than timeout in the acceptCount/backlog queue waiting for Tomcat to call Socket.accept()

Different question around the same issue: What would need to happen, so that 
there would be refused connections instead of client timeouts?

Same scenario as above but with an acceptCount of 5000
0s - 5000 connections in acceptCount, 15000 refused connections
1s - 3000 connections in acceptCount, 15000 refused connections,
     2000 completed requests
2s - 1000 connections in acceptCount, 15000 refused connections,
     4000 completed requests
3s - 15000 refused connections, 5000 completed requests

Your numbers are too close together. If you use numbers that are further
apart, the behaviour should be more obvious. Something like:
maxThreads=4
maxConnections=10
acceptCount=20

What do you mean by "numbers are too close together"? Why would that be an 
issue? What would be far enough? Is there any documentation? The processing speed 
shouldn't be an issue, as the endpoints sleep for 10s.

My point was that with values a 3, 2 and 1 and the off-by-one behaviour of maxConnections it is harder to match up observed numbers with configuration values. If the configuration values are further apart it should be easier to match observations, and changes in observations, which configuration values and changes in configuration values.

Keep in mind that my numbers above assume things happen instantly whereas in reality there is always going to be an ordering. The observed numbers can be slightly different from what you expect sometimes. If your configuration values are only 1 apart it will be hard to be sure what you are seeing.

Regardless, I tried your suggested configuration and nothing really changed: I 
see 31 successful requests and 19 timed out after 5 seconds. Still not a single 
refused connection. And considering the numbers, the OS acknowledged the 
configured acceptCount number.

So we have:
maxThreads=4
maxConnections=10
acceptCount=20

and a request processing time of 1 second.

I'd guess that the OS is using a much large accept count. Let's model it.

0s - 50 connections in acceptCount
1s - 39 connections in acceptCount, 11 connections maintained by Tomcat,
     4 requests processing
2s - 35 connections in acceptCount, 11 connections maintained by Tomcat,
     4 requests processing, 4 completed requests
3s - 31 connections in acceptCount, 11 connections maintained by Tomcat,
     4 requests processing, 8 completed requests
4s - 27 connections in acceptCount, 11 connections maintained by Tomcat,
     4 requests processing, 12 completed requests
5s - 23 connections in acceptCount, 11 connections maintained by Tomcat,
     4 requests processing, 16 completed requests
6s - 19 connections timed out, 11 connections maintained by Tomcat,
     4 requests processing, 20 completed requests
7s - 19 connections timed out, 7 connections maintained by Tomcat,
     4 requests processing, 24 completed requests
8s - 19 connections timed out, 3 connections maintained by Tomcat,
     3 requests processing, 28 completed requests
9s - 19 connections timed out, 0 connections maintained by Tomcat,
     0 requests processing, 31 completed requests

That seems to match what you observed. That suggests the OS is using an acceptCount of at least 50.

Same question as before: what needs to change to make Tomcat refuse 
connections? This still seems like a bug to me.

Not up to Tomcat. Tomcat can only call Socket.accept() and does so under the control of maxConnections.

Connection refused == acceptCount/backlog full (or no listening socket).

Connection refusal is entirely under the control of the OS and will be driven largely by the actual value of acceptCount/backlog.

Other factors will complicate this further.

Clients may automatically retry connections. Browsers especially. I often use a custom HTTP client (or just telnet) when testing Tomcat to ensure that I can control exactly what the client is doing.

Once the client has sent the request, it will try to read the response. If there is a large traffic spike and Tomcat takes a while to clear the acceptCount queue you can see clients timing out waiting for the response. You don't normally see this in browsers as they have a fairly long read timeout to handler slow web sites.

HTH,

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to