Am 03.11.2020 um 16:05 schrieb Eric Robinson:
>> -----Original Message-----
>> From: Eric Robinson <eric.robin...@psmnv.com>
>> Sent: Tuesday, November 3, 2020 8:21 AM
>> To: Tomcat Users List <users@tomcat.apache.org>
>> Subject: RE: Weirdest Tomcat Behavior Ever?
>>
>>> From: Mark Thomas <ma...@apache.org>
>>> Sent: Tuesday, November 3, 2020 2:06 AM
>>> To: Tomcat Users List <users@tomcat.apache.org>
>>> Subject: Re: Weirdest Tomcat Behavior Ever?
>>>
>>> On 02/11/2020 12:16, Eric Robinson wrote:
>>>
>>> <snip/>
>>>
>>>> Gotcha, thanks for the clarification. Let's see what happens when
>>>> the users
>>> start hitting it at 8:00 am Eastern.
>>>
>>> Progress. The first attempt to write to the socket triggers the
>>> following
>>> exception:
>>>
>>> 02-Nov-2020 14:33:54.083 FINE [http-bio-3016-exec-13]
>>> org.apache.tomcat.util.net.JIoEndpoint$DebugOutputStream.write
>>> [301361476]
>>>  java.net.SocketException: Bad file descriptor (Write failed)
>>>     at java.net.SocketOutputStream.socketWrite0(Native Method)
>>>     at
>>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
>>>     at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
>>>     at
>>>
...

>>> Because this is an instance of an IOException, Tomcat assumes it has
>>> been caused by the client dropping the connection and silently
>>> swallows it. I'll be changing that later today so the exception is
>>> logged as DEBUG level for all new Tomcat versions.
>>>
>>> Possible causes of "java.net.SocketException: Bad file descriptor"
>>> I've been able to find are:
>>>
>>> 1. OS running out of file descriptors.
>>>
>>> 2.Trying to use a closed socket.
>>>
>>> I want to review the source code to see if there are any others.
>>>
>>> I don't think we are seeing 2 as there is no indication of the Socket,
>>> InputStream or OutputStream being closed in the logs.
>>>
>>> That leaves 1. Possible causes here are a file descriptor leak or
>>> normal operations occasionally needing more than the current limit. I
>>> don't think it is a leak as I'd expect to see many more errors of this
>>> type after the first and we aren't seeing that. That leaves the
>>> possibility of the current limit being a little too low.
>>>
>>> My recommendation at this point is to increase the limit for file 
>>> descriptors.
>>> Meanwhile, I'll look at the JRE source to see if there are any other
>>> possible triggers for this exception.
>>>
>>> Mark
>>>
>>>
>>
>> On the tomcat server, max open file descriptors is currently 2853957
>>
>> [root@001app01a ~]# sysctl fs.file-max
>> fs.file-max = 2853957
>>
>> Most of the time, the number of open files appears to run about 600,000.
>>
>>  What do you think of watching the open file count and seeing if the number
>> gets up around the ceiling when the socket write failure occurs? Something
>> like...
>>
>> [root@001app01a ~]#  while [ TRUE ];do FILES=$(lsof|wc -l);echo "$(date
>> +%H:%M:%S) $FILES";done
>> 09:11:15 591671
>> 09:11:35 627347
>> 09:11:54 626851
>> 09:12:11 626429
>> 09:12:26 545748
>> 09:12:42 548578
>> 09:12:58 551487
>> 09:13:14 516700
>> 09:13:30 513312
>> 09:13:45 512830
>> 09:14:02 533338
>> 09:14:18 568233
>> 09:14:35 570158
>> 09:14:51 566269
>> 09:15:07 547389
>> 09:15:23 544203
>> 09:15:38 546395
>>
>> It's not ideal; as it seems to take 15-20 seconds to count them using lsof.
>>
>>
>>
> 
> Wait, never mind. I realized the per-process limits are what matters. I 
> checked, and nofile was set to 4096 for the relevant java process.
> 
> I did...
> 
> # prlimit --pid 8730 --nofile=16384:16384
> 
> That should give java some extra breathing room if the issue is max open 
> files, right?

>From my experience you should see a different exception if you hit the
NOFILE limit: java.net.SocketException: Too many open files

But I've only seen that when you open or accept a new connection. Never
seen this later when something is written to that already open socket.

To me a bad file descriptor sounds more like a closed socket. This
reminds me of a database or http-client connection pool that handed out
connections with already closed sockets. I think this could be
suspicious because Mark wrote this happens on the first write to the socket.

- Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to