Dear André,
Took me a while to answer, because I wanted to get more precise readings.
>>> Assuming that your client is really connecting to that HTTP connector on
>>> port 8080 mentioned above..
>> Yes, it has a forwarded port 80 (using FreeBSD ipfw) that also points to
>> 8080, and there is an Apache with mod_proxy_http that hooks into 8081. My
>> tests are on the vanilla port, though.
>
> Can you be a bit clearer on this part ? Do you see the problem happening for
> 1 in 10 posts, when your client connects directly to Tomcat's HTTP port 8080 ?
> Or is it only when the client connects to Tomcat via either one of these
> intermediate pieces of machinery ?
I re-ran my tests, making sure that I was positive about the path the packets
travel. Here is what I fished out of the log of a client app running on the
same machine as the server:
java.net.SocketException: Connection reset by peer
at java.net.PlainSocketImpl.socketSetOption(Native Method)
at
java.net.AbstractPlainSocketImpl.setOption(AbstractPlainSocketImpl.java:267)
at java.net.Socket.setTcpNoDelay(Socket.java:940)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:400)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:483)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:213)
at sun.net.www.http.HttpClient.New(HttpClient.java:300)
at sun.net.www.http.HttpClient.New(HttpClient.java:316)
at
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:992)
at
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:971)
at
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:846)
at
sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1087)
....
This particular client connects on the same machine as the Tomcat server, using
port 8080 (the configured HTTP connector for this Tomcat server). The URL used
is http://localhost:8080/<webapp>.
So whatever is happening, it is not happening on the network or in the ISP's
hardware. It seems to be local to my machine and with this I feel that Tomcat
is a suspect again. This goes against some of the things I said earlier. I was
wrong, sorry.
Is there any way to overwhelm a Tomcat connector without it serving 503
responses? This particular connector is maxed out at 256 connections, with
about 5% actually busy at any given time (as reported by JMX), with spikes of
up to 50% busy every hour or so.
> Another thing : your client is effectively requesting non-keepalive
> connections, so Tomcat will close the connection after sending the response
> to each request. And your clients have to rebuild a new connection for each
> request.
> If the same client(s) make lots of small requests one after another, this may
> be counter-productive, because each connection build-up requires several
> packets going back and forth. Also, on the server side, when a connection is
> being closed, it will nevertheless "linger" for a while in CLOSE_WAIT state,
> waiting for the client's TCP stack to acknowledge the CLOSE. I have seen
> cases where a large number of such connections being in CLOSE_WAIT triggered
> bizarre issues, such as a server becoming unable to accept new TCP
> connections for a while.
I know this, but since I know the traffic pattern (connections are used once
per minute) I opted to make the connections non-keep-alive from the client
side. That way, the connections that come from browsers can make use of
keep-alive for performance.
I guess I should have gone for a model where I have a non-keel-alive HTTP
connector on Tomcat for the clients and another keep-alive-enabled connector
for the browser traffic. Hmmm. It may not be too late for that, but first I'd
like some proof this is an issue. :)
> It may be worth checking how many of such CLOSE_WAIT connections you have
> over time, and if this relates to when the problems happen.
> netstat -pan | grep CLOSE_WAIT
> would show this. If more than a couple of hundreds show up, I'd become
> suspicious of something like that.
I now graph these in Munin and I see spikes of up to 100 sockets in that state.
You say hundreds are a problem, what about one hundred?
--
Kees Jan
http://java-monitor.com/
[email protected]
+31651838192
Repairing cannot be completed, you can only stop doing it.
-- Belarusian proverb
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]