Hi Christophe, if in future you get more stucks, try to increase our open files to 4096, using follow commands:
ulimit -Hn 10240 ulimit -Sn 4096 to persist on reboot edit your /etc/security/limits.conf and add follow lines: * soft nofile 4096 * hard nofile 10240 i have changed my /etc/sysctl.conf with follow lines too, maybe one of that can help you if you have problems in the future: # tuneTCP net.ipv4.tcp_window_scaling=0 net.ipv4.tcp_tw_recycle=1 net.ipv4.tcp_mem=786432 1048576 1572864 net.ipv4.tcp_tw_reuse=1 net.ipv4.tcp_fin_timeout=30 net.ipv4.tcp_keepalive_time=1800 net.ipv4.tcp_max_syn_backlog=4096 net.core.wmem_max=8388608 net.core.rmem_max=8388608 net.ipv4.tcp_rmem=4096 87380 8388608 net.ipv4.tcp_wmem=4096 87380 8388608 i followed some tips from IBM RedPaper - Linux Performance and Tuning Guidelines, http://www.redbooks.ibm.com/abstracts/REDP4285.html best regards Clóvis On Mon, Jul 7, 2008 at 6:02 AM, Christophe Fondacci < [EMAIL PROTECTED]> wrote: > Hi Clovis, > > Thanks for your answers. > > Open files on our production servers are 1024. > Here is the complete output of ulimit -a : > core file size (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > max nice (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 36352 > max locked memory (kbytes, -l) 32 > max memory size (kbytes, -m) unlimited > open files (-n) 1024 > pipe size (512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > max rt priority (-r) 0 > stack size (kbytes, -s) 8192 > cpu time (seconds, -t) unlimited > max user processes (-u) 36352 > virtual memory (kbytes, -v) unlimited > file locks (-x) unlimited > > Our production servers are connected with gigabit ethernet. However the > servers used to reproduce the problems are only 100MBPS ethernet. The > problem occurs in both test and production environment. > > Our TCP keep alive settings are : > tcp_keepalive_time is 7200. > tcp_keepalive_intvl is 75 > tcp_keepalive_probes is 9 > > I am monitoring threads call stack using jprofiler which displays the stack > from my initial mail. > > I've tried the suggestion from Filip Hanik (maxKeepAliveRequests="1" in my > tomcat connector). But I was still able to reproduce the problem. > > Then I switched the connector to the NIO connector (I was previously using > the HTTP/1.1 default connector). I was not able to reproduce the problem > after hours of test (it usually happens after 10-20 minutes of heavy load). > Pushed the configuration change to one of our 4 productions servers to > monitor the efficiency. > > So far we didn't have any problem on the server with NIO connector after 5 > production days... > > Christophe. > > > > > ----- Original Message ----- From: "Clovis Wichoski" < > [EMAIL PROTECTED]> > To: "Tomcat Users List" <users@tomcat.apache.org> > Sent: Friday, July 04, 2008 4:17 AM > Subject: Re: Tomcat bottleneck on InternalInputBuffer.parseRequestLine > > > hi, Christophe, > > well, i still dont find the reason about what is my problem, but some > things > that help me to avoid the problem to occurs frequently, > > i checked the limit for open files on linux, you can check yous with ulimit > -a, here i set to 4096, > > how the machines are connected? its with gigabit ethernet? > > please show us all your configuration on /proc/sys/net/ipv4/ the most > important is: > > cat /proc/sys/net/ipv4/tcp_keepalive_time > > but the what more impact on the performance was the right configuration of > JDBC driver on the pool, i use MaxDB and the driver have a problem, that > when getting new physical connections, the drive have a singleton pattern, > that we cant get connections in parallel (really parallel, with multi-core > processors) and when this parallel attempts to get connections occurs, we > get stuck threads, but note, maybe the problem isnt with driver, its just a > suspect, since i dont have a solution for this. > another suspect is that, for some strange reason the socket on java exists, > but the socket on linux (inode) dont exists anymore, and until java know > this, the system stucks, until timeout, but i cant figure or simulate this, > since its really a rare case, and for me just occurs only one time, and i > dont have a way to prove that, i'm try to check my problem using the follow > script: > > #!/bin/bash > today=`date +%Y%m%d%H%M%S` > psId=`/opt/java/jdk1.6.0_06/bin/jps | grep Bootstrap | cut -d' ' -f1` > /opt/java/jdk1.6.0_06/bin/jstack -l $psId > > /mnt/logs/stack/stack${today}.txt > echo "--- pstack ---" >> /mnt/logs/stack/stack${today}.txt > pstack $psId >> /mnt/logs/stack/stack${today}.txt > echo "--- lsof ---" >> /mnt/logs/stack/stack${today}.txt > lsof >> /mnt/logs/stack/stack${today}.txt > echo "--- ls -l /proc/${psId}/fd/ ---" >> /mnt/logs/stack/stack${today}.txt > ls -l /proc/${psId}/fd/ >> /mnt/logs/stack/stack${today}.txt > echo "stack do processo $psId gravado em /mnt/logs/stack/stack${today}.txt" > > when users reports a stuck, i run this script manually, ten times, then > compare outputs, IBM have a tool that you can use to read better the jstack > output, i dont remember the name right now, but tomorrow i will post the > link here. > > lets see if we can share knowledge to win this fight, ;) > > regards > > Clóvis > > On Tue, Jul 1, 2008 at 12:23 PM, Christophe Fondacci < > [EMAIL PROTECTED]> wrote: > > Hello all, >> >> We have a problem with tomcat on our production server. >> This problem may be related to the one listed here : >> http://grokbase.com/profile/id:hNxqA0ZEdnD-6GYFRNs-iIkKEvF907FNWdczKYQ719Q >> >> Here it is : >> - We got 2 tomcat servers on 2 distinct machines. >> - 1 server is our application (let's call it A for Application server) >> - The other server is hosting solr (let's call it S for Solr server) >> - All servers are Tomcat 6.0.14 running on jdk 1.6.0_02-b05 on Linux >> (Fedora core 6) >> - Server A performs http requests to server S by 2 ways : >> > A http get (using apache commons HttpClient) with URL like >> >> http://S:8080/solr/select/?q=cityuri%3AXEABDBFDDACCXmaidenheadXEABDBFDDACCX&facet=true&facet.field=price&fl=id&facet.sort=false&facet.mincount=1&facet.limit=-1 >> < >> http://s:8080/solr/select/?q=cityuri%3AXEABDBFDDACCXmaidenheadXEABDBFDDACCX&facet=true&facet.field=price&fl=id&facet.sort=false&facet.mincount=1&facet.limit=-1 >> > >> > A http post with URL like : http://S:8080/solr/select/< >> http://s:8080/solr/select/>with a set of 12 NameValuePair >> >> >> When traffic is light on our server A, everything works great. >> When traffic is high on our server A (simultation of 40 simultaneous users >> with Jmeter), some requests to our server S take more than 200 seconds. It >> happens randomly and we couldn't isolate an URL-pattern: an URL can return >> in less than 500ms and the exact same URL can take 300s before >> returning... >> >> We performed deep jvm analysis (using jprofiler) to observe what was going >> on on the Solr-server. When the problem occurs, we can see threads which >> are >> stucked with the following call stack : >> >> at java.net.SocketInputStream.socketRead0(Native Method) >> at java.net.SocketInputStream.read(SocketInputStream.java:129) >> at >> >> org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:700) >> at >> >> org.apache.coyote.http11.InternalInputBuffer.parseRequestLine(InternalInputBuffer.java:366) >> at >> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:805) >> at >> >> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584) >> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) >> at java.lang.Thread.run(Thread.java:619) >> >> Requests which returns in 200s+ seem to spend almost all their time >> reading >> this input stream... >> The javadoc says parseRequestLine is used to parse the http header. As I >> stated above, our URL seem quite small so I can't understand why it >> happens. >> The response from server S is very small as well. >> >> We are able to reproduce the problem with less than 40 threads, but it is >> more difficult to repoduce. >> As I said at the beginning, I have found a user which had a similar >> problem >> but the mailing list thread does not give any solution... >> >> Has anyone an idea of what is going on ? Is there settings we can use to >> avoid this problems ? >> I am out of ideas on what to try to fix this... >> >> Any help would be highly appreciated...thank you very much. >> Christophe. >> >> >> --------------------------------------------------------------------- >> To start a new topic, e-mail: users@tomcat.apache.org >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> >> > > --------------------------------------------------------------------- > To start a new topic, e-mail: users@tomcat.apache.org > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >