Peter, I tried the latest Tomcat source (I believe it is 5.5.15 head as stated in bug #37896). As you suggested I used "fastasyncqueue". Here is the config for "fastasyncqueue" <Sender className=" org.apache.catalina.cluster.tcp.ReplicationTransmitter" replicationMode="fastasyncqueue" keepAliveTimeout="320000" threadPriority="10" queueTimeWait="true" queueDoStats="true" waitForAck="false" autoConnect="false" doTransmitterProcessingStats="true" />
But it did not work (I am not able to use stickyseesion load balancing at the moment)..... the error % was very high (> 34%) so reverted back to "pooled" mode but removed "autoConnect" attribute. I still saw the "Broken pipe" exceptions, so I wondered if the problem is really fixed or not. I further tried to tweak listener threads and sender socket pool limit: <Receiver className=" org.apache.catalina.cluster.tcp.ReplicationListener" tcpListenAddress="auto" tcpListenPort="4001" tcpThreadCount="50"/> <Sender className=" org.apache.catalina.cluster.tcp.ReplicationTransmitter" replicationMode="pooled" keepAliveTimeout="-1" maxPoolSocketLimit="200" doTransmitterProcessingStats="true" /> with the new configuration I am getting a lot of following "SEVERE" error. SEVERE: Exception initializing page context java.lang.IllegalStateException: Cannot create a session after the response has been committed at org.apache.catalina.connector.Request.doGetSession(Request.java :2214) at org.apache.catalina.connector.Request.getSession(Request.java :2024) at org.apache.catalina.connector.RequestFacade.getSession( RequestFacade.java:831) at javax.servlet.http.HttpServletRequestWrapper.getSession( HttpServletRequestWrapper.java:215) at org.apache.catalina.core.ApplicationHttpRequest.getSession( ApplicationHttpRequest.java:544) at org.apache.catalina.core.ApplicationHttpRequest.getSession( ApplicationHttpRequest.java:493) at org.apache.jasper.runtime.PageContextImpl._initialize( PageContextImpl.java:148) at org.apache.jasper.runtime.PageContextImpl.initialize( PageContextImpl.java:123) at org.apache.jasper.runtime.JspFactoryImpl.internalGetPageContext( JspFactoryImpl.java:104) at org.apache.jasper.runtime.JspFactoryImpl.getPageContext( JspFactoryImpl.java:61) at org.apache.jsp.dynaLeftMenuItems_jsp._jspService( org.apache.jsp.dynaLeftMenuItems_jsp:50) Having said all that, since Jmeter script fails at initial steps therefore successive steps failed too so overall error % went up to 30% and req/sec was 19. I am kind of confused while trying to analyze the situation as to why would "Broken pipe" exception occur (even when it was supposed to be fixed) but then it disappears by increasing listener thread and sender socket limit....is it some kind of timing issue and balancing between no of listeners thread and no of sender sockets in the pool. I didn't find in the documentation about the effect of changing those parameter or any recommendation. Thanks Yogesh On 12/16/05, Peter Rossbach <[EMAIL PROTECTED]> wrote: > > Hey Yogesh, > > please update to current svn head. > > s. following bug that now fixed: > > http://issues.apache.org/bugzilla/show_bug.cgi?id=37896 > > S. 5.5.14 Changelog the see that more bugs exists inside 5.5.12. > > Please, report as it works! > > > Peter > > Tipp: For high load the fastasyncqueue sender mode is better. > Also you don't need autoconnect! > > > > Yogesh Prajapati schrieb: > > >The detail on Tomcat Clustering Load Testing Environment: > > > >Application: A web Portal, Pure JSP/Servlet based implementation using > JDBC > >(Oracle 10g RAC) and OLTP in nature. > > > >Load Test Tool: Jmeter > > > >Clustering Setup: 4 nodes > > > >OS: SUSE Enterprize 9 (SP2) on all nodes (kernel: 2.6.5-7.97) > > > >Sofwares: JDK 1.5.0_05, Tomcat 5.5.12 > > > >Hardware configuration: > >Node #1: Dual Pentium III (Coppermine) 1 GHz, 1 GB RAM > >Node #2: Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM > >Node #3: Dual Pentium III (Coppermine) 1 GHz, 1 GB RAM > >Node #4: Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM > > > >Network Configuration: All nodes are behind Alteon Load balancer > >(response-time based load balancing), all have two nic cards with subnets > >10.1.13.0 for load balancing network, 10.1.11.0 for private LAN. The > private > >nic has multicast enabled. All private nic are connected to 10/100 Fast > >Ethernet switch. > > > >Tomcat cluster configuration (same on all nodes): > > <Cluster className=" > org.apache.catalina.cluster.tcp.SimpleTcpCluster > >" > > managerClassName=" > >org.apache.catalina.cluster.session.DeltaManager" > > expireSessionsOnShutdown="false" > > useDirtyFlag="true" > > notifyListenersOnReplication="true"> > > > > <Membership > > className="org.apache.catalina.cluster.mcast.McastService > " > > mcastAddr="228.0.0.4" > > mcastPort="45564" > > mcastFrequency="1000" > > mcastDropTime="35000" > > mcastBindAddr="auto" > > /> > > > > <Receiver > > className=" > >org.apache.catalina.cluster.tcp.ReplicationListener" > > tcpListenAddress="auto" > > tcpListenPort="4001" > > tcpThreadCount="24"/> > > > > <Sender > > className=" > >org.apache.catalina.cluster.tcp.ReplicationTransmitter" > > replicationMode="pooled" > > autoConnect="true" > > keepAliveTimeout="-1" > > maxPoolSocketLimit="600" > > doTransmitterProcessingStats="true" > > /> > > > > <Valve className=" > >org.apache.catalina.cluster.tcp.ReplicationValve" > > > > >filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/> > > > > <Deployer className=" > >org.apache.catalina.cluster.deploy.FarmWarDeployer" > > tempDir="/tmp/war-temp/" > > deployDir="/tmp/war-deploy/" > > watchDir="/tmp/war-listen/" > > watchEnabled="false"/> > > > > <ClusterListener className=" > >org.apache.catalina.cluster.session.ClusterSessionListener"/> > > </Cluster> > > Note: for the application session availability on all the nodes is > >must, so using "pooled" mode. > > > >Tomcate VM Parameters (additional switches for VM tunning): > >-XX:+AggressiveHeap -Xms832m -Xmx832m -XX:+UseParallelGC > -XX:+PrintGCDetails > >-XX:MaxGCPauseMillis=200 -XX:GCTimeRatio=9 > > > >After starting tomcat on all the nodes, when I run Jmeter scripts with > 20-70 > >concurrent user threads, the entire cluster works fine (almost 0% error) > but > >at high number of users like > 200 concurrent user threads the tomcat > >cluster session replication starts failing consistently and the > replication > >messages getting lost. Here is what I get in tomcat logs on all the nodes > >(too many times): > > > >WARNING: Message lost: [10.1.11.95:4,001] type=[ > >org.apache.catalina.cluster.session.SessionMessageImpl], > >id=[40FC741DB987BF5161C3AEEB32570A8E- > >1134732225260] > >java.net.SocketException: Broken pipe > > at java.net.SocketOutputStream.socketWrite0(Native Method) > > at java.net.SocketOutputStream.socketWrite( > SocketOutputStream.java > >:92) > > at java.net.SocketOutputStream.write(SocketOutputStream.java:124) > > at org.apache.catalina.cluster.tcp.DataSender.writeData( > >DataSender.java:858) > > at org.apache.catalina.cluster.tcp.DataSender.pushMessage( > >DataSender.java:799) > > at org.apache.catalina.cluster.tcp.DataSender.sendMessage( > >DataSender.java:623) > > at org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage > ( > >PooledSocketSender.java:128) > > at > >org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData( > >ReplicationTransmitter.java:867) > > at > > > org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageClusterDomain > >(ReplicationTransmitter.java:460) > > at > >org.apache.catalina.cluster.tcp.SimpleTcpCluster.sendClusterDomain( > >SimpleTcpCluster.java:1012) > > at org.apache.catalina.cluster.session.DeltaManager.send( > >DeltaManager.java:629) > > at > >org.apache.catalina.cluster.session.DeltaManager.sendCreateSession( > >DeltaManager.java:617) > > at org.apache.catalina.cluster.session.DeltaManager.createSession > ( > >DeltaManager.java:593) > > at org.apache.catalina.cluster.session.DeltaManager.createSession > ( > >DeltaManager.java:572) > >............................. > >............................. > > > >Also I have noticed fewer times on two of the nodes (#3, #4) following > >error: > > > >SEVERE: TCP Worker thread in cluster caught ' > >java.lang.ArrayIndexOutOfBoundsException: 1025' closing channel > >java.lang.ArrayIndexOutOfBoundsException: 1025 > > at org.apache.catalina.cluster.io.XByteBuffer.toInt( > XByteBuffer.java > >:231) > > at org.apache.catalina.cluster.io.XByteBuffer.countPackages( > >XByteBuffer.java:164) > > at org.apache.catalina.cluster.io.ObjectReader.append( > >ObjectReader.java:87) > > at > org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel > >(TcpReplicationThread.java:127) > > at org.apache.catalina.cluster.tcp.TcpReplicationThread.run( > >TcpReplicationThread.java:69) > > > >With all the above warning/exception I get the following jmeter results > >(scripts runs at: 200 concurrent threads, 5 iteration, 0 sec ramp-up > >period): > > > >Rate: 28 req/sec > >Error: 9.07 % > > > >The rate is acceptable but error is very high and specially at high > number > >of user thread the error % goes up. I have run the Jmeter script several > >times along with tweaking cluster configuration but I am not able to > figure > >out what am I doing wrong. > > > >Is "Broken pipe" is some kind failure and serious blocker OR it can > safely > >be ignored? > > > >"ArrayIndexOutOfBoundsException" looks to me a bug, it may already have > been > >reported but I don't know yet? > > > >With current scenario the memory usage are below 600 MB. My target is > reach > >2000 concurrent users thread keeping error within 3% and maintain the > same > >req/sec. Does this mean I have to add more memory (making it 2 GB on each > >node). > > > >Is there something else I am missing that I need to look at? > > > >Any suggestions, ideas, tips are most welcome and appreciated. > > > >Thanks > > > >Yogi > > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >