Re: Tomcat 5.5.12 clustering - messages lost under high load

Yogesh Prajapati Sat, 17 Dec 2005 02:59:46 -0800

Peter,

I tried the latest Tomcat source (I believe it is 5.5.15 head as stated in
bug #37896). As you suggested I used "fastasyncqueue". Here is the config
for "fastasyncqueue"
            <Sender
                className="
org.apache.catalina.cluster.tcp.ReplicationTransmitter"
                replicationMode="fastasyncqueue"
                keepAliveTimeout="320000"
                threadPriority="10"
                queueTimeWait="true"
                queueDoStats="true"
                waitForAck="false"
                autoConnect="false"
                doTransmitterProcessingStats="true"
                />


But it did not work (I am not able to use stickyseesion load balancing at
the moment)..... the error % was very high (> 34%) so reverted back to
"pooled" mode but removed "autoConnect" attribute. I still saw the  "Broken
pipe" exceptions, so I wondered if the problem is really fixed or not. I
further tried to tweak listener threads and sender socket pool limit:

            <Receiver
                className="
org.apache.catalina.cluster.tcp.ReplicationListener"
                tcpListenAddress="auto"
                tcpListenPort="4001"
                tcpThreadCount="50"/>

            <Sender
                className="
org.apache.catalina.cluster.tcp.ReplicationTransmitter"
                replicationMode="pooled"
                keepAliveTimeout="-1"
                maxPoolSocketLimit="200"
                doTransmitterProcessingStats="true"
                />
with the new configuration I am getting a lot of following "SEVERE" error.

SEVERE: Exception initializing page context
java.lang.IllegalStateException: Cannot create a session after the response
has been committed
        at org.apache.catalina.connector.Request.doGetSession(Request.java
:2214)
        at org.apache.catalina.connector.Request.getSession(Request.java
:2024)
        at org.apache.catalina.connector.RequestFacade.getSession(
RequestFacade.java:831)
        at javax.servlet.http.HttpServletRequestWrapper.getSession(
HttpServletRequestWrapper.java:215)
        at org.apache.catalina.core.ApplicationHttpRequest.getSession(
ApplicationHttpRequest.java:544)
        at org.apache.catalina.core.ApplicationHttpRequest.getSession(
ApplicationHttpRequest.java:493)
        at org.apache.jasper.runtime.PageContextImpl._initialize(
PageContextImpl.java:148)
        at org.apache.jasper.runtime.PageContextImpl.initialize(
PageContextImpl.java:123)
        at org.apache.jasper.runtime.JspFactoryImpl.internalGetPageContext(
JspFactoryImpl.java:104)
        at org.apache.jasper.runtime.JspFactoryImpl.getPageContext(
JspFactoryImpl.java:61)
        at org.apache.jsp.dynaLeftMenuItems_jsp._jspService(
org.apache.jsp.dynaLeftMenuItems_jsp:50)


Having said all that, since Jmeter script fails at initial steps therefore
successive steps failed too so overall error % went up to 30% and req/sec
was 19. I am kind of confused while trying to analyze the situation as to
why would "Broken pipe" exception occur (even when it was supposed to be
fixed) but then it disappears by increasing listener thread and sender
socket limit....is it some kind of timing issue and balancing between no of
listeners thread and no of sender sockets in the pool. I didn't find in the
documentation about the effect of changing those parameter or any
recommendation.

Thanks
Yogesh

On 12/16/05, Peter Rossbach <[EMAIL PROTECTED]> wrote:
>
> Hey Yogesh,
>
> please update to current svn head.
>
> s. following bug that now fixed:
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=37896
>
> S. 5.5.14 Changelog the see that more bugs exists inside 5.5.12.
>
> Please, report as it works!
>
>
> Peter
>
> Tipp: For high load the fastasyncqueue sender mode is better.
> Also you don't need autoconnect!
>
>
>
> Yogesh Prajapati schrieb:
>
> >The detail on Tomcat Clustering Load Testing Environment:
> >
> >Application: A web Portal, Pure JSP/Servlet based implementation using
> JDBC
> >(Oracle 10g RAC) and OLTP in nature.
> >
> >Load Test Tool: Jmeter
> >
> >Clustering Setup: 4 nodes
> >
> >OS: SUSE Enterprize 9 (SP2) on all nodes (kernel: 2.6.5-7.97)
> >
> >Sofwares: JDK 1.5.0_05, Tomcat 5.5.12
> >
> >Hardware configuration:
> >Node #1:  Dual Pentium III (Coppermine)  1 GHz, 1 GB RAM
> >Node #2:  Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
> >Node #3:  Dual Pentium III (Coppermine)  1 GHz, 1 GB RAM
> >Node #4:  Single Intel(R) XEON(TM) CPU 2.00GHz, 1 GB RAM
> >
> >Network Configuration: All nodes are behind Alteon Load balancer
> >(response-time based load balancing), all have two nic cards with subnets
> >10.1.13.0 for load balancing network, 10.1.11.0 for private LAN. The
> private
> >nic has multicast enabled. All private nic are connected to 10/100 Fast
> >Ethernet switch.
> >
> >Tomcat cluster configuration (same on all nodes):
> >        <Cluster className="
> org.apache.catalina.cluster.tcp.SimpleTcpCluster
> >"
> >                 managerClassName="
> >org.apache.catalina.cluster.session.DeltaManager"
> >                 expireSessionsOnShutdown="false"
> >                 useDirtyFlag="true"
> >                 notifyListenersOnReplication="true">
> >
> >            <Membership
> >                className="org.apache.catalina.cluster.mcast.McastService
> "
> >                mcastAddr="228.0.0.4"
> >                mcastPort="45564"
> >                mcastFrequency="1000"
> >                mcastDropTime="35000"
> >                mcastBindAddr="auto"
> >                />
> >
> >            <Receiver
> >                className="
> >org.apache.catalina.cluster.tcp.ReplicationListener"
> >                tcpListenAddress="auto"
> >                tcpListenPort="4001"
> >                tcpThreadCount="24"/>
> >
> >            <Sender
> >                className="
> >org.apache.catalina.cluster.tcp.ReplicationTransmitter"
> >                replicationMode="pooled"
> >                autoConnect="true"
> >                keepAliveTimeout="-1"
> >                maxPoolSocketLimit="600"
> >                doTransmitterProcessingStats="true"
> >                />
> >
> >            <Valve className="
> >org.apache.catalina.cluster.tcp.ReplicationValve"
> >
>
> >filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
> >
> >            <Deployer className="
> >org.apache.catalina.cluster.deploy.FarmWarDeployer"
> >                      tempDir="/tmp/war-temp/"
> >                      deployDir="/tmp/war-deploy/"
> >                      watchDir="/tmp/war-listen/"
> >                      watchEnabled="false"/>
> >
> >            <ClusterListener className="
> >org.apache.catalina.cluster.session.ClusterSessionListener"/>
> >        </Cluster>
> >     Note: for the application session availability on all the nodes is
> >must, so using "pooled" mode.
> >
> >Tomcate VM Parameters (additional switches for VM tunning):
> >-XX:+AggressiveHeap -Xms832m -Xmx832m -XX:+UseParallelGC
> -XX:+PrintGCDetails
> >-XX:MaxGCPauseMillis=200 -XX:GCTimeRatio=9
> >
> >After starting tomcat on all the nodes, when I run Jmeter scripts with
> 20-70
> >concurrent user threads, the entire cluster works fine (almost 0% error)
> but
> >at high number of users like > 200 concurrent user threads the tomcat
> >cluster session replication starts failing consistently and the
> replication
> >messages getting lost. Here is what I get in tomcat logs on all the nodes
> >(too many times):
> >
> >WARNING: Message lost: [10.1.11.95:4,001] type=[
> >org.apache.catalina.cluster.session.SessionMessageImpl],
> >id=[40FC741DB987BF5161C3AEEB32570A8E-
> >1134732225260]
> >java.net.SocketException: Broken pipe
> >        at java.net.SocketOutputStream.socketWrite0(Native Method)
> >        at java.net.SocketOutputStream.socketWrite(
> SocketOutputStream.java
> >:92)
> >        at java.net.SocketOutputStream.write(SocketOutputStream.java:124)
> >        at org.apache.catalina.cluster.tcp.DataSender.writeData(
> >DataSender.java:858)
> >        at org.apache.catalina.cluster.tcp.DataSender.pushMessage(
> >DataSender.java:799)
> >        at org.apache.catalina.cluster.tcp.DataSender.sendMessage(
> >DataSender.java:623)
> >        at org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage
> (
> >PooledSocketSender.java:128)
> >        at
> >org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(
> >ReplicationTransmitter.java:867)
> >        at
> >
> org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageClusterDomain
> >(ReplicationTransmitter.java:460)
> >        at
> >org.apache.catalina.cluster.tcp.SimpleTcpCluster.sendClusterDomain(
> >SimpleTcpCluster.java:1012)
> >        at org.apache.catalina.cluster.session.DeltaManager.send(
> >DeltaManager.java:629)
> >        at
> >org.apache.catalina.cluster.session.DeltaManager.sendCreateSession(
> >DeltaManager.java:617)
> >        at org.apache.catalina.cluster.session.DeltaManager.createSession
> (
> >DeltaManager.java:593)
> >        at org.apache.catalina.cluster.session.DeltaManager.createSession
> (
> >DeltaManager.java:572)
> >.............................
> >.............................
> >
> >Also I have noticed fewer times on two of the nodes (#3, #4) following
> >error:
> >
> >SEVERE: TCP Worker thread in cluster caught '
> >java.lang.ArrayIndexOutOfBoundsException: 1025' closing channel
> >java.lang.ArrayIndexOutOfBoundsException: 1025
> >        at org.apache.catalina.cluster.io.XByteBuffer.toInt(
> XByteBuffer.java
> >:231)
> >        at org.apache.catalina.cluster.io.XByteBuffer.countPackages(
> >XByteBuffer.java:164)
> >        at org.apache.catalina.cluster.io.ObjectReader.append(
> >ObjectReader.java:87)
> >        at
> org.apache.catalina.cluster.tcp.TcpReplicationThread.drainChannel
> >(TcpReplicationThread.java:127)
> >        at org.apache.catalina.cluster.tcp.TcpReplicationThread.run(
> >TcpReplicationThread.java:69)
> >
> >With all the above warning/exception I get the following jmeter results
> >(scripts runs at: 200 concurrent threads, 5 iteration, 0 sec ramp-up
> >period):
> >
> >Rate: 28 req/sec
> >Error: 9.07 %
> >
> >The rate is acceptable but error is very high and specially at high
> number
> >of user thread the error % goes up. I have run the Jmeter script several
> >times along with tweaking cluster configuration but I am not able to
> figure
> >out what am I doing wrong.
> >
> >Is "Broken pipe" is some kind failure and serious blocker OR it can
> safely
> >be ignored?
> >
> >"ArrayIndexOutOfBoundsException" looks to me a bug, it may already have
> been
> >reported but I don't know yet?
> >
> >With current scenario the memory usage are below 600 MB. My target is
> reach
> >2000 concurrent users thread keeping error within 3% and maintain the
> same
> >req/sec. Does this mean I have to add more memory (making it 2 GB on each
> >node).
> >
> >Is there something else I am missing that I need to look at?
> >
> >Any suggestions, ideas, tips are most welcome and appreciated.
> >
> >Thanks
> >
> >Yogi
> >
> >
> >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Re: Tomcat 5.5.12 clustering - messages lost under high load

Reply via email to