I haven't tested clustering on Solaris 9, but on linux it works great.
There is something funky with your multicast, as you can see there are
members added and disappearing all the time.
Try to increase your mcastDropTime, that should keep the members in the
cluster for a longer time.
contact me at my apache.org email for help with debugging

Filip

-----Original Message-----
From: Ilyschenko, Vlad [mailto:[EMAIL PROTECTED]
Sent: Sunday, February 22, 2004 5:15 PM
To: [EMAIL PROTECTED]
Subject: tomcat 5.0.19 cluster problem


Hi,



We are running three Solaris9 boxes with tomcat 5.0.19 on them. Cluster
configuration is as follows:



        <Cluster
className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"


managerClassName="org.apache.catalina.cluster.session.DeltaManager"

                 expireSessionsOnShutdown="false"

                 useDirtyFlag="true">



            <Membership


className="org.apache.catalina.cluster.mcast.McastService"

                mcastAddr="228.0.0.3"

                mcastPort="45564"

                mcastFrequency="500"

                mcastDropTime="3000"/>



            <Receiver


className="org.apache.catalina.cluster.tcp.ReplicationListener"

                tcpListenAddress="auto"

                tcpListenPort="4001"

                tcpSelectorTimeout="100"

                tcpThreadCount="60"/>



            <Sender


className="org.apache.catalina.cluster.tcp.ReplicationTransmitter"

                replicationMode="pooled"/>



            <Valve
className="org.apache.catalina.cluster.tcp.ReplicationValve"


filter=".*\.gif;.*\.js;.*\.jpg;.*\.htm;.*\.html;.*\.txt;"/>

        </Cluster>



Yesterday tomcat on one of the servers ran out of memory that coincided
with a clustered web application hang across all three servers. All
tomcat instances started exhibiting cluster problems in one shape or
another. I wonder if 5.0.19 cluster has memory leaks. I have not
experienced OutOfMemory problems on those boxes running 5.0.16 for over
a month.



In any case could a cluster node that ran out of memory destroy the
entire cluster?





You could find the log fragments from those three boxes below:



Box #1 (IP: 192.168.64.40) - the one with memory problems:



22 Feb 2004 00:26:43 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.36:4001,192.168.64.36,4001, alive=112504278]

22 Feb 2004 00:26:43 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.36:
4001,192.168.64.36,4001, alive=112532838]

22 Feb 2004 00:26:53 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.36:4001,192.168.64.36,4001, alive=112532838]

22 Feb 2004 00:26:53 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.36:
4001,192.168.64.36,4001, alive=112540488]

22 Feb 2004 00:26:58 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.36:4001,192.168.64.36,4001, alive=112540488]

22 Feb 2004 00:26:58 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.36:
4001,192.168.64.36,4001, alive=112548138]

22 Feb 2004 00:27:04 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.41:4001,192.168.64.41,4001, alive=113937290]

22 Feb 2004 00:27:04 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.41:
4001,192.168.64.41,4001, alive=113967890]

22 Feb 2004 00:27:09 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.36:4001,192.168.64.36,4001, alive=112548138]

22 Feb 2004 00:27:09 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.36:
4001,192.168.64.36,4001, alive=112558338]

22 Feb 2004 00:27:19 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.41:4001,192.168.64.41,4001, alive=113967890]

22 Feb 2004 00:27:19 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.41:
4001,192.168.64.41,4001, alive=113981150]

22 Feb 2004 00:27:27 ERROR TP-Processor16 - An exception or error
occurred in the container during the request processing

java.lang.OutOfMemoryError

22 Feb 2004 00:27:27 DEBUG Finalizer - result finalized

22 Feb 2004 00:27:27 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.36:4001,192.168.64.36,4001, alive=112558338]

22 Feb 2004 00:27:27 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.36:
4001,192.168.64.36,4001, alive=112573638]

22 Feb 2004 00:27:27 INFO TP-Processor16 - Unknown message 0

22 Feb 2004 00:27:34 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.36:4001,192.168.64.36,4001, alive=112573638]

22 Feb 2004 00:27:34 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.36:
4001,192.168.64.36,4001, alive=112581288]





Box #2 (IP: 192.168.64.36):



22 Feb 2004 00:26:43 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117485053]

22 Feb 2004 00:26:48 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117485053]

22 Feb 2004 00:26:53 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117495344]

22 Feb 2004 00:26:56 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117495344]

22 Feb 2004 00:26:58 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117500276]

22 Feb 2004 00:27:01 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117500276]

22 Feb 2004 00:27:03 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117505583]

22 Feb 2004 00:27:06 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117505583]

22 Feb 2004 00:27:08 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117510798]

22 Feb 2004 00:27:14 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117510798]

22 Feb 2004 00:27:19 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117520986]

22 Feb 2004 00:27:22 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117520986]

22 Feb 2004 00:27:26 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117528626]

22 Feb 2004 00:27:29 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117528626]

22 Feb 2004 00:27:30 INFO TP-Processor1 - Unknown message 0

22 Feb 2004 00:27:34 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117536379]



Box #3:





22 Feb 2004 00:26:40 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117477359]

22 Feb 2004 00:26:42 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117485053]

22 Feb 2004 00:26:45 WARN
ContainerBackgroundProcessor[StandardEngine[Catalina]] - Wasn't able to
read acknowledgement from server in 15000 ms. Disconnecting socket, and
trying again.

22 Feb 2004 00:26:45 WARN
ContainerBackgroundProcessor[StandardEngine[Catalina]] - Unable to send
replicated message, is server down?

java.net.SocketTimeoutException: Read timed out

                at java.net.SocketInputStream.socketRead0(Native Method)

                at
java.net.SocketInputStream.read(SocketInputStream.java:129)

                ...

22 Feb 2004 00:26:48 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117485053]

22 Feb 2004 00:26:52 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117495344]

22 Feb 2004 00:26:55 WARN TP-Processor1 - Wasn't able to read
acknowledgement from server in 15000 ms. Disconnecting socket, and
trying again.

22 Feb 2004 00:26:55 WARN TP-Processor1 - Unable to send replicated
message, is server down?

java.net.SocketTimeoutException: Read timed out

                at java.net.SocketInputStream.socketRead0(Native Method)

                at
java.net.SocketInputStream.read(SocketInputStream.java:129)

                at
java.net.SocketInputStream.read(SocketInputStream.java:182)

                at
org.apache.catalina.cluster.tcp.SocketSender.waitForAck(SocketSender.jav
a:181)

                at
org.apache.catalina.cluster.tcp.SocketSender.sendMessage(SocketSender.ja
va:172)

                at
org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage(PooledSoc
ketSender.java:166)

                ...

22 Feb 2004 00:26:55 WARN TP-Processor3 - Wasn't able to read
acknowledgement from server in 15000 ms. Disconnecting socket, and
trying again.

22 Feb 2004 00:26:55 WARN TP-Processor20 - Wasn't able to read
acknowledgement from server in 15000 ms. Disconnecting socket, and
trying again.

22 Feb 2004 00:26:55 WARN TP-Processor16 - Wasn't able to read
acknowledgement from server in 15000 ms. Disconnecting socket, and
trying again.

22 Feb 2004 00:26:55 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117495344]

22 Feb 2004 00:26:57 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117500276]

22 Feb 2004 00:27:00 INFO Cluster-MembershipReceiver - Received member
disappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.
64.40:4001,192.168.64.40,4001, alive=117500276]

22 Feb 2004 00:27:02 INFO Cluster-MembershipReceiver - Replication
member
added:org.apache.catalina.cluster.mcast.McastMember[tcp://192.168.64.40:
4001,192.168.64.40,4001, alive=117505583]

22 Feb 2004 05:26:07 WARN TP-Processor138 - No socket sender available
for client=/192.168.64.40:4001 did it disappear?

22 Feb 2004 05:26:07 WARN TP-Processor157 - No socket sender available
for client=/192.168.64.40:4001 did it disappear?

22 Feb 2004 05:26:07 INFO TP-Processor8 - Unknown message 0

22 Feb 2004 05:26:07 WARN TP-Processor128 - No socket sender available
for client=/192.168.64.40:4001 did it disappear?

22 Feb 2004 05:26:07 INFO TP-Processor32 - Unknown message 0

22 Feb 2004 05:26:07 WARN
ContainerBackgroundProcessor[StandardEngine[Catalina]] - Unable to send
replicated message, is server down?

java.lang.IllegalStateException: Socket pool is closed.

                at
org.apache.catalina.cluster.tcp.PooledSocketSender$SenderQueue.getSender
(PooledSocketSender.java:217)

                at
org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage(PooledSoc
ketSender.java:160)

                at
org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(R
eplicationTransmitter.java:164)

                at
org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessage(Repli
cationTransmitter.java:196)

                at
org.apache.catalina.cluster.tcp.SimpleTcpCluster.send(SimpleTcpCluster.j
ava:450)

                ...

Thanks,

Vlad






****************************************************************************
****
The information contained in this email message may be confidential. If you
are not the intended recipient, any use, interference with, disclosure or
copying of this material is unauthorised and prohibited. Although this
message and any attachments are believed to be free of viruses, no
responsibility is accepted by Informa for any loss or damage arising in any
way from receipt or use thereof.  Messages to and from the company are
monitored for operational reasons and in accordance with lawful business
practices.
If you have received this message in error, please notify us by return and
delete the message and any attachments.  Further enquiries/returns can be
sent to [EMAIL PROTECTED]


---
Incoming mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.585 / Virus Database: 370 - Release Date: 2/11/2004

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.585 / Virus Database: 370 - Release Date: 2/11/2004


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to