Hi again,
I try the config using keepAliveTime to 10:
<Transport
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
timeout="60000" keepAliveTime="10"
keepAliveCount="0"/>
One more time, the cluster is not working, the big problem is that I cannot
reproduce the error at my backup server that works perfectly.
Node 2, drops a log error at 12:58 AM, then, at the same time, node 1 report
"ClusterError" continuously
(Continuous errors are on every hit; the server supports 1 hit per second)
Logs:
NODE 2 - LOG
=============
Jan 31, 2008 12:58:13 PM
org.apache.catalina.tribes.transport.nio.NioReceiver socketTimeouts
WARNING: Channel key is registered, but has had no interest ops for the last
3000 ms. (cancelled:false):[EMAIL PROTECTED] last
access:2008-01-31 12:58:10.208
NODE 1 - LOG
=============
Jan 31, 2008 12:58:04 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Received
memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://loc
alhost:4002,localhost,4002, alive=101194547,id={123 -66 95 -10 88
24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={}, command={}, domain={},
]] message. Will verify.
Jan 31, 2008 12:58:04 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Verification complete. Member still
alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4002,
localhost,4002, alive=101194547,id={123
-66 95 -10 88 24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={},
command={}, domain={}, ]]
Jan 31, 2008 12:58:04 PM org.apache.catalina.ha.tcp.SimpleTcpCluster send
SEVERE: Unable to send message through cluster sender.
org.apache.catalina.tribes.ChannelException: Operation has timed out(60000
ms.).; Faulty members:tcp://localhost:4002;
at
org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(Paral
lelNioSender.java:97)
at
org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(Po
oledParallelSender.java:53)
at
org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(Repl
icationTransmitter.java:80)
at
org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoord
inator.java:78)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
at
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMess
age(ThroughputInterceptor.java:61)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
at
org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sen
dMessage(MessageDispatchInterceptor.java:73)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
at
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage
(TcpFailureDetector.java:87)
at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216)
at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175)
at
org.apache.catalina.ha.tcp.SimpleTcpCluster.send(SimpleTcpCluster.java:835)
at
org.apache.catalina.ha.tcp.SimpleTcpCluster.sendClusterDomain(SimpleTcpClust
er.java:814)
at
org.apache.catalina.ha.tcp.ReplicationValve.send(ReplicationValve.java:551)
at
org.apache.catalina.ha.tcp.ReplicationValve.sendMessage(ReplicationValve.jav
a:535)
at
org.apache.catalina.ha.tcp.ReplicationValve.sendSessionReplicationMessage(Re
plicationValve.java:517)
at
org.apache.catalina.ha.tcp.ReplicationValve.sendReplicationMessage(Replicati
onValve.java:428)
at
org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:362
)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http
11Protocol.java:584)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)
Jan 31, 2008 12:58:07 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Received
memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://loc
alhost:4002,localhost,4002, alive=101197553,id={123 -66 95 -10 88
24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={}, command={}, domain={},
]] message. Will verify.
Jan 31, 2008 12:58:07 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Verification complete. Member still
alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4002,
localhost,4002, alive=101197553,id={123
-66 95 -10 88 24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={},
command={}, domain={}, ]]
[...] repeats on every hit.
========================
I cannot understand the node 2 log, why is the node 2 crashing??
What can I do??
Thanks on advance.
Raúl.
-----Mensaje original-----
De: Filip Hanik - Dev Lists [mailto:[EMAIL PROTECTED]
Enviado el: lunes, 28 de enero de 2008 1:45
Para: Tomcat Users List
Asunto: Re: Tomcat 6 - Cluster error.
I'd set keepAliveTime to 10 as well,
Filip
Raúl García wrote:
> Hi Again, once again thanks for your time, but we still have problems,
>
> We applied the "keepAliveCount=0" param. and last Wednesday 23 Jan we
> restart both nodes.
>
> Around 11 hour after the startup, node 1 reports a new error, but both
nodes
> are working perfectly.
>
> I cannot imagine why the member disappear unexpectedly, I repost the
error,
> and the config files.
>
> INSTANCE 1 - LOG
> ================
> Jan 24, 2008 10:25:54 PM
> org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
> memberDisappeared
> INFO: Received
>
memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://loc
> alhost:4002,localhost,4002, alive=123412856,id={-31 -91 -122 -60 -58 -5 68
> 25 -87 13 -20 -12 -100 5 -16 94 }, payload={}, command={}, domain={}, ]]
> message. Will verify.
> Jan 24, 2008 10:25:54 PM
> org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
> memberDisappeared
> INFO: Verification complete. Member still
>
alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4002,
> localhost,4002, alive=123412856,id={-31 -91 -122 -60 -58 -5 68 25 -87 13
-20
> -12 -100 5 -16 94 }, payload={}, command={}, domain={}, ]]
> Jan 24, 2008 10:25:54 PM org.apache.catalina.ha.tcp.SimpleTcpCluster send
> SEVERE: Unable to send message through cluster sender.
> org.apache.catalina.tribes.ChannelException: Operation has timed out(60000
> ms.).; Faulty members:tcp://localhost:4002;
> at
>
org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(Paral
> lelNioSender.java:97)
> at
>
org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(Po
> oledParallelSender.java:53)
> at
>
org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(Repl
> icationTransmitter.java:80)
> at
>
org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoord
> inator.java:78)
> at
>
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
> nterceptorBase.java:75)
> at
>
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMess
> age(ThroughputInterceptor.java:61)
> at
>
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
> nterceptorBase.java:75)
> at
>
org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sen
> dMessage(MessageDispatchInterceptor.java:73)
> at
>
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
> nterceptorBase.java:75)
> at
>
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage
> (TcpFailureDetector.java:87)
> at
>
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
> nterceptorBase.java:75)
> at
> org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216)
> at
> org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175)
> at
>
org.apache.catalina.ha.tcp.SimpleTcpCluster.send(SimpleTcpCluster.java:835)
> at
>
org.apache.catalina.ha.tcp.SimpleTcpCluster.sendClusterDomain(SimpleTcpClust
> er.java:814)
> at
>
org.apache.catalina.ha.tcp.ReplicationValve.send(ReplicationValve.java:551)
> at
>
org.apache.catalina.ha.tcp.ReplicationValve.sendMessage(ReplicationValve.jav
> a:535)
> at
>
org.apache.catalina.ha.tcp.ReplicationValve.sendSessionReplicationMessage(Re
> plicationValve.java:517)
> at
>
org.apache.catalina.ha.tcp.ReplicationValve.sendReplicationMessage(Replicati
> onValve.java:428)
> at
>
org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:362
> )
> at
>
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
> at
>
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http
> 11Protocol.java:584)
> at
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> at java.lang.Thread.run(Thread.java:619)
>
> Jan 24, 2008 10:26:54 PM
> org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
> memberDisappeared
> INFO: Received memberDisappeared [...] repeats only once again.
>
> Jan 25, 2008 5:37:52 AM
> org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor report
> INFO: ThroughputInterceptor Report[
> Tx Msg:66167 messages
> Sent:37.02 MB (total)
> Sent:37.02 MB (application)
> Time:118.53 seconds
> Tx Speed:0.31 MB/sec (total)
> TxSpeed:0.31 MB/sec (application)
> Error Msg:2
> Rx Msg:90000 messages
> Rx Speed:0.00 MB/sec (since 1st msg)
> Received:41.06 MB]
>
>
>
>
> INSTANCE-1 --- Server.xml
> ==========================
> NOTE:: 111.111.111.111 is the server ip address.
> ==========================
> <Server port="8006" shutdown="SHUTDOWN" debug="0">
> <Listener className="org.apache.catalina.core.JasperListener"
debug="0"/>
> <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener"
> debug="0"/>
> <Listener
> className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"
> debug="0"/>
>
> <GlobalNamingResources>
> <Environment name="InstanceName" type="java.lang.String"
value="pro1"/>
>
> <Resource name="UserDatabase" auth="Container"
> type="org.apache.catalina.UserDatabase"
> description="User database that can be updated and saved"
>
factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
> pathname="conf/tomcat-users.xml" />
> </GlobalNamingResources>
>
> <Service name="Catalina">
>
> <Connector port="8081" protocol="HTTP/1.1" maxHttpHeaderSize="8192"
> emptySessionPath="true"
> maxThreads="150" minSpareThreads="100"
maxSpareThreads="300"
> enableLookups="false" redirectPort="81443"
acceptCount="1000"
> debug="0" connectionTimeout="20000"
> disableUploadTimeout="true"
> compression="on"
> compressionMinSize="2048"
> noCompressionUserAgents="gozilla, traviata"
> compressableMimeType="text/html,text/xml" />
>
> <Engine name="Catalina" defaultHost="localhost" debug="0"
> jvmRoute="PR1">
> <Cluster
> className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
> channelSendOptions="6">
>
> <Manager className="org.apache.catalina.ha.session.DeltaManager"
> expireSessionsOnShutdown="false"
> notifyListenersOnReplication="true"/>
>
> <Channel
> className="org.apache.catalina.tribes.group.GroupChannel">
> <Membership
> className="org.apache.catalina.tribes.membership.McastService"
> address="228.0.0.4"
> port="45564"
> frequency="1000"
> dropTime="30000"/>
> <Receiver
> className="org.apache.catalina.tribes.transport.nio.NioReceiver"
> address="127.0.0.1"
> port="4001"
> autoBind="100"
> selectorTimeout="5000"
> maxThreads="12"/>
>
> <Sender
> className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
> <Transport
> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
> timeout="60000" keepAliveCount="0"/>
> </Sender>
> <Interceptor
>
className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"
> />
> <Interceptor
>
className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15In
> terceptor"/>
> <Interceptor
>
className="org.apache.catalina.tribes.group.interceptors.ThroughputIntercept
> or"/>
> </Channel>
>
> <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
>
>
filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
>
> <Deployer
> className="org.apache.catalina.ha.deploy.FarmWarDeployer"
> tempDir="/tmp/war-temp/"
> deployDir="/tmp/war-deploy/"
> watchDir="/tmp/war-listen/"
> watchEnabled="false"/>
>
> <ClusterListener
>
className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
> <ClusterListener
> className="org.apache.catalina.ha.session.ClusterSessionListener"/>
> </Cluster>
> <Realm className="org.apache.catalina.realm.UserDatabaseRealm"
> debug="0" resourceName="UserDatabase"/>
> <Host name="localhost" debug="0" appBase="webapps"
> unpackWARs="true" autoDeploy="true"
> xmlValidation="false" xmlNamespaceAware="false">
> <Valve className="org.apache.catalina.valves.RemoteAddrValve"
> allow="10.0.0.*,127.0.0.1,228.0.0.4,111.111.111.111"/>
> </Host>
> </Engine>
> </Service>
> </Server>
> ==============================================
>
>
> INSTANCE-2 server.xml
> =====================
> <Server port="8007" shutdown="SHUTDOWN" debug="0">
>
> <Listener className="org.apache.catalina.core.JasperListener"
debug="0"/>
> <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener"
> debug="0"/>
> <Listener
> className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"
> debug="0"/>
>
> <GlobalNamingResources>
>
> <Environment name="InstanceName" type="java.lang.String"
value="pro2"/>
>
> <Resource name="UserDatabase" auth="Container"
> type="org.apache.catalina.UserDatabase"
> description="User database that can be updated and saved"
>
factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
> pathname="conf/tomcat-users.xml"/>
> </GlobalNamingResources>
>
> <Service name="Catalina">
>
> <Connector port="8082" protocol="HTTP/1.1" maxHttpHeaderSize="8192"
> emptySessionPath="true"
> maxThreads="150" minSpareThreads="100"
maxSpareThreads="300"
> enableLookups="false" redirectPort="82443"
acceptCount="1000"
> debug="0" connectionTimeout="20000"
> disableUploadTimeout="true"
> compression="on"
> compressionMinSize="2048"
> noCompressionUserAgents="gozilla, traviata"
> compressableMimeType="text/html,text/xml" />
> <Engine name="Catalina" defaultHost="localhost" debug="0"
> jvmRoute="PR2">
>
> <Cluster
> className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
> channelSendOptions="6">
>
>
> <Manager className="org.apache.catalina.ha.session.DeltaManager"
> expireSessionsOnShutdown="false"
> notifyListenersOnReplication="true"/>
>
> <Channel
> className="org.apache.catalina.tribes.group.GroupChannel">
> <Membership
> className="org.apache.catalina.tribes.membership.McastService"
> address="228.0.0.4"
> port="45564"
> frequency="1000"
> dropTime="30000"/>
> <Receiver
> className="org.apache.catalina.tribes.transport.nio.NioReceiver"
> address="127.0.0.1"
> port="4002"
> autoBind="100"
> selectorTimeout="5000"
> maxThreads="12"/>
>
> <Sender
> className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
> <Transport
> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
> timeout="60000" keepAliveCount="0"/>
> </Sender>
> <Interceptor
>
className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"
> />
> <Interceptor
>
className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15In
> terceptor"/>
> <Interceptor
>
className="org.apache.catalina.tribes.group.interceptors.ThroughputIntercept
> or"/>
> </Channel>
>
> <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
>
>
filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
> <!-- <Valve
> className="org.apache.catalina.ha.session.JvmRouteBinderValve"/> -->
>
> <Deployer
> className="org.apache.catalina.ha.deploy.FarmWarDeployer"
> tempDir="/tmp/war-temp/"
> deployDir="/tmp/war-deploy/"
> watchDir="/tmp/war-listen/"
> watchEnabled="false"/>
>
> <ClusterListener
>
className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
> <ClusterListener
> className="org.apache.catalina.ha.session.ClusterSessionListener"/>
> </Cluster>
>
> <Realm className="org.apache.catalina.realm.UserDatabaseRealm"
> resourceName="UserDatabase" debug="0"/>
>
> <Host name="localhost" debug="0" appBase="webapps"
> unpackWARs="true" autoDeploy="true"
> xmlValidation="false" xmlNamespaceAware="false">
>
> <Valve className="org.apache.catalina.valves.RemoteAddrValve"
> allow="10.0.0.*,127.0.0.1,228.0.0.4,111.111.111.111"/>
> </Host>
> </Engine>
> </Service>
> </Server>
> ===============================
>
> -----Mensaje original-----
> De: Filip Hanik - Dev Lists [mailto:[EMAIL PROTECTED]
> Enviado el: jueves, 17 de enero de 2008 19:01
> Para: Tomcat Users List
> Asunto: Re: Tomcat 6 - Cluster error.
>
> already replied to your old thread
>
> ok, it looks like you might have ended up with a rogue socket,
> and what happens is that any message sent to that socket just gets lost
> in the ether, since it doesn't have any interest ops.
> There is a workaround for this, turn off keep alives all together, or
> implement a keep alive timeout
>
> Option 1 - no keep alives at all
>
> <Transport
> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
> timeout="60000"
> keepAliveCount="0"/>
>
> Option 2 - implement a keep alive timeout
>
> <Transport
> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
> timeout="60000"
> keepAliveTime="120000"/>
>
> or make a combination of both values
>
> either option should work for you.
>
> On a side note, I'm interested if the scenario you run into is
> reproducible, it keeps happening over and over again, then if possible,
> I'd like to get some debug logs from you
>
> Filip
>
>
>
>
> ---------------------------------------------------------------------
> To start a new topic, e-mail: [email protected]
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>
> ---------------------------------------------------------------------
> To start a new topic, e-mail: [email protected]
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>
---------------------------------------------------------------------
To start a new topic, e-mail: [email protected]
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To start a new topic, e-mail: [email protected]
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]