ok, it looks like you might have ended up with a rogue socket,
and what happens is that any message sent to that socket just gets lost in the ether, since it doesn't have any interest ops. There is a workaround for this, turn off keep alives all together, or implement a keep alive timeout

Option 1 - no keep alives at all

<Transport 
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
          timeout="60000"
          keepAliveCount="0"/>

Option 2 - implement a keep alive timeout

<Transport 
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
          timeout="60000"
          keepAliveTime="120000"/>

either option should work for you.

On a side note, I'm interested if the scenario you run into is reproducible, it 
keeps happening over and over again, then if possible, I'd like to get some 
debug logs from you

Filip





Raúl García wrote:
Hi all, I'm Raúl.

We had a tomcat 5.0.28 server configured with 2 clustered instances(working
perfectly), and we decided to migrate to java6 and the new tomcat 6.0.14.

We modify the configuration files to match the new tomcat 6.0.14 structure.
But now we have really annoying problems with the cluster.

This server receives aprox. 1 petition per second. We have pen as the load
balancer between 2 instances.

Once started, both instances report member.added and seems to be clustered.
Works fine.
But around 12 days after being started, instance number 2, report a
cluster.member disappear, then the session replication doesn't work,
instance 2 is unstable and gives a timeout error (But instance 1 is
working!!).

Intance2 Log:
=============
Jan 14, 2008 7:05:17 PM org.apache.catalina.tribes.transport.nio.NioReceiver
socketTimeouts
WARNING: Channel key is registered, but has had no interest ops for the last
3000 ms. (cancelled:false):[EMAIL PROTECTED] last
access:2008-01-14 19:05:12.847
Jan 14, 2008 7:05:22 PM org.apache.catalina.tribes.transport.nio.NioReceiver
socketTimeouts
WARNING: Channel key is registered, but has had no interest ops for the last
3000 ms. (cancelled:false):[EMAIL PROTECTED] last
access:2008-01-14 19:05:17.848
Jan 14, 2008 7:05:27 PM org.apache.catalina.tribes.transport.nio.NioReceiver
socketTimeouts
WARNING: Channel key is registered, but has had no interest ops for the last
3000 ms. (cancelled:false):[EMAIL PROTECTED] last
access:2008-01-14 19:05:22.85
Jan 14, 2008 7:05:35 PM org.apache.catalina.tribes.transport.nio.NioReceiver
socketTimeouts
WARNING: Channel key is registered, but has had no interest ops for the last
3000 ms. (cancelled:false):[EMAIL PROTECTED] last
access:2008-01-14 19:05:30.111
Jan 14, 2008 7:05:35 PM org.apache.catalina.tribes.transport.nio.NioReceiver
socketTimeouts
WARNING: Channel key is registered, but has had no interest ops for the last
3000 ms. (cancelled:false):[EMAIL PROTECTED] last
access:2008-01-14 19:05:27.852
Jan 15, 2008 1:56:37 AM
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor report
INFO: ThroughputInterceptor Report[
        Tx Msg:20000 messages
        Sent:11.00 MB (total)
        Sent:11.00 MB (application)
        Time:28.14 seconds
        Tx Speed:0.39 MB/sec (total)
        TxSpeed:0.39 MB/sec (application)
        Error Msg:0
        Rx Msg:12195 messages
        Rx Speed:0.00 MB/sec (since 1st msg)
        Received:6.74 MB]
Jan 15, 2008 8:48:36 AM org.apache.catalina.ha.tcp.SimpleTcpCluster
memberDisappeared
INFO: Received member
disappeared:org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost
:4001,localhost,4001, alive=72702012,id={-44 -75 50 38 26 -53 72 -63 -76 -94
12 82 127 106 126 -61 }, payload={}, command={}, domain={}, ]
Jan 15, 2008 8:48:36 AM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
performBasicCheck
INFO: Suspect member, confirmed
dead.[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4001,
localhost,4001, alive=72702012,id={-44 -75 50 38 26 -53 72 -63 -76 -94 12 82
127 106 126 -61 }, payload={}, command={}, domain={}, ]]
Jan 15, 2008 8:48:39 AM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Received
memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://loc
alhost:4001,localhost,4001, alive=72705018,id={-44 -75 50 38 26 -53 72 -63
-76 -94 12 82 127 106 126 -61 }, payload={}, command={}, domain={}, ]]
message. Will verify.
Jan 15, 2008 8:48:39 AM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Verification complete. Member still
alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4001,
localhost,4001, alive=72705018,id={-44 -75 50 38 26 -53 72 -63 -76 -94 12 82
127 106 126 -61 }, payload={}, command={}, domain={}, ]]
Jan 15, 2008 8:48:39 AM org.apache.catalina.ha.tcp.SimpleTcpCluster send
SEVERE: Unable to send message through cluster sender.
org.apache.catalina.tribes.ChannelException: Operation has timed out(60000
ms.).; Faulty members:tcp://localhost:4001;
        at
org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(Paral
lelNioSender.java:97)
        at
org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(Po
oledParallelSender.java:53)
        at
org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(Repl
icationTransmitter.java:80)
        at
org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoord
inator.java:78)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMess
age(ThroughputInterceptor.java:61)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sen
dMessage(MessageDispatchInterceptor.java:73)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage
(TcpFailureDetector.java:87)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216)
        at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175)
        at
org.apache.catalina.ha.tcp.SimpleTcpCluster.send(SimpleTcpCluster.java:835)
        at
org.apache.catalina.ha.tcp.SimpleTcpCluster.sendClusterDomain(SimpleTcpClust
er.java:814)
        at
org.apache.catalina.ha.session.DeltaManager.send(DeltaManager.java:586)
        at
org.apache.catalina.ha.session.DeltaManager.sendCreateSession(DeltaManager.j
ava:575)
        at
org.apache.catalina.ha.session.DeltaManager.createSession(DeltaManager.java:
551)
        at
org.apache.catalina.ha.session.DeltaManager.createSession(DeltaManager.java:
534)
        at
org.apache.catalina.connector.Request.doGetSession(Request.java:2312)
        at
org.apache.catalina.connector.Request.getSession(Request.java:2075)
        at
org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:83
3)
        at pad.kernel.Resolver.service(Resolver.java:266)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
        at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
FilterChain.java:290)
        at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
ain.java:206)
        at pad.kernel.EntryPointFilter.doFilter(EntryPointFilter.java:365)
        at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
FilterChain.java:235)
        at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
ain.java:206)
        at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja
va:219)
        at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja
va:175)
        at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128
)
        at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102
)
        at
org.apache.catalina.valves.RequestFilterValve.process(RequestFilterValve.jav
a:269)
        at
org.apache.catalina.valves.RemoteAddrValve.invoke(RemoteAddrValve.java:81)
        at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java
:109)
        at
org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:347
)
        at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
        at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http
11Protocol.java:584)
        at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:619)

[...]
SEVERE: Unable to send message through cluster sender. [...] <---- Repeats every 60000ms

Resolver.java at line 266
==========================

//I just get a new session (request is a
javax.servlet.http.HttpServletRequest)
javax.servlet.http.HttpSession sesion = request.getSession(true);

==================

We modified server.xml to be exactly what is recommended by the
documentation. Helped by some guys of dev-apache mailing list (wrong list,
so I post this problem again here)

The config files of each instance are:

INSTANCE-1 --- Server.xml
==========================
NOTE:: 111.111.111.111 is the server ip address.
==========================
<Server port="8006" shutdown="SHUTDOWN" debug="0">
  <Listener className="org.apache.catalina.core.JasperListener" debug="0"/>
  <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener"
debug="0"/>
  <Listener
className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"
debug="0"/>

  <GlobalNamingResources>
    <Environment name="InstanceName" type="java.lang.String" value="pro1"/>

    <Resource name="UserDatabase" auth="Container"
              type="org.apache.catalina.UserDatabase"
              description="User database that can be updated and saved"
              factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
              pathname="conf/tomcat-users.xml" />
  </GlobalNamingResources>

  <Service name="Catalina">

    <Connector port="8081" protocol="HTTP/1.1" maxHttpHeaderSize="8192"
emptySessionPath="true"
               maxThreads="150" minSpareThreads="100" maxSpareThreads="300"
               enableLookups="false" redirectPort="81443" acceptCount="1000"
               debug="0" connectionTimeout="20000"
disableUploadTimeout="true"
               compression="on"
                           compressionMinSize="2048"
                           noCompressionUserAgents="gozilla, traviata"
                           compressableMimeType="text/html,text/xml" />

    <Engine name="Catalina" defaultHost="localhost" debug="0"
jvmRoute="PR1">
                        <Cluster
className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
                 channelSendOptions="6">

          <Manager className="org.apache.catalina.ha.session.DeltaManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"/>

          <Channel
className="org.apache.catalina.tribes.group.GroupChannel">
            <Membership
className="org.apache.catalina.tribes.membership.McastService"
                        address="228.0.0.4"
                        port="45564"
                        frequency="1000"
                        dropTime="30000"/>
            <Receiver
className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                      address="127.0.0.1"
                      port="4001"
                      autoBind="100"
                      selectorTimeout="5000"
                      maxThreads="12"/>

            <Sender
className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
              <Transport
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
timeout="60000"/>
            </Sender>
            <Interceptor
className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"
/>
            <Interceptor
className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15In
terceptor"/>
            <Interceptor
className="org.apache.catalina.tribes.group.interceptors.ThroughputIntercept
or"/>
          </Channel>

          <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>

          <Deployer
className="org.apache.catalina.ha.deploy.FarmWarDeployer"
                    tempDir="/tmp/war-temp/"
                    deployDir="/tmp/war-deploy/"
                    watchDir="/tmp/war-listen/"
                    watchEnabled="false"/>

          <ClusterListener
className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
          <ClusterListener
className="org.apache.catalina.ha.session.ClusterSessionListener"/>
        </Cluster>
      <Realm className="org.apache.catalina.realm.UserDatabaseRealm"
             debug="0" resourceName="UserDatabase"/>
      <Host name="localhost" debug="0" appBase="webapps"
            unpackWARs="true" autoDeploy="true"
            xmlValidation="false" xmlNamespaceAware="false">
          <Valve className="org.apache.catalina.valves.RemoteAddrValve"
                   allow="10.0.0.*,127.0.0.1,228.0.0.4,111.111.111.111"/>
      </Host>
    </Engine>
  </Service>
</Server>
==============================================


INSTANCE-2 server.xml
=====================
<Server port="8007" shutdown="SHUTDOWN" debug="0">

  <Listener className="org.apache.catalina.core.JasperListener" debug="0"/>
  <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener"
debug="0"/>
  <Listener
className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"
debug="0"/>

  <GlobalNamingResources>

    <Environment name="InstanceName" type="java.lang.String" value="pro2"/>

    <Resource name="UserDatabase" auth="Container"
              type="org.apache.catalina.UserDatabase"
              description="User database that can be updated and saved"
              factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
              pathname="conf/tomcat-users.xml"/>
  </GlobalNamingResources>

  <Service name="Catalina">

    <Connector port="8082" protocol="HTTP/1.1" maxHttpHeaderSize="8192"
emptySessionPath="true"
               maxThreads="150" minSpareThreads="100" maxSpareThreads="300"
               enableLookups="false" redirectPort="82443" acceptCount="1000"
               debug="0" connectionTimeout="20000"
disableUploadTimeout="true"
               compression="on"
                           compressionMinSize="2048"
                           noCompressionUserAgents="gozilla, traviata"
                           compressableMimeType="text/html,text/xml" />
    <Engine name="Catalina" defaultHost="localhost" debug="0"
jvmRoute="PR2">

                        <Cluster
className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
                 channelSendOptions="6">


          <Manager className="org.apache.catalina.ha.session.DeltaManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"/>

          <Channel
className="org.apache.catalina.tribes.group.GroupChannel">
            <Membership
className="org.apache.catalina.tribes.membership.McastService"
                        address="228.0.0.4"
                        port="45564"
                        frequency="1000"
                        dropTime="30000"/>
            <Receiver
className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                      address="127.0.0.1"
                      port="4002"
                      autoBind="100"
                      selectorTimeout="5000"
                      maxThreads="12"/>

            <Sender
className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
              <Transport
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
timeout="60000"/>
            </Sender>
            <Interceptor
className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"
/>
            <Interceptor
className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15In
terceptor"/>
            <Interceptor
className="org.apache.catalina.tribes.group.interceptors.ThroughputIntercept
or"/>
          </Channel>

          <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
          <!-- <Valve
className="org.apache.catalina.ha.session.JvmRouteBinderValve"/> -->

          <Deployer
className="org.apache.catalina.ha.deploy.FarmWarDeployer"
                    tempDir="/tmp/war-temp/"
                    deployDir="/tmp/war-deploy/"
                    watchDir="/tmp/war-listen/"
                    watchEnabled="false"/>

          <ClusterListener
className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
          <ClusterListener
className="org.apache.catalina.ha.session.ClusterSessionListener"/>
        </Cluster>

      <Realm className="org.apache.catalina.realm.UserDatabaseRealm"
             resourceName="UserDatabase" debug="0"/>

      <Host name="localhost" debug="0" appBase="webapps"
            unpackWARs="true" autoDeploy="true"
            xmlValidation="false" xmlNamespaceAware="false">

          <Valve className="org.apache.catalina.valves.RemoteAddrValve"
                   allow="10.0.0.*,127.0.0.1,228.0.0.4,111.111.111.111"/>
      </Host>
    </Engine>
  </Service>
</Server>
===============================

Can you see something wrong that can cause that timeouts?
I can paste more config files if you need.

Thank you very much
Raúl.



---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to