Thanks Mark. :) My comments inlined...
Over and above, the underlined network pipeline system also seems to be fine. Still dont understand what is wrong. After enabling the logs to FINE level, i could see following... Jun 14, 2011 10:26:38 AM org.apache.catalina.tribes.transport.ReceiverBase getBind FINE: Starting replication listener on address:xx.xx.xx.xxx Jun 14, 2011 10:26:38 AM org.apache.catalina.tribes.transport.ReceiverBase bind INFO: Receiver Server Socket bound to:/xx.xx.xx.xxx:4000 Jun 14, 2011 10:26:38 AM org.apache.catalina.tribes.membership.McastServiceImpl setupSocket INFO: Setting cluster mcast soTimeout to 500 Jun 14, 2011 10:26:38 AM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers INFO: Sleeping for 1000 milliseconds to establish cluster membership, start level:4 Jun 14, 2011 10:26:39 AM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers INFO: Done sleeping, membership established, start level:4 Jun 14, 2011 10:26:39 AM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers INFO: Sleeping for 1000 milliseconds to establish cluster membership, start level:8 Jun 14, 2011 10:26:40 AM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers INFO: Done sleeping, membership established, start level:8 Since, the clusters do not recognize anything, no communication in the logs! :( On Tue, Jun 14, 2011 at 5:07 AM, Mark Eggers <its_toas...@yahoo.com> wrote: > ----- Original Message ----- > > > From: Nilesh - MiKu <niles...@directi.com> > > To: users@tomcat.apache.org > > Cc: > > Sent: Monday, June 13, 2011 8:36 AM > > Subject: Tomcat 6.0.18 clustering problem > > > > Hi people... > > > > Background : > > > > I have two nodes (say, n1 and n2) running 3 instances of tomcat (say t1, > t2, > > t3), with n1 running t1, t3 and n2 running t2. (All running same > > application.). I want to make clustering for n1-t1 and n2-t2. > > > > Clustering cofig for n1-t1 is.... > > > > <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" > > channelSendOptions="8"> > > > > <Manager className="org.apache.catalina.ha.session.DeltaManager" > > expireSessionsOnShutdown="false" > > notifyListenersOnReplication="true"/> > > > > <Channel > > className="org.apache.catalina.tribes.group.GroupChannel"> > > <Membership > > className="org.apache.catalina.tribes.membership.McastService" > > address="228.0.0.4" > > port="45564" > > frequency="500" > > dropTime="3000"/> > > > > <Receiver > > className="org.apache.catalina.tribes.transport.nio.NioReceiver" > > address="auto" > > port="4000" > > autoBind="100" > > selectorTimeout="5000" > > maxThreads="6"/> > > > > <Sender > > className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> > > <Transport > > > className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/> > > </Sender> > > > > <Interceptor > > > className="org.apache.catalina.tribes.group.interceptors.TcpPingInterceptor"/> > > <Interceptor > > > className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/> > > <Interceptor > > > className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/> > > </Channel> > > > > <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" > > filter=".*\.ico;.*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.css;.*\.txt;"/> > > > > <ClusterListener > > className="org.apache.catalina.ha.session.ClusterSessionListener"/> > > > > </Cluster> > > > > Clustering cofig for n2-t2 is same as above.... > > > > n1-t3 has element <Cluster> commented and is not participating in > > clustering > > at all. Its being used for some other special purpose. > > > > Here is what i get when i start the tomcat instance. > > > > Jun 11, 2011 9:26:18 AM org.apache.catalina.core.AprLifecycleListener > init > > INFO: The APR based Apache Tomcat Native library which allows optimal > > performance in production environments was not found on the > > java.library.path: /usr/lib/jvm/jav > > > a-1.6.0-sun-1.6.0.13/jre/lib/amd64/server:/usr/lib/jvm/java-1.6.0-sun-1.6.0.13/jre/lib/amd64:/usr/lib/jvm/java-1.6.0-sun-1.6.0.13/jre/../lib/amd64:/usr/java/packages > > /lib/amd64:/lib:/usr/lib > > Jun 11, 2011 9:26:18 AM org.apache.coyote.http11.Http11Protocol init > > INFO: Initializing Coyote HTTP/1.1 on http-8080 > > Jun 11, 2011 9:26:18 AM org.apache.catalina.startup.Catalina load > > INFO: Initialization processed in 446 ms > > Jun 11, 2011 9:26:18 AM org.apache.catalina.core.StandardService start > > INFO: Starting service Catalina > > Jun 11, 2011 9:26:18 AM org.apache.catalina.core.StandardEngine start > > INFO: Starting Servlet Engine: Apache Tomcat/6.0.18 > > Jun 11, 2011 9:26:18 AM org.apache.catalina.ha.tcp.SimpleTcpCluster start > > INFO: Cluster is about to start > > Jun 11, 2011 9:26:18 AM org.apache.catalina.tribes.transport.ReceiverBase > > bind > > INFO: Receiver Server Socket bound to:/70.87.28.134:4000 > > Jun 11, 2011 9:26:18 AM > > org.apache.catalina.tribes.membership.McastServiceImpl setupSocket > > INFO: Setting cluster mcast soTimeout to 500 > > Jun 11, 2011 9:26:18 AM > > org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers > > INFO: Sleeping for 1000 milliseconds to establish cluster membership, > start > > level:4 > > Jun 11, 2011 9:26:19 AM > > org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers > > INFO: Done sleeping, membership established, start level:4 > > Jun 11, 2011 9:26:19 AM > > org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers > > INFO: Sleeping for 1000 milliseconds to establish cluster membership, > start > > level:8 > > Jun 11, 2011 9:26:20 AM > > org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers > > INFO: Done sleeping, membership established, start level:8 > > Jun 11, 2011 9:26:20 AM org.apache.catalina.loader.WebappClassLoader > > validateJarFile > > INFO: validateJarFile(/opt/ > > mail.pw/webapps/pw-mail/WEB-INF/lib/selenium-server-0.9.2-standalone.jar) > - > > jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending > > class: javax/servlet/Servlet.class > > Jun 11, 2011 9:26:20 AM org.apache.catalina.loader.WebappClassLoader > > validateJarFile > > INFO: validateJarFile(/opt/ > > mail.pw/webapps/pw-mail/WEB-INF/lib/servlet-api-2.5-6.1.11.jar) - jar > not > > loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: ja > > vax/servlet/Servlet.class > > Jun 11, 2011 9:26:21 AM org.apache.catalina.ha.session.DeltaManager start > > INFO: Register manager /pw-mail to cluster element Engine with name > Catalina > > Jun 11, 2011 9:26:21 AM org.apache.catalina.ha.session.DeltaManager start > > INFO: Starting clustering manager at /pw-mail > > Jun 11, 2011 9:26:21 AM org.apache.catalina.ha.session.DeltaManager > > getAllClusterSessions > > INFO: Manager [localhost#/pw-mail]: skipping state transfer. No members > > active in cluster group. > > Jun 11, 2011 9:26:28 AM org.apache.catalina.ha.session.DeltaManager start > > INFO: Register manager /manager to cluster element Engine with name > Catalina > > Jun 11, 2011 9:26:28 AM org.apache.catalina.ha.session.DeltaManager start > > INFO: Starting clustering manager at /manager > > Jun 11, 2011 9:26:28 AM org.apache.catalina.ha.session.DeltaManager > > getAllClusterSessions > > INFO: Manager [localhost#/manager]: skipping state transfer. No members > > active in cluster group. > > Jun 11, 2011 9:26:28 AM org.apache.coyote.http11.Http11Protocol start > > INFO: Starting Coyote HTTP/1.1 on http-8080 > > Jun 11, 2011 9:26:28 AM org.apache.jk.common.ChannelSocket init > > INFO: JK: ajp13 listening on /0.0.0.0:8009 > > Jun 11, 2011 9:26:28 AM org.apache.jk.server.JkMain start > > INFO: Jk running ID=0 time=0/24 config=null > > Jun 11, 2011 9:26:28 AM org.apache.catalina.startup.Catalina start > > INFO: Server startup in 10245 ms > > > > Note : context for all instances is pw-mail. > > > > Can anyone say what is wrong with this configuration. > > > > > > -- > > Best Regards, > > Nilesh Mevada > > > > This looks like an AMD 64 bit Linux platform? I'm just guessing based on > the paths in your mail message. > > yes. > At any rate, I'll make some comments which will hopefully help. > > First of all, I would recommend upgrading to the latest Tomcat 6 version > (6.0.32) and JRE version if possible. There have been a lot of > cluster-related patches since 6.0.18. If possible, look at upgrading to the > latest Tomcat 7 version (7.0.14). > > From the output, it looks like you have the Selenium server included in > your application. I think the server version includes an embedded Jetty > server, and Tomcat is complaining about classes that are included. > > See: > > > INFO: validateJarFile(/opt/ > > mail.pw/webapps/pw-mail/WEB-INF/lib/selenium-server-0.9.2-standalone.jar) > - > > jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending > > class: javax/servlet/Servlet.class > > > I think you'll need the corresponding coreless version, but check the > Selenium documentation to make sure. > > Also, you've included the servlet API in your application. This is shown > by: > > > INFO: validateJarFile(/opt/ > > mail.pw/webapps/pw-mail/WEB-INF/lib/servlet-api-2.5-6.1.11.jar) - jar > not > > loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: ja > > vax/servlet/Servlet.class > > > Don't do this. Your IDE should enable you to write servlet code without it > packaging up the API. Each IDE is different, so read your documentation. > > will take care of the above two. > Make sure that your application is marked distributable by adding > <distributable/> in your web.xml file. Make sure that all session properties > implement Serializable. > > yes, its marked that way. > Your cluster configuration doesn't look too much different than one I use. > > > <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" > > channelSendOptions="8"> > > I'm not sure why you are using ASYNCHRONOUS as your channelSendOptions > (especially without an ACK). This will allow session updates to be processed > in a different order from which they were sent. I don't know how this will > impact your application. > > Just to make response faster. Although, the app is not so heavily used, and i can make the value 4 i.e. Sync with Ack. > > <Interceptor > > > className="org.apache.catalina.tribes.group.interceptors.TcpPingInterceptor"/> > > You're using an interceptor that doesn't seem to be documented. Looking at > the source code, it appears that this interceptor sends a ping message out > every second. > > > <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" > > filter=".*\.ico;.*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.css;.*\.txt;"/> > > You've removed the htm and html items from the filter, and added ico. I'll > assume that there are no htm / html files in your application (all pages are > generated dynamically). > > yes. There a couple of html files, but since i am using nginx, that static html file request never reaches my tomcat instance. Hence no worries. > In short, I don't see any show stoppers in your configuration, but maybe > other list members have some ideas. > > However, there could be some system issues that are preventing multicasting > from working. > > 1. Make sure your system is set up for multicasting > > Check to see if your interface is enabled for multicasting. Mine looks like > this in part: > > eth0 Link encap:Ethernet > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > > I have MASTER - SLAVE multicast settings... something like.. bond0 Link encap:Ethernet UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 bond0:0 Link encap:Ethernet UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 eth0 Link encap:Ethernet UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 Anyway, the things are proper as far as this thing is concerned. 2. Make sure your firewall allows multicasting. By default, Fedora 15 does > not. Other Linux distributions may be different. Add the following to your > firewall rules (adjust for your distribution). > > -A INPUT -d 224.0.0.0/4 -m state --state NEW -j ACCEPT > > You'll probably want this to be much more restrictive, but this may get you > up and running. > > 3. Add a multicast route > > Adjust this to fit your distribution and configuration. > > /sbin/ip route add to multicast 224.0.0.0/4 dev eth0 > > Filip Hanik published a link to a multicast test tool (MCaster) that was > included in (a now ancient version of) Tomcat 4. This was useful in order to > confirm that you had multicasting set up correctly on your systems. You > might be able to dig it up and build it by following the Archives link on > the Tomcat home page. > > > Seems to be correct on my host for the above two points.. > 4. Don't announce multicast on the localhost address. > > By default, Tomcat gets the host address for multicasting via > java.net.InetAddress.getLocalHost().getHostAddress(). Make sure you're not > advertising 127.0.0.1. In Linux, the most common source of this problem is > by adding your host name to the localhost line in /etc/hosts. > Thanks for your separate log config. i could set it up, and see that n1-t1 and n2-t2 are binding to appropriate n2, n1 ip addresses well. > You can also set up separate logging for clustering by making some changes > to $CATALINA_HOME/conf/logging.properties > > For example: > > # Added a cluster logging handler > handlers = 1catalina.org.apache.juli.FileHandler, > 2localhost.org.apache.juli.FileHandler, > 3manager.org.apache.juli.FileHandler, > 4host-manager.org.apache.juli.FileHandler, > java.util.logging.ConsoleHandler,5cluster.org.apache.juli.FileHandler > > # specify the level and where to store the information > 5cluster.org.apache.juli.FileHandler.level = FINER > 5cluster.org.apache.juli.FileHandler.directory = ${catalina.base}/logs > 5cluster.org.apache.juli.FileHandler.prefix = cluster. > > # various cluster components logging > org.apache.catalina.tribes.MESSAGES.level = FINE > org.apache.catalina.tribes.MESSAGES.handlers = > 5cluster.org.apache.juli.FileHandler > > org.apache.catalina.tribes.level = FINE > org.apache.catalina.tribes.handlers = 5cluster.org.apache.juli.FileHandler > > org.apache.catalina.ha.level = FINE > org.apache.catalina.ha.handlers = 5cluster.org.apache.juli.FileHander > > org.apache.catalina.ha.deploy.level = INFO > org.apache.catalina.ha.deploy.handlers = > 5cluster.org.apache.juli.FileHandler > > Adjust the logging levels accordingly. > > . . . . just my two cents. > > /mde/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > -- Best Regards, Nilesh Mevada