Hi, I am using DeltaManager (static membership) with non-sticky load balancing on two nodes. I have observed even load, and requests with the same JSESSIONID being served successfully by both tomcats. This leads me to conclude that session replication is working as expected when both nodes are up.
However, when I restart any one of them, the newly restarted tomcat is unable to serve requests from old sessions. The logs indicate that node discovering is working but the session sync timeouts. New logins/sessions work just fine though, implying that replication is working successfully again. *tomcat1.log* 13-Jan-2024 14:16:35.713 INFO [GroupChannel-Heartbeat-1] org.apache.catalina.ha.tcp.SimpleTcpCluster.memberDisappeared Received member disappeared:org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat2:8090,tomcat2,8090, alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 }, payload={}, command={}, domain={}, ] 13-Jan-2024 14:44:16.457 INFO [GroupChannel-Heartbeat-1] org.apache.catalina.ha.tcp.SimpleTcpCluster.memberAdded Replication member added:org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat2:8090,tomcat2,8090, alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 }, payload={}, command={}, domain={}, ] 13-Jan-2024 14:44:16.457 INFO [GroupChannel-Heartbeat-1] org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.performBasicCheck Suspect member, confirmed alive.[org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat2:8090,tomcat2,8090, alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 }, payload={}, command={}, domain={}, ]] *13-Jan-2024 14:45:24.354 WARNING [Tribes-Task-Receiver-4] org.apache.catalina.ha.session.DeltaManager.deserializeSessions overload existing session XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX* *tomcat2.log* 13-Jan-2024 14:45:24.290 INFO [localhost-startStop-1] org.apache.catalina.ha.session.DeltaManager.startInternal Register manager localhost# to cluster element Engine with name Catalina 13-Jan-2024 14:45:24.291 INFO [localhost-startStop-1] org.apache.catalina.ha.session.DeltaManager.startInternal Starting clustering manager at localhost# 13-Jan-2024 14:45:24.363 INFO [localhost-startStop-1] org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.report ThroughputInterceptor Report[ Tx Msg:1 messages Sent:0.00 MB (total) Sent:0.00 MB (application) Time:0.06 seconds Tx Speed:0.01 MB/sec (total) TxSpeed:0.01 MB/sec (application) Error Msg:0 Rx Msg:15 messages Rx Speed:0.00 MB/sec (since 1st msg) Received:0.00 MB] 13-Jan-2024 14:45:24.368 INFO [localhost-startStop-1] org.apache.catalina.ha.session.DeltaManager.getAllClusterSessions Manager [localhost#], requesting session state from org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat1:8090,tomcat1,8090, alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 }, payload={}, command={}, domain={}, ]. This operation will timeout if no session state has been received within 60 seconds. *13-Jan-2024 14:46:24.459 SEVERE [localhost-startStop-1] org.apache.catalina.ha.session.DeltaManager.waitForSendAllSessions Manager [localhost#]: No session state send at 1/13/24 2:45 PM received, timing out after 60,167 ms.* There is also a warning, but I am unsure of its significance. I have tried tweaking the sendAllSessions value to false and increasing the stateTransferTimeout window to no avail. This is my clustering config for tomcat1 (the config is the same for tomcat2 with the host as tomcat1 and uniqueId {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1}) - <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" channelSendOptions="6" channelStartOptions="3"> <Manager className="org.apache.catalina.ha.session.DeltaManager"/> <Channel className="org.apache.catalina.tribes.group.GroupChannel"> <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver" address="0.0.0.0" port="8090" autoBind="0"/> <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/> </Sender> <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpPingInterceptor"/> <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/> <Interceptor className="org.apache.catalina.tribes.group.interceptors.StaticMembershipInterceptor"> <Member className="org.apache.catalina.tribes.membership.StaticMember" port="8090" host="tomcat2" uniqueId="{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2}"/> </Interceptor> <Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/> </Channel> <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" filter=""/> <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/> </Cluster> Any help would be greatly appreciated. Sincerely, Manak Bisht