Re: Tomcat not syncing existing sessions on restart

2024-01-23 Thread Manak Bisht
Thanks for going the extra mile to help me out on this. I really appreciate
it.
As far as I am aware, the auto detection of local member is only available
post v9.0.17 and the  tag was added in v8.5.1. Unfortunately,
I happen to be working in an environment where 8.5.0 is the highest non-EOL
version available. I know I am playing very fast and loose with the
definition of EOL when the current version is 8.5.98. Since the
StaticMembershipInterceptor has been available for a long time, I thought I
could make it work without those two features.

Sincerely,
Manak Bisht

On Tue, Jan 23, 2024 at 3:56 PM Mark Thomas  wrote:

> The other difference is that you don't appear to have defined the local
> member of the cluster. You should define all members of the cluster,
> including the local member, on each node. The local member can be
> defined explicitly as LocalMember or as an ordinary Member and Tomcat
> will figure out it is the local one.
>


Re: Tomcat not syncing existing sessions on restart

2024-01-23 Thread Mark Thomas
I have configured my standard cluster test environment for a 2-node 
cluster, using DeltaManager and static membership. httpd is configured 
for non-sticky load-balancing.


Each node has the Manager web application and my simple cluster-test 
deployed.

https://people.apache.org/~markt/dev/cluster-test.war

Starting both both nodes and connecting directly to each manager 
instance shows no sessions in cluster-test as expected.


Requesting the cluster index page via httpd triggers the creation of a 
single session in cluster-test. Requests alternate between node 1 and 
node 2 as expected. Examining the session via the manager app shows that 
the changes to the session are being correctly replicated.


Stopping node 2 causes further requests to be directed to node 1 only.

Starting node 2 shows that the session is replicated correctly from node 
1. I see the updated session in both nodes via the Manager app.


Also the following test works:
- create a session
- stop node 2
- further requests (handled by node 1)
- stop requests
- start node 2
- stop node 1
- resume requests (handled by node 2)

One difference is that I am using the StaticMembershipService rather 
than the StaticMembershipInterceptor. I don't think that will make any 
difference.


The other difference is that you don't appear to have defined the local 
member of the cluster. You should define all members of the cluster, 
including the local member, on each node. The local member can be 
defined explicitly as LocalMember or as an ordinary Member and Tomcat 
will figure out it is the local one.


Mark


On 22/01/2024 08:39, Manak Bisht wrote:

I thought that this https://marc.info/?l=tomcat-user=119376798217922=2
might be the problem.
*"The uniqueId is used to be able to differentiate between the same node
  joining a cluster, then crashing and then rejoining again. if the uniqueId
didn't change in between this, there is no way to tell  the difference
between a node going down, or just leaving the cluster  and rejoining."*
So, I tried creating a session when one of the nodes was down, but that did
not sync as well when the other node came online again.
In that case, I would also expect org.apache.catalina.ha.
session.DeltaManager.waitForSendAllSessions to proceed with no state sync
rather than timing out.

I have also checked the time on both the servers using the Linux date
command and they seem to be in sync. The timezone flag passed to the
JAVA_OPTS argument in catalina.sh is also the same. Please let me know if
any more information is required to help debug this issue.

Sincerely,
Manak Bisht

On Sun, Jan 14, 2024 at 11:09 PM Manak Bisht  wrote:


Hi,
I am using DeltaManager (static membership) with non-sticky load balancing
on two nodes. I have observed even load, and requests with the same
JSESSIONID being served successfully by both tomcats. This leads me to
conclude that session replication is working as expected when both nodes
are up.

However, when I restart any one of them, the newly restarted tomcat is
unable to serve requests from old sessions. The logs indicate that node
discovering is working but the session sync timeouts. New logins/sessions
work just fine though, implying that replication is working successfully
again.

*tomcat1.log*
13-Jan-2024 14:16:35.713 INFO [GroupChannel-Heartbeat-1]
org.apache.catalina.ha.tcp.SimpleTcpCluster.memberDisappeared Received
member
disappeared:org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat2:8090,tomcat2,8090,
alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 },
payload={}, command={}, domain={}, ]
13-Jan-2024 14:44:16.457 INFO [GroupChannel-Heartbeat-1]
org.apache.catalina.ha.tcp.SimpleTcpCluster.memberAdded Replication member
added:org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat2:8090,tomcat2,8090,
alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 },
payload={}, command={}, domain={}, ]
13-Jan-2024 14:44:16.457 INFO [GroupChannel-Heartbeat-1]
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.performBasicCheck
Suspect member, confirmed
alive.[org.apache.catalina.tribes.membership.StaticMember[tcp://tomcat2:8090,tomcat2,8090,
alive=0, securePort=-1, UDP Port=-1, id={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 },
payload={}, command={}, domain={}, ]]
*13-Jan-2024 14:45:24.354 WARNING [Tribes-Task-Receiver-4]
org.apache.catalina.ha.session.DeltaManager.deserializeSessions overload
existing session *


*tomcat2.log*
13-Jan-2024 14:45:24.290 INFO [localhost-startStop-1]
org.apache.catalina.ha.session.DeltaManager.startInternal Register manager
localhost# to cluster element Engine with name Catalina
13-Jan-2024 14:45:24.291 INFO [localhost-startStop-1]
org.apache.catalina.ha.session.DeltaManager.startInternal Starting
clustering manager at localhost#
13-Jan-2024 14:45:24.363 INFO [localhost-startStop-1]
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.report