Hi. Here's the problem in a nutshell. 3-node cluster with shared tree cache. 
Nodes 1 and 2 go away at around the same time (via an unplugged network cable). 
Node 3 gets notification withing 10-12 seconds that Node 1 is gone and makes a 
few changes to the cache (within a transaction). Cache tries to replicate to 
Node 2 (not knowing it has gone away) and fails (ReplicationException). Node 3 
thinks that his local cache has been updated but it hasn't because of the 
replication failure. Node 3 receives notification that Node 2 has gone away 
after ~50 seconds and again updates his cache, which works because there is no 
one left to replicate to.

There are two things I need help with:
1. I need to have my local cache update even when it fails to replicate.
2. Why does it take so long to receive notification that the second node has 
gone away when they were both on the same network cable that I unplugged? My 
JGroups timeout is set to 12 seconds max (counting retries). The two JGroups 
viewChange notifications are sometime more than 60 seconds apart.

Thanks for the help!
Jim

View the original post : 
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3973649#3973649

Reply to the post : 
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3973649
_______________________________________________
jboss-user mailing list
jboss-user@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/jboss-user

Reply via email to