Hello,
we have a cluster with 2 nodes, and have to restart one server while other is 
working. Sometimes after several restarts of each node in turn the TreeCache 
doesn't retrieve the state on startup, because it doesn't see other node. After 
some time it finds the other node, but it doesn't affect its state. This 
problem is very serious for us, because we store user sessions in the cache, so 
if the new node doesn't receive the state from existing node, all user requests 
to the new node fail.

Here is log excerpt from the starting node:

  | 2007-03-02 19:45:08,367 DEBUG [org.jboss.cache.TreeCache] Starting 
jboss.cache:service=GearSessionsTreeCache
  | 2007-03-02 19:45:11,878 INFO  [org.jboss.cache.TreeCache] viewAccepted(): 
[192.168.3.71:7810|0] [192.168.3.71:7810]
  | 2007-03-02 19:45:11,878 INFO  [org.jboss.cache.TreeCache] TreeCache local 
address is 192.168.3.71:7810
  | 2007-03-02 19:45:11,878 DEBUG [org.jboss.cache.TreeCache] transferred state 
is null (may be first member in cluster)
  | 2007-03-02 19:45:11,894 INFO  [org.jboss.cache.TreeCache] State could not 
be retrieved (we are the first member in group)
  | ...
  | 2007-03-02 19:45:24,192 INFO  [org.jboss.cache.TreeCache] viewAccepted(): 
MergeView::[192.168.3.65:7810|5] [192.168.3.65:7810, 192.168.3.71:7810], 
subgroups=[[192.168.3.71:7810|4] [192.168.3.65:7810], [192.168.3.71:7810|0] 
[192.168.3.71:7810]]
  | 
We use TCP stack of JGroups:

  |           <TCP bind_addr="192.168.3.71" start_port="7810" loopback="true"/>
  |           <TCPPING initial_hosts="192.168.3.65[7810]"
  |                    port_range="3"
  |                    timeout="3500"
  |                    num_initial_members="3"
  |                    up_thread="true"
  |                    down_thread="true"/>
  |           <MERGE2 min_interval="5000" max_interval="10000"/>
  |           <FD shun="true" timeout="2500" max_tries="5" up_thread="true" 
down_thread="true" />
  |           <VERIFY_SUSPECT timeout="1500" down_thread="false" 
up_thread="false" />
  |           <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100" 
retransmit_timeout="3000" />
  |           <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" 
up_thread="false" />
  |           <pbcast.GMS join_timeout="5000"
  |                       join_retry_timeout="2000"
  |                       shun="false"
  |                       print_local_addr="false"
  |                       down_thread="true"
  |                       up_thread="true"/>
  |           <pbcast.STATE_TRANSFER up_thread="true" down_thread="true" />
  | 
Any help with this would be very appreciated...

View the original post : 
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4024692#4024692

Reply to the post : 
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4024692
_______________________________________________
jboss-user mailing list
jboss-user@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/jboss-user

Reply via email to