We are using JBoss Cache 1.4.1 SP6 with JGroups 2.4.x
We have a cluster of cache instances with two Sun Solaris and multiple RHEL 
machines.

When one of the RHEL instance is restarted, the VIEW of the cache instances in 
SOLARIS machines aren't updated.
i.e. viewAccepted() - Still has the old RHEL instance along with the new RHEL 
instance(which was restarted)

eg: [172.16.11.200:65261, 172.16.11.12:50903, 172.16.11.10:41912, 
172.16.11.20:51156, 172.16.11.10:43789, 172.16.11.20:57771,  
172.16.11.10:51722, 172.16.11.20:35858,  172.16.11.11:51210]

172.16.11.10 - RHEL Instance 1
172.16.11.20 - RHEL Instance 2

Its assumed that when a cache instance goes down the view should be immediately 
when FD_SOCK is configured. But it wasn't updated as expected.

Whereas the viewAccepted() was updated with active members and got resolved 
after some hours only.

We got a ReplicationException timeout

Received Throwable from remote node org.jboss.cache.ReplicationException: 
rsp=sender=172.16.11.10:41912, retval=null, received=false, suspected=false

The code is as follows

 <attribute name="ClusterConfig">
  |             <config>
  |                 <!-- UDP: if you have a multihomed machine,
  |                 set the bind_addr attribute to the appropriate NIC IP 
address, e.g bind_addr="192.168.0.2"
  |                 -->
  |                 <!-- UDP: On Windows machines, because of the media sense 
feature
  |                  being broken with multicast (even after disabling media 
sense)
  |                  set the loopback attribute to true -->
  |                 <UDP mcast_addr="224.7.8.9" mcast_port="45567"
  |                     ip_ttl="64" ip_mcast="true"
  |                     mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
  |                     ucast_send_buf_size="150000" ucast_recv_buf_size="80000"
  |                     loopback="true" bind_addr="16.150.24.69"/>
  |                 <PING timeout="2000" num_initial_members="3"
  |                     up_thread="false" down_thread="false"/>
  |                 <MERGE2 min_interval="10000" max_interval="20000"/>
  |                 <!--        <FD shun="true" up_thread="true" 
down_thread="true" />-->
  |                 <FD_SOCK/>
  |                 <VERIFY_SUSPECT timeout="1500"
  |                     up_thread="false" down_thread="false"/>
  |                 <pbcast.NAKACK gc_lag="50" 
retransmit_timeout="600,1200,2400,4800"
  |                     max_xmit_size="8192" up_thread="false" 
down_thread="false"/>
  |                 <UNICAST timeout="600,1200,2400" window_size="100" 
min_threshold="10"
  |                     down_thread="false"/>
  |                 <pbcast.STABLE desired_avg_gossip="20000"
  |                     up_thread="false" down_thread="false"/>
  |                 <FRAG frag_size="8192"
  |                     down_thread="false" up_thread="false"/>
  |                 <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
  |                     shun="true" print_local_addr="true"/>
  |                 <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
  |             </config>
  |         </attribute>



>From the exception message we infer that  172.16.11.10:41912, this cache 
>instance has been restarted and the current active instance was 
>172.16.11.10:51722

View the original post : 
http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4221191#4221191

Reply to the post : 
http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&p=4221191
_______________________________________________
jboss-user mailing list
jboss-user@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/jboss-user

Reply via email to