[ http://jira.jboss.com/jira/browse/JGRP-39?page=history ]
Bela Ban updated JGRP-39: ------------------------- Fix Version: 2.2.8 > A TCP stack does not correctly detect failure (pulled cable) for certain > TCPPING configurations > ----------------------------------------------------------------------------------------------- > > Key: JGRP-39 > URL: http://jira.jboss.com/jira/browse/JGRP-39 > Project: JGroups > Type: Bug > Versions: 2.2.9 > Reporter: Ovidiu Feodorov > Assignee: Bela Ban > Fix For: 2.2.8 > > > Physical hosts "A" (192.168.1.1, coordinator) and "B" (192.168.1.2) run > JGroups processes configured with TCP/TCPPING stacks. > "A" stack configuration: > TCP(bind_addr=192.168.1.1;start_port=11800;loopback=true): > TCPPING(initial_hosts=192.168.1.2[11800];port_range=3;timeout=3500;num_initial_members=3;up_thread=true;down_thread=true): > MERGE2(min_interval=5000;max_interval=10000): > FD(shun=true;timeout=1500;max_tries=3;up_thread=true;down_thread=true): > VERIFY_SUSPECT(timeout=1500;down_thread=false;up_thread=false): > pbcast.NAKACK(down_thread=true;up_thread=true;gc_lag=100;retransmit_timeout=3000): > pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false): > pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;shun=false;print_local_addr=false;down_thread=true;up_thread=true) > "B" stack configuration: > TCP(bind_addr=192.168.1.2;start_port=11800;loopback=true): > TCPPING(initial_hosts=192.168.1.1[11800];port_range=3;timeout=3500;num_initial_members=3;up_thread=true;down_thread=true): > MERGE2(min_interval=5000;max_interval=10000): > FD(shun=true;timeout=1500;max_tries=3;up_thread=true;down_thread=true): > VERIFY_SUSPECT(timeout=1500;down_thread=false;up_thread=false): > pbcast.NAKACK(down_thread=true;up_thread=true;gc_lag=100;retransmit_timeout=3000): > pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false): > pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;shun=false;print_local_addr=false;down_thread=true;up_thread=true) > If I pull the cable under B, the B stack immediately and correctly > indentifies A as suspect and installs a new view containing itself only. > However, A does not recognizes B as suspect and undeterministically spews out > various info and warning messages. The view (A, B) stays incorrectly "valid" > for a long time; sometimes gets replaced by (A), sometimes not. > I tracked down the cause of the problem down to the A TCPPING configuration > and TCP queue . If A's TCPPING is configured with a port_range=1, the > problem goes away and the new view immediately installs into the A stack. It > seems that if there are messages in the TCP queue except the SUSPECT message > generated by FD, they mess up things and the SUSPECT message gets stuck in > the queue, with undeterministic results. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa - If you want more information on JIRA, or have a bug to report see: http://www.atlassian.com/software/jira ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ JBoss-Development mailing list JBoss-Development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jboss-development