[ 
https://issues.apache.org/jira/browse/GEODE-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203682#comment-16203682
 ] 

ASF subversion and git services commented on GEODE-3780:
--------------------------------------------------------

Commit 081bccd8391709381f2b3c4cce3b1cf6df49b1ce in geode's branch 
refs/heads/develop from [~bschuchardt]
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=081bccd ]

GEODE-3780 suspected member is never watched again after passing final check

A member going through a final health check is now put in the
suspected members collection and a new "neighbour" is selected for
the background monitor thread, ensuring that it doesn't interfere
with the health check.  Once the health check is done the member is
removed from the suspected members collection and a new "neighbour"
is selected, allowing the monitor thread to once again consider the
suspected member.

A message is also sent to the node that initiated suspicion so that
it also will resume watching the formerly suspect member.


> suspected member is never watched again after passing final check
> -----------------------------------------------------------------
>
>                 Key: GEODE-3780
>                 URL: https://issues.apache.org/jira/browse/GEODE-3780
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Bruce Schuchardt
>
> In a network-down test we saw a node on the losing side of the network 
> partition perform final checks on members on the winning side.  One of the 
> final checks mysteriously succeeded
> [info 2017/09/17 12:24:45.552 PDT 
> gemfire1_rs-FullRegression-2017-09-15-21-00-35-client-10_8941 <Geode Failure 
> Detection thread 4> tid=0x128] Final check failed but detected recent message 
> traffic for suspect member 
> 10.32.109.252(gemfire3_rs-FullRegression-2017-09-15-21-00-35-client-16_6135:6135)<v2>:1026
> [info 2017/09/17 12:24:45.552 PDT 
> gemfire1_rs-FullRegression-2017-09-15-21-00-35-client-10_8941 <Geode Failure 
> Detection thread 4> tid=0x128] Final check passed for suspect member 
> 10.32.109.252(gemfire3_rs-FullRegression-2017-09-15-21-00-35-client-16_6135:6135)<v2>:1026
> After this the suspected member was never checked again and the losing side 
> failed to shut down.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to