[ 
https://issues.apache.org/jira/browse/GEODE-6522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795416#comment-16795416
 ] 

ASF subversion and git services commented on GEODE-6522:
--------------------------------------------------------

Commit a12dca02ca0a1a39a924c0d028104bd0fffe82b6 in geode's branch 
refs/heads/develop from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=a12dca0 ]

GEODE-6522 Server hangs during shutdown after becoming membership coordinator

Schedule removal of members that have failed availability checks before
becoming coordinator.  Do not shut down the View Creator thread if the
view was created locally, even if it seems to have a different
coordinator.


> Server hangs during shutdown after becoming membership coordinator
> ------------------------------------------------------------------
>
>                 Key: GEODE-6522
>                 URL: https://issues.apache.org/jira/browse/GEODE-6522
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Bruce Schuchardt
>            Assignee: Bruce Schuchardt
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Recent changes to processing of "leave" requests can cause a member to become 
> the coordinator when it receives a request from one member and all other 
> potential coordinators have failed availability checks.  Unfortunately this 
> is causing a hang during shutdown.  The new coordinator sends out a new view 
> but that view doesn't have the members that failed availability checks 
> removed.  This causes the new View Creator thread to be stopped.  Another one 
> isn't started until additional suspect processing is performed but that 
> doesn't always happen.  This can cause shutdown to hang with other threads 
> trying to contact servers that are no longer there.
>  
> {noformat}
> [info 2019/03/13 08:38:24.959 PDT <Geode Membership View Creator> tid=0x147] 
> finished waiting for responses to view preparation
> [info 2019/03/13 08:38:24.959 PDT <Geode Membership View Creator> tid=0x147] 
> received new view: 
> View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|15] members: 
> [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, 
> turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, 
> turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, 
> turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, 
> turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: 
> [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, 
> turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, 
> turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005]
> old view is: 
> View[turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000|3] 
> members: [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, 
> turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, 
> turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, 
> turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005, 
> turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, 
> turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, 
> turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, 
> turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023]
> [info 2019/03/13 08:38:24.974 PDT <Geode Membership View Creator> tid=0x147] 
> Failure detection is now watching 
> turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015; suspects are 
> {turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002=View[turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000|3]
>  members: 
> [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, 
> turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, 
> turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, 
> turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005, 
> turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, 
> turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, 
> turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, 
> turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023]}
> [info 2019/03/13 08:38:24.981 PDT <Geode Membership View Creator> tid=0x147] 
> sending new view 
> View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|15] members: 
> [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, 
> turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, 
> turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, 
> turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, 
> turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: 
> [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, 
> turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, 
> turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005]
> [info 2019/03/13 08:38:24.981 PDT <Geode Membership View Creator> tid=0x147] 
> BRUCE: setting shutdown flag in view creator
> java.lang.Exception: stack trace
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.setShutdownFlag(GMSJoinLeave.java:2247)
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.prepareAndSendView(GMSJoinLeave.java:2713)
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.sendInitialView(GMSJoinLeave.java:2220)
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.run(GMSJoinLeave.java:2299)
> [info 2019/03/13 08:38:24.982 PDT <Geode Membership View Creator> tid=0x147] 
> View Creator thread is exiting
> [info 2019/03/13 08:38:24.982 PDT <Geode Membership View Creator> tid=0x147] 
> BRUCE: setting shutdown flag in view creator
> java.lang.Exception: stack trace
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.setShutdownFlag(GMSJoinLeave.java:2247)
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.run(GMSJoinLeave.java:2379)
> [info 2019/03/13 08:38:26.416 PDT <vm_4_thr_8_bridge_2_1_host1_17023> 
> tid=0x150] GemFireCache[id = 66144348; isClosing = true; isShutDownAll = 
> false; created = Wed Mar 13 08:34:41 PDT 2019; server = false; copyOnRead = 
> false; lockLease = 120; lockTimeout = 60]: Now closing.
> ...
> [info 2019/03/13 08:38:27.495 PDT <Geode Membership View Creator> tid=0x161] 
> View Creator thread is starting
> [info 2019/03/13 08:38:27.507 PDT <Geode Membership View Creator> tid=0x161] 
> preparing new view 
> View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|21] members: 
> [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, 
> turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, 
> turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, 
> turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: 
> [turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019]
> [info 2019/03/13 08:38:27.508 PDT <Geode Membership View Creator> tid=0x161] 
> finished waiting for responses to view preparation
> [info 2019/03/13 08:38:27.508 PDT <Geode Membership View Creator> tid=0x161] 
> received new view: 
> View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|21] members: 
> [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, 
> turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, 
> turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, 
> turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: 
> [turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019]
> old view is: 
> View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|15] members: 
> [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, 
> turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, 
> turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, 
> turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, 
> turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: 
> [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, 
> turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, 
> turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005]
> [info 2019/03/13 08:38:27.566 PDT <Geode Membership View Creator> tid=0x161] 
> sending new view 
> View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|21] members: 
> [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, 
> turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, 
> turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, 
> turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: 
> [turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019]
> [info 2019/03/13 08:38:27.567 PDT <Geode Membership View Creator> tid=0x161] 
> BRUCE: setting shutdown flag in view creator
> java.lang.Exception: stack trace
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.setShutdownFlag(GMSJoinLeave.java:2247)
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.prepareAndSendView(GMSJoinLeave.java:2713)
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.sendInitialView(GMSJoinLeave.java:2220)
> at 
> org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.run(GMSJoinLeave.java:2299)
> [info 2019/03/13 08:38:27.567 PDT <Geode Membership View Creator> tid=0x161] 
> View Creator thread is exiting
> {noformat}
> etc.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to