[ https://issues.apache.org/jira/browse/GEODE-6522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bruce Schuchardt resolved GEODE-6522. ------------------------------------- Resolution: Fixed Fix Version/s: 1.10.0 > Server hangs during shutdown after becoming membership coordinator > ------------------------------------------------------------------ > > Key: GEODE-6522 > URL: https://issues.apache.org/jira/browse/GEODE-6522 > Project: Geode > Issue Type: Bug > Components: membership > Reporter: Bruce Schuchardt > Assignee: Bruce Schuchardt > Priority: Major > Fix For: 1.10.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Recent changes to processing of "leave" requests can cause a member to become > the coordinator when it receives a request from one member and all other > potential coordinators have failed availability checks. Unfortunately this > is causing a hang during shutdown. The new coordinator sends out a new view > but that view doesn't have the members that failed availability checks > removed. This causes the new View Creator thread to be stopped. Another one > isn't started until additional suspect processing is performed but that > doesn't always happen. This can cause shutdown to hang with other threads > trying to contact servers that are no longer there. > > {noformat} > [info 2019/03/13 08:38:24.959 PDT <Geode Membership View Creator> tid=0x147] > finished waiting for responses to view preparation > [info 2019/03/13 08:38:24.959 PDT <Geode Membership View Creator> tid=0x147] > received new view: > View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|15] members: > [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, > turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, > turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, > turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, > turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: > [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, > turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, > turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005] > old view is: > View[turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000|3] > members: [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, > turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, > turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, > turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005, > turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, > turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, > turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, > turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] > [info 2019/03/13 08:38:24.974 PDT <Geode Membership View Creator> tid=0x147] > Failure detection is now watching > turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015; suspects are > {turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002=View[turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000|3] > members: > [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, > turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, > turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, > turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005, > turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, > turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, > turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, > turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023]} > [info 2019/03/13 08:38:24.981 PDT <Geode Membership View Creator> tid=0x147] > sending new view > View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|15] members: > [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, > turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, > turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, > turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, > turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: > [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, > turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, > turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005] > [info 2019/03/13 08:38:24.981 PDT <Geode Membership View Creator> tid=0x147] > BRUCE: setting shutdown flag in view creator > java.lang.Exception: stack trace > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.setShutdownFlag(GMSJoinLeave.java:2247) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.prepareAndSendView(GMSJoinLeave.java:2713) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.sendInitialView(GMSJoinLeave.java:2220) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.run(GMSJoinLeave.java:2299) > [info 2019/03/13 08:38:24.982 PDT <Geode Membership View Creator> tid=0x147] > View Creator thread is exiting > [info 2019/03/13 08:38:24.982 PDT <Geode Membership View Creator> tid=0x147] > BRUCE: setting shutdown flag in view creator > java.lang.Exception: stack trace > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.setShutdownFlag(GMSJoinLeave.java:2247) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.run(GMSJoinLeave.java:2379) > [info 2019/03/13 08:38:26.416 PDT <vm_4_thr_8_bridge_2_1_host1_17023> > tid=0x150] GemFireCache[id = 66144348; isClosing = true; isShutDownAll = > false; created = Wed Mar 13 08:34:41 PDT 2019; server = false; copyOnRead = > false; lockLease = 120; lockTimeout = 60]: Now closing. > ... > [info 2019/03/13 08:38:27.495 PDT <Geode Membership View Creator> tid=0x161] > View Creator thread is starting > [info 2019/03/13 08:38:27.507 PDT <Geode Membership View Creator> tid=0x161] > preparing new view > View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|21] members: > [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, > turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, > turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, > turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: > [turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019] > [info 2019/03/13 08:38:27.508 PDT <Geode Membership View Creator> tid=0x161] > finished waiting for responses to view preparation > [info 2019/03/13 08:38:27.508 PDT <Geode Membership View Creator> tid=0x161] > received new view: > View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|21] members: > [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, > turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, > turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, > turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: > [turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019] > old view is: > View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|15] members: > [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, > turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, > turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, > turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019, > turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: > [turtle(locatorgemfire_2_3_host1_17698:17698:locator)<ec><v0>:41000, > turtle(locatorgemfire_2_4_host1_17715:17715:locator)<ec><v1>:41003, > turtle(locatorgemfire_2_2_host1_17676:17676:locator)<ec><v1>:41005] > [info 2019/03/13 08:38:27.566 PDT <Geode Membership View Creator> tid=0x161] > sending new view > View[turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014|21] members: > [turtle(locatorgemfire_2_1_host1_17653:17653:locator)<ec><v1>:41002, > turtle(bridgegemfire_2_1_host1_17023:17023)<ec><v2>:41014{lead}, > turtle(bridgegemfire_2_4_host1_17100:17100)<ec><v2>:41015, > turtle(bridgegemfire_2_3_host1_17064:17064)<ec><v3>:41023] shutdown: > [turtle(bridgegemfire_2_2_host1_17048:17048)<ec><v3>:41019] > [info 2019/03/13 08:38:27.567 PDT <Geode Membership View Creator> tid=0x161] > BRUCE: setting shutdown flag in view creator > java.lang.Exception: stack trace > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.setShutdownFlag(GMSJoinLeave.java:2247) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.prepareAndSendView(GMSJoinLeave.java:2713) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.sendInitialView(GMSJoinLeave.java:2220) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave$ViewCreator.run(GMSJoinLeave.java:2299) > [info 2019/03/13 08:38:27.567 PDT <Geode Membership View Creator> tid=0x161] > View Creator thread is exiting > {noformat} > etc. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)