[jira] [Commented] (GEODE-2875) shutdown is taking as long as 20 seconds
[ https://issues.apache.org/jira/browse/GEODE-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016334#comment-16016334 ] ASF subversion and git services commented on GEODE-2875: Commit 15245dfd2b78a593697e46c8710d23783fc4 in geode's branch refs/heads/feature/GEODE-2929-1 from [~bschuchardt] [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=15245df ] GEODE-2875 shutdown is taking as long as 20 seconds With a 1.2.0 release pending I am backing out the part of the fix for this issue that routed certain messages over UDP unicast. This change needs more testing as Hitesh suspects it is implicated in a number of hangs he has seen in his tests. The Shutdown message is still routed over UDP but all others are now directed to TCP/IP stream sockets, as they were before. > shutdown is taking as long as 20 seconds > > > Key: GEODE-2875 > URL: https://issues.apache.org/jira/browse/GEODE-2875 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bruce Schuchardt >Assignee: Bruce Schuchardt > Fix For: 1.2.0 > > > Recent changes have introduced a bug where sometimes, particularly during > shutdown of a lot of servers, the shutdown process will stall for as long as > 20 seconds. This appears to be due to changes that keep a membership > coordinator from sending out a new view during shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (GEODE-2875) shutdown is taking as long as 20 seconds
[ https://issues.apache.org/jira/browse/GEODE-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015896#comment-16015896 ] ASF subversion and git services commented on GEODE-2875: Commit 15245dfd2b78a593697e46c8710d23783fc4 in geode's branch refs/heads/develop from [~bschuchardt] [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=15245df ] GEODE-2875 shutdown is taking as long as 20 seconds With a 1.2.0 release pending I am backing out the part of the fix for this issue that routed certain messages over UDP unicast. This change needs more testing as Hitesh suspects it is implicated in a number of hangs he has seen in his tests. The Shutdown message is still routed over UDP but all others are now directed to TCP/IP stream sockets, as they were before. > shutdown is taking as long as 20 seconds > > > Key: GEODE-2875 > URL: https://issues.apache.org/jira/browse/GEODE-2875 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bruce Schuchardt >Assignee: Bruce Schuchardt > Fix For: 1.2.0 > > > Recent changes have introduced a bug where sometimes, particularly during > shutdown of a lot of servers, the shutdown process will stall for as long as > 20 seconds. This appears to be due to changes that keep a membership > coordinator from sending out a new view during shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (GEODE-2875) shutdown is taking as long as 20 seconds
[ https://issues.apache.org/jira/browse/GEODE-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008989#comment-16008989 ] ASF subversion and git services commented on GEODE-2875: Commit 48f6e11adb84145187f9b1f715b6b368d94cee68 in geode's branch refs/heads/feature/GEODE-2900 from [~bschuchardt] [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=48f6e11 ] GEODE-2875 shutdown is taking as long as 20 seconds The fix for this issue causes one of the test cases in LocatorDUnitTest to fail consistently. With the fix we don't create any TCP/IP connections in this test during startup but the test expects one to have been created and it expects the connection's reader-thread to have initiated suspect processing. The test needs to be altered to ensure that this thread has been created by sending message that requires a reply. The reply will be sent over the expected connection, ensuring that there is a reader-thread. > shutdown is taking as long as 20 seconds > > > Key: GEODE-2875 > URL: https://issues.apache.org/jira/browse/GEODE-2875 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bruce Schuchardt >Assignee: Bruce Schuchardt > Fix For: 1.2.0 > > > Recent changes have introduced a bug where sometimes, particularly during > shutdown of a lot of servers, the shutdown process will stall for as long as > 20 seconds. This appears to be due to changes that keep a membership > coordinator from sending out a new view during shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (GEODE-2875) shutdown is taking as long as 20 seconds
[ https://issues.apache.org/jira/browse/GEODE-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008819#comment-16008819 ] ASF subversion and git services commented on GEODE-2875: Commit 48f6e11adb84145187f9b1f715b6b368d94cee68 in geode's branch refs/heads/feature/GEODE-2632-15 from [~bschuchardt] [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=48f6e11 ] GEODE-2875 shutdown is taking as long as 20 seconds The fix for this issue causes one of the test cases in LocatorDUnitTest to fail consistently. With the fix we don't create any TCP/IP connections in this test during startup but the test expects one to have been created and it expects the connection's reader-thread to have initiated suspect processing. The test needs to be altered to ensure that this thread has been created by sending message that requires a reply. The reply will be sent over the expected connection, ensuring that there is a reader-thread. > shutdown is taking as long as 20 seconds > > > Key: GEODE-2875 > URL: https://issues.apache.org/jira/browse/GEODE-2875 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bruce Schuchardt >Assignee: Bruce Schuchardt > Fix For: 1.2.0 > > > Recent changes have introduced a bug where sometimes, particularly during > shutdown of a lot of servers, the shutdown process will stall for as long as > 20 seconds. This appears to be due to changes that keep a membership > coordinator from sending out a new view during shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (GEODE-2875) shutdown is taking as long as 20 seconds
[ https://issues.apache.org/jira/browse/GEODE-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008735#comment-16008735 ] ASF subversion and git services commented on GEODE-2875: Commit 48f6e11adb84145187f9b1f715b6b368d94cee68 in geode's branch refs/heads/develop from [~bschuchardt] [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=48f6e11 ] GEODE-2875 shutdown is taking as long as 20 seconds The fix for this issue causes one of the test cases in LocatorDUnitTest to fail consistently. With the fix we don't create any TCP/IP connections in this test during startup but the test expects one to have been created and it expects the connection's reader-thread to have initiated suspect processing. The test needs to be altered to ensure that this thread has been created by sending message that requires a reply. The reply will be sent over the expected connection, ensuring that there is a reader-thread. > shutdown is taking as long as 20 seconds > > > Key: GEODE-2875 > URL: https://issues.apache.org/jira/browse/GEODE-2875 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bruce Schuchardt >Assignee: Bruce Schuchardt > > Recent changes have introduced a bug where sometimes, particularly during > shutdown of a lot of servers, the shutdown process will stall for as long as > 20 seconds. This appears to be due to changes that keep a membership > coordinator from sending out a new view during shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (GEODE-2875) shutdown is taking as long as 20 seconds
[ https://issues.apache.org/jira/browse/GEODE-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003026#comment-16003026 ] ASF subversion and git services commented on GEODE-2875: Commit d2edad5eb1d50762a01f372f430464f6919e65fd in geode's branch refs/heads/develop from [~ukohlmeyer] [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=d2edad5 ] GEODE-2875: SpotlessApply > shutdown is taking as long as 20 seconds > > > Key: GEODE-2875 > URL: https://issues.apache.org/jira/browse/GEODE-2875 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bruce Schuchardt > > Recent changes have introduced a bug where sometimes, particularly during > shutdown of a lot of servers, the shutdown process will stall for as long as > 20 seconds. This appears to be due to changes that keep a membership > coordinator from sending out a new view during shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (GEODE-2875) shutdown is taking as long as 20 seconds
[ https://issues.apache.org/jira/browse/GEODE-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002903#comment-16002903 ] ASF subversion and git services commented on GEODE-2875: Commit f0b99b48dca354cd80d0f8620031ecbe780e8fa6 in geode's branch refs/heads/develop from [~bschuchardt] [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=f0b99b4 ] GEODE-2875 shutdown is taking as long as 20 seconds The band-aid fix for this problem was to reduce the wait-time on joining a thread sending shutdown messages. This change set alters the membership manager, reviving the path of sending certain messages like ShutdownMessage over UDP instead of TCP/IP stream sockets. This avenue doesn't block trying to form point-to-point connections so the join() can complete in a short amount of time. > shutdown is taking as long as 20 seconds > > > Key: GEODE-2875 > URL: https://issues.apache.org/jira/browse/GEODE-2875 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bruce Schuchardt > > Recent changes have introduced a bug where sometimes, particularly during > shutdown of a lot of servers, the shutdown process will stall for as long as > 20 seconds. This appears to be due to changes that keep a membership > coordinator from sending out a new view during shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (GEODE-2875) shutdown is taking as long as 20 seconds
[ https://issues.apache.org/jira/browse/GEODE-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995747#comment-15995747 ] ASF subversion and git services commented on GEODE-2875: Commit c3a70efaf43b470bd16ab6c1a3a003e5f99686d0 in geode's branch refs/heads/develop from [~bschuchardt] [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=c3a70ef ] GEODE-2875 shutdown is taking as long as 20 seconds Hitesh reviewed this change for me. A more complete fix is in the works but needs a lot of testing. This change will reduce the stall from 20 seconds down to 5 seconds. > shutdown is taking as long as 20 seconds > > > Key: GEODE-2875 > URL: https://issues.apache.org/jira/browse/GEODE-2875 > Project: Geode > Issue Type: Bug > Components: membership >Reporter: Bruce Schuchardt > > Recent changes have introduced a bug where sometimes, particularly during > shutdown of a lot of servers, the shutdown process will stall for as long as > 20 seconds. This appears to be due to changes that keep a membership > coordinator from sending out a new view during shutdown. -- This message was sent by Atlassian JIRA (v6.3.15#6346)