[jira] [Resolved] (GEODE-10306) CacheServerImpl should stop the acceptor immediately after stop is called
[ https://issues.apache.org/jira/browse/GEODE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-10306. - Fix Version/s: 1.16.0 Resolution: Fixed Moved up the acceptor close > CacheServerImpl should stop the acceptor immediately after stop is called > - > > Key: GEODE-10306 > URL: https://issues.apache.org/jira/browse/GEODE-10306 > Project: Geode > Issue Type: Bug >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Currently, after cache server stop is called, it takes a while for the > acceptor to stop taking new data, which can be a problem because the bigger > the window of time, the greater the risk of data loss. > > {noformat} > public synchronized void stop() { > if (!isRunning()) { > return; > } > RuntimeException firstException = null; > try { > if (loadMonitor != null) { > loadMonitor.stop(); > } > } catch (RuntimeException e) { > logger.warn("CacheServer - Error closing load monitor", e); > firstException = e; > } > try { > if (advisor != null) { > advisor.close(); > } > } catch (RuntimeException e) { > logger.warn("CacheServer - Error closing advisor", e); > firstException = e; > } > PROBLEM -> try { > if (acceptor != null) { > acceptor.close(); > } > } catch (RuntimeException e) { > logger.warn("CacheServer - Error closing acceptor monitor", e); > if (firstException != null) { > firstException = e; > } > } {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (GEODE-10306) CacheServerImpl should stop the acceptor immediately after stop is called
[ https://issues.apache.org/jira/browse/GEODE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-10306: --- Assignee: Mark Hanson > CacheServerImpl should stop the acceptor immediately after stop is called > - > > Key: GEODE-10306 > URL: https://issues.apache.org/jira/browse/GEODE-10306 > Project: Geode > Issue Type: Bug >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > > Currently, after cache server stop is called, it takes a while for the > acceptor to stop taking new data, which can be a problem because the bigger > the window of time, the greater the risk of data loss. > > {noformat} > public synchronized void stop() { > if (!isRunning()) { > return; > } > RuntimeException firstException = null; > try { > if (loadMonitor != null) { > loadMonitor.stop(); > } > } catch (RuntimeException e) { > logger.warn("CacheServer - Error closing load monitor", e); > firstException = e; > } > try { > if (advisor != null) { > advisor.close(); > } > } catch (RuntimeException e) { > logger.warn("CacheServer - Error closing advisor", e); > firstException = e; > } > PROBLEM -> try { > if (acceptor != null) { > acceptor.close(); > } > } catch (RuntimeException e) { > logger.warn("CacheServer - Error closing acceptor monitor", e); > if (firstException != null) { > firstException = e; > } > } {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (GEODE-10306) CacheServerImpl should stop the acceptor immediately after stop is called
Mark Hanson created GEODE-10306: --- Summary: CacheServerImpl should stop the acceptor immediately after stop is called Key: GEODE-10306 URL: https://issues.apache.org/jira/browse/GEODE-10306 Project: Geode Issue Type: Bug Reporter: Mark Hanson Currently, after cache server stop is called, it takes a while for the acceptor to stop taking new data, which can be a problem because the bigger the window of time, the greater the risk of data loss. {noformat} public synchronized void stop() { if (!isRunning()) { return; } RuntimeException firstException = null; try { if (loadMonitor != null) { loadMonitor.stop(); } } catch (RuntimeException e) { logger.warn("CacheServer - Error closing load monitor", e); firstException = e; } try { if (advisor != null) { advisor.close(); } } catch (RuntimeException e) { logger.warn("CacheServer - Error closing advisor", e); firstException = e; } PROBLEM -> try { if (acceptor != null) { acceptor.close(); } } catch (RuntimeException e) { logger.warn("CacheServer - Error closing acceptor monitor", e); if (firstException != null) { firstException = e; } } {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (GEODE-10265) DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML cannot be run in parallel with itself.
[ https://issues.apache.org/jira/browse/GEODE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-10265: Description: This test uses a hardcoded cache.xml with a server port inside that is hardcoded. Basically, the second test started in parallel will have a bind error because the port is already in use. We should consider generating the file rather than using a static one. Stress-new-test failure. [https://concourse.apachegeode-ci.info/builds/48751343] This issue was discovered as part of the stress-new-test of GEODE-10228's PR {noformat} The Problem > < The Problem org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil$ControlListener {noformat} {noformat} DurableClientSimpleDUnitTest > testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML FAILED org.gradle.internal.exceptions.DefaultMultiCauseException: Multiple Failures (2 failures) org.apache.geode.test.dunit.RMIException: While invoking org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest$$Lambda$364/438711076.call in VM 0 running on Host heavy-lifter-f7bd4fb4-95bb-5e71-b25c-83f8d8a79c56.c.apachegeode-ci.internal with 4 VMs java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use IgnoredException.addIgnoredException to ignore. --- Found suspect string in 'dunit_suspect-vm0.log' at line 450 [error 2022/04/28 00:39:54.901 UTC tid=32] Cache initialization for GemFireCache[id = 1097663966; isClosing = false; isShutDownAll = false; created = Thu Apr 28 00:37:54 UTC 2022; server = true; copyOnRead = false; lockLease = 120; lockTimeout = 60] failed because: org.apache.geode.GemFireIOException: While starting cache server CacheServer on port=10188 client subscription config policy=entry client subscription config capacity=1000 client subscription config overflow directory=. at org.apache.geode.internal.cache.xmlcache.CacheCreation.startCacheServers(CacheCreation.java:801) at org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:600) at org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:339) at org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4202) at org.apache.geode.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1620) at org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1445) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158) at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142) at org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil.createCacheServerFromXmlN(CacheServerTestUtil.java:253) at org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest.lambda$testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML$515fd116$1(DurableClientSimpleDUnitTest.java:584) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123) at org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.r
[jira] [Updated] (GEODE-10265) DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML cannot be run in parallel with itself.
[ https://issues.apache.org/jira/browse/GEODE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-10265: Description: This test uses a hardcoded cache.xml with a server port inside that is hardcoded. Bascially, the second test started in parallel will have a bind error because the port is already in use. We should consider generating the file rather than using a static one. Stress-new-test failure. [https://concourse.apachegeode-ci.info/builds/48751343] This issue was discovered as part of the stress-new-test of GEODE-10228's PR {noformat} The Problem > < The Problem org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil$ControlListener {noformat} {noformat} DurableClientSimpleDUnitTest > testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML FAILED org.gradle.internal.exceptions.DefaultMultiCauseException: Multiple Failures (2 failures) org.apache.geode.test.dunit.RMIException: While invoking org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest$$Lambda$364/438711076.call in VM 0 running on Host heavy-lifter-f7bd4fb4-95bb-5e71-b25c-83f8d8a79c56.c.apachegeode-ci.internal with 4 VMs java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use IgnoredException.addIgnoredException to ignore. --- Found suspect string in 'dunit_suspect-vm0.log' at line 450 [error 2022/04/28 00:39:54.901 UTC tid=32] Cache initialization for GemFireCache[id = 1097663966; isClosing = false; isShutDownAll = false; created = Thu Apr 28 00:37:54 UTC 2022; server = true; copyOnRead = false; lockLease = 120; lockTimeout = 60] failed because: org.apache.geode.GemFireIOException: While starting cache server CacheServer on port=10188 client subscription config policy=entry client subscription config capacity=1000 client subscription config overflow directory=. at org.apache.geode.internal.cache.xmlcache.CacheCreation.startCacheServers(CacheCreation.java:801) at org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:600) at org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:339) at org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4202) at org.apache.geode.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1620) at org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1445) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158) at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142) at org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil.createCacheServerFromXmlN(CacheServerTestUtil.java:253) at org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest.lambda$testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML$515fd116$1(DurableClientSimpleDUnitTest.java:584) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123) at org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.r
[jira] [Updated] (GEODE-10265) DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML cannot be run in parallel with itself.
[ https://issues.apache.org/jira/browse/GEODE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-10265: Description: This test uses a hardcoded cache.xml with a server port inside that is hardcoded. Bascially the second test started in parallel will have a bind error because the port is already in use. Stress-new-test failure. [https://concourse.apachegeode-ci.info/builds/48751343] This issue was discovered as part of the stress-new-test of GEODE-10228's PR {noformat} DurableClientSimpleDUnitTest > testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML FAILED org.gradle.internal.exceptions.DefaultMultiCauseException: Multiple Failures (2 failures) org.apache.geode.test.dunit.RMIException: While invoking org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest$$Lambda$364/438711076.call in VM 0 running on Host heavy-lifter-f7bd4fb4-95bb-5e71-b25c-83f8d8a79c56.c.apachegeode-ci.internal with 4 VMs java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use IgnoredException.addIgnoredException to ignore. --- Found suspect string in 'dunit_suspect-vm0.log' at line 450 [error 2022/04/28 00:39:54.901 UTC tid=32] Cache initialization for GemFireCache[id = 1097663966; isClosing = false; isShutDownAll = false; created = Thu Apr 28 00:37:54 UTC 2022; server = true; copyOnRead = false; lockLease = 120; lockTimeout = 60] failed because: org.apache.geode.GemFireIOException: While starting cache server CacheServer on port=10188 client subscription config policy=entry client subscription config capacity=1000 client subscription config overflow directory=. at org.apache.geode.internal.cache.xmlcache.CacheCreation.startCacheServers(CacheCreation.java:801) at org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:600) at org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:339) at org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4202) at org.apache.geode.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1620) at org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1445) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158) at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142) at org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil.createCacheServerFromXmlN(CacheServerTestUtil.java:253) at org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest.lambda$testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML$515fd116$1(DurableClientSimpleDUnitTest.java:584) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123) at org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:
[jira] [Created] (GEODE-10265) DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML cannot be run in parallel with itself.
Mark Hanson created GEODE-10265: --- Summary: DurableClientSimpleDUnitTest.testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML cannot be run in parallel with itself. Key: GEODE-10265 URL: https://issues.apache.org/jira/browse/GEODE-10265 Project: Geode Issue Type: Bug Components: tests Reporter: Mark Hanson This test uses a hardcoded cache.xml with a server port inside that is hardcoded. Bascially the second test started in parallel will have a bind error because the port is already in use. Stress-new-test failure. https://concourse.apachegeode-ci.info/builds/48751343 {noformat} DurableClientSimpleDUnitTest > testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML FAILED org.gradle.internal.exceptions.DefaultMultiCauseException: Multiple Failures (2 failures) org.apache.geode.test.dunit.RMIException: While invoking org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest$$Lambda$364/438711076.call in VM 0 running on Host heavy-lifter-f7bd4fb4-95bb-5e71-b25c-83f8d8a79c56.c.apachegeode-ci.internal with 4 VMs java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use IgnoredException.addIgnoredException to ignore. --- Found suspect string in 'dunit_suspect-vm0.log' at line 450 [error 2022/04/28 00:39:54.901 UTC tid=32] Cache initialization for GemFireCache[id = 1097663966; isClosing = false; isShutDownAll = false; created = Thu Apr 28 00:37:54 UTC 2022; server = true; copyOnRead = false; lockLease = 120; lockTimeout = 60] failed because: org.apache.geode.GemFireIOException: While starting cache server CacheServer on port=10188 client subscription config policy=entry client subscription config capacity=1000 client subscription config overflow directory=. at org.apache.geode.internal.cache.xmlcache.CacheCreation.startCacheServers(CacheCreation.java:801) at org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:600) at org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:339) at org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4202) at org.apache.geode.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1620) at org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1445) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:191) at org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:158) at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:142) at org.apache.geode.internal.cache.tier.sockets.CacheServerTestUtil.createCacheServerFromXmlN(CacheServerTestUtil.java:253) at org.apache.geode.internal.cache.tier.sockets.DurableClientSimpleDUnitTest.lambda$testReadyForEventsNotCalledImplicitlyForRegisterInterestWithCacheXML$515fd116$1(DurableClientSimpleDUnitTest.java:584) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123) at org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPo
[jira] [Commented] (GEODE-10228) CI Failure: DurableClientTestCase > testDurableHAFailover times out in await for failover
[ https://issues.apache.org/jira/browse/GEODE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529520#comment-17529520 ] Mark Hanson commented on GEODE-10228: - Tracked down a new issue found during stress-new-test of PR for GEODE-10228. The basic problem is this test uses a hard coded port in the cache.xml for the test. That means that the test cannot be run in parallel with itself, which is what stress-new-test was doing. If we want to fix this test, (I suggest it should be a low priority). We should not use a static cache.xml and shift to a dynamically generated cache.xml. I am submitting a new bug for that particular issue and merging the PR as the test in question is not new. > CI Failure: DurableClientTestCase > testDurableHAFailover times out in await > for failover > - > > Key: GEODE-10228 > URL: https://issues.apache.org/jira/browse/GEODE-10228 > Project: Geode > Issue Type: Bug > Components: client/server, tests >Affects Versions: 1.15.0 >Reporter: Kirk Lund >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage, pull-request-available > > {{testDurableHAFailover}} has a history of flakiness, thought the stacks do > seem to have changed some since the older versions of the but were resolved. > {noformat} > urableClientTestCase > testDurableHAFailover FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.test.dunit.internal.IdentifiableRunnable.run in VM 2 running > on Host > heavy-lifter-7bbf0b58-8bc0-5ca8-840d-7bcf83293b6d.c.apachegeode-ci.internal > with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631) > at org.apache.geode.test.dunit.VM.invoke(VM.java:435) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.durableFailover(DurableClientTestCase.java:520) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.testDurableHAFailover(DurableClientTestCase.java:439) > Caused by: > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase > expected: null > but was: "0"="0" within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:167) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:985) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:769) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$durableFailover$3f73998b$1(DurableClientTestCase.java:521) > Caused by: > org.opentest4j.AssertionFailedError: > expected: null > but was: "0"="0" > at > sun.reflect.GeneratedConstructorAccessor199.newInstance(Unknown Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$null$2(DurableClientTestCase.java:525) > {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (GEODE-10228) CI Failure: DurableClientTestCase > testDurableHAFailover times out in await for failover
[ https://issues.apache.org/jira/browse/GEODE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528380#comment-17528380 ] Mark Hanson commented on GEODE-10228: - Awaiting PR review approvals at this point. The initial code change was put in as well as a bunch of reviewer comment changes. > CI Failure: DurableClientTestCase > testDurableHAFailover times out in await > for failover > - > > Key: GEODE-10228 > URL: https://issues.apache.org/jira/browse/GEODE-10228 > Project: Geode > Issue Type: Bug > Components: client/server, tests >Affects Versions: 1.15.0 >Reporter: Kirk Lund >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage, pull-request-available > > {{testDurableHAFailover}} has a history of flakiness, thought the stacks do > seem to have changed some since the older versions of the but were resolved. > {noformat} > urableClientTestCase > testDurableHAFailover FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.test.dunit.internal.IdentifiableRunnable.run in VM 2 running > on Host > heavy-lifter-7bbf0b58-8bc0-5ca8-840d-7bcf83293b6d.c.apachegeode-ci.internal > with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631) > at org.apache.geode.test.dunit.VM.invoke(VM.java:435) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.durableFailover(DurableClientTestCase.java:520) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.testDurableHAFailover(DurableClientTestCase.java:439) > Caused by: > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase > expected: null > but was: "0"="0" within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:167) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:985) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:769) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$durableFailover$3f73998b$1(DurableClientTestCase.java:521) > Caused by: > org.opentest4j.AssertionFailedError: > expected: null > but was: "0"="0" > at > sun.reflect.GeneratedConstructorAccessor199.newInstance(Unknown Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$null$2(DurableClientTestCase.java:525) > {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (GEODE-10248) CI: DeployToMultiGroupDUnitTest encountered suspect string
[ https://issues.apache.org/jira/browse/GEODE-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-10248. - Fix Version/s: 1.15.0 Resolution: Fixed > CI: DeployToMultiGroupDUnitTest encountered suspect string > -- > > Key: GEODE-10248 > URL: https://issues.apache.org/jira/browse/GEODE-10248 > Project: Geode > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Xiaojian Zhou >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage, pull-request-available > Fix For: 1.15.0 > > > > Task :geode-assembly:distributedTest > DeployToMultiGroupDUnitTest > executionError FAILED > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 571 > > $?? > ???PK???L?Tk??6??Class1.classPK???L?T{6}? > ?timestampPK??u? > ---YMBX204KTK7fmoVc8vVmUZOfJOmATtYGRLlAK > Content-Disposition: form-data; name="config" > Content-Type: application/json > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 592 > > $?? > ???PK???L?Tk??6??Class1.classPK???L?T{6}? > ?timestampPK??u? > --w3iZZ1eYF3P3Eh2pe2x4sTm2w24zOxfn2XIcRWX1 > Content-Disposition: form-data; name="config" > Content-Type: application/json > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:422) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:438) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:183) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141) > at > org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42) > at > org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80) > at > org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccess
[jira] [Commented] (GEODE-10248) CI: DeployToMultiGroupDUnitTest encountered suspect string
[ https://issues.apache.org/jira/browse/GEODE-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528379#comment-17528379 ] Mark Hanson commented on GEODE-10248: - The core problem was that the test was outputting a string the suspicious string parser didn't like. So I added a special case for "Management Request: " followed by packet data, which is what that log statement outputs. I think we should not be logging like this at the info level. > CI: DeployToMultiGroupDUnitTest encountered suspect string > -- > > Key: GEODE-10248 > URL: https://issues.apache.org/jira/browse/GEODE-10248 > Project: Geode > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Xiaojian Zhou >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage, pull-request-available > > > Task :geode-assembly:distributedTest > DeployToMultiGroupDUnitTest > executionError FAILED > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 571 > > $?? > ???PK???L?Tk??6??Class1.classPK???L?T{6}? > ?timestampPK??u? > ---YMBX204KTK7fmoVc8vVmUZOfJOmATtYGRLlAK > Content-Disposition: form-data; name="config" > Content-Type: application/json > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 592 > > $?? > ???PK???L?Tk??6??Class1.classPK???L?T{6}? > ?timestampPK??u? > --w3iZZ1eYF3P3Eh2pe2x4sTm2w24zOxfn2XIcRWX1 > Content-Disposition: form-data; name="config" > Content-Type: application/json > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:422) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:438) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:183) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141) > at > org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42) > at > org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80) > at > org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75) >
[jira] [Commented] (GEODE-10248) CI: DeployToMultiGroupDUnitTest encountered suspect string
[ https://issues.apache.org/jira/browse/GEODE-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17525157#comment-17525157 ] Mark Hanson commented on GEODE-10248: - Not sure why this is failing now. This is a log statement that has been in the code for a while. There is nothing wrong here. I have changed the default to a property "geode.management.request.logging" and the log statement goes away. > CI: DeployToMultiGroupDUnitTest encountered suspect string > -- > > Key: GEODE-10248 > URL: https://issues.apache.org/jira/browse/GEODE-10248 > Project: Geode > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Xiaojian Zhou >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage > > > Task :geode-assembly:distributedTest > DeployToMultiGroupDUnitTest > executionError FAILED > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 571 > > $?? > ???PK???L?Tk??6??Class1.classPK???L?T{6}? > ?timestampPK??u? > ---YMBX204KTK7fmoVc8vVmUZOfJOmATtYGRLlAK > Content-Disposition: form-data; name="config" > Content-Type: application/json > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 592 > > $?? > ???PK???L?Tk??6??Class1.classPK???L?T{6}? > ?timestampPK??u? > --w3iZZ1eYF3P3Eh2pe2x4sTm2w24zOxfn2XIcRWX1 > Content-Disposition: form-data; name="config" > Content-Type: application/json > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:422) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:438) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:183) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141) > at > org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42) > at > org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80) > at > org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProc
[jira] [Assigned] (GEODE-10248) CI: DeployToMultiGroupDUnitTest encountered suspect string
[ https://issues.apache.org/jira/browse/GEODE-10248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-10248: --- Assignee: Mark Hanson > CI: DeployToMultiGroupDUnitTest encountered suspect string > -- > > Key: GEODE-10248 > URL: https://issues.apache.org/jira/browse/GEODE-10248 > Project: Geode > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Xiaojian Zhou >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage > > > Task :geode-assembly:distributedTest > DeployToMultiGroupDUnitTest > executionError FAILED > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 571 > > $?? > ???PK???L?Tk??6??Class1.classPK???L?T{6}? > ?timestampPK??u? > ---YMBX204KTK7fmoVc8vVmUZOfJOmATtYGRLlAK > Content-Disposition: form-data; name="config" > Content-Type: application/json > --- > Found suspect string in 'dunit_suspect-vm0.log' at line 592 > > $?? > ???PK???L?Tk??6??Class1.classPK???L?T{6}? > ?timestampPK??u? > --w3iZZ1eYF3P3Eh2pe2x4sTm2w24zOxfn2XIcRWX1 > Content-Disposition: form-data; name="config" > Content-Type: application/json > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:422) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:438) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:183) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141) > at > org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) > at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42) > at > org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80) > at > org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.inv
[jira] [Assigned] (GEODE-10228) CI Failure: DurableClientTestCase > testDurableHAFailover times out in await for failover
[ https://issues.apache.org/jira/browse/GEODE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-10228: --- Assignee: Mark Hanson > CI Failure: DurableClientTestCase > testDurableHAFailover times out in await > for failover > - > > Key: GEODE-10228 > URL: https://issues.apache.org/jira/browse/GEODE-10228 > Project: Geode > Issue Type: Bug > Components: client/server, tests >Reporter: Kirk Lund >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage > Fix For: 1.15.0 > > > {{testDurableHAFailover}} has a history of flakiness, thought the stacks do > seem to have changed some since the older versions of the but were resolved. > {noformat} > urableClientTestCase > testDurableHAFailover FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.test.dunit.internal.IdentifiableRunnable.run in VM 2 running > on Host > heavy-lifter-7bbf0b58-8bc0-5ca8-840d-7bcf83293b6d.c.apachegeode-ci.internal > with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:631) > at org.apache.geode.test.dunit.VM.invoke(VM.java:435) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.durableFailover(DurableClientTestCase.java:520) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.testDurableHAFailover(DurableClientTestCase.java:439) > Caused by: > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase > expected: null > but was: "0"="0" within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:167) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:985) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:769) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$durableFailover$3f73998b$1(DurableClientTestCase.java:521) > Caused by: > org.opentest4j.AssertionFailedError: > expected: null > but was: "0"="0" > at > sun.reflect.GeneratedConstructorAccessor199.newInstance(Unknown Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > org.apache.geode.internal.cache.tier.sockets.DurableClientTestCase.lambda$null$2(DurableClientTestCase.java:525) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
[ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-9704. Fix Version/s: 1.15.0 Resolution: Fixed > When durable clients recovers, it sends "ready for event" signal before > register for interest, this might cause problem for caching_proxy regions > - > > Key: GEODE-9704 > URL: https://issues.apache.org/jira/browse/GEODE-9704 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Jinmei Liao >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.1, pull-request-available > Fix For: 1.15.0 > > > This is the old Geode behavior, but may or may not be the correct behavior. > When durable clients recovers, there is a queueTimer thread that runs > `QueueManagerImp.recoverPrimary` method, it > * makes new connection to server > - sends readyForEvents (which will cause the server to start sending the > queued events) > - recovers interest > - clears the region of keys of interest > - re-registers interest > It sends readyForEvents before it clears region of keys of interest, if > server sends some events of those keys in between, it will clear them, thus > it seems to the user that the client region doesn't have those keys. > > Run geode-core distributedTest > AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), > change the InterestResultPolicy to NONE, you would see the test would fail > occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between > `createNewPrimary` and `recoverInterest` would make the test fail more > consistently. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-10195) MicrometerBinderTest > processorMetricsBinderExists FAILED
Mark Hanson created GEODE-10195: --- Summary: MicrometerBinderTest > processorMetricsBinderExists FAILED Key: GEODE-10195 URL: https://issues.apache.org/jira/browse/GEODE-10195 Project: Geode Issue Type: Bug Components: core Reporter: Mark Hanson windows-acceptance-test-openjdk11 failed with the following error. {noformat} MicrometerBinderTest > processorMetricsBinderExists FAILED org.apache.geode.cache.client.ServerOperationException: remote server on heavy-lifter-ceacbfa8-6147-51ca-affd-b497cd16e2ef(4420:loner):54545:7074b0d7: Function named CheckIfMeterExistsFunction is not registered to FunctionService at org.apache.geode.cache.client.internal.ExecuteFunctionOp$ExecuteFunctionOpImpl.processResponse(ExecuteFunctionOp.java:394) at org.apache.geode.cache.client.internal.AbstractOp.processResponse(AbstractOp.java:234) at org.apache.geode.cache.client.internal.AbstractOp.attemptReadResponse(AbstractOp.java:209) at org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:394) at org.apache.geode.cache.client.internal.AbstractOpWithTimeout.attempt(AbstractOpWithTimeout.java:45) at org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:284) at org.apache.geode.cache.client.internal.pooling.PooledConnection.execute(PooledConnection.java:358) at org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:760) at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:151) at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:820) at org.apache.geode.cache.client.internal.ExecuteFunctionOp.execute(ExecuteFunctionOp.java:100) at org.apache.geode.internal.cache.execute.ServerFunctionExecutor.executeOnServer(ServerFunctionExecutor.java:217) at org.apache.geode.internal.cache.execute.ServerFunctionExecutor.executeFunction(ServerFunctionExecutor.java:104) at org.apache.geode.internal.cache.execute.ServerFunctionExecutor.execute(ServerFunctionExecutor.java:368) at org.apache.geode.internal.cache.execute.ServerFunctionExecutor.execute(ServerFunctionExecutor.java:377) at org.apache.geode.metrics.MicrometerBinderTest.processorMetricsBinderExists(MicrometerBinderTest.java:152) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-10153) Benchmarks: PartitionedPutAllBenchmark: net.schmizz.sshj.transport.TransportException: Connection reset
[ https://issues.apache.org/jira/browse/GEODE-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-10153. - Resolution: Duplicate > Benchmarks: PartitionedPutAllBenchmark: > net.schmizz.sshj.transport.TransportException: Connection reset > > > Key: GEODE-10153 > URL: https://issues.apache.org/jira/browse/GEODE-10153 > Project: Geode > Issue Type: Bug > Components: benchmarks >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage > > This looks like GEODE-10147 > [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/223] > > {noformat} > 2022-03-23 04:26:18.839 ERROR RemoteJVMFactory - Launching > /usr/lib/jvm/bellsoft-java8-amd64/jre/bin/java -classpath > .geode-performance/lib/SERVER-2/* > -Djava.library.path=/home/geode/META-INF/native -DRMI_HOST=172.31.40.73 > -DRMI_PORT=3 -DJVM_ID=2 -DROLE=SERVER -DOUTPUT_DIR=output/SERVER-2 > -server -Djava.awt.headless=true > -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 > -Dgemfire.OSProcess.ENABLE_OUTPUT_REDIRECTION=true > -Dgemfire.launcher.registerSignalHandlers=true -XX:+DisableExplicitGC -Xmx8g > -Xms8g -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly > -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark > -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseNUMA -XX:+ScavengeBeforeFullGC > -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=32768 > -Dbenchmark.withSslProtocols= -Dbenchmark.withSslCiphers= > org.apache.geode.perftest.jvms.rmi.ChildJVM on > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure$SshNode@3cad46e8Failed. > 21:26:18net.schmizz.sshj.transport.TransportException: Connection reset > 21:26:18 at > net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194) > 21:26:18 at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793) > 21:26:18 at net.schmizz.sshj.SocketClient.connect(SocketClient.java:178) > 21:26:18 at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74) > 21:26:18 at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.onNode(SshInfrastructure.java:86) > 21:26:18 at > org.apache.geode.perftest.jvms.JVMLauncher$1.run(JVMLauncher.java:68) > 21:26:18Caused by: java.net.SocketException: Connection reset > 21:26:18 at java.net.SocketInputStream.read(SocketInputStream.java:210) > 21:26:18 at java.net.SocketInputStream.read(SocketInputStream.java:141) > 21:26:18 at java.net.SocketInputStream.read(SocketInputStream.java:224) > 21:26:18 at > net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:211) > 21:26:18 at > net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187) > 21:26:18 ... 5 more > 21:31:18 > 21:31:18PartitionedPutAllBenchmark > run() FAILED > 21:31:18java.lang.IllegalStateException: Workers failed to start in 5 > minute > 21:31:18at > org.apache.geode.perftest.jvms.RemoteJVMFactory.launch(RemoteJVMFactory.java:133) > 21:31:18at > org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:97) > 21:31:18at > org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:65) > 21:31:18at > org.apache.geode.benchmark.tests.PartitionedPutAllBenchmark.run(PartitionedPutAllBenchmark.java:52) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-10173) CI failure: P2pPartitionedGetBenchmark > run()
[ https://issues.apache.org/jira/browse/GEODE-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-10173. - Resolution: Duplicate > CI failure: P2pPartitionedGetBenchmark > run() > -- > > Key: GEODE-10173 > URL: https://issues.apache.org/jira/browse/GEODE-10173 > Project: Geode > Issue Type: Bug >Reporter: Jianxia Chen >Priority: Major > Labels: needsTriage > > {code:java} > org.apache.geode.benchmark.tests.P2pPartitionedGetBenchmark > run() FAILED > 15:49:10net.schmizz.sshj.transport.TransportException: Connection reset > 15:49:10at > net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:181) > 15:49:10at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:771) > 15:49:10at > net.schmizz.sshj.SocketClient.connect(SocketClient.java:150) > 15:49:10at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:75) > 15:49:10at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.copyFromNode(SshInfrastructure.java:186) > 15:49:10at > org.apache.geode.perftest.jvms.RemoteJVMs.copyResults(RemoteJVMs.java:87) > 15:49:10at > org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:136) > 15:49:10at > org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:68) > 15:49:10at > org.apache.geode.benchmark.tests.P2pPartitionedGetBenchmark.run(P2pPartitionedGetBenchmark.java:44) > 15:49:10 > 15:49:10Caused by: > 15:49:10java.net.SocketException: Connection reset > 15:49:10at > java.net.SocketInputStream.read(SocketInputStream.java:210) > 15:49:10at > java.net.SocketInputStream.read(SocketInputStream.java:141) > 15:49:10at > java.net.SocketInputStream.read(SocketInputStream.java:224) > 15:49:10at > net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:198) > 15:49:10at > net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:174) > 15:49:10... 8 more {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-10178) CI Failure: PartitionedGetLongBenchmark > run()
[ https://issues.apache.org/jira/browse/GEODE-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-10178. - Resolution: Duplicate > CI Failure: PartitionedGetLongBenchmark > run() > --- > > Key: GEODE-10178 > URL: https://issues.apache.org/jira/browse/GEODE-10178 > Project: Geode > Issue Type: Bug >Reporter: Jianxia Chen >Priority: Major > Labels: needsTriage > > {code:java} > PartitionedGetLongBenchmark > run() FAILED > 01:08:25net.schmizz.sshj.transport.TransportException: Connection reset > 01:08:25at > net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194) > 01:08:25at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793) > 01:08:25at > net.schmizz.sshj.SocketClient.connect(SocketClient.java:178) > 01:08:25at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74) > 01:08:25at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.copyFromNode(SshInfrastructure.java:185) > 01:08:25at > org.apache.geode.perftest.jvms.RemoteJVMs.copyResults(RemoteJVMs.java:87) > 01:08:25at > org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:112) > 01:08:25at > org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:65) > 01:08:25at > org.apache.geode.benchmark.tests.PartitionedGetLongBenchmark.run(PartitionedGetLongBenchmark.java:45) > 01:08:25 > 01:08:25Caused by: > 01:08:25java.net.SocketException: Connection reset > 01:08:25at > java.net.SocketInputStream.read(SocketInputStream.java:210) > 01:08:25at > java.net.SocketInputStream.read(SocketInputStream.java:141) > 01:08:25at > java.net.SocketInputStream.read(SocketInputStream.java:224) > 01:08:25at > net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:211) > 01:08:25at > net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187) > 01:08:25... 8 more {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-10154) Benchmarks: PartitionedIndexedQueryBenchmark: net.schmizz.sshj.transport.TransportException: Server closed connection during identification exchange
[ https://issues.apache.org/jira/browse/GEODE-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-10154. - Resolution: Fixed This is benchmark job path issue fixed in concourse. > Benchmarks: PartitionedIndexedQueryBenchmark: > net.schmizz.sshj.transport.TransportException: Server closed connection > during identification exchange > > > Key: GEODE-10154 > URL: https://issues.apache.org/jira/browse/GEODE-10154 > Project: Geode > Issue Type: Bug > Components: benchmarks >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Priority: Major > Labels: needsTriage > > [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-with-security-manager/builds/224] > This same framework is reporting an error in GEODE-10153 and GEODE-10147 > {noformat} > 02:36:47PartitionedIndexedQueryBenchmark > run() FAILED > 02:36:47java.util.concurrent.CompletionException: > java.io.UncheckedIOException: net.schmizz.sshj.transport.TransportException: > Server closed connection during identification exchange > 02:36:47at > java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) > 02:36:47at > java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) > 02:36:47at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1643) > 02:36:47at > java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1632) > 02:36:47at > java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) > 02:36:47at > java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) > 02:36:47at > java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) > 02:36:47at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) > 02:36:47 > 02:36:47Caused by: > 02:36:47java.io.UncheckedIOException: > net.schmizz.sshj.transport.TransportException: Server closed connection > during identification exchange > 02:36:47at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.lambda$copyToNodes$1(SshInfrastructure.java:176) > 02:36:47at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) > 02:36:47... 5 more > 02:36:47 > 02:36:47Caused by: > 02:36:47net.schmizz.sshj.transport.TransportException: Server > closed connection during identification exchange > 02:36:47at > net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194) > 02:36:47at > net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793) > 02:36:47at > net.schmizz.sshj.SocketClient.connect(SocketClient.java:178) > 02:36:47at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74) > 02:36:47at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.lambda$copyToNodes$1(SshInfrastructure.java:158) > 02:36:47... 6 more > 02:36:47 > 02:36:47Caused by: > 02:36:47net.schmizz.sshj.transport.TransportException: Server > closed connection during identification exchange > 02:36:47at > net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:214) > 02:36:47at > net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187) > 02:36:47... 10 more {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-10153) Benchmarks: PartitionedPutAllBenchmark: net.schmizz.sshj.transport.TransportException: Connection reset
[ https://issues.apache.org/jira/browse/GEODE-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-10153: --- Assignee: Mark Hanson > Benchmarks: PartitionedPutAllBenchmark: > net.schmizz.sshj.transport.TransportException: Connection reset > > > Key: GEODE-10153 > URL: https://issues.apache.org/jira/browse/GEODE-10153 > Project: Geode > Issue Type: Bug > Components: benchmarks >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage > > This looks like GEODE-10147 > [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/223] > > {noformat} > 2022-03-23 04:26:18.839 ERROR RemoteJVMFactory - Launching > /usr/lib/jvm/bellsoft-java8-amd64/jre/bin/java -classpath > .geode-performance/lib/SERVER-2/* > -Djava.library.path=/home/geode/META-INF/native -DRMI_HOST=172.31.40.73 > -DRMI_PORT=3 -DJVM_ID=2 -DROLE=SERVER -DOUTPUT_DIR=output/SERVER-2 > -server -Djava.awt.headless=true > -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 > -Dgemfire.OSProcess.ENABLE_OUTPUT_REDIRECTION=true > -Dgemfire.launcher.registerSignalHandlers=true -XX:+DisableExplicitGC -Xmx8g > -Xms8g -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly > -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark > -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseNUMA -XX:+ScavengeBeforeFullGC > -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=32768 > -Dbenchmark.withSslProtocols= -Dbenchmark.withSslCiphers= > org.apache.geode.perftest.jvms.rmi.ChildJVM on > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure$SshNode@3cad46e8Failed. > 21:26:18net.schmizz.sshj.transport.TransportException: Connection reset > 21:26:18 at > net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194) > 21:26:18 at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793) > 21:26:18 at net.schmizz.sshj.SocketClient.connect(SocketClient.java:178) > 21:26:18 at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74) > 21:26:18 at > org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.onNode(SshInfrastructure.java:86) > 21:26:18 at > org.apache.geode.perftest.jvms.JVMLauncher$1.run(JVMLauncher.java:68) > 21:26:18Caused by: java.net.SocketException: Connection reset > 21:26:18 at java.net.SocketInputStream.read(SocketInputStream.java:210) > 21:26:18 at java.net.SocketInputStream.read(SocketInputStream.java:141) > 21:26:18 at java.net.SocketInputStream.read(SocketInputStream.java:224) > 21:26:18 at > net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:211) > 21:26:18 at > net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187) > 21:26:18 ... 5 more > 21:31:18 > 21:31:18PartitionedPutAllBenchmark > run() FAILED > 21:31:18java.lang.IllegalStateException: Workers failed to start in 5 > minute > 21:31:18at > org.apache.geode.perftest.jvms.RemoteJVMFactory.launch(RemoteJVMFactory.java:133) > 21:31:18at > org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:97) > 21:31:18at > org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:65) > 21:31:18at > org.apache.geode.benchmark.tests.PartitionedPutAllBenchmark.run(PartitionedPutAllBenchmark.java:52) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
[ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512627#comment-17512627 ] Mark Hanson commented on GEODE-9704: PR 7442 is available. I have made changes to fix the behavior that was causing the problems. The core of the problem was that registerinterst should be called before readyforevents. It was reversed effectively, so that has been corrected. LocalRegionUpdateTest.java was created to house two unit tests for the new code. AuthExpirationDUnitTest has a test by Jinmei that has been uncommented that would typically be flaky, but with this fix, no longer fails. I believe this bug is done with the exception of the review phase of the PR and associated changes. > When durable clients recovers, it sends "ready for event" signal before > register for interest, this might cause problem for caching_proxy regions > - > > Key: GEODE-9704 > URL: https://issues.apache.org/jira/browse/GEODE-9704 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Jinmei Liao >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.1, pull-request-available > > This is the old Geode behavior, but may or may not be the correct behavior. > When durable clients recovers, there is a queueTimer thread that runs > `QueueManagerImp.recoverPrimary` method, it > * makes new connection to server > - sends readyForEvents (which will cause the server to start sending the > queued events) > - recovers interest > - clears the region of keys of interest > - re-registers interest > It sends readyForEvents before it clears region of keys of interest, if > server sends some events of those keys in between, it will clear them, thus > it seems to the user that the client region doesn't have those keys. > > Run geode-core distributedTest > AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), > change the InterestResultPolicy to NONE, you would see the test would fail > occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between > `createNewPrimary` and `recoverInterest` would make the test fail more > consistently. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-5564) Flaky test ConcurrentIndexInitOnOverflowRegionDUnitTest > testIndexUpdateWithRegionClear
[ https://issues.apache.org/jira/browse/GEODE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-5564: --- Affects Version/s: 1.14.0 > Flaky test ConcurrentIndexInitOnOverflowRegionDUnitTest > > testIndexUpdateWithRegionClear > > > Key: GEODE-5564 > URL: https://issues.apache.org/jira/browse/GEODE-5564 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.8.0, 1.14.0 >Reporter: Jacob Barrett >Priority: Major > Labels: flaky > > {noformat} > org.apache.geode.cache.query.internal.index.ConcurrentIndexInitOnOverflowRegionDUnitTest > > testIndexUpdateWithRegionClear FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache.query.internal.index.ConcurrentIndexInitOnOverflowRegionDUnitTest$12.run > in VM 0 running on Host 92f89c21d1b0 with 4 VMs > at org.apache.geode.test.dunit.VM.invoke(VM.java:443) > at org.apache.geode.test.dunit.VM.invoke(VM.java:412) > at org.apache.geode.test.dunit.VM.invoke(VM.java:355) > at > org.apache.geode.cache.query.internal.index.ConcurrentIndexInitOnOverflowRegionDUnitTest.testIndexUpdateWithRegionClear(ConcurrentIndexInitOnOverflowRegionDUnitTest.java:411) > Caused by: > java.lang.AssertionError: After clear region size is supposed to be > zero as all index updates are blocked. Current region size is: 13 > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.geode.cache.query.internal.index.ConcurrentIndexInitOnOverflowRegionDUnitTest$12.run2(ConcurrentIndexInitOnOverflowRegionDUnitTest.java:430) > {noformat} > Failing: > https://concourse.apachegeode-ci.info/teams/main/pipelines/pr-develop/jobs/DistributedTest/builds/556 > Passing: > https://concourse.apachegeode-ci.info/teams/main/pipelines/pr-develop/jobs/DistributedTest/builds/547 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-10154) Benchmarks: PartitionedIndexedQueryBenchmark: net.schmizz.sshj.transport.TransportException: Server closed connection during identification exchange
Mark Hanson created GEODE-10154: --- Summary: Benchmarks: PartitionedIndexedQueryBenchmark: net.schmizz.sshj.transport.TransportException: Server closed connection during identification exchange Key: GEODE-10154 URL: https://issues.apache.org/jira/browse/GEODE-10154 Project: Geode Issue Type: Bug Components: benchmarks Affects Versions: 1.15.0 Reporter: Mark Hanson [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-with-security-manager/builds/224] This same framework is reporting an error in GEODE-10153 and GEODE-10147 {noformat} 02:36:47PartitionedIndexedQueryBenchmark > run() FAILED 02:36:47java.util.concurrent.CompletionException: java.io.UncheckedIOException: net.schmizz.sshj.transport.TransportException: Server closed connection during identification exchange 02:36:47at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) 02:36:47at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) 02:36:47at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1643) 02:36:47at java.util.concurrent.CompletableFuture$AsyncRun.exec(CompletableFuture.java:1632) 02:36:47at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) 02:36:47at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) 02:36:47at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) 02:36:47at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) 02:36:47 02:36:47Caused by: 02:36:47java.io.UncheckedIOException: net.schmizz.sshj.transport.TransportException: Server closed connection during identification exchange 02:36:47at org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.lambda$copyToNodes$1(SshInfrastructure.java:176) 02:36:47at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) 02:36:47... 5 more 02:36:47 02:36:47Caused by: 02:36:47net.schmizz.sshj.transport.TransportException: Server closed connection during identification exchange 02:36:47at net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194) 02:36:47at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793) 02:36:47at net.schmizz.sshj.SocketClient.connect(SocketClient.java:178) 02:36:47at org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74) 02:36:47at org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.lambda$copyToNodes$1(SshInfrastructure.java:158) 02:36:47... 6 more 02:36:47 02:36:47Caused by: 02:36:47net.schmizz.sshj.transport.TransportException: Server closed connection during identification exchange 02:36:47at net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:214) 02:36:47at net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187) 02:36:47... 10 more {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-10153) Benchmarks: PartitionedPutAllBenchmark: net.schmizz.sshj.transport.TransportException: Connection reset
Mark Hanson created GEODE-10153: --- Summary: Benchmarks: PartitionedPutAllBenchmark: net.schmizz.sshj.transport.TransportException: Connection reset Key: GEODE-10153 URL: https://issues.apache.org/jira/browse/GEODE-10153 Project: Geode Issue Type: Bug Components: benchmarks Affects Versions: 1.15.0 Reporter: Mark Hanson This looks like GEODE-10147 [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/benchmark-base/builds/223] {noformat} 2022-03-23 04:26:18.839 ERROR RemoteJVMFactory - Launching /usr/lib/jvm/bellsoft-java8-amd64/jre/bin/java -classpath .geode-performance/lib/SERVER-2/* -Djava.library.path=/home/geode/META-INF/native -DRMI_HOST=172.31.40.73 -DRMI_PORT=3 -DJVM_ID=2 -DROLE=SERVER -DOUTPUT_DIR=output/SERVER-2 -server -Djava.awt.headless=true -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 -Dgemfire.OSProcess.ENABLE_OUTPUT_REDIRECTION=true -Dgemfire.launcher.registerSignalHandlers=true -XX:+DisableExplicitGC -Xmx8g -Xms8g -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseNUMA -XX:+ScavengeBeforeFullGC -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=32768 -Dbenchmark.withSslProtocols= -Dbenchmark.withSslCiphers= org.apache.geode.perftest.jvms.rmi.ChildJVM on org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure$SshNode@3cad46e8Failed. 21:26:18net.schmizz.sshj.transport.TransportException: Connection reset 21:26:18at net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:194) 21:26:18at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:793) 21:26:18at net.schmizz.sshj.SocketClient.connect(SocketClient.java:178) 21:26:18at org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.getSSHClient(SshInfrastructure.java:74) 21:26:18at org.apache.geode.perftest.infrastructure.ssh.SshInfrastructure.onNode(SshInfrastructure.java:86) 21:26:18at org.apache.geode.perftest.jvms.JVMLauncher$1.run(JVMLauncher.java:68) 21:26:18Caused by: java.net.SocketException: Connection reset 21:26:18at java.net.SocketInputStream.read(SocketInputStream.java:210) 21:26:18at java.net.SocketInputStream.read(SocketInputStream.java:141) 21:26:18at java.net.SocketInputStream.read(SocketInputStream.java:224) 21:26:18at net.schmizz.sshj.transport.TransportImpl.receiveServerIdent(TransportImpl.java:211) 21:26:18at net.schmizz.sshj.transport.TransportImpl.init(TransportImpl.java:187) 21:26:18... 5 more 21:31:18 21:31:18PartitionedPutAllBenchmark > run() FAILED 21:31:18java.lang.IllegalStateException: Workers failed to start in 5 minute 21:31:18at org.apache.geode.perftest.jvms.RemoteJVMFactory.launch(RemoteJVMFactory.java:133) 21:31:18at org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:97) 21:31:18at org.apache.geode.perftest.runner.DefaultTestRunner.runTest(DefaultTestRunner.java:65) 21:31:18at org.apache.geode.benchmark.tests.PartitionedPutAllBenchmark.run(PartitionedPutAllBenchmark.java:52) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Reopened] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
[ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reopened GEODE-9704: > When durable clients recovers, it sends "ready for event" signal before > register for interest, this might cause problem for caching_proxy regions > - > > Key: GEODE-9704 > URL: https://issues.apache.org/jira/browse/GEODE-9704 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Jinmei Liao >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.1, pull-request-available > > This is the old Geode behavior, but may or may not be the correct behavior. > When durable clients recovers, there is a queueTimer thread that runs > `QueueManagerImp.recoverPrimary` method, it > * makes new connection to server > - sends readyForEvents (which will cause the server to start sending the > queued events) > - recovers interest > - clears the region of keys of interest > - re-registers interest > It sends readyForEvents before it clears region of keys of interest, if > server sends some events of those keys in between, it will clear them, thus > it seems to the user that the client region doesn't have those keys. > > Run geode-core distributedTest > AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), > change the InterestResultPolicy to NONE, you would see the test would fail > occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between > `createNewPrimary` and `recoverInterest` would make the test fail more > consistently. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
[ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9704: --- Labels: GeodeOperationAPI blocks-1.15.1 pull-request-available (was: GeodeOperationAPI blocks-1.15.0 pull-request-available) > When durable clients recovers, it sends "ready for event" signal before > register for interest, this might cause problem for caching_proxy regions > - > > Key: GEODE-9704 > URL: https://issues.apache.org/jira/browse/GEODE-9704 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Jinmei Liao >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.1, pull-request-available > > This is the old Geode behavior, but may or may not be the correct behavior. > When durable clients recovers, there is a queueTimer thread that runs > `QueueManagerImp.recoverPrimary` method, it > * makes new connection to server > - sends readyForEvents (which will cause the server to start sending the > queued events) > - recovers interest > - clears the region of keys of interest > - re-registers interest > It sends readyForEvents before it clears region of keys of interest, if > server sends some events of those keys in between, it will clear them, thus > it seems to the user that the client region doesn't have those keys. > > Run geode-core distributedTest > AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), > change the InterestResultPolicy to NONE, you would see the test would fail > occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between > `createNewPrimary` and `recoverInterest` would make the test fail more > consistently. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
[ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9704: --- Labels: GeodeOperationAPI blocks-1.15.0 pull-request-available (was: GeodeOperationAPI blocks-1.15.1 pull-request-available) > When durable clients recovers, it sends "ready for event" signal before > register for interest, this might cause problem for caching_proxy regions > - > > Key: GEODE-9704 > URL: https://issues.apache.org/jira/browse/GEODE-9704 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Jinmei Liao >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.0, pull-request-available > > This is the old Geode behavior, but may or may not be the correct behavior. > When durable clients recovers, there is a queueTimer thread that runs > `QueueManagerImp.recoverPrimary` method, it > * makes new connection to server > - sends readyForEvents (which will cause the server to start sending the > queued events) > - recovers interest > - clears the region of keys of interest > - re-registers interest > It sends readyForEvents before it clears region of keys of interest, if > server sends some events of those keys in between, it will clear them, thus > it seems to the user that the client region doesn't have those keys. > > Run geode-core distributedTest > AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), > change the InterestResultPolicy to NONE, you would see the test would fail > occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between > `createNewPrimary` and `recoverInterest` would make the test fail more > consistently. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-10039) BucketProfiles can be stale in rare cases.
Mark Hanson created GEODE-10039: --- Summary: BucketProfiles can be stale in rare cases. Key: GEODE-10039 URL: https://issues.apache.org/jira/browse/GEODE-10039 Project: Geode Issue Type: Bug Components: core Affects Versions: 1.15.0 Reporter: Mark Hanson In the case when a server is starting as a member of a partitioned region during a rebalance, it is possible for the the starting server to not get a profile removal for a bucket that has been relocated. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone
[ https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-9815. Fix Version/s: 1.15.0 Resolution: Fixed The solution to this was to change the logic to deal with the situation where there were multiple copies in the same redundancy zone by deleting the extra copy in the zone, but also ensure a new copy gets made in another zone. > Recovering persistent members can result in extra copies of a bucket or two > copies in the same redundancy zone > -- > > Key: GEODE-9815 > URL: https://issues.apache.org/jira/browse/GEODE-9815 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Dan Smith >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.0, needsTriage, > pull-request-available > Fix For: 1.15.0 > > > The fix in GEODE-9554 is incomplete for some cases, and it also introduces a > new issue when removing buckets that are over redundancy. > GEODE-9554 and these new issues are all related to using redundancy zones and > having persistent members. > With persistence, when we start up a member with persisted buckets, we always > recover the persisted buckets on startup, regardless of whether redundancy is > already met or what zone the existing buckets are on. This is necessary to > ensure that we can recover all colocated buckets that might be persisted on > the member. > Because recovering these persistent buckets may cause us to go over > redundancy, after we recover from disk, we run a "restore redundancy" task > that actually removes copies of buckets that are over redundancy. > GEODE-9554 addressed one case where we end up removing the last copy of a > bucket from one redundancy zone while leaving two copies in another > redundancy zone. It did so by disallowing the removal of a bucket if it is > the last copy in a redundancy zone. > There are a couple of issues with this approach. > *Problem 1:* We may end up with two copies of the bucket in one zone in some > cases > With a slight tweak to the scenario fixed with GEODE-9554 we can end up never > getting out of the situation where we have two copies of a bucket in the same > zone. > Steps: > 1. Start two redundancy zones A and B with two members each. Bucket 0 is on > member A1 and B1. > 2. Shutdown member A1. > 3. Rebalance - this will create bucket 0 on A2. > 4. Shutdown B1. Revoke it's disk store and delete the data > 5. Startup A1 - it will recover bucket 0. > 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that > situation. > *Problem 2:* We may never delete extra copies of a bucket > The fix for GEODE-9554 introduces a new problem if we have more than 2 > redundancy zones > Steps > 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 > and B1 > 2. Shutdown A1 > 3. Rebalance - this will create Bucket 0 on C1 > 4. Startup A1 - this will recreate bucket 0 > 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy. > I think the overall fix is probably to do something different than prevent > removing the last copy of a bucket from a redundancy zone. Instead, I think > we should do something like this: > 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* > buckets that have two copies in the same zone, as well as any buckets that > are actually over redundancy. > 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra > copies of a bucket in the same zone first > 3. Back out the changes for GEODE-9554 and let the last copy be deleted from > a zone. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
[ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9704: --- Description: This is the old Geode behavior, but may or may not be the correct behavior. When durable clients recovers, there is a queueTimer thread that runs `QueueManagerImp.recoverPrimary` method, it * makes new connection to server - sends readyForEvents (which will cause the server to start sending the queued events) - recovers interest - clears the region of keys of interest - re-registers interest It sends readyForEvents before it clears region of keys of interest, if server sends some events of those keys in between, it will clear them, thus it seems to the user that the client region doesn't have those keys. Run geode-core distributedTest AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), change the InterestResultPolicy to NONE, you would see the test would fail occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between `createNewPrimary` and `recoverInterest` would make the test fail more consistently. was: This is the old Geode behavior, but may or may not be the correct behavior. When durable clients recovers, there is a queueTimer thread that runs `QueueManagerImp.recoverPrimary` method, it * makes new connection to server - sends readyForEvents (which will cause the server to start sending the queued events) - recovers interest - clears the region of keys of interest - re-registers interest It sends readyForEvents before it clears region of keys of interest, if server sends some events of those keys in between, it will clear them, thus it seems to the user that the client region doesn't have those keys. Run geode-core distributedTest AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKey_durableClient(), change the InterestResultPolicy to NONE, you would see the test would fail occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between `createNewPrimary` and `recoverInterest` would make the test fail more consistently. > When durable clients recovers, it sends "ready for event" signal before > register for interest, this might cause problem for caching_proxy regions > - > > Key: GEODE-9704 > URL: https://issues.apache.org/jira/browse/GEODE-9704 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Jinmei Liao >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.1 > > This is the old Geode behavior, but may or may not be the correct behavior. > When durable clients recovers, there is a queueTimer thread that runs > `QueueManagerImp.recoverPrimary` method, it > * makes new connection to server > - sends readyForEvents (which will cause the server to start sending the > queued events) > - recovers interest > - clears the region of keys of interest > - re-registers interest > It sends readyForEvents before it clears region of keys of interest, if > server sends some events of those keys in between, it will clear them, thus > it seems to the user that the client region doesn't have those keys. > > Run geode-core distributedTest > AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), > change the InterestResultPolicy to NONE, you would see the test would fail > occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between > `createNewPrimary` and `recoverInterest` would make the test fail more > consistently. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
[ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9704: -- Assignee: Mark Hanson (was: Kirk Lund) > When durable clients recovers, it sends "ready for event" signal before > register for interest, this might cause problem for caching_proxy regions > - > > Key: GEODE-9704 > URL: https://issues.apache.org/jira/browse/GEODE-9704 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Jinmei Liao >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, blocks-1.15.1 > > This is the old Geode behavior, but may or may not be the correct behavior. > When durable clients recovers, there is a queueTimer thread that runs > `QueueManagerImp.recoverPrimary` method, it > * makes new connection to server > - sends readyForEvents (which will cause the server to start sending the > queued events) > - recovers interest > - clears the region of keys of interest > - re-registers interest > It sends readyForEvents before it clears region of keys of interest, if > server sends some events of those keys in between, it will clear them, thus > it seems to the user that the client region doesn't have those keys. > > Run geode-core distributedTest > AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKey_durableClient(), > change the InterestResultPolicy to NONE, you would see the test would fail > occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between > `createNewPrimary` and `recoverInterest` would make the test fail more > consistently. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict
[ https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-9920. Resolution: Won't Fix This is a resource issue where we saw a 10 second delay on wakeup for stats. That indicates is we ran out of CPU. There really isn't anything to fix. We could reduce the number of concurrent tests, but for 1.12. There is no point. > CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and > RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port > conflict > --- > > Key: GEODE-9920 > URL: https://issues.apache.org/jira/browse/GEODE-9920 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.12.8 >Reporter: Hale Bales >Assignee: Mark Hanson >Priority: Major > Labels: CI, needsTriage > > StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with > AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess > failed with a suspicious string with a failure to respond to heartbeats. They > are in the same CI run so it seems like this is a port conflict where there > is overlap between the two tests as one is shutting down and the other is > starting up. > > Updated: This is part of the long standing problem with port binding and the > imperfection in handling default ports in tests. In this case 41000. > {code:java} > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest > > testWithInvalidMemberID FAILED > java.lang.AssertionError: > Expecting: > <"Member Count : 1 > Name| Id > - | -- > locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator] > "> > to contain: > <"locatorToStop"> > at > org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87) > at > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240) > {code} > {code:java} > org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > > testLimitedAccess FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on > Host 07d663f91562 with 4 VMs > Caused by: > org.apache.geode.distributed.DistributedSystemDisconnectedException: > This connection to a distributed system has been disconnected., caused by > org.apache.geode.ForcedDisconnectException: Member isn't responding to > heartbeat requests > Caused by: > org.apache.geode.ForcedDisconnectException: Member isn't > responding to heartbeat requests > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in log4j at line 1125 > [fatal 2022/01/04 01:04:33.305 GMT > tid=100] Membership service failure: Member isn't responding to heartbeat > requests > > org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: > Member isn't responding to heartbeat requests > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264) > at org.jgroups.JChannel.invokeCallback(JChannel.java:816) > at org.jgroups.JChannel.up(JChannel.java:741) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) > at > org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72
[jira] [Commented] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict
[ https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17470138#comment-17470138 ] Mark Hanson commented on GEODE-9920: Our initial diagnosis was that this was a port issue. This was incorrect. Further investigation led us to a system load issue. As an aside, 1.12 still uses Docker which means port collisions are super unlikely. > CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and > RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port > conflict > --- > > Key: GEODE-9920 > URL: https://issues.apache.org/jira/browse/GEODE-9920 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.12.8 >Reporter: Hale Bales >Assignee: Mark Hanson >Priority: Major > Labels: CI, needsTriage > > StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with > AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess > failed with a suspicious string with a failure to respond to heartbeats. They > are in the same CI run so it seems like this is a port conflict where there > is overlap between the two tests as one is shutting down and the other is > starting up. > > Updated: This is part of the long standing problem with port binding and the > imperfection in handling default ports in tests. In this case 41000. > {code:java} > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest > > testWithInvalidMemberID FAILED > java.lang.AssertionError: > Expecting: > <"Member Count : 1 > Name| Id > - | -- > locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator] > "> > to contain: > <"locatorToStop"> > at > org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87) > at > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240) > {code} > {code:java} > org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > > testLimitedAccess FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on > Host 07d663f91562 with 4 VMs > Caused by: > org.apache.geode.distributed.DistributedSystemDisconnectedException: > This connection to a distributed system has been disconnected., caused by > org.apache.geode.ForcedDisconnectException: Member isn't responding to > heartbeat requests > Caused by: > org.apache.geode.ForcedDisconnectException: Member isn't > responding to heartbeat requests > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in log4j at line 1125 > [fatal 2022/01/04 01:04:33.305 GMT > tid=100] Membership service failure: Member isn't responding to heartbeat > requests > > org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: > Member isn't responding to heartbeat requests > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264) > at org.jgroups.JChannel.invokeCallback(JChannel.java:816) > at org.jgroups.JChannel.up(JChannel.java:741) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) > at > org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatReco
[jira] [Commented] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict
[ https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17469595#comment-17469595 ] Mark Hanson commented on GEODE-9920: closed it, but then we started relooking at the problem after some discussion. > CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and > RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port > conflict > --- > > Key: GEODE-9920 > URL: https://issues.apache.org/jira/browse/GEODE-9920 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.12.8 >Reporter: Hale Bales >Assignee: Mark Hanson >Priority: Major > Labels: CI, needsTriage > > StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with > AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess > failed with a suspicious string with a failure to respond to heartbeats. They > are in the same CI run so it seems like this is a port conflict where there > is overlap between the two tests as one is shutting down and the other is > starting up. > > Updated: This is part of the long standing problem with port binding and the > imperfection in handling default ports in tests. In this case 41000. > {code:java} > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest > > testWithInvalidMemberID FAILED > java.lang.AssertionError: > Expecting: > <"Member Count : 1 > Name| Id > - | -- > locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator] > "> > to contain: > <"locatorToStop"> > at > org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87) > at > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240) > {code} > {code:java} > org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > > testLimitedAccess FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on > Host 07d663f91562 with 4 VMs > Caused by: > org.apache.geode.distributed.DistributedSystemDisconnectedException: > This connection to a distributed system has been disconnected., caused by > org.apache.geode.ForcedDisconnectException: Member isn't responding to > heartbeat requests > Caused by: > org.apache.geode.ForcedDisconnectException: Member isn't > responding to heartbeat requests > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in log4j at line 1125 > [fatal 2022/01/04 01:04:33.305 GMT > tid=100] Membership service failure: Member isn't responding to heartbeat > requests > > org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: > Member isn't responding to heartbeat requests > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264) > at org.jgroups.JChannel.invokeCallback(JChannel.java:816) > at org.jgroups.JChannel.up(JChannel.java:741) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) > at > org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72) > at > org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70) >
[jira] [Closed] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict
[ https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson closed GEODE-9920. -- > CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and > RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port > conflict > --- > > Key: GEODE-9920 > URL: https://issues.apache.org/jira/browse/GEODE-9920 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.12.8 >Reporter: Hale Bales >Assignee: Mark Hanson >Priority: Major > Labels: CI, needsTriage > > StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with > AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess > failed with a suspicious string with a failure to respond to heartbeats. They > are in the same CI run so it seems like this is a port conflict where there > is overlap between the two tests as one is shutting down and the other is > starting up. > > Updated: This is part of the long standing problem with port binding and the > imperfection in handling default ports in tests. In this case 41000. > {code:java} > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest > > testWithInvalidMemberID FAILED > java.lang.AssertionError: > Expecting: > <"Member Count : 1 > Name| Id > - | -- > locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator] > "> > to contain: > <"locatorToStop"> > at > org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87) > at > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240) > {code} > {code:java} > org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > > testLimitedAccess FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on > Host 07d663f91562 with 4 VMs > Caused by: > org.apache.geode.distributed.DistributedSystemDisconnectedException: > This connection to a distributed system has been disconnected., caused by > org.apache.geode.ForcedDisconnectException: Member isn't responding to > heartbeat requests > Caused by: > org.apache.geode.ForcedDisconnectException: Member isn't > responding to heartbeat requests > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in log4j at line 1125 > [fatal 2022/01/04 01:04:33.305 GMT > tid=100] Membership service failure: Member isn't responding to heartbeat > requests > > org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: > Member isn't responding to heartbeat requests > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264) > at org.jgroups.JChannel.invokeCallback(JChannel.java:816) > at org.jgroups.JChannel.up(JChannel.java:741) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) > at > org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72) > at > org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70) > at org.jgroups.protocols.TP.passMessageUp(TP.java:1658) > at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876) >
[jira] [Resolved] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict
[ https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-9920. Resolution: Won't Fix This port issue has been fixed by Dale's changes on develop, that are now in 1.14 and 1.15. It was decided that we were not going to backport that change to 1.12 > CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and > RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port > conflict > --- > > Key: GEODE-9920 > URL: https://issues.apache.org/jira/browse/GEODE-9920 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.12.8 >Reporter: Hale Bales >Assignee: Mark Hanson >Priority: Major > Labels: CI, needsTriage > > StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with > AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess > failed with a suspicious string with a failure to respond to heartbeats. They > are in the same CI run so it seems like this is a port conflict where there > is overlap between the two tests as one is shutting down and the other is > starting up. > > Updated: This is part of the long standing problem with port binding and the > imperfection in handling default ports in tests. In this case 41000. > {code:java} > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest > > testWithInvalidMemberID FAILED > java.lang.AssertionError: > Expecting: > <"Member Count : 1 > Name| Id > - | -- > locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator] > "> > to contain: > <"locatorToStop"> > at > org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87) > at > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240) > {code} > {code:java} > org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > > testLimitedAccess FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on > Host 07d663f91562 with 4 VMs > Caused by: > org.apache.geode.distributed.DistributedSystemDisconnectedException: > This connection to a distributed system has been disconnected., caused by > org.apache.geode.ForcedDisconnectException: Member isn't responding to > heartbeat requests > Caused by: > org.apache.geode.ForcedDisconnectException: Member isn't > responding to heartbeat requests > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in log4j at line 1125 > [fatal 2022/01/04 01:04:33.305 GMT > tid=100] Membership service failure: Member isn't responding to heartbeat > requests > > org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: > Member isn't responding to heartbeat requests > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264) > at org.jgroups.JChannel.invokeCallback(JChannel.java:816) > at org.jgroups.JChannel.up(JChannel.java:741) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) > at > org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72) > at > org.apache.geode.distributed.internal.membership.gms
[jira] [Updated] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict
[ https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9920: --- Description: StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with a suspicious string with a failure to respond to heartbeats. They are in the same CI run so it seems like this is a port conflict where there is overlap between the two tests as one is shutting down and the other is starting up. Updated: This is part of the long standing problem with port binding and the imperfection in handling default ports in tests. In this case 41000. {code:java} org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest > testWithInvalidMemberID FAILED java.lang.AssertionError: Expecting: <"Member Count : 1 Name| Id - | -- locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator] "> to contain: <"locatorToStop"> at org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87) at org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240) {code} {code:java} org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > testLimitedAccess FAILED org.apache.geode.test.dunit.RMIException: While invoking org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on Host 07d663f91562 with 4 VMs Caused by: org.apache.geode.distributed.DistributedSystemDisconnectedException: This connection to a distributed system has been disconnected., caused by org.apache.geode.ForcedDisconnectException: Member isn't responding to heartbeat requests Caused by: org.apache.geode.ForcedDisconnectException: Member isn't responding to heartbeat requests java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use IgnoredException.addIgnoredException to ignore. --- Found suspect string in log4j at line 1125 [fatal 2022/01/04 01:04:33.305 GMT tid=100] Membership service failure: Member isn't responding to heartbeat requests org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: Member isn't responding to heartbeat requests at org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016) at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083) at org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686) at org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325) at org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264) at org.jgroups.JChannel.invokeCallback(JChannel.java:816) at org.jgroups.JChannel.up(JChannel.java:741) at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) at org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72) at org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70) at org.jgroups.protocols.TP.passMessageUp(TP.java:1658) at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876) at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10) at org.jgroups.protocols.TP.handleSingleMessage(TP.java:1789) at org.jgroups.protocols.TP.receive(TP.java:1714) at org.apache.geode.distributed.internal.membership.gms.messenger.Transport.receive(Transport.java:160) at org.jgroups.protocols.UDP$PacketReceiver.run(UDP.java:701) at java.lang.Thread.run(Thread.java:748) --- Found suspect string in log4j at line 1191 [error 2022/01/04 01:04:34.715 GMT tid=33] Cache initialization for GemFireCache[id = 1852143676; isClosing = false; isShutDownAll = false; created = Tue Jan 04 01:04:20 GMT 2022; server = false; copyOnRead = false; lockLease = 120; lockTimeout = 60
[jira] [Commented] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict
[ https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17469554#comment-17469554 ] Mark Hanson commented on GEODE-9920: 1.12 does not have [~demery]'s changes to make or port reservation system less likely to hit bind issues. This looks like an interaction between two tests using the same port. I tend to think we should just close this as not going to fix. Thoughts? > CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and > RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port > conflict > --- > > Key: GEODE-9920 > URL: https://issues.apache.org/jira/browse/GEODE-9920 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.12.8 >Reporter: Hale Bales >Assignee: Mark Hanson >Priority: Major > Labels: CI, needsTriage > > StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with > AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess > failed with a suspicious string with a failure to respond to heartbeats. They > are in the same CI run so it seems like this is a port conflict where there > is overlap between the two tests as one is shutting down and the other is > starting up. > {code:java} > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest > > testWithInvalidMemberID FAILED > java.lang.AssertionError: > Expecting: > <"Member Count : 1 > Name| Id > - | -- > locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator] > "> > to contain: > <"locatorToStop"> > at > org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87) > at > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240) > {code} > {code:java} > org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > > testLimitedAccess FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on > Host 07d663f91562 with 4 VMs > Caused by: > org.apache.geode.distributed.DistributedSystemDisconnectedException: > This connection to a distributed system has been disconnected., caused by > org.apache.geode.ForcedDisconnectException: Member isn't responding to > heartbeat requests > Caused by: > org.apache.geode.ForcedDisconnectException: Member isn't > responding to heartbeat requests > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in log4j at line 1125 > [fatal 2022/01/04 01:04:33.305 GMT > tid=100] Membership service failure: Member isn't responding to heartbeat > requests > > org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: > Member isn't responding to heartbeat requests > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264) > at org.jgroups.JChannel.invokeCallback(JChannel.java:816) > at org.jgroups.JChannel.up(JChannel.java:741) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) > at > org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72) > at > org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(Address
[jira] [Assigned] (GEODE-9920) CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port conflict
[ https://issues.apache.org/jira/browse/GEODE-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9920: -- Assignee: Mark Hanson > CI Failure: StopLocatorCommandDUnitTest > testWithInvalidMemberID and > RegionReliabilityDistNoAckDUnitTest > testLimitedAccess failed with port > conflict > --- > > Key: GEODE-9920 > URL: https://issues.apache.org/jira/browse/GEODE-9920 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.12.8 >Reporter: Hale Bales >Assignee: Mark Hanson >Priority: Major > Labels: CI, needsTriage > > StopLocatorCommandDUnitTest.testWithInvalidMemberID failured with > AssertionError and RegionReliabilityDistNoAckDUnitTest > testLimitedAccess > failed with a suspicious string with a failure to respond to heartbeats. They > are in the same CI run so it seems like this is a port conflict where there > is overlap between the two tests as one is shutting down and the other is > starting up. > {code:java} > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest > > testWithInvalidMemberID FAILED > java.lang.AssertionError: > Expecting: > <"Member Count : 1 > Name| Id > - | -- > locator-0 | 172.17.0.20(locator-0:108:locator):41000 [Coordinator] > "> > to contain: > <"locatorToStop"> > at > org.apache.geode.test.junit.assertions.CommandResultAssert.containsOutput(CommandResultAssert.java:87) > at > org.apache.geode.management.internal.cli.commands.StopLocatorCommandDUnitTest.testWithInvalidMemberID(StopLocatorCommandDUnitTest.java:240) > {code} > {code:java} > org.apache.geode.cache30.RegionReliabilityDistNoAckDUnitTest > > testLimitedAccess FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.RegionReliabilityTestCase$7.run in VM 0 running on > Host 07d663f91562 with 4 VMs > Caused by: > org.apache.geode.distributed.DistributedSystemDisconnectedException: > This connection to a distributed system has been disconnected., caused by > org.apache.geode.ForcedDisconnectException: Member isn't responding to > heartbeat requests > Caused by: > org.apache.geode.ForcedDisconnectException: Member isn't > responding to heartbeat requests > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in log4j at line 1125 > [fatal 2022/01/04 01:04:33.305 GMT > tid=100] Membership service failure: Member isn't responding to heartbeat > requests > > org.apache.geode.distributed.internal.membership.api.MemberDisconnectedException: > Member isn't responding to heartbeat requests > at > org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.forceDisconnect(GMSMembership.java:2016) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.forceDisconnect(GMSJoinLeave.java:1083) > at > org.apache.geode.distributed.internal.membership.gms.membership.GMSJoinLeave.processMessage(GMSJoinLeave.java:686) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1325) > at > org.apache.geode.distributed.internal.membership.gms.messenger.JGroupsMessenger$JGroupsReceiver.receive(JGroupsMessenger.java:1264) > at org.jgroups.JChannel.invokeCallback(JChannel.java:816) > at org.jgroups.JChannel.up(JChannel.java:741) > at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1030) > at org.jgroups.protocols.FRAG2.up(FRAG2.java:165) > at org.jgroups.protocols.FlowControl.up(FlowControl.java:390) > at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1077) > at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:792) > at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:433) > at > org.apache.geode.distributed.internal.membership.gms.messenger.StatRecorder.up(StatRecorder.java:72) > at > org.apache.geode.distributed.internal.membership.gms.messenger.AddressManager.up(AddressManager.java:70) > at org.jgroups.protocols.TP.passMessageUp(TP.java:1658) > at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1876) > at org.jgroups.util.DirectExecutor.execute(DirectExecutor.java:10) > at org.jgroups.protocols.TP.handleSingleM
[jira] [Commented] (GEODE-9885) StringsDUnitTest.givenBucketsMoveDuringAppend_thenDataIsNotLost fails with duplicated append
[ https://issues.apache.org/jira/browse/GEODE-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468855#comment-17468855 ] Mark Hanson commented on GEODE-9885: I just took a look at the test and this doesn't look like a test problem. We should probably take a deeper look at this. > StringsDUnitTest.givenBucketsMoveDuringAppend_thenDataIsNotLost fails with > duplicated append > > > Key: GEODE-9885 > URL: https://issues.apache.org/jira/browse/GEODE-9885 > Project: Geode > Issue Type: Bug > Components: redis >Affects Versions: 1.15.0 >Reporter: Ray Ingles >Priority: Major > Labels: needsTriage > > The test appends a lot of strings to a key. It wound up adding (at least one) > extra string to the stored string: > > {{java.util.concurrent.ExecutionException: java.lang.AssertionError: > unexpected -\{append0}-key-3-27680- at index 27681 iterationCount=61995 in > string}} > > The string "\{append0}-key-3-27680-" appeared twice in sequence. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone
[ https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468755#comment-17468755 ] Mark Hanson commented on GEODE-9815: I have addressed all of the concerns with a solution that I am happy with. There is a PR out for review. > Recovering persistent members can result in extra copies of a bucket or two > copies in the same redundancy zone > -- > > Key: GEODE-9815 > URL: https://issues.apache.org/jira/browse/GEODE-9815 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Dan Smith >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, needsTriage, pull-request-available > > The fix in GEODE-9554 is incomplete for some cases, and it also introduces a > new issue when removing buckets that are over redundancy. > GEODE-9554 and these new issues are all related to using redundancy zones and > having persistent members. > With persistence, when we start up a member with persisted buckets, we always > recover the persisted buckets on startup, regardless of whether redundancy is > already met or what zone the existing buckets are on. This is necessary to > ensure that we can recover all colocated buckets that might be persisted on > the member. > Because recovering these persistent buckets may cause us to go over > redundancy, after we recover from disk, we run a "restore redundancy" task > that actually removes copies of buckets that are over redundancy. > GEODE-9554 addressed one case where we end up removing the last copy of a > bucket from one redundancy zone while leaving two copies in another > redundancy zone. It did so by disallowing the removal of a bucket if it is > the last copy in a redundancy zone. > There are a couple of issues with this approach. > *Problem 1:* We may end up with two copies of the bucket in one zone in some > cases > With a slight tweak to the scenario fixed with GEODE-9554 we can end up never > getting out of the situation where we have two copies of a bucket in the same > zone. > Steps: > 1. Start two redundancy zones A and B with two members each. Bucket 0 is on > member A1 and B1. > 2. Shutdown member A1. > 3. Rebalance - this will create bucket 0 on A2. > 4. Shutdown B1. Revoke it's disk store and delete the data > 5. Startup A1 - it will recover bucket 0. > 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that > situation. > *Problem 2:* We may never delete extra copies of a bucket > The fix for GEODE-9554 introduces a new problem if we have more than 2 > redundancy zones > Steps > 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 > and B1 > 2. Shutdown A1 > 3. Rebalance - this will create Bucket 0 on C1 > 4. Startup A1 - this will recreate bucket 0 > 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy. > I think the overall fix is probably to do something different than prevent > removing the last copy of a bucket from a redundancy zone. Instead, I think > we should do something like this: > 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* > buckets that have two copies in the same zone, as well as any buckets that > are actually over redundancy. > 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra > copies of a bucket in the same zone first > 3. Back out the changes for GEODE-9554 and let the last copy be deleted from > a zone. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9878) PostgresJdbcLoaderIntegrationTest. initializationError failed
Mark Hanson created GEODE-9878: -- Summary: PostgresJdbcLoaderIntegrationTest. initializationError failed Key: GEODE-9878 URL: https://issues.apache.org/jira/browse/GEODE-9878 Project: Geode Issue Type: Bug Components: jdbc, tests Affects Versions: 1.15.0 Reporter: Mark Hanson [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk11/builds/42.1] failed with the stack trace shown below for PostgresJdbcLoaderIntegrationTest. initializationError {noformat} org.testcontainers.containers.ContainerLaunchException: Container startup failed at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:330) at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:311) at org.testcontainers.containers.ContainerisedDockerCompose.invoke(DockerComposeContainer.java:646) at org.testcontainers.containers.DockerComposeContainer.runWithCompose(DockerComposeContainer.java:309) at org.testcontainers.containers.DockerComposeContainer.createServices(DockerComposeContainer.java:233) at org.testcontainers.containers.DockerComposeContainer.start(DockerComposeContainer.java:177) at org.apache.geode.connectors.jdbc.test.junit.rules.SqlDatabaseConnectionRule$1.evaluate(SqlDatabaseConnectionRule.java:57) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at org.junit.runner.JUnitCore.run(JUnitCore.java:115) at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) at java.util.Iterator.forEachRemaining(Iterator.java:133) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82) at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99) at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79) at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:566) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(Con
[jira] [Created] (GEODE-9877) GeodeRedisServerStartupDUnitTest. startupFailsGivenPortAlreadyInUse failed
Mark Hanson created GEODE-9877: -- Summary: GeodeRedisServerStartupDUnitTest. startupFailsGivenPortAlreadyInUse failed Key: GEODE-9877 URL: https://issues.apache.org/jira/browse/GEODE-9877 Project: Geode Issue Type: Bug Components: redis Affects Versions: 1.15.0 Reporter: Mark Hanson [https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/acceptance-test-openjdk8/builds/43] failed with GeodeRedisServerStartupDUnitTest. startupFailsGivenPortAlreadyInUse {noformat} java.net.BindException: Address already in use (Bind failed) at java.net.PlainSocketImpl.socketBind(Native Method) at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387) at java.net.Socket.bind(Socket.java:662) at org.apache.geode.redis.GeodeRedisServerStartupDUnitTest.startupFailsGivenPortAlreadyInUse(GeodeRedisServerStartupDUnitTest.java:115) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:138) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at org.junit.runner.JUnitCore.run(JUnitCore.java:115) at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.Iterator.forEachRemaining(Iterator.java:116) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82) at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) at org.junit.platform.launcher.core.DefaultLau
[jira] [Commented] (GEODE-9876) SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > testSerialGatewaySenderThreadsConnectToSameReceiver FAILED
[ https://issues.apache.org/jira/browse/GEODE-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454911#comment-17454911 ] Mark Hanson commented on GEODE-9876: Hi Mario, This looks like an error related to GEODE-8202. Can you take a look? Thanks, Mark > SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > > testSerialGatewaySenderThreadsConnectToSameReceiver FAILED > - > > Key: GEODE-9876 > URL: https://issues.apache.org/jira/browse/GEODE-9876 > Project: Geode > Issue Type: Bug > Components: wan >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Mario Kevo >Priority: Major > > > {noformat} > java.lang.AssertionError: Error parsing gfsh output. 'Senders Connected' > column header not found > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.assertTrue(Assert.java:42) > at org.junit.Assert.assertNotNull(Assert.java:713) > at > org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.parseSendersConnectedFromGfshOutput(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:236) > at > org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.allDispatchersConnectedToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:208) > at > org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.testSerialGatewaySenderThreadsConnectToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:176) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.apache.geode.rules.DockerComposeRule$1.evaluate(DockerComposeRule.java:104) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at java.util.Iterator.forEachRemaining(Iterator.java:116) > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > at java.util.stream.AbstractPipeline.evalua
[jira] [Assigned] (GEODE-9876) SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > testSerialGatewaySenderThreadsConnectToSameReceiver FAILED
[ https://issues.apache.org/jira/browse/GEODE-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9876: -- Assignee: Mario Kevo > SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > > testSerialGatewaySenderThreadsConnectToSameReceiver FAILED > - > > Key: GEODE-9876 > URL: https://issues.apache.org/jira/browse/GEODE-9876 > Project: Geode > Issue Type: Bug > Components: wan >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Mario Kevo >Priority: Major > > > {noformat} > java.lang.AssertionError: Error parsing gfsh output. 'Senders Connected' > column header not found > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.assertTrue(Assert.java:42) > at org.junit.Assert.assertNotNull(Assert.java:713) > at > org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.parseSendersConnectedFromGfshOutput(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:236) > at > org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.allDispatchersConnectedToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:208) > at > org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.testSerialGatewaySenderThreadsConnectToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:176) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.apache.geode.rules.DockerComposeRule$1.evaluate(DockerComposeRule.java:104) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at java.util.Iterator.forEachRemaining(Iterator.java:116) > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) >
[jira] [Created] (GEODE-9876) SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > testSerialGatewaySenderThreadsConnectToSameReceiver FAILED
Mark Hanson created GEODE-9876: -- Summary: SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest > testSerialGatewaySenderThreadsConnectToSameReceiver FAILED Key: GEODE-9876 URL: https://issues.apache.org/jira/browse/GEODE-9876 Project: Geode Issue Type: Bug Components: wan Affects Versions: 1.15.0 Reporter: Mark Hanson {noformat} java.lang.AssertionError: Error parsing gfsh output. 'Senders Connected' column header not found at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.assertTrue(Assert.java:42) at org.junit.Assert.assertNotNull(Assert.java:713) at org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.parseSendersConnectedFromGfshOutput(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:236) at org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.allDispatchersConnectedToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:208) at org.apache.geode.cache.wan.SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.testSerialGatewaySenderThreadsConnectToSameReceiver(SeveralGatewayReceiversWithSamePortAndHostnameForSendersTest.java:176) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.apache.geode.rules.DockerComposeRule$1.evaluate(DockerComposeRule.java:104) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at org.junit.runner.JUnitCore.run(JUnitCore.java:115) at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.Iterator.forEachRemaining(Iterator.java:116) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82) at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(Engi
[jira] [Commented] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies in the same redundancy zone
[ https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452113#comment-17452113 ] Mark Hanson commented on GEODE-9815: I have addressed the two cases above. I am not particularly satisfied with my implementation of change 1. I need to get a code review from [~upthewaterspout] . A draft PR is in place. > Recovering persistent members can result in extra copies of a bucket or two > copies in the same redundancy zone > -- > > Key: GEODE-9815 > URL: https://issues.apache.org/jira/browse/GEODE-9815 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Dan Smith >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, needsTriage, pull-request-available > > The fix in GEODE-9554 is incomplete for some cases, and it also introduces a > new issue when removing buckets that are over redundancy. > GEODE-9554 and these new issues are all related to using redundancy zones and > having persistent members. > With persistence, when we start up a member with persisted buckets, we always > recover the persisted buckets on startup, regardless of whether redundancy is > already met or what zone the existing buckets are on. This is necessary to > ensure that we can recover all colocated buckets that might be persisted on > the member. > Because recovering these persistent buckets may cause us to go over > redundancy, after we recover from disk, we run a "restore redundancy" task > that actually removes copies of buckets that are over redundancy. > GEODE-9554 addressed one case where we end up removing the last copy of a > bucket from one redundancy zone while leaving two copies in another > redundancy zone. It did so by disallowing the removal of a bucket if it is > the last copy in a redundancy zone. > There are a couple of issues with this approach. > *Problem 1:* We may end up with two copies of the bucket in one zone in some > cases > With a slight tweak to the scenario fixed with GEODE-9554 we can end up never > getting out of the situation where we have two copies of a bucket in the same > zone. > Steps: > 1. Start two redundancy zones A and B with two members each. Bucket 0 is on > member A1 and B1. > 2. Shutdown member A1. > 3. Rebalance - this will create bucket 0 on A2. > 4. Shutdown B1. Revoke it's disk store and delete the data > 5. Startup A1 - it will recover bucket 0. > 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that > situation. > *Problem 2:* We may never delete extra copies of a bucket > The fix for GEODE-9554 introduces a new problem if we have more than 2 > redundancy zones > Steps > 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 > and B1 > 2. Shutdown A1 > 3. Rebalance - this will create Bucket 0 on C1 > 4. Startup A1 - this will recreate bucket 0 > 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy. > I think the overall fix is probably to do something different than prevent > removing the last copy of a bucket from a redundancy zone. Instead, I think > we should do something like this: > 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* > buckets that have two copies in the same zone, as well as any buckets that > are actually over redundancy. > 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra > copies of a bucket in the same zone first > 3. Back out the changes for GEODE-9554 and let the last copy be deleted from > a zone. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9856) SMoveNativeRedisAcceptanceTest is failing with cluster is down.
[ https://issues.apache.org/jira/browse/GEODE-9856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450699#comment-17450699 ] Mark Hanson commented on GEODE-9856: Seems related. Should be confirmed. > SMoveNativeRedisAcceptanceTest is failing with cluster is down. > --- > > Key: GEODE-9856 > URL: https://issues.apache.org/jira/browse/GEODE-9856 > Project: Geode > Issue Type: Bug > Components: redis >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Priority: Major > Labels: needsTriage > > {noformat} > SMoveNativeRedisAcceptanceTest > testSMoveNegativeCases FAILED > 12:05:00redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN > The cluster is down > 12:05:00at > redis.clients.jedis.Protocol.processError(Protocol.java:125) > 12:05:00at redis.clients.jedis.Protocol.process(Protocol.java:169) > 12:05:00at redis.clients.jedis.Protocol.read(Protocol.java:223) > 12:05:00at > redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352) > 12:05:00at > redis.clients.jedis.Connection.getIntegerReply(Connection.java:294) > 12:05:00at redis.clients.jedis.Jedis.sadd(Jedis.java:1391) > 12:05:00at > redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:973) > 12:05:00at > redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:970) > 12:05:00at > redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:121) > 12:05:00at > redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45) > 12:05:00at > redis.clients.jedis.JedisCluster.sadd(JedisCluster.java:975) > 12:05:00at > org.apache.geode.redis.internal.executor.set.AbstractSMoveIntegrationTest.testSMoveNegativeCases(AbstractSMoveIntegrationTest.java:112) > 12:05:00at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 12:05:00at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 12:05:00at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 12:05:00at java.lang.reflect.Method.invoke(Method.java:498) > 12:05:00at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > 12:05:00at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 12:05:00at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > 12:05:00at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 12:05:00at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > 12:05:00at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 12:05:00at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 12:05:00at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > 12:05:00at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > 12:05:00at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > 12:05:00at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > 12:05:00at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > 12:05:00at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > 12:05:00at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > 12:05:00at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > 12:05:00at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > 12:05:00at > org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:118) > 12:05:00at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20) > 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20) > 12:05:00at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 12:05:00at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > 12:05:00at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) > 12:05:00at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > 12:05:00at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > 12:05:00at java.util.Iter
[jira] [Comment Edited] (GEODE-9856) SMoveNativeRedisAcceptanceTest is failing with cluster is down.
[ https://issues.apache.org/jira/browse/GEODE-9856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450699#comment-17450699 ] Mark Hanson edited comment on GEODE-9856 at 11/29/21, 7:45 PM: --- GEODE-9428 Seems related. Should be confirmed. was (Author: mhansonp): Seems related. Should be confirmed. > SMoveNativeRedisAcceptanceTest is failing with cluster is down. > --- > > Key: GEODE-9856 > URL: https://issues.apache.org/jira/browse/GEODE-9856 > Project: Geode > Issue Type: Bug > Components: redis >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Priority: Major > Labels: needsTriage > > {noformat} > SMoveNativeRedisAcceptanceTest > testSMoveNegativeCases FAILED > 12:05:00redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN > The cluster is down > 12:05:00at > redis.clients.jedis.Protocol.processError(Protocol.java:125) > 12:05:00at redis.clients.jedis.Protocol.process(Protocol.java:169) > 12:05:00at redis.clients.jedis.Protocol.read(Protocol.java:223) > 12:05:00at > redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352) > 12:05:00at > redis.clients.jedis.Connection.getIntegerReply(Connection.java:294) > 12:05:00at redis.clients.jedis.Jedis.sadd(Jedis.java:1391) > 12:05:00at > redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:973) > 12:05:00at > redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:970) > 12:05:00at > redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:121) > 12:05:00at > redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45) > 12:05:00at > redis.clients.jedis.JedisCluster.sadd(JedisCluster.java:975) > 12:05:00at > org.apache.geode.redis.internal.executor.set.AbstractSMoveIntegrationTest.testSMoveNegativeCases(AbstractSMoveIntegrationTest.java:112) > 12:05:00at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 12:05:00at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 12:05:00at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 12:05:00at java.lang.reflect.Method.invoke(Method.java:498) > 12:05:00at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > 12:05:00at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 12:05:00at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > 12:05:00at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 12:05:00at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > 12:05:00at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 12:05:00at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 12:05:00at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > 12:05:00at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > 12:05:00at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > 12:05:00at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > 12:05:00at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > 12:05:00at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > 12:05:00at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > 12:05:00at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > 12:05:00at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > 12:05:00at > org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:118) > 12:05:00at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20) > 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20) > 12:05:00at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 12:05:00at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > 12:05:00at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) > 12:05:00at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > 12:05
[jira] [Created] (GEODE-9861) Windows: ResultModelTest.serializeFileToDownload Failed
Mark Hanson created GEODE-9861: -- Summary: Windows: ResultModelTest.serializeFileToDownload Failed Key: GEODE-9861 URL: https://issues.apache.org/jira/browse/GEODE-9861 Project: Geode Issue Type: Bug Components: tests Affects Versions: 1.12.5 Reporter: Mark Hanson {noformat} org.apache.geode.management.internal.cli.result.model.ResultModelTest > serializeFileToDownload FAILED java.io.IOException: Access is denied at java.io.WinNTFileSystem.createFileExclusively(Native Method) at java.io.File.createNewFile(File.java:1035) at org.junit.rules.TemporaryFolder.newFile(TemporaryFolder.java:67) at org.apache.geode.management.internal.cli.result.model.ResultModelTest.serializeFileToDownload(ResultModelTest.java:176) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9859) Mass-Test-Run: WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED
[ https://issues.apache.org/jira/browse/GEODE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9859: --- Summary: Mass-Test-Run: WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED (was: WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED) > Mass-Test-Run: > WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, > false) [0] FAILED > > > Key: GEODE-9859 > URL: https://issues.apache.org/jira/browse/GEODE-9859 > Project: Geode > Issue Type: Bug > Components: wan >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Alberto Gomez >Priority: Major > > Looks like this might be failing from the original PR. I have linked to > GEODE-9369 as the most likely origination. > > {noformat} > WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, > false) [0] FAILED > java.lang.AssertionError: > Expecting elements: > ["Execution failed. Error: > org.apache.geode.cache.EntryDestroyedException: 937"] > to have exactly 1 times execution error > at > org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9860) NativeRedisRenameRedirectionsDUnitTest. initializationError
Mark Hanson created GEODE-9860: -- Summary: NativeRedisRenameRedirectionsDUnitTest. initializationError Key: GEODE-9860 URL: https://issues.apache.org/jira/browse/GEODE-9860 Project: Geode Issue Type: Bug Components: redis Affects Versions: 1.15.0 Reporter: Mark Hanson {noformat} NativeRedisRenameRedirectionsDUnitTest > initializationError FAILED java.lang.RuntimeException: java.lang.NullPointerException at org.rnorth.ducttape.timeouts.Timeouts.callFuture(Timeouts.java:68) at org.rnorth.ducttape.timeouts.Timeouts.doWithTimeout(Timeouts.java:60) at org.testcontainers.containers.wait.strategy.WaitAllStrategy.waitUntilReady(WaitAllStrategy.java:53) at org.testcontainers.containers.DockerComposeContainer.waitUntilServiceStarted(DockerComposeContainer.java:285) at java.util.concurrent.ConcurrentHashMap.forEach(ConcurrentHashMap.java:1597) at org.testcontainers.containers.DockerComposeContainer.waitUntilServiceStarted(DockerComposeContainer.java:265) at org.testcontainers.containers.DockerComposeContainer.start(DockerComposeContainer.java:179) at org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:84) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at org.junit.runner.JUnitCore.run(JUnitCore.java:115) at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.Iterator.forEachRemaining(Iterator.java:116) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82) at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99) at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79) at org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClass
[jira] [Assigned] (GEODE-9859) WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED
[ https://issues.apache.org/jira/browse/GEODE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9859: -- Assignee: Alberto Gomez > WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, > false) [0] FAILED > - > > Key: GEODE-9859 > URL: https://issues.apache.org/jira/browse/GEODE-9859 > Project: Geode > Issue Type: Bug > Components: wan >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Alberto Gomez >Priority: Major > > Looks like this might be failing from the original PR. I have linked to > GEODE-9369 as the most likely origination. > > {noformat} > WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, > false) [0] FAILED > java.lang.AssertionError: > Expecting elements: > ["Execution failed. Error: > org.apache.geode.cache.EntryDestroyedException: 937"] > to have exactly 1 times execution error > at > org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9859) WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED
[ https://issues.apache.org/jira/browse/GEODE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9859: --- Description: Looks like this might be failing from the original PR. I have linked to GEODE-9369 as the most likely origination. {noformat} WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, false) [0] FAILED java.lang.AssertionError: Expecting elements: ["Execution failed. Error: org.apache.geode.cache.EntryDestroyedException: 937"] to have exactly 1 times execution error at org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450) {noformat} was: {noformat} WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, false) [0] FAILED java.lang.AssertionError: Expecting elements: ["Execution failed. Error: org.apache.geode.cache.EntryDestroyedException: 937"] to have exactly 1 times execution error at org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450) {noformat} > WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, > false) [0] FAILED > - > > Key: GEODE-9859 > URL: https://issues.apache.org/jira/browse/GEODE-9859 > Project: Geode > Issue Type: Bug > Components: wan >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Priority: Major > > Looks like this might be failing from the original PR. I have linked to > GEODE-9369 as the most likely origination. > > {noformat} > WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, > false) [0] FAILED > java.lang.AssertionError: > Expecting elements: > ["Execution failed. Error: > org.apache.geode.cache.EntryDestroyedException: 937"] > to have exactly 1 times execution error > at > org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9859) WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED
[ https://issues.apache.org/jira/browse/GEODE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450670#comment-17450670 ] Mark Hanson commented on GEODE-9859: This test was failing under windows on the original PR. > WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, > false) [0] FAILED > - > > Key: GEODE-9859 > URL: https://issues.apache.org/jira/browse/GEODE-9859 > Project: Geode > Issue Type: Bug > Components: wan >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Priority: Major > > {noformat} > WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, > false) [0] FAILED > java.lang.AssertionError: > Expecting elements: > ["Execution failed. Error: > org.apache.geode.cache.EntryDestroyedException: 937"] > to have exactly 1 times execution error > at > org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9859) WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED
Mark Hanson created GEODE-9859: -- Summary: WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(false, false) [0] FAILED Key: GEODE-9859 URL: https://issues.apache.org/jira/browse/GEODE-9859 Project: Geode Issue Type: Bug Components: wan Affects Versions: 1.15.0 Reporter: Mark Hanson {noformat} WanCopyRegionCommandDUnitTest > testRegionDestroyedDuringExecution(false, false) [0] FAILED java.lang.AssertionError: Expecting elements: ["Execution failed. Error: org.apache.geode.cache.EntryDestroyedException: 937"] to have exactly 1 times execution error at org.apache.geode.cache.wan.internal.cli.commands.WanCopyRegionCommandDUnitTest.testRegionDestroyedDuringExecution(WanCopyRegionCommandDUnitTest.java:450) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (GEODE-9858) Mass-Test-Run failure PingOpDistributedTest. memberShouldCorrectlyRedirectPingMessage
Mark Hanson created GEODE-9858: -- Summary: Mass-Test-Run failure PingOpDistributedTest. memberShouldCorrectlyRedirectPingMessage Key: GEODE-9858 URL: https://issues.apache.org/jira/browse/GEODE-9858 Project: Geode Issue Type: Bug Components: core Affects Versions: 1.15.0 Reporter: Mark Hanson {noformat} java.lang.AssertionError: Expecting actual: 1638052119621L to be greater than: 1638052119621L at org.apache.geode.internal.cache.tier.sockets.PingOpDistributedTest.memberShouldCorrectlyRedirectPingMessage(PingOpDistributedTest.java:205) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59) at org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59) at org.apache.geode.test.dunit.rules.AbstractDistributedRule$1.evaluate(AbstractDistributedRule.java:59) at org.apache.geode.test.junit.rules.serializable.SerializableTemporaryFolder$1.evaluate(SerializableTemporaryFolder.java:130) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at org.junit.runner.JUnitCore.run(JUnitCore.java:115) at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.Iterator.forEachRemaining(Iterator.java:116) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82) at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) at org.junit.platform.launcher.core.
[jira] [Created] (GEODE-9857) ShowMissingDiskStoreCommandDUnitTest. stopAllMembersAndStart2ndLocator
Mark Hanson created GEODE-9857: -- Summary: ShowMissingDiskStoreCommandDUnitTest. stopAllMembersAndStart2ndLocator Key: GEODE-9857 URL: https://issues.apache.org/jira/browse/GEODE-9857 Project: Geode Issue Type: Bug Components: tests Affects Versions: 1.15.0 Reporter: Mark Hanson {noformat} ShowMissingDiskStoreCommandDUnitTest > stopAllMembersAndStart2ndLocator FAILED org.awaitility.core.ConditionTimeoutException: Assertion condition defined as a lambda expression in org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest Expecting value to be true but was false within 5 minutes. at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166) at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939) at org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723) at org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest.stopAllMembersAndStart2ndLocator(ShowMissingDiskStoreCommandDUnitTest.java:201) Caused by: org.opentest4j.AssertionFailedError: Expecting value to be true but was false at sun.reflect.GeneratedConstructorAccessor23.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at org.apache.geode.test.junit.rules.GfshCommandRule.connectAndVerify(GfshCommandRule.java:153) at org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest.lambda$stopAllMembersAndStart2ndLocator$3(ShowMissingDiskStoreCommandDUnitTest.java:201) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9857) ShowMissingDiskStoreCommandDUnitTest. stopAllMembersAndStart2ndLocator
[ https://issues.apache.org/jira/browse/GEODE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9857: --- Labels: needsTriage (was: ) > ShowMissingDiskStoreCommandDUnitTest. stopAllMembersAndStart2ndLocator > -- > > Key: GEODE-9857 > URL: https://issues.apache.org/jira/browse/GEODE-9857 > Project: Geode > Issue Type: Bug > Components: tests >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Priority: Major > Labels: needsTriage > > {noformat} > ShowMissingDiskStoreCommandDUnitTest > stopAllMembersAndStart2ndLocator FAILED > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest > > Expecting value to be true but was false within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723) > at > org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest.stopAllMembersAndStart2ndLocator(ShowMissingDiskStoreCommandDUnitTest.java:201) > Caused by: > org.opentest4j.AssertionFailedError: > Expecting value to be true but was false > at sun.reflect.GeneratedConstructorAccessor23.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > org.apache.geode.test.junit.rules.GfshCommandRule.connectAndVerify(GfshCommandRule.java:153) > at > org.apache.geode.management.internal.cli.commands.ShowMissingDiskStoreCommandDUnitTest.lambda$stopAllMembersAndStart2ndLocator$3(ShowMissingDiskStoreCommandDUnitTest.java:201) > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-9856) SMoveNativeRedisAcceptanceTest is failing with cluster is down.
[ https://issues.apache.org/jira/browse/GEODE-9856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9856: --- Labels: needsTriage (was: ) > SMoveNativeRedisAcceptanceTest is failing with cluster is down. > --- > > Key: GEODE-9856 > URL: https://issues.apache.org/jira/browse/GEODE-9856 > Project: Geode > Issue Type: Bug > Components: redis >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Priority: Major > Labels: needsTriage > > {noformat} > SMoveNativeRedisAcceptanceTest > testSMoveNegativeCases FAILED > 12:05:00redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN > The cluster is down > 12:05:00at > redis.clients.jedis.Protocol.processError(Protocol.java:125) > 12:05:00at redis.clients.jedis.Protocol.process(Protocol.java:169) > 12:05:00at redis.clients.jedis.Protocol.read(Protocol.java:223) > 12:05:00at > redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352) > 12:05:00at > redis.clients.jedis.Connection.getIntegerReply(Connection.java:294) > 12:05:00at redis.clients.jedis.Jedis.sadd(Jedis.java:1391) > 12:05:00at > redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:973) > 12:05:00at > redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:970) > 12:05:00at > redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:121) > 12:05:00at > redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45) > 12:05:00at > redis.clients.jedis.JedisCluster.sadd(JedisCluster.java:975) > 12:05:00at > org.apache.geode.redis.internal.executor.set.AbstractSMoveIntegrationTest.testSMoveNegativeCases(AbstractSMoveIntegrationTest.java:112) > 12:05:00at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 12:05:00at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 12:05:00at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 12:05:00at java.lang.reflect.Method.invoke(Method.java:498) > 12:05:00at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > 12:05:00at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 12:05:00at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > 12:05:00at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 12:05:00at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > 12:05:00at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 12:05:00at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 12:05:00at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > 12:05:00at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > 12:05:00at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > 12:05:00at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > 12:05:00at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > 12:05:00at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > 12:05:00at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > 12:05:00at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > 12:05:00at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > 12:05:00at > org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:118) > 12:05:00at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20) > 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20) > 12:05:00at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 12:05:00at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > 12:05:00at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) > 12:05:00at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > 12:05:00at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > 12:05:00at java.util.Iterator.forEachRemaining(Iterator.java:116) > 12:05:00at
[jira] [Created] (GEODE-9856) SMoveNativeRedisAcceptanceTest is failing with cluster is down.
Mark Hanson created GEODE-9856: -- Summary: SMoveNativeRedisAcceptanceTest is failing with cluster is down. Key: GEODE-9856 URL: https://issues.apache.org/jira/browse/GEODE-9856 Project: Geode Issue Type: Bug Components: redis Affects Versions: 1.15.0 Reporter: Mark Hanson {noformat} SMoveNativeRedisAcceptanceTest > testSMoveNegativeCases FAILED 12:05:00redis.clients.jedis.exceptions.JedisClusterException: CLUSTERDOWN The cluster is down 12:05:00at redis.clients.jedis.Protocol.processError(Protocol.java:125) 12:05:00at redis.clients.jedis.Protocol.process(Protocol.java:169) 12:05:00at redis.clients.jedis.Protocol.read(Protocol.java:223) 12:05:00at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352) 12:05:00at redis.clients.jedis.Connection.getIntegerReply(Connection.java:294) 12:05:00at redis.clients.jedis.Jedis.sadd(Jedis.java:1391) 12:05:00at redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:973) 12:05:00at redis.clients.jedis.JedisCluster$70.execute(JedisCluster.java:970) 12:05:00at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:121) 12:05:00at redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45) 12:05:00at redis.clients.jedis.JedisCluster.sadd(JedisCluster.java:975) 12:05:00at org.apache.geode.redis.internal.executor.set.AbstractSMoveIntegrationTest.testSMoveNegativeCases(AbstractSMoveIntegrationTest.java:112) 12:05:00at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 12:05:00at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 12:05:00at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 12:05:00at java.lang.reflect.Method.invoke(Method.java:498) 12:05:00at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) 12:05:00at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) 12:05:00at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) 12:05:00at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) 12:05:00at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 12:05:00at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 12:05:00at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 12:05:00at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) 12:05:00at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 12:05:00at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) 12:05:00at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) 12:05:00at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 12:05:00at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 12:05:00at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 12:05:00at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 12:05:00at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 12:05:00at org.apache.geode.redis.NativeRedisClusterTestRule$1.evaluate(NativeRedisClusterTestRule.java:118) 12:05:00at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20) 12:05:00at org.junit.rules.RunRules.evaluate(RunRules.java:20) 12:05:00at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 12:05:00at org.junit.runners.ParentRunner.run(ParentRunner.java:413) 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:137) 12:05:00at org.junit.runner.JUnitCore.run(JUnitCore.java:115) 12:05:00at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) 12:05:00at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) 12:05:00at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) 12:05:00at java.util.Iterator.forEachRemaining(Iterator.java:116) 12:05:00at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) 12:05:00at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) 12:05:00at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) 12:05:00at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) 12:05:00at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluat
[jira] [Commented] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly
[ https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17445643#comment-17445643 ] Mark Hanson commented on GEODE-8644: I have rerun this on a variety of cloud instances trying to reproduce this and I have not been successful. I think we may need to add more logging into the code so when it does fail we have more detail. > SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() > intermittently fails when queues drain too slowly > --- > > Key: GEODE-8644 > URL: https://issues.apache.org/jira/browse/GEODE-8644 > Project: Geode > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Benjamin P Ross >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, needsTriage, pull-request-available > > Currently the test > SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() > relies on a 2 second delay to allow for queues to finish draining after > finishing the put operation. If queues take longer than 2 seconds to drain > the test will fail. We should change the test to wait for the queues to be > empty with a long timeout in case the queues never fully drain. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9815) Recovering persistent members can result in extra copies of a bucket or two copies int the same redundancy zone
[ https://issues.apache.org/jira/browse/GEODE-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9815: -- Assignee: Mark Hanson > Recovering persistent members can result in extra copies of a bucket or two > copies int the same redundancy zone > --- > > Key: GEODE-9815 > URL: https://issues.apache.org/jira/browse/GEODE-9815 > Project: Geode > Issue Type: Bug > Components: regions >Affects Versions: 1.15.0 >Reporter: Dan Smith >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, needsTriage > > The fix in GEODE-9554 is incomplete for some cases, and it also introduces a > new issue when removing buckets that are over redundancy. > GEODE-9554 and these new issues are all related to using redundancy zones and > having persistent members. > With persistence, when we start up a member with persisted buckets, we always > recover the persisted buckets on startup, regardless of whether redundancy is > already met or what zone the existing buckets are on. This is necessary to > ensure that we can recover all colocated buckets that might be persisted on > the member. > Because recovering these persistent buckets may cause us to go over > redundancy, after we recover from disk, we run a "restore redundancy" task > that actually removes copies of buckets that are over redundancy. > GEODE-9554 addressed one case where we end up removing the last copy of a > bucket from one redundancy zone while leaving two copies in another > redundancy zone. It did so by disallowing the removal of a bucket if it is > the last copy in a redundancy zone. > There are a couple of issues with this approach. > *Problem 1:* We may end up with two copies of the bucket in one zone in some > cases > With a slight tweak to the scenario fixed with GEODE-9554 we can end up never > getting out of the situation where we have two copies of a bucket in the same > zone. > Steps: > 1. Start two redundancy zones A and B with two members each. Bucket 0 is on > member A1 and B1. > 2. Shutdown member A1. > 3. Rebalance - this will create bucket 0 on A2. > 4. Shutdown B1. Revoke it's disk store and delete the data > 5. Startup A1 - it will recover bucket 0. > 6. At this point, bucket 0 is on A1 and A2, and nothing will resolve that > situation. > *Problem 2:* We may never delete extra copies of a bucket > The fix for GEODE-9554 introduces a new problem if we have more than 2 > redundancy zones > Steps > 1. Start three redundancy zones A,B,C with one member each. Bucket 0 is on A1 > and B1 > 2. Shutdown A1 > 3. Rebalance - this will create Bucket 0 on C1 > 4. Startup A1 - this will recreate bucket 0 > 5. Now we have bucket 0 on A1, B1, and C1. Nothing will remove the extra copy. > I think the overall fix is probably to do something different than prevent > removing the last copy of a bucket from a redundancy zone. Instead, I think > we should do something like this: > 1. Change PartitionRegionLoadModel.getOverRedundancyBuckets to return *any* > buckets that have two copies in the same zone, as well as any buckets that > are actually over redundancy. > 2. Change PartitionRegionLoadModel.findBestRemove to always remove extra > copies of a bucket in the same zone first > 3. Back out the changes for GEODE-9554 and let the last copy be deleted from > a zone. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-8644) SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() intermittently fails when queues drain too slowly
[ https://issues.apache.org/jira/browse/GEODE-8644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-8644: -- Assignee: Mark Hanson > SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() > intermittently fails when queues drain too slowly > --- > > Key: GEODE-8644 > URL: https://issues.apache.org/jira/browse/GEODE-8644 > Project: Geode > Issue Type: Bug >Affects Versions: 1.15.0 >Reporter: Benjamin P Ross >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, needsTriage, pull-request-available > > Currently the test > SerialGatewaySenderQueueDUnitTest.unprocessedTokensMapShouldDrainCompletely() > relies on a 2 second delay to allow for queues to finish draining after > finishing the put operation. If queues take longer than 2 seconds to drain > the test will fail. We should change the test to wait for the queues to be > empty with a long timeout in case the queues never fully drain. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-9425) AutoConnectionSource thread in client can't query for available locators when it is connected to a locator that was shut down
[ https://issues.apache.org/jira/browse/GEODE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9425: -- Assignee: (was: Mark Hanson) > AutoConnectionSource thread in client can't query for available locators when > it is connected to a locator that was shut down > - > > Key: GEODE-9425 > URL: https://issues.apache.org/jira/browse/GEODE-9425 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.15.0 >Reporter: Lynn Gallinat >Priority: Major > > The AutoConnectionSource thread runs in a client and queries the locator that > client is connected to so it can update the list of available locators. > But if the locator the client is connected to was shut down, the client > can't get an updated locator list. > In this case the locator was shut down and is not coming back, but there is > another available locator. > However we can't find out what that available locator is because we can't > complete the query. > To summarize: The AutoConnectionSource thread that runs in a client to update > the list of available locators should be able to get a list of available > locators even when that client is connected to a locator that was shut down. > The AutoConnectionSource thread starts and runs every 10 seconds. This is > from the client's system log. > [info 2021/07/07 19:37:33.723 GMT clientgemfire1_host1_881 > tid=0x2d] AutoConnectionSource > UpdateLocatorListTask started with interval=1 ms. > After the locator is shut down the AutoConnectionSource thread can't complete > its work so we get stuck threads. > This stuck thread stack shows it is trying to run UpdateLocatorListTask. > {noformat} > clientgemfire1_881/system.log: [warn 2021/07/07 19:47:25.784 GMT > clientgemfire1_host1_881 tid=0x36] Thread <286> (0x11e) that > was executed at <07 Jul 2021 19:46:03 GMT> has been stuck for <82.041 > seconds> and number of thread monitor iteration <1> > Thread Name state > Executor Group > Monitored metric > Thread stack for "poolTimer-pool-24" (0x11e): > java.lang.ThreadState: RUNNABLE (in native) > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) > - locked java.net.SocksSocketImpl@3e95a505 > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:607) > at > org.apache.geode.distributed.internal.tcpserver.AdvancedSocketCreatorImpl.connect(AdvancedSocketCreatorImpl.java:102) > at > org.apache.geode.internal.net.SCAdvancedSocketCreator.connect(SCAdvancedSocketCreator.java:51) > at > org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.connect(ClusterSocketCreatorImpl.java:96) > at > org.apache.geode.distributed.internal.tcpserver.TcpClient.getServerVersion(TcpClient.java:246) > at > org.apache.geode.distributed.internal.tcpserver.TcpClient.requestToServer(TcpClient.java:151) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocatorUsingConnection(AutoConnectionSourceImpl.java:217) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocator(AutoConnectionSourceImpl.java:207) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryLocators(AutoConnectionSourceImpl.java:254) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.access$200(AutoConnectionSourceImpl.java:68) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl$UpdateLocatorListTask.run2(AutoConnectionSourceImpl.java:458) > at > org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1334) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Locked ownable synchronizers: > - java.util.concurrent.ThreadPoolExecutor$Worker@24cd39b5 > {noformat} > Impact on running cache operations: > Any operations in progress by the client connected to a locator that was > shut down can take 59 seconds to complete, which is the def
[jira] [Commented] (GEODE-8616) ClientServerCacheOperationDUnitTest > largeObjectPutWithReadTimeoutThrowsException fails with ServerConnectivityException : Pool unexpected socket timed out on client
[ https://issues.apache.org/jira/browse/GEODE-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17442142#comment-17442142 ] Mark Hanson commented on GEODE-8616: Added a couple of reproductions to the bug as attachments. These were reproduced using develop. > ClientServerCacheOperationDUnitTest > > largeObjectPutWithReadTimeoutThrowsException fails with > ServerConnectivityException : Pool unexpected socket timed out on client > -- > > Key: GEODE-8616 > URL: https://issues.apache.org/jira/browse/GEODE-8616 > Project: Geode > Issue Type: Bug >Affects Versions: 1.12.1 >Reporter: Donal Evans >Priority: Major > Labels: GeodeOperationAPI, flaky-test > Attachments: hansonm-findfailures-11-10-2021-23-52-38-logs.tgz, > hansonm-findfailures-11-10-2021-23-52-45-logs.tgz > > > {noformat} > > Task :geode-core:distributedTest > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest > > largeObjectPutWithReadTimeoutThrowsException FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest$$Lambda$177/0x000100b52040.run > in VM 2 running on Host c1346ab7b3e3 with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610) > at org.apache.geode.test.dunit.VM.invoke(VM.java:437) > at > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.largeObjectPutWithReadTimeoutThrowsException(ClientServerCacheOperationDUnitTest.java:117) > Caused by: > org.apache.geode.cache.client.ServerConnectivityException: Pool > unexpected socket timed out on client connection=Pooled Connection to > c1346ab7b3e3:35437: Connection[DESTROYED]). Server unreachable: could not > connect after 1 attempts > at > org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:659) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:501) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:153) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:108) > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:774) > at > org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:91) > at > org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:116) > at > org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2795) > at > org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1472) > at > org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1445) > at > org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:196) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1382) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1321) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1306) > at > org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:436) > at > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.lambda$largeObjectPutWithReadTimeoutThrowsException$3ab01cf6$2(ClientServerCacheOperationDUnitTest.java:120) > {noformat} > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-results/distributedTest/1601514101/ > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-artifacts/1601514101/distributedtestfiles-OpenJDK11-1.12.1-build.0106.tgz > This is a flaky failure. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (GEODE-8616) ClientServerCacheOperationDUnitTest > largeObjectPutWithReadTimeoutThrowsException fails with ServerConnectivityException : Pool unexpected socket timed out on client
[ https://issues.apache.org/jira/browse/GEODE-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-8616: --- Attachment: hansonm-findfailures-11-10-2021-23-52-38-logs.tgz hansonm-findfailures-11-10-2021-23-52-45-logs.tgz > ClientServerCacheOperationDUnitTest > > largeObjectPutWithReadTimeoutThrowsException fails with > ServerConnectivityException : Pool unexpected socket timed out on client > -- > > Key: GEODE-8616 > URL: https://issues.apache.org/jira/browse/GEODE-8616 > Project: Geode > Issue Type: Bug >Affects Versions: 1.12.1 >Reporter: Donal Evans >Priority: Major > Labels: GeodeOperationAPI, flaky-test > Attachments: hansonm-findfailures-11-10-2021-23-52-38-logs.tgz, > hansonm-findfailures-11-10-2021-23-52-45-logs.tgz > > > {noformat} > > Task :geode-core:distributedTest > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest > > largeObjectPutWithReadTimeoutThrowsException FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest$$Lambda$177/0x000100b52040.run > in VM 2 running on Host c1346ab7b3e3 with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610) > at org.apache.geode.test.dunit.VM.invoke(VM.java:437) > at > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.largeObjectPutWithReadTimeoutThrowsException(ClientServerCacheOperationDUnitTest.java:117) > Caused by: > org.apache.geode.cache.client.ServerConnectivityException: Pool > unexpected socket timed out on client connection=Pooled Connection to > c1346ab7b3e3:35437: Connection[DESTROYED]). Server unreachable: could not > connect after 1 attempts > at > org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:659) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:501) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:153) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:108) > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:774) > at > org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:91) > at > org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:116) > at > org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2795) > at > org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1472) > at > org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1445) > at > org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:196) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1382) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1321) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1306) > at > org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:436) > at > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.lambda$largeObjectPutWithReadTimeoutThrowsException$3ab01cf6$2(ClientServerCacheOperationDUnitTest.java:120) > {noformat} > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-results/distributedTest/1601514101/ > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-artifacts/1601514101/distributedtestfiles-OpenJDK11-1.12.1-build.0106.tgz > This is a flaky failure. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-8616) ClientServerCacheOperationDUnitTest > largeObjectPutWithReadTimeoutThrowsException fails with ServerConnectivityException : Pool unexpected socket timed out on client
[ https://issues.apache.org/jira/browse/GEODE-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-8616: -- Assignee: Mark Hanson > ClientServerCacheOperationDUnitTest > > largeObjectPutWithReadTimeoutThrowsException fails with > ServerConnectivityException : Pool unexpected socket timed out on client > -- > > Key: GEODE-8616 > URL: https://issues.apache.org/jira/browse/GEODE-8616 > Project: Geode > Issue Type: Bug >Affects Versions: 1.12.1 >Reporter: Donal Evans >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, flaky-test > > {noformat} > > Task :geode-core:distributedTest > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest > > largeObjectPutWithReadTimeoutThrowsException FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest$$Lambda$177/0x000100b52040.run > in VM 2 running on Host c1346ab7b3e3 with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610) > at org.apache.geode.test.dunit.VM.invoke(VM.java:437) > at > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.largeObjectPutWithReadTimeoutThrowsException(ClientServerCacheOperationDUnitTest.java:117) > Caused by: > org.apache.geode.cache.client.ServerConnectivityException: Pool > unexpected socket timed out on client connection=Pooled Connection to > c1346ab7b3e3:35437: Connection[DESTROYED]). Server unreachable: could not > connect after 1 attempts > at > org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:659) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:501) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:153) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:108) > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:774) > at > org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:91) > at > org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:116) > at > org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2795) > at > org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1472) > at > org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1445) > at > org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:196) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1382) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1321) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1306) > at > org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:436) > at > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.lambda$largeObjectPutWithReadTimeoutThrowsException$3ab01cf6$2(ClientServerCacheOperationDUnitTest.java:120) > {noformat} > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-results/distributedTest/1601514101/ > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-artifacts/1601514101/distributedtestfiles-OpenJDK11-1.12.1-build.0106.tgz > This is a flaky failure. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (GEODE-8616) ClientServerCacheOperationDUnitTest > largeObjectPutWithReadTimeoutThrowsException fails with ServerConnectivityException : Pool unexpected socket timed out on client
[ https://issues.apache.org/jira/browse/GEODE-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-8616: -- Assignee: (was: Mark Hanson) > ClientServerCacheOperationDUnitTest > > largeObjectPutWithReadTimeoutThrowsException fails with > ServerConnectivityException : Pool unexpected socket timed out on client > -- > > Key: GEODE-8616 > URL: https://issues.apache.org/jira/browse/GEODE-8616 > Project: Geode > Issue Type: Bug >Affects Versions: 1.12.1 >Reporter: Donal Evans >Priority: Major > Labels: GeodeOperationAPI, flaky-test > > {noformat} > > Task :geode-core:distributedTest > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest > > largeObjectPutWithReadTimeoutThrowsException FAILED > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest$$Lambda$177/0x000100b52040.run > in VM 2 running on Host c1346ab7b3e3 with 4 VMs > at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:610) > at org.apache.geode.test.dunit.VM.invoke(VM.java:437) > at > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.largeObjectPutWithReadTimeoutThrowsException(ClientServerCacheOperationDUnitTest.java:117) > Caused by: > org.apache.geode.cache.client.ServerConnectivityException: Pool > unexpected socket timed out on client connection=Pooled Connection to > c1346ab7b3e3:35437: Connection[DESTROYED]). Server unreachable: could not > connect after 1 attempts > at > org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:659) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:501) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:153) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:108) > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:774) > at > org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:91) > at > org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:116) > at > org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2795) > at > org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1472) > at > org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1445) > at > org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:196) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1382) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1321) > at > org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1306) > at > org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:436) > at > org.apache.geode.cache30.ClientServerCacheOperationDUnitTest.lambda$largeObjectPutWithReadTimeoutThrowsException$3ab01cf6$2(ClientServerCacheOperationDUnitTest.java:120) > {noformat} > =-=-=-=-=-=-=-=-=-=-=-=-=-=-= Test Results URI > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-results/distributedTest/1601514101/ > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Test report artifacts from this job are available at: > http://files.apachegeode-ci.info/builds/apache-support-1-12-main/1.12.1-build.0106/test-artifacts/1601514101/distributedtestfiles-OpenJDK11-1.12.1-build.0106.tgz > This is a flaky failure. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (GEODE-9425) AutoConnectionSource thread in client can't query for available locators when it is connected to a locator that was shut down
[ https://issues.apache.org/jira/browse/GEODE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437768#comment-17437768 ] Mark Hanson commented on GEODE-9425: Are there more logs available? > AutoConnectionSource thread in client can't query for available locators when > it is connected to a locator that was shut down > - > > Key: GEODE-9425 > URL: https://issues.apache.org/jira/browse/GEODE-9425 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.15.0 >Reporter: Lynn Gallinat >Assignee: Mark Hanson >Priority: Major > > The AutoConnectionSource thread runs in a client and queries the locator that > client is connected to so it can update the list of available locators. > But if the locator the client is connected to was shut down, the client > can't get an updated locator list. > In this case the locator was shut down and is not coming back, but there is > another available locator. > However we can't find out what that available locator is because we can't > complete the query. > To summarize: The AutoConnectionSource thread that runs in a client to update > the list of available locators should be able to get a list of available > locators even when that client is connected to a locator that was shut down. > The AutoConnectionSource thread starts and runs every 10 seconds. This is > from the client's system log. > [info 2021/07/07 19:37:33.723 GMT clientgemfire1_host1_881 > tid=0x2d] AutoConnectionSource > UpdateLocatorListTask started with interval=1 ms. > After the locator is shut down the AutoConnectionSource thread can't complete > its work so we get stuck threads. > This stuck thread stack shows it is trying to run UpdateLocatorListTask. > {noformat} > clientgemfire1_881/system.log: [warn 2021/07/07 19:47:25.784 GMT > clientgemfire1_host1_881 tid=0x36] Thread <286> (0x11e) that > was executed at <07 Jul 2021 19:46:03 GMT> has been stuck for <82.041 > seconds> and number of thread monitor iteration <1> > Thread Name state > Executor Group > Monitored metric > Thread stack for "poolTimer-pool-24" (0x11e): > java.lang.ThreadState: RUNNABLE (in native) > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) > - locked java.net.SocksSocketImpl@3e95a505 > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:607) > at > org.apache.geode.distributed.internal.tcpserver.AdvancedSocketCreatorImpl.connect(AdvancedSocketCreatorImpl.java:102) > at > org.apache.geode.internal.net.SCAdvancedSocketCreator.connect(SCAdvancedSocketCreator.java:51) > at > org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.connect(ClusterSocketCreatorImpl.java:96) > at > org.apache.geode.distributed.internal.tcpserver.TcpClient.getServerVersion(TcpClient.java:246) > at > org.apache.geode.distributed.internal.tcpserver.TcpClient.requestToServer(TcpClient.java:151) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocatorUsingConnection(AutoConnectionSourceImpl.java:217) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocator(AutoConnectionSourceImpl.java:207) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryLocators(AutoConnectionSourceImpl.java:254) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.access$200(AutoConnectionSourceImpl.java:68) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl$UpdateLocatorListTask.run2(AutoConnectionSourceImpl.java:458) > at > org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1334) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Locked ownable synchronizers: > - java.util.concurrent.ThreadPoolExecutor$Worker@24cd39b5 > {noformat} > Impact on running cache operations: > Any operations in progress by the client connected to a
[jira] [Assigned] (GEODE-9425) AutoConnectionSource thread in client can't query for available locators when it is connected to a locator that was shut down
[ https://issues.apache.org/jira/browse/GEODE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9425: -- Assignee: Mark Hanson > AutoConnectionSource thread in client can't query for available locators when > it is connected to a locator that was shut down > - > > Key: GEODE-9425 > URL: https://issues.apache.org/jira/browse/GEODE-9425 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.15.0 >Reporter: Lynn Gallinat >Assignee: Mark Hanson >Priority: Major > > The AutoConnectionSource thread runs in a client and queries the locator that > client is connected to so it can update the list of available locators. > But if the locator the client is connected to was shut down, the client > can't get an updated locator list. > In this case the locator was shut down and is not coming back, but there is > another available locator. > However we can't find out what that available locator is because we can't > complete the query. > To summarize: The AutoConnectionSource thread that runs in a client to update > the list of available locators should be able to get a list of available > locators even when that client is connected to a locator that was shut down. > The AutoConnectionSource thread starts and runs every 10 seconds. This is > from the client's system log. > [info 2021/07/07 19:37:33.723 GMT clientgemfire1_host1_881 > tid=0x2d] AutoConnectionSource > UpdateLocatorListTask started with interval=1 ms. > After the locator is shut down the AutoConnectionSource thread can't complete > its work so we get stuck threads. > This stuck thread stack shows it is trying to run UpdateLocatorListTask. > {noformat} > clientgemfire1_881/system.log: [warn 2021/07/07 19:47:25.784 GMT > clientgemfire1_host1_881 tid=0x36] Thread <286> (0x11e) that > was executed at <07 Jul 2021 19:46:03 GMT> has been stuck for <82.041 > seconds> and number of thread monitor iteration <1> > Thread Name state > Executor Group > Monitored metric > Thread stack for "poolTimer-pool-24" (0x11e): > java.lang.ThreadState: RUNNABLE (in native) > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) > - locked java.net.SocksSocketImpl@3e95a505 > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:607) > at > org.apache.geode.distributed.internal.tcpserver.AdvancedSocketCreatorImpl.connect(AdvancedSocketCreatorImpl.java:102) > at > org.apache.geode.internal.net.SCAdvancedSocketCreator.connect(SCAdvancedSocketCreator.java:51) > at > org.apache.geode.distributed.internal.tcpserver.ClusterSocketCreatorImpl.connect(ClusterSocketCreatorImpl.java:96) > at > org.apache.geode.distributed.internal.tcpserver.TcpClient.getServerVersion(TcpClient.java:246) > at > org.apache.geode.distributed.internal.tcpserver.TcpClient.requestToServer(TcpClient.java:151) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocatorUsingConnection(AutoConnectionSourceImpl.java:217) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryOneLocator(AutoConnectionSourceImpl.java:207) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.queryLocators(AutoConnectionSourceImpl.java:254) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.access$200(AutoConnectionSourceImpl.java:68) > at > org.apache.geode.cache.client.internal.AutoConnectionSourceImpl$UpdateLocatorListTask.run2(AutoConnectionSourceImpl.java:458) > at > org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1334) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Locked ownable synchronizers: > - java.util.concurrent.ThreadPoolExecutor$Worker@24cd39b5 > {noformat} > Impact on running cache operations: > Any operations in progress by the client connected to a locator that was > shut down can take 59 seconds to co
[jira] [Resolved] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails
[ https://issues.apache.org/jira/browse/GEODE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-9645. Fix Version/s: 1.15.0 Resolution: Fixed The code will not send DataSerializer registration notifications when using multiuser authentication. > MultiUserAuth: DataSerializerRecoveryListener is called without auth > information. Promptly fails > > > Key: GEODE-9645 > URL: https://issues.apache.org/jira/browse/GEODE-9645 > Project: Geode > Issue Type: Bug > Components: core >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, pull-request-available > Fix For: 1.15.0 > > > When multiuserSecureModeEnabled is enabled, a user may register a > DataSerializer. When endpoint manager detects a new endpoint, it will attempt > to register the data serializers with other machines. This is a problem was > there is no authentication information in the background process to > authenticate. Hence the error seen below. > > {noformat} > [warn 2021/09/27 18:03:02.804 PDT tid=0x62] > DataSerializerRecoveryTask - Error recovering dataSerializers: > java.lang.UnsupportedOperationException: Use Pool APIs for doing operations > when multiuser-secure-mode-enabled is set to true. > at > org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) > > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) > at > org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40) > > at > org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116) > > at > org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-9617) CI Failure: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size
[ https://issues.apache.org/jira/browse/GEODE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9617: -- Assignee: Mark Hanson > CI Failure: PartitionedRegionSingleHopDUnitTest fails with > ConditionTimeoutException waiting for server to bucket map size > -- > > Key: GEODE-9617 > URL: https://issues.apache.org/jira/browse/GEODE-9617 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.15.0 >Reporter: Kirk Lund >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage, pull-request-available > > {noformat} > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > > testClientMetadataForPersistentPrs FAILED > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses > org.apache.geode.cache.client.internal.ClientMetadataService, > org.apache.geode.cache.client.internal.ClientMetadataServiceorg.apache.geode.cache.Region > > Expecting actual not to be null within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723) > at > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testClientMetadataForPersistentPrs(PartitionedRegionSingleHopDUnitTest.java:971) > Caused by: > java.lang.AssertionError: > Expecting actual not to be null > at > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testClientMetadataForPersistentPrs$26(PartitionedRegionSingleHopDUnitTest.java:976) > {noformat} > {noformat} > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > > testMetadataServiceCallAccuracy_FromGetOp FAILED > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses > org.apache.geode.cache.client.internal.ClientMetadataService > Expecting value to be false but was true expected:<[fals]e> but > was:<[tru]e> within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723) > at > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testMetadataServiceCallAccuracy_FromGetOp(PartitionedRegionSingleHopDUnitTest.java:394) > Caused by: > org.junit.ComparisonFailure: > Expecting value to be false but was true expected:<[fals]e> but > was:<[tru]e> > at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testMetadataServiceCallAccuracy_FromGetOp$6(PartitionedRegionSingleHopDUnitTest.java:395) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-9617) CI Failure: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size
[ https://issues.apache.org/jira/browse/GEODE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17427402#comment-17427402 ] Mark Hanson commented on GEODE-9617: I did a little cleanup and added an assert that should help in future test runs. It actually moved the error to the latch assertions that I made, which means the failure was being missed before. > CI Failure: PartitionedRegionSingleHopDUnitTest fails with > ConditionTimeoutException waiting for server to bucket map size > -- > > Key: GEODE-9617 > URL: https://issues.apache.org/jira/browse/GEODE-9617 > Project: Geode > Issue Type: Bug > Components: client/server >Affects Versions: 1.15.0 >Reporter: Kirk Lund >Priority: Major > Labels: needsTriage, pull-request-available > > {noformat} > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > > testClientMetadataForPersistentPrs FAILED > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses > org.apache.geode.cache.client.internal.ClientMetadataService, > org.apache.geode.cache.client.internal.ClientMetadataServiceorg.apache.geode.cache.Region > > Expecting actual not to be null within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723) > at > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testClientMetadataForPersistentPrs(PartitionedRegionSingleHopDUnitTest.java:971) > Caused by: > java.lang.AssertionError: > Expecting actual not to be null > at > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testClientMetadataForPersistentPrs$26(PartitionedRegionSingleHopDUnitTest.java:976) > {noformat} > {noformat} > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest > > testMetadataServiceCallAccuracy_FromGetOp FAILED > org.awaitility.core.ConditionTimeoutException: Assertion condition > defined as a lambda expression in > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest that uses > org.apache.geode.cache.client.internal.ClientMetadataService > Expecting value to be false but was true expected:<[fals]e> but > was:<[tru]e> within 5 minutes. > at > org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:166) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119) > at > org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31) > at > org.awaitility.core.ConditionFactory.until(ConditionFactory.java:939) > at > org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:723) > at > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.testMetadataServiceCallAccuracy_FromGetOp(PartitionedRegionSingleHopDUnitTest.java:394) > Caused by: > org.junit.ComparisonFailure: > Expecting value to be false but was true expected:<[fals]e> but > was:<[tru]e> > at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at > org.apache.geode.internal.cache.PartitionedRegionSingleHopDUnitTest.lambda$testMetadataServiceCallAccuracy_FromGetOp$6(PartitionedRegionSingleHopDUnitTest.java:395) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails
[ https://issues.apache.org/jira/browse/GEODE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9645: --- Labels: pull-request-available (was: needsTriage pull-request-available) > MultiUserAuth: DataSerializerRecoveryListener is called without auth > information. Promptly fails > > > Key: GEODE-9645 > URL: https://issues.apache.org/jira/browse/GEODE-9645 > Project: Geode > Issue Type: Bug > Components: core >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: pull-request-available > > When multiuserSecureModeEnabled is enabled, a user may register a > DataSerializer. When endpoint manager detects a new endpoint, it will attempt > to register the data serializers with other machines. This is a problem was > there is no authentication information in the background process to > authenticate. Hence the error seen below. > > {noformat} > [warn 2021/09/27 18:03:02.804 PDT tid=0x62] > DataSerializerRecoveryTask - Error recovering dataSerializers: > java.lang.UnsupportedOperationException: Use Pool APIs for doing operations > when multiuser-secure-mode-enabled is set to true. > at > org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) > > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) > at > org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40) > > at > org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116) > > at > org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.
[ https://issues.apache.org/jira/browse/GEODE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9647: --- Labels: (was: needsTriage) > MultiUserAuth: DataSerializer.Register throws when attempting to register a > new DataSerializer. > --- > > Key: GEODE-9647 > URL: https://issues.apache.org/jira/browse/GEODE-9647 > Project: Geode > Issue Type: Bug > Components: core >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > > When multiuserSecureModeEnabled is set, a user may attempt to register a > DataSerializer, but will get the following error. The reason is that the > PoolImpl needs credentials to authenticate against, which it does not have. > > {noformat} > [warn 2021/09/28 10:32:42.470 PDT tid=0x1] Error registering > instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs > for doing operations when multiuser-secure-mode-enabled is set to true. > at > org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) > > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800) > at > org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34) > > at > org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264) > > at > org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197) > > at > org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093) > > at > org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966) > at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) > at > org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > > at > org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38) > > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) > > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139) > > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) > > at > com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) > > at > com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235) > > at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.
[ https://issues.apache.org/jira/browse/GEODE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-9647. Resolution: Duplicate One solution can fix both of these issues as they are related. So closing this issue as a duplicate. > MultiUserAuth: DataSerializer.Register throws when attempting to register a > new DataSerializer. > --- > > Key: GEODE-9647 > URL: https://issues.apache.org/jira/browse/GEODE-9647 > Project: Geode > Issue Type: Bug > Components: core >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage > > When multiuserSecureModeEnabled is set, a user may attempt to register a > DataSerializer, but will get the following error. The reason is that the > PoolImpl needs credentials to authenticate against, which it does not have. > > {noformat} > [warn 2021/09/28 10:32:42.470 PDT tid=0x1] Error registering > instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs > for doing operations when multiuser-secure-mode-enabled is set to true. > at > org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) > > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800) > at > org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34) > > at > org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264) > > at > org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197) > > at > org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093) > > at > org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966) > at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) > at > org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > > at > org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38) > > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) > > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139) > > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) > > at > com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) > > at > com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235) > > at com.intellij.rt.junit.JUnitStarter.main(JUnitStart
[jira] [Assigned] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.
[ https://issues.apache.org/jira/browse/GEODE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9647: -- Assignee: Mark Hanson > MultiUserAuth: DataSerializer.Register throws when attempting to register a > new DataSerializer. > --- > > Key: GEODE-9647 > URL: https://issues.apache.org/jira/browse/GEODE-9647 > Project: Geode > Issue Type: Bug > Components: core >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage > > When multiuserSecureModeEnabled is set, a user may attempt to register a > DataSerializer, but will get the following error. The reason is that the > PoolImpl needs credentials to authenticate against, which it does not have. > > {noformat} > [warn 2021/09/28 10:32:42.470 PDT tid=0x1] Error registering > instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs > for doing operations when multiuser-secure-mode-enabled is set to true. > at > org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) > > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800) > at > org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34) > > at > org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264) > > at > org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197) > > at > org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093) > > at > org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966) > at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) > at > org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > > at > org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38) > > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) > > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139) > > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) > > at > com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) > > at > com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235) > > at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails
[ https://issues.apache.org/jira/browse/GEODE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9645: -- Assignee: Mark Hanson > MultiUserAuth: DataSerializerRecoveryListener is called without auth > information. Promptly fails > > > Key: GEODE-9645 > URL: https://issues.apache.org/jira/browse/GEODE-9645 > Project: Geode > Issue Type: Bug > Components: core >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: needsTriage > > When multiuserSecureModeEnabled is enabled, a user may register a > DataSerializer. When endpoint manager detects a new endpoint, it will attempt > to register the data serializers with other machines. This is a problem was > there is no authentication information in the background process to > authenticate. Hence the error seen below. > > {noformat} > [warn 2021/09/27 18:03:02.804 PDT tid=0x62] > DataSerializerRecoveryTask - Error recovering dataSerializers: > java.lang.UnsupportedOperationException: Use Pool APIs for doing operations > when multiuser-secure-mode-enabled is set to true. > at > org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) > > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) > at > org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40) > > at > org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116) > > at > org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.
[ https://issues.apache.org/jira/browse/GEODE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17421565#comment-17421565 ] Mark Hanson commented on GEODE-9647: The solution appears to be to plumb the regionService down into the DataSerializer class call to register. Doing so appears to alleviate the issue. > MultiUserAuth: DataSerializer.Register throws when attempting to register a > new DataSerializer. > --- > > Key: GEODE-9647 > URL: https://issues.apache.org/jira/browse/GEODE-9647 > Project: Geode > Issue Type: Bug > Components: core >Affects Versions: 1.15.0 >Reporter: Mark Hanson >Priority: Major > Labels: needsTriage > > When multiuserSecureModeEnabled is set, a user may attempt to register a > DataSerializer, but will get the following error. The reason is that the > PoolImpl needs credentials to authenticate against, which it does not have. > > {noformat} > [warn 2021/09/28 10:32:42.470 PDT tid=0x1] Error registering > instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs > for doing operations when multiuser-secure-mode-enabled is set to true. > at > org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) > > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800) > at > org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34) > > at > org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264) > > at > org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197) > > at > org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093) > > at > org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966) > at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) > at > org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > > at > org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38) > > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) > > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139) > > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) > > at > com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) > > at > com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235) > > at com.intel
[jira] [Updated] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.
[ https://issues.apache.org/jira/browse/GEODE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9647: --- Description: When multiuserSecureModeEnabled is set, a user may attempt to register a DataSerializer, but will get the following error. The reason is that the PoolImpl needs credentials to authenticate against, which it does not have. {noformat} [warn 2021/09/28 10:32:42.470 PDT tid=0x1] Error registering instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs for doing operations when multiuser-secure-mode-enabled is set to true. at org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800) at org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34) at org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264) at org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197) at org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093) at org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966) at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) at org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) at org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54) {noformat} was: When multiuserSecureModeEnabled is set, a user may attempt to register a DataSerializer, but will get the following error. The reason is that the PoolImpl needs credentials to authenticate against, which it does not have. {noformat} [warn 2021/09/28 10:32:42.470 PDT tid=0x1] Error registering instantiator on pool:[warn 2021/09/28 10:32:42.470 PDT tid=0x1] Error registering instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs for doing operations when multiuser-secure-mode-enabled is set to true. at org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800) at org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:3
[jira] [Created] (GEODE-9647) MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer.
Mark Hanson created GEODE-9647: -- Summary: MultiUserAuth: DataSerializer.Register throws when attempting to register a new DataSerializer. Key: GEODE-9647 URL: https://issues.apache.org/jira/browse/GEODE-9647 Project: Geode Issue Type: Bug Components: core Affects Versions: 1.15.0 Reporter: Mark Hanson When multiuserSecureModeEnabled is set, a user may attempt to register a DataSerializer, but will get the following error. The reason is that the PoolImpl needs credentials to authenticate against, which it does not have. {noformat} [warn 2021/09/28 10:32:42.470 PDT tid=0x1] Error registering instantiator on pool:[warn 2021/09/28 10:32:42.470 PDT tid=0x1] Error registering instantiator on pool:java.lang.UnsupportedOperationException: Use Pool APIs for doing operations when multiuser-secure-mode-enabled is set to true. at org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:800) at org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:34) at org.apache.geode.internal.cache.PoolManagerImpl.allPoolsRegisterDataSerializers(PoolManagerImpl.java:264) at org.apache.geode.internal.InternalDataSerializer.sendRegistrationMessageToServers(InternalDataSerializer.java:1197) at org.apache.geode.internal.InternalDataSerializer._register(InternalDataSerializer.java:1093) at org.apache.geode.internal.InternalDataSerializer.register(InternalDataSerializer.java:966) at org.apache.geode.DataSerializer.register(DataSerializer.java:2900) at org.apache.geode.management.internal.security.MultiUserAuthenticationDUnitTest.multiAuthenticatedView(MultiUserAuthenticationDUnitTest.java:152) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.apache.geode.test.junit.rules.DescribedExternalResource$1.evaluate(DescribedExternalResource.java:40) at org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:139) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails
[ https://issues.apache.org/jira/browse/GEODE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9645: --- Description: When multiuserSecureModeEnabled is enabled, a user may register a DataSerializer. When endpoint manager detects a new endpoint, it will attempt to register the data serializers with other machines. This is a problem was there is no authentication information in the background process to authenticate. Hence the error seen below. {noformat} [warn 2021/09/27 18:03:02.804 PDT tid=0x62] DataSerializerRecoveryTask - Error recovering dataSerializers: java.lang.UnsupportedOperationException: Use Pool APIs for doing operations when multiuser-secure-mode-enabled is set to true. at org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) at org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40) at org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116) at org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748){noformat} was: When multiuserSecureModeEnabled is enabled, a user may register a DataSerializer. When endpoint manager detects a new endpoint, it will attempt to register the data serializers with other machines. This is a problem was there is no authentication information in the background process to authenticate. Hence the error seen below. {noformat} [warn 2021/09/27 18:03:02.804 PDT tid=0x62] DataSerializerRecoveryTask - Error recovering dataSerializers: [warn 2021/09/27 18:03:02.804 PDT tid=0x62] DataSerializerRecoveryTask - Error recovering dataSerializers: java.lang.UnsupportedOperationException: Use Pool APIs for doing operations when multiuser-secure-mode-enabled is set to true. at org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) at org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40) at org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116) at org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748){noformat} > MultiUserAuth: DataSerializerRecoveryListener is called without auth > information. Promptly fails > > > Key: GEODE-9645 > URL: https://issues.apache.org/jira/browse/GEODE-9645 > Project: Geode > Issue Type: Bug > Components: core >Reporter: Mark Hanson >Priority: Major > Labels: needsTriage > > When multiuserSecureModeEnabled is enabled, a user may register a > DataSerializer. When endpoint manager detects a new endpoint, it will attempt > to register the data serializers with other machines. This is a problem was > there is no authentication information in the background process to > authenticate. Hence the error seen below. > > {noformat} > [warn 2021/09/27 18:03:02.804 PDT tid=0x62] > DataSerializerRecoveryTask - Error recovering dataSerializers: > java.lang.UnsupportedOperationException: Use Pool APIs for doing operations > when multiuser-secure-mode-enabled is set to true. > at > org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) > > at > org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) > at > org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40) > > at > org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116) > > at > org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-9645) MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails
Mark Hanson created GEODE-9645: -- Summary: MultiUserAuth: DataSerializerRecoveryListener is called without auth information. Promptly fails Key: GEODE-9645 URL: https://issues.apache.org/jira/browse/GEODE-9645 Project: Geode Issue Type: Bug Components: core Reporter: Mark Hanson When multiuserSecureModeEnabled is enabled, a user may register a DataSerializer. When endpoint manager detects a new endpoint, it will attempt to register the data serializers with other machines. This is a problem was there is no authentication information in the background process to authenticate. Hence the error seen below. {noformat} [warn 2021/09/27 18:03:02.804 PDT tid=0x62] DataSerializerRecoveryTask - Error recovering dataSerializers: [warn 2021/09/27 18:03:02.804 PDT tid=0x62] DataSerializerRecoveryTask - Error recovering dataSerializers: java.lang.UnsupportedOperationException: Use Pool APIs for doing operations when multiuser-secure-mode-enabled is set to true. at org.apache.geode.cache.client.internal.PoolImpl.authenticateIfRequired(PoolImpl.java:1540) at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:816) at org.apache.geode.cache.client.internal.RegisterDataSerializersOp.execute(RegisterDataSerializersOp.java:40) at org.apache.geode.cache.client.internal.DataSerializerRecoveryListener$RecoveryTask.run2(DataSerializerRecoveryListener.java:116) at org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1337) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748){noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-9365) HARegionQueue over throttles when multiple threads attempt concurrent adds
[ https://issues.apache.org/jira/browse/GEODE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-9365. Fix Version/s: 1.15.0 Resolution: Fixed This fix includes changes to reduce the number of semaphores used in the HARegionQueue while waiting for permission to add more items into the queue. This reduced two semaphores to one and switched several fields to Atomics > HARegionQueue over throttles when multiple threads attempt concurrent adds > -- > > Key: GEODE-9365 > URL: https://issues.apache.org/jira/browse/GEODE-9365 > Project: Geode > Issue Type: Bug > Components: client queues >Reporter: Darrel Schneider >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI, pull-request-available > Fix For: 1.15.0 > > > HARegionQueue.checkQueueSizeConstraint has some code that implements a > "throttle" on adds to a queue that is full. It is supposed to wait > "eventEnqueueWaitTime" before doing an add. But because this code does two > syncs (putGuard and permitMon) and only waits on one of them, it holds the > other sync for the duration of this threads throttle. Any other concurrent > thread trying to add to the queue gets stuck on the putGuard sync that is > held by the first thread that is doing the timed wait. So it ends up waiting > "eventEnqueueWaitTime" to acquire the first sync and then ends up waiting > again "eventEnqueueWaitTime" when it does its own timed wait. If you have 10 > concurrent threads trying to add one of them will end up waiting 10 * > "eventEnqueueWaitTime". > A couple ideas of how to fix this. Get rid of the putGuard and just use > permitMon. Then as soon as the first thread goes into its timed wait another > thread is allowed to sync on permitMon. But if this is done then we need to > think carefully about the code inside this sync block since it can not be > executed while one or more other threads are waiting in permitMon. > The other solution would be to compute the elapsed time it took to get into > the first sync and subtract that from the time we wait on permitMon. This > seems like a simple solution but does introduce at least one call of get time > (the second call is only needed if the queue is full). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-9603) BlockingHARegionJUnitTest is in need of a refactor. It is poorly written by current standards.
Mark Hanson created GEODE-9603: -- Summary: BlockingHARegionJUnitTest is in need of a refactor. It is poorly written by current standards. Key: GEODE-9603 URL: https://issues.apache.org/jira/browse/GEODE-9603 Project: Geode Issue Type: Improvement Components: tests Affects Versions: 1.15.0 Reporter: Mark Hanson Both exception and thread handling could use some modernization... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (GEODE-9554) Rebalancing a region with multiple redundancy zones can fail
[ https://issues.apache.org/jira/browse/GEODE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-9554. Fix Version/s: 1.15.0 1.14.1 1.13.5 1.12.5 Resolution: Fixed This fix to this issue was to ensure that we were not deleting the last copy of a bucket in a redundancy zone. > Rebalancing a region with multiple redundancy zones can fail > > > Key: GEODE-9554 > URL: https://issues.apache.org/jira/browse/GEODE-9554 > Project: Geode > Issue Type: Bug > Components: core >Affects Versions: 1.12.4, 1.13.4, 1.14.0, 1.15.0 >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: pull-request-available > Fix For: 1.12.5, 1.13.5, 1.14.1, 1.15.0 > > > When attempting to rebalance a region with multiple redundancy zones, the > code does not distinguish between zones when deleting redundant bucket > copies. This can mean that a bucket from a different zone gets deleted > leaving the servers in a state of reduced redundancy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-9554) Rebalancing a region with multiple redundancy zones can fail
[ https://issues.apache.org/jira/browse/GEODE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17412119#comment-17412119 ] Mark Hanson commented on GEODE-9554: I have a new fix that should address this issue better, plus some additional testing. The core change is to make sure that we don't delete the last copy in a redundancy zone. > Rebalancing a region with multiple redundancy zones can fail > > > Key: GEODE-9554 > URL: https://issues.apache.org/jira/browse/GEODE-9554 > Project: Geode > Issue Type: Bug > Components: core >Affects Versions: 1.12.4, 1.13.4, 1.14.0, 1.15.0 >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: pull-request-available > > When attempting to rebalance a region with multiple redundancy zones, the > code does not distinguish between zones when deleting redundant bucket > copies. This can mean that a bucket from a different zone gets deleted > leaving the servers in a state of reduced redundancy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-9584) PartitionedRegionLoadModel.createRedundantBucket
Mark Hanson created GEODE-9584: -- Summary: PartitionedRegionLoadModel.createRedundantBucket Key: GEODE-9584 URL: https://issues.apache.org/jira/browse/GEODE-9584 Project: Geode Issue Type: Improvement Components: tests Affects Versions: 1.15.0 Reporter: Mark Hanson PartitionedRegionLoadModel.createRedundantBucket needs unit testing. It does not currently have any JUnit tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-9554) Rebalancing a region with multiple redundancy zones can fail
[ https://issues.apache.org/jira/browse/GEODE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408994#comment-17408994 ] Mark Hanson commented on GEODE-9554: I have a new fix for this that will address all of the known cases, but the problem right now is getting the test to pass consecutively. I am going to get some help on this. > Rebalancing a region with multiple redundancy zones can fail > > > Key: GEODE-9554 > URL: https://issues.apache.org/jira/browse/GEODE-9554 > Project: Geode > Issue Type: Bug > Components: core >Affects Versions: 1.12.4, 1.13.4, 1.14.0, 1.15.0 >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: pull-request-available > > When attempting to rebalance a region with multiple redundancy zones, the > code does not distinguish between zones when deleting redundant bucket > copies. This can mean that a bucket from a different zone gets deleted > leaving the servers in a state of reduced redundancy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (GEODE-9554) Rebalancing a region with multiple redundancy zones can fail
[ https://issues.apache.org/jira/browse/GEODE-9554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9554: -- Assignee: Mark Hanson > Rebalancing a region with multiple redundancy zones can fail > > > Key: GEODE-9554 > URL: https://issues.apache.org/jira/browse/GEODE-9554 > Project: Geode > Issue Type: Bug > Components: core >Affects Versions: 1.12.4, 1.13.4, 1.14.0, 1.15.0 >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: pull-request-available > > When attempting to rebalance a region with multiple redundancy zones, the > code does not distinguish between zones when deleting redundant bucket > copies. This can mean that a bucket from a different zone gets deleted > leaving the servers in a state of reduced redundancy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-9554) Rebalancing a region with multiple redundancy zones can fail
Mark Hanson created GEODE-9554: -- Summary: Rebalancing a region with multiple redundancy zones can fail Key: GEODE-9554 URL: https://issues.apache.org/jira/browse/GEODE-9554 Project: Geode Issue Type: Bug Components: core Affects Versions: 1.13.4, 1.12.4, 1.14.0, 1.15.0 Reporter: Mark Hanson When attempting to rebalance a region with multiple redundancy zones, the code does not distinguish between zones when deleting redundant bucket copies. This can mean that a bucket from a different zone gets deleted leaving the servers in a state of reduced redundancy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (GEODE-9520) create index gfsh command behavior seems inconsistent with other commands
[ https://issues.apache.org/jira/browse/GEODE-9520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson updated GEODE-9520: --- Description: The way the command currently works, when you it uses the group of the region to determine the target members to perform the operation on. When you create a region, it doesn't need the group or anything, it just works. Seems like create index should be able to find the region on a member without an issue. was: When you create a cluster from an XML with no group specified, then you try to create an index in a subregion that exists, the command will error out with "Region root/transRegion does not exist." The region exists, but in the background the command is using the group as a way to find the target members because there is no group, there are not target members and the command cannot complete. The problem is that the error message is not good. I suggest changing the error message to say something more useful like "Could not find the region abc in a group. Please specify a target member or target members." Summary: create index gfsh command behavior seems inconsistent with other commands (was: Error message is not useful when trying to create index after using a cache xml for startup) > create index gfsh command behavior seems inconsistent with other commands > - > > Key: GEODE-9520 > URL: https://issues.apache.org/jira/browse/GEODE-9520 > Project: Geode > Issue Type: Bug > Components: gfsh >Affects Versions: 1.12.4, 1.13.4, 1.14.0, 1.15.0 >Reporter: Mark Hanson >Priority: Major > > The way the command currently works, when you it uses the group of the region > to determine the target members to perform the operation on. When you create > a region, it doesn't need the group or anything, it just works. Seems like > create index should be able to find the region on a member without an issue. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (GEODE-9520) Error message is not useful when trying to create index after using a cache xml for startup
Mark Hanson created GEODE-9520: -- Summary: Error message is not useful when trying to create index after using a cache xml for startup Key: GEODE-9520 URL: https://issues.apache.org/jira/browse/GEODE-9520 Project: Geode Issue Type: Bug Components: gfsh Affects Versions: 1.13.4, 1.12.4, 1.14.0, 1.15.0 Reporter: Mark Hanson When you create a cluster from an XML with no group specified, then you try to create an index in a subregion that exists, the command will error out with "Region root/transRegion does not exist." The region exists, but in the background the command is using the group as a way to find the target members because there is no group, there are not target members and the command cannot complete. The problem is that the error message is not good. I suggest changing the error message to say something more useful like "Could not find the region abc in a group. Please specify a target member or target members." -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-9490) CI failure: NativeRedisSessionAcceptanceTest > executionError
[ https://issues.apache.org/jira/browse/GEODE-9490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17397481#comment-17397481 ] Mark Hanson commented on GEODE-9490: Ignore ^ build 108 failure note. > CI failure: NativeRedisSessionAcceptanceTest > executionError > - > > Key: GEODE-9490 > URL: https://issues.apache.org/jira/browse/GEODE-9490 > Project: Geode > Issue Type: Test > Components: redis, tests >Reporter: Jens Deppe >Assignee: Jens Deppe >Priority: Major > Labels: pull-request-available > > {noformat} > NativeRedisSessionAcceptanceTest > executionError FAILED > java.lang.AssertionError: Suspicious strings were written to the log > during this run. > Fix the strings or use IgnoredException.addIgnoredException to ignore. > --- > Found suspect string in 'dunit_suspect-local.log' at line 1611 > [error 2021/08/05 23:35:01.484 UTC tid=78] Failed to > return response on inboundChannel > io.netty.channel.StacklessClosedChannelException > at io.netty.channel.AbstractChannel$AbstractUnsafe.write(Object, > ChannelPromise)(Unknown Source) > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:409) > at > org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:425) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:186) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70) > at > org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at java.util.Iterator.forEachRemaining(Iterator.java:116) > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485) > at > org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82) > at > org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) > at > org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) > at > org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79) > at > org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProce
[jira] [Resolved] (GEODE-9194) Move PR clear related statistics to the appropriate classes
[ https://issues.apache.org/jira/browse/GEODE-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson resolved GEODE-9194. Fix Version/s: 1.15.0 Assignee: Mark Hanson Resolution: Fixed This has been merged to feature/GEODE-7665 > Move PR clear related statistics to the appropriate classes > --- > > Key: GEODE-9194 > URL: https://issues.apache.org/jira/browse/GEODE-9194 > Project: Geode > Issue Type: New Feature > Components: statistics >Reporter: Mark Hanson >Assignee: Mark Hanson >Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > Currently there are PR clear statistics that are not a part of the > Partitioned Region Stats. This feature work is to track the movement of those > stats. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-9365) HARegionQueue over throttles when multiple threads attempt concurrent adds
[ https://issues.apache.org/jira/browse/GEODE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378311#comment-17378311 ] Mark Hanson commented on GEODE-9365: I am testing the "other solution". That seems like a simple patch. The larger questions I think are interesting and I will need to investigate further. > HARegionQueue over throttles when multiple threads attempt concurrent adds > -- > > Key: GEODE-9365 > URL: https://issues.apache.org/jira/browse/GEODE-9365 > Project: Geode > Issue Type: Bug > Components: client queues >Reporter: Darrel Schneider >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI > > HARegionQueue.checkQueueSizeConstraint has some code that implements a > "throttle" on adds to a queue that is full. It is supposed to wait > "eventEnqueueWaitTime" before doing an add. But because this code does two > syncs (putGuard and permitMon) and only waits on one of them, it holds the > other sync for the duration of this threads throttle. Any other concurrent > thread trying to add to the queue gets stuck on the putGuard sync that is > held by the first thread that is doing the timed wait. So it ends up waiting > "eventEnqueueWaitTime" to acquire the first sync and then ends up waiting > again "eventEnqueueWaitTime" when it does its own timed wait. If you have 10 > concurrent threads trying to add one of them will end up waiting 10 * > "eventEnqueueWaitTime". > A couple ideas of how to fix this. Get rid of the putGuard and just use > permitMon. Then as soon as the first thread goes into its timed wait another > thread is allowed to sync on permitMon. But if this is done then we need to > think carefully about the code inside this sync block since it can not be > executed while one or more other threads are waiting in permitMon. > The other solution would be to compute the elapsed time it took to get into > the first sync and subtract that from the time we wait on permitMon. This > seems like a simple solution but does introduce at least one call of get time > (the second call is only needed if the queue is full). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (GEODE-8064) DeploymentSemanticVersionJarDUnitTest.java (GEODE-7421) is failing.
[ https://issues.apache.org/jira/browse/GEODE-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378295#comment-17378295 ] Mark Hanson commented on GEODE-8064: Another issue with this test {noformat} org.apache.geode.management.internal.rest.DeploymentSemanticVersionJarDUnitTest > deploySameJarNameWithDifferentContent FAILEDorg.apache.geode.management.internal.rest.DeploymentSemanticVersionJarDUnitTest > deploySameJarNameWithDifferentContent FAILED java.lang.AssertionError: Suspicious strings were written to the log during this run. Fix the strings or use IgnoredException.addIgnoredException to ignore. --- Found suspect string in 'dunit_suspect-vm0.log' at line 763 ZMÐÈhÌ.ßÝÒ¡3ÐþÕ îÑTæ:£#¹±K÷¦nÀÞ0ö¡?¢èZy@*¤MáÚâ©øa칤ò ½PKüh#pjPKEéR timestamp3432µ05547016PK½{}¨ PKEéRüh#pjjddunit/function/Def.classþÊPKEéR½{}¨ ùtimestampPK? --KpEt0WRhuP7_uJjerp4keHy2JOGeQ6 Content-Disposition: form-data; name="config" Content-Type: application/json at org.junit.Assert.fail(Assert.java:89) at org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:409) at org.apache.geode.test.dunit.internal.DUnitLauncher.closeAndCheckForSuspects(DUnitLauncher.java:425) at org.apache.geode.test.dunit.rules.ClusterStartupRule.after(ClusterStartupRule.java:186) at org.apache.geode.test.dunit.rules.ClusterStartupRule.access$100(ClusterStartupRule.java:70) at org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:141) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:566) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33) at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94) at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:119) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:566) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(Refle
[jira] [Assigned] (GEODE-9365) HARegionQueue over throttles when multiple threads attempt concurrent adds
[ https://issues.apache.org/jira/browse/GEODE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hanson reassigned GEODE-9365: -- Assignee: Mark Hanson > HARegionQueue over throttles when multiple threads attempt concurrent adds > -- > > Key: GEODE-9365 > URL: https://issues.apache.org/jira/browse/GEODE-9365 > Project: Geode > Issue Type: Bug > Components: client queues >Reporter: Darrel Schneider >Assignee: Mark Hanson >Priority: Major > Labels: GeodeOperationAPI > > HARegionQueue.checkQueueSizeConstraint has some code that implements a > "throttle" on adds to a queue that is full. It is supposed to wait > "eventEnqueueWaitTime" before doing an add. But because this code does two > syncs (putGuard and permitMon) and only waits on one of them, it holds the > other sync for the duration of this threads throttle. Any other concurrent > thread trying to add to the queue gets stuck on the putGuard sync that is > held by the first thread that is doing the timed wait. So it ends up waiting > "eventEnqueueWaitTime" to acquire the first sync and then ends up waiting > again "eventEnqueueWaitTime" when it does its own timed wait. If you have 10 > concurrent threads trying to add one of them will end up waiting 10 * > "eventEnqueueWaitTime". > A couple ideas of how to fix this. Get rid of the putGuard and just use > permitMon. Then as soon as the first thread goes into its timed wait another > thread is allowed to sync on permitMon. But if this is done then we need to > think carefully about the code inside this sync block since it can not be > executed while one or more other threads are waiting in permitMon. > The other solution would be to compute the elapsed time it took to get into > the first sync and subtract that from the time we wait on permitMon. This > seems like a simple solution but does introduce at least one call of get time > (the second call is only needed if the queue is full). -- This message was sent by Atlassian Jira (v8.3.4#803005)