[ 
https://issues.apache.org/jira/browse/GEODE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483980#comment-17483980
 ] 

Hale Bales edited comment on GEODE-9845 at 1/28/22, 9:37 PM:
-------------------------------------------------------------

This failure is not a 1.15 release blocker. It appears to be a test issue. 

In this test the goal is to run the lead server out of memory to the point 
where it can still function, but can't accept any new commands (except 
deletes). The critical heap percentage is set to 5%, so we should be easily 
hitting the point where we can't add anything more without even approaching the 
true limit of what the server can handle. In this case we went beyond that 5% 
and ran both the servers out of memory, causing them to have wake-up delays and 
failing heartbeat requests until they eventually become suspected and get 
booted from the distributed system. The servers bounce around for awhile trying 
to restart and rejoin, but never get back into a stable state.

In this test we are using tenured heap GC but we should be using concurrent 
mark sweep, since we are configuring the critical heap percentage. The JVM can 
ignore the tenured heap GC, so by using it we may not get notified that we have 
reached the critical heap percentage, and so we just keep doing more sets until 
the JVM finally decides to do a GC, but it is big and causes the servers to 
fail availability tests. I have attached my notes of the series of events from 
the logs.



was (Author: balesh2):
This failure is not a 1.15 release blocker. It appears to be a test issue. 

In this test the goal is to run the lead server out of memory to the point 
where it can still function, but can't accept any new commands (except 
deletes). The critical heap percentage is set to 5%, so we should be easily 
hitting the point where we can't add anything more without even approaching the 
true limit of what the server can handle. In this case we went beyond that 5% 
and ran both the servers out of memory, causing them to have wake-up delays and 
failing heartbeat requests until they eventually become suspected and get 
booted from the distributed system. The servers bounce around for awhile trying 
to restart and rejoin, but never get back into a stable state.

In this test we are using tenured heap GC but we should be using concurrent 
mark sweep, since we are configuring the critical heap percentage. The JVM can 
ignore the tenured heap GC, so by using it we may not get notified that we have 
reached the critical heap percentage, and so we just keep doing more sets until 
the JVM finally decides to do a GC, but it is big and causes the servers to 
fail availability tests. I have attached my notes of the series of events from 
the logs.

 !PXL_20220114_205156514.jpg!  !PXL_20220114_205152268.jpg!  
!PXL_20220114_205147440.jpg!  !PXL_20220114_205141950.MP.jpg! 

> CI failure: Multiple tests in OutOfMemoryDUnitTest failed with 
> ConnectException
> -------------------------------------------------------------------------------
>
>                 Key: GEODE-9845
>                 URL: https://issues.apache.org/jira/browse/GEODE-9845
>             Project: Geode
>          Issue Type: Bug
>          Components: redis
>    Affects Versions: 1.15.0
>            Reporter: Kamilla Aslami
>            Assignee: Hale Bales
>            Priority: Major
>              Labels: needsTriage
>         Attachments: PXL_20220114_205141950.MP.jpg, 
> PXL_20220114_205147440.jpg, PXL_20220114_205152268.jpg, 
> PXL_20220114_205156514.jpg
>
>
> 4 tests in OutOfMemoryDUnitTest failed with `java.net.ConnectException: 
> Connection refused`.
> {noformat}
> OutOfMemoryDUnitTest > shouldAllowDeleteOperations_afterThresholdReached 
> FAILED
>     java.lang.AssertionError: 
>     Expecting throwable message:
>       "No more cluster attempts left."
>     to contain:
>       "OOM command not allowed"
>     but did not.
>     Throwable that failed the check:
>     redis.clients.jedis.exceptions.JedisClusterMaxAttemptsException: No more 
> cluster attempts left.
>       at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:156)
>       at 
> redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:45)
>       at redis.clients.jedis.JedisCluster.set(JedisCluster.java:293)
>       at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.setRedisKeyAndValue(OutOfMemoryDUnitTest.java:228)
>       at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.lambda$addMultipleKeys$5(OutOfMemoryDUnitTest.java:212)
>       at 
> org.assertj.core.api.ThrowableAssert.catchThrowable(ThrowableAssert.java:62)
>       at 
> org.assertj.core.api.AssertionsForClassTypes.catchThrowable(AssertionsForClassTypes.java:877)
>       at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.addMultipleKeys(OutOfMemoryDUnitTest.java:210)
>       at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.fillMemory(OutOfMemoryDUnitTest.java:201)
>       at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.shouldAllowDeleteOperations_afterThresholdReached(OutOfMemoryDUnitTest.java:166)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
>       at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>       at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>       at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:138)
>       at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>       at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>       at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>       at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
>       at 
> org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
>       at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>       at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>       at java.util.Iterator.forEachRemaining(Iterator.java:116)
>       at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
>       at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>       at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>       at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>       at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>       at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>       at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
>       at 
> org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82)
>       at 
> org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73)
>       at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
>       at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
>       at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
>       at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
>       at 
> org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
>       at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
>       at 
> org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
>       at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.processAllTestClasses(JUnitPlatformTestClassProcessor.java:99)
>       at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor$CollectAllTestClassesExecutor.access$000(JUnitPlatformTestClassProcessor.java:79)
>       at 
> org.gradle.api.internal.tasks.testing.junitplatform.JUnitPlatformTestClassProcessor.stop(JUnitPlatformTestClassProcessor.java:75)
>       at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>       at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
>       at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
>       at com.sun.proxy.$Proxy2.stop(Unknown Source)
>       at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.stop(TestWorker.java:133)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>       at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:182)
>       at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:164)
>       at 
> org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:414)
>       at 
> org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
>       at 
> org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:48)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at 
> org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:56)
>       at java.lang.Thread.run(Thread.java:748)
>       Suppressed: redis.clients.jedis.exceptions.JedisConnectionException: 
> Could not get a resource from the pool
>         at redis.clients.jedis.util.Pool.getResource(Pool.java:84)
>         at redis.clients.jedis.JedisPool.getResource(JedisPool.java:370)
>         at 
> redis.clients.jedis.JedisSlotBasedConnectionHandler.getConnectionFromSlot(JedisSlotBasedConnectionHandler.java:129)
>         at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:118)
>         ... 86 more
>       Caused by: redis.clients.jedis.exceptions.JedisConnectionException: 
> Failed to create socket.
>         at 
> redis.clients.jedis.DefaultJedisSocketFactory.createSocket(DefaultJedisSocketFactory.java:110)
>         at redis.clients.jedis.Connection.connect(Connection.java:226)
>         at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:135)
>         at redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:309)
>         at 
> redis.clients.jedis.BinaryJedis.initializeFromClientConfig(BinaryJedis.java:87)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:292)
>         at redis.clients.jedis.Jedis.<init>(Jedis.java:167)
>         at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:177)
>         at 
> org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:918)
>         at 
> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:431)
>         at 
> org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:356)
>         at redis.clients.jedis.util.Pool.getResource(Pool.java:75)
>         ... 89 more
>       Caused by: java.net.ConnectException: Connection refused (Connection 
> refused)
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>         at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>         at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>         at java.net.Socket.connect(Socket.java:607)
>         at 
> redis.clients.jedis.DefaultJedisSocketFactory.createSocket(DefaultJedisSocketFactory.java:80)
>         ... 100 more
>         at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.addMultipleKeys(OutOfMemoryDUnitTest.java:218)
>         at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.fillMemory(OutOfMemoryDUnitTest.java:201)
>         at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.shouldAllowDeleteOperations_afterThresholdReached(OutOfMemoryDUnitTest.java:166)
> {noformat}
> {noformat}
> OutOfMemoryDUnitTest > shouldAllowExpiration_afterThresholdReached FAILED
>     redis.clients.jedis.exceptions.JedisConnectionException: Failed to create 
> socket.
>         at 
> redis.clients.jedis.DefaultJedisSocketFactory.createSocket(DefaultJedisSocketFactory.java:110)
>         at redis.clients.jedis.Connection.connect(Connection.java:226)
>         at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:135)
>         at redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:309)
>         at 
> redis.clients.jedis.BinaryJedis.initializeFromClientConfig(BinaryJedis.java:87)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:82)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:77)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:147)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:132)
>         at redis.clients.jedis.Jedis.<init>(Jedis.java:72)
>         at 
> org.apache.geode.test.dunit.rules.RedisClusterStartupRule.flushAll(RedisClusterStartupRule.java:132)
>         at 
> org.apache.geode.test.dunit.rules.RedisClusterStartupRule.flushAll(RedisClusterStartupRule.java:127)
>         at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.testSetup(OutOfMemoryDUnitTest.java:87)
>         Caused by:
>         java.net.ConnectException: Connection refused (Connection refused)
>             at java.net.PlainSocketImpl.socketConnect(Native Method)
>             at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>             at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>             at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>             at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>             at java.net.Socket.connect(Socket.java:607)
>             at 
> redis.clients.jedis.DefaultJedisSocketFactory.createSocket(DefaultJedisSocketFactory.java:80)
>             ... 12 more
> {noformat}
> {noformat}
> OutOfMemoryDUnitTest > shouldReturnOOMError_forSubscribe_whenThresholdReached 
> FAILED
>     redis.clients.jedis.exceptions.JedisConnectionException: Failed to create 
> socket.
>         at 
> redis.clients.jedis.DefaultJedisSocketFactory.createSocket(DefaultJedisSocketFactory.java:110)
>         at redis.clients.jedis.Connection.connect(Connection.java:226)
>         at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:135)
>         at redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:309)
>         at 
> redis.clients.jedis.BinaryJedis.initializeFromClientConfig(BinaryJedis.java:87)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:82)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:77)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:147)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:132)
>         at redis.clients.jedis.Jedis.<init>(Jedis.java:72)
>         at 
> org.apache.geode.test.dunit.rules.RedisClusterStartupRule.flushAll(RedisClusterStartupRule.java:132)
>         at 
> org.apache.geode.test.dunit.rules.RedisClusterStartupRule.flushAll(RedisClusterStartupRule.java:127)
>         at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.testSetup(OutOfMemoryDUnitTest.java:87)
>         Caused by:
>         java.net.ConnectException: Connection refused (Connection refused)
>             at java.net.PlainSocketImpl.socketConnect(Native Method)
>             at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>             at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>             at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>             at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>             at java.net.Socket.connect(Socket.java:607)
>             at 
> redis.clients.jedis.DefaultJedisSocketFactory.createSocket(DefaultJedisSocketFactory.java:80)
>             ... 12 more
> {noformat}
> {noformat}
> OutOfMemoryDUnitTest > shouldReturnOOMError_forPublish_whenThresholdReached 
> FAILED
>     redis.clients.jedis.exceptions.JedisConnectionException: Failed to create 
> socket.
>         at 
> redis.clients.jedis.DefaultJedisSocketFactory.createSocket(DefaultJedisSocketFactory.java:110)
>         at redis.clients.jedis.Connection.connect(Connection.java:226)
>         at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:135)
>         at redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:309)
>         at 
> redis.clients.jedis.BinaryJedis.initializeFromClientConfig(BinaryJedis.java:87)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:82)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:77)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:147)
>         at redis.clients.jedis.BinaryJedis.<init>(BinaryJedis.java:132)
>         at redis.clients.jedis.Jedis.<init>(Jedis.java:72)
>         at 
> org.apache.geode.test.dunit.rules.RedisClusterStartupRule.flushAll(RedisClusterStartupRule.java:132)
>         at 
> org.apache.geode.test.dunit.rules.RedisClusterStartupRule.flushAll(RedisClusterStartupRule.java:127)
>         at 
> org.apache.geode.redis.OutOfMemoryDUnitTest.testSetup(OutOfMemoryDUnitTest.java:87)
>         Caused by:
>         java.net.ConnectException: Connection refused (Connection refused)
>             at java.net.PlainSocketImpl.socketConnect(Native Method)
>             at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>             at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>             at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>             at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>             at java.net.Socket.connect(Socket.java:607)
>             at 
> redis.clients.jedis.DefaultJedisSocketFactory.createSocket(DefaultJedisSocketFactory.java:80)
>             ... 12 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to