[ https://issues.apache.org/jira/browse/CASSANDRA-16493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303505#comment-17303505 ]
Sam Tunnicliffe edited comment on CASSANDRA-16493 at 3/17/21, 3:09 PM: ----------------------------------------------------------------------- With Java versions >= 9, Netty requires a system property to be set to enable the use of unsafe (i.e. without a {{Cleaner}}) direct buffer creation. Without this set, Netty buffer pools, which are currently still used for CQL Message encoding in both v4 and v5, fall back to using {{ByteBuffer.allocateDirect(int)}} which has inferior performance and can lead to OOMs as seen here if we attempt to allocate more than the permitted maximum. In this case, multiple instances are being started in the same JVM and each has its own PooledByteBufAllocator. Coupling this with the switch to protocol v5, which uses C*'s own buffer pool for the outer frame buffers causes the direct memory allocation to exceed the limit when running this test. Setting the system property {{io.netty.tryReflectionSetAccessible=true}} reverts to the same buffer creation method as java 8, after which I don't see any OOMs when running the test. I've also inspected heap dumps taken while running the test under several configurations which confirm this analysis. |Patch| [here|https://github.com/apache/cassandra/compare/trunk...beobal:16493-trunk]| |Circle CI| [here|https://app.circleci.com/pipelines/github/beobal/cassandra?branch=16493-trunk]| |Apache CI| [here|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/492/]| was (Author: beobal): With Java versions >= 9, Netty requires a system property to be set to enable the use of unsafe (i.e. without a {{Cleaner}}) direct buffer creation. Without this set, Netty buffer pools, which are currently still used for CQL Message encoding in both v4 and v5, fall back to using {{ByteBuffer.allocateDirect(int)}} which has inferior performance and can lead to OOMs as seen here if we attempt to allocate more than the permitted maximum. In this case, multiple instances are being started in the same JVM and each has its own PooledByteBufAllocator. Coupling this with the switch to protocol v5, which uses C*'s own buffer pool for the outer frame buffers causes the direct memory allocation to exceed the limit when running this test. Setting the system property {{io.netty.tryReflectionSetAccessible=true}} reverts to the same buffer creation method as java 8, after which I don't see any OOMs when running the test. I've also inspected heap dumps taken while running the test under several configurations which confirm this analysis. |Patch| [here|https://github.com/apache/cassandra/compare/trunk...beobal:16493-trunk]| |Circle CI| [here|https://app.circleci.com/pipelines/github/beobal/cassandra?branch=16493-trunk]| > IPMembershipTest.sameIPFailWithoutReplace fails with timeout > ------------------------------------------------------------ > > Key: CASSANDRA-16493 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16493 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java > Reporter: Adam Holmberg > Assignee: Sam Tunnicliffe > Priority: Normal > Fix For: 4.0, 4.0-beta > > Attachments: ci-failures.png > > > https://ci-cassandra.apache.org/job/Cassandra-trunk/307/testReport/junit/org.apache.cassandra.distributed.test/IPMembershipTest/sameIPFailWithoutReplace/ > {noformat} > java.lang.RuntimeException: > com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) > tried for query failed (tried: localhost/0:0:0:0:0:0:0:1:9042 > (com.datastax.driver.core.exceptions.TransportException: > [localhost/0:0:0:0:0:0:0:1:9042] Cannot connect), localhost/127.0.0.1:9042 > (com.datastax.driver.core.exceptions.OperationTimedOutException: > [localhost/127.0.0.1:9042] Operation timed out)) > at > org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:146) > at > org.apache.cassandra.stress.settings.StressSettings.getJavaDriverClient(StressSettings.java:114) > at > org.apache.cassandra.stress.settings.SettingsSchema.createKeySpaces(SettingsSchema.java:66) > at > org.apache.cassandra.stress.settings.StressSettings.maybeCreateKeyspaces(StressSettings.java:154) > at org.apache.cassandra.stress.StressAction.run(StressAction.java:56) > at org.apache.cassandra.stress.Stress.run(Stress.java:155) > at org.apache.cassandra.stress.Stress.main(Stress.java:63) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org