[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191450#comment-16191450 ]
Andrey Lataev commented on CASSANDRA-13931: ------------------------------------------- As you can see in attached cassandra-env.sh file row: {code:java} JVM_OPTS="$JVM_OPTS -Djdk.nio.maxCachedBufferSize=262144" {code} - exist. I will try to enlarge RAM and and increase heap size til 16Gb. Eclipse Memory Analyser for heapdump shown top 3 problem suspect: *Problem Suspect 1* {code:java} The thread org.apache.cassandra.net.OutboundTcpConnection @ 0x6cd263100 MessagingService-Outgoing-p00skimnosql10.00.egov.local/172.20.4.148-Large keeps local variables with total size 306 114 312 (13,97%) bytes. The memory is accumulated in one instance of "org.apache.cassandra.net.OutboundTcpConnection" loaded by "sun.misc.Launcher$AppClassLoader @ 0x6c0000000". {code} * Problem Suspect 2* {code:java} 529 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by "sun.misc.Launcher$AppClassLoader @ 0x6c0000000" occupy 776 362 840 (35,43%) bytes. Biggest instances: •io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e1e0 epollEventLoopGroup-2-7 - 156 689 680 (7,15%) bytes. •io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e7e0 epollEventLoopGroup-2-3 - 125 567 112 (5,73%) bytes. •io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719da60 epollEventLoopGroup-2-12 - 119 599 160 (5,46%) bytes. •io.netty.util.concurrent.FastThreadLocalThread @ 0x6ceab17b0 epollEventLoopGroup-2-1 - 118 469 632 (5,41%) bytes. •io.netty.util.concurrent.FastThreadLocalThread @ 0x6d7059b00 ReadStage-151 - 66 494 040 (3,03%) bytes. {code} *Problem Suspect 3* {code:java} 126 instances of "byte[]", loaded by "<system class loader>" occupy 268 549 640 (12,26%) bytes. These instances are referenced from one instance of "java.util.HashMap$Node[]", loaded by "<system class loader>" Keywords byte[] java.util.HashMap$Node[] {code} > Cassandra JVM stop itself randomly > ---------------------------------- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM > Reporter: Andrey Lataev > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.0.jar:3.1 > 1.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.0.jar:3.11.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - > Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... > Heap dump file created > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > java.io.IOError: java.io.EOFException: Stream ended prematurely > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94) > ~[apache-cassandra-3.11.0.jar:3.11.0] > Caused by: java.io.EOFException: Stream ended prematurely > at > net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218) > ~[lz4-1.3.0.jar:na] > at > net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150) > ~[lz4-1.3.0.jar:na] > at > net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117) > ~[lz4-1.3.0.jar:na] > at java.io.DataInputStream.readFully(DataInputStream.java:195) > ~[na:1.8.0_121] > at java.io.DataInputStream.readFully(DataInputStream.java:169) > ~[na:1.8.0_121] > at > org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:639) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:604) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at org.apache.cassandra.db.Columns.apply(Columns.java:377) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:600) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeOne(UnfilteredSerializer.java:475) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:431) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222) > ~[apache-cassandra-3.11.0.jar:3.11.0] > ... 11 common frames omitted > Also I try to set -XX:+ExplicitGCInvokesConcurrent on some other nodes but > without success. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org