[ https://issues.apache.org/jira/browse/CASSANDRA-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17382253#comment-17382253 ]
Richard Hesse commented on CASSANDRA-16764: ------------------------------------------- It was a good try, but no joy. We're getting the original heap error, but this time on the splitting side: {noformat} $ sstablesplit --size 1700 --no-snapshot md-5357-big-Data.db WARN 17:55:50 Live sstable {REDACTED}/md-5357-big-Data.db from level 3 is not on corresponding level in the leveled manifest. This is not a problem per se, but may indicate an orphaned sstable due to a failed compaction not cleaned up properly. Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) at org.apache.cassandra.io.util.DataOutputBuffer.expandToFit(DataOutputBuffer.java:159) at org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:119) at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132) at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:151) ... {noformat} Going back to my scrub comment from earlier, it might be worth considering. If Cassandra can't compact or split the data, it's probably worth considering that data corrupt from an operational standpoint. > Compaction repeatedly fails validateReallocation exception > ---------------------------------------------------------- > > Key: CASSANDRA-16764 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16764 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction/LCS > Reporter: Richard Hesse > Priority: Normal > > I have a few nodes in my ring that are stuck repeatedly trying to compact the > same tables over and over again. I've run through the usual trick of rolling > restarts, and it doesn't seem to help. This exception is logged on the nodes: > {code} > ERROR [CompactionExecutor:6] 2021-06-25 20:28:30,001 CassandraDaemon.java:244 > - Exception in thread Thread[CompactionExecutor:6,1,main] > java.lang.RuntimeException: null > at > org.apache.cassandra.io.util.DataOutputBuffer.validateReallocation(DataOutputBuffer.java:134) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.util.DataOutputBuffer.calculateNewSize(DataOutputBuffer.java:152) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.util.DataOutputBuffer.expandToFit(DataOutputBuffer.java:159) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:119) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:151) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:296) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:426) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.ClusteringPrefix$Serializer.serializeValuesWithoutSize(ClusteringPrefix.java:323) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.Clustering$Serializer.serialize(Clustering.java:131) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.ClusteringPrefix$Serializer.serialize(ClusteringPrefix.java:266) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.Serializers$NewFormatSerializer.serialize(Serializers.java:167) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.Serializers$NewFormatSerializer.serialize(Serializers.java:154) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.sstable.IndexInfo$Serializer.serialize(IndexInfo.java:103) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.sstable.IndexInfo$Serializer.serialize(IndexInfo.java:82) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.ColumnIndex.addIndexBlock(ColumnIndex.java:216) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at org.apache.cassandra.db.ColumnIndex.add(ColumnIndex.java:264) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.ColumnIndex.buildRowIndex(ColumnIndex.java:111) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:173) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:136) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:98) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:143) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:204) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:272) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_292] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_292] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[na:1.8.0_292] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_292] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) > [apache-cassandra-3.11.10.jar:3.11.10] > at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_292] > {code} > Watching the compaction progress, it looks like it makes it through, dies, > then starts over again. This process repeats forever. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org