[ https://issues.apache.org/jira/browse/CASSANDRA-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17370776#comment-17370776 ]
David Capwell commented on CASSANDRA-16764: ------------------------------------------- this is what I see in the code {code} static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8; ... long validateReallocation(long newSize) { int saturatedSize = saturatedArraySizeCast(newSize); if (saturatedSize <= capacity()) throw new RuntimeException(); return saturatedSize; } {code} this is called by {code} long calculateNewSize(long count) { long capacity = capacity(); //Both sides of this max expression need to use long arithmetic to avoid integer overflow //count and capacity are longs so that ensures it right now. long newSize = capacity + count; //For large buffers don't double, increase by 50% if (capacity > 1024L * 1024L * DOUBLING_THRESHOLD) newSize = Math.max((capacity * 3L) / 2L, newSize); else newSize = Math.max(capacity * 2L, newSize); return validateReallocation(newSize); } {code} which is called by {code} protected void expandToFit(long count) { if (count <= 0) return; ByteBuffer newBuffer = ByteBuffer.allocate(checkedArraySizeCast(calculateNewSize(count))); buffer.flip(); newBuffer.put(buffer); buffer = newBuffer; } {code} This implies to me that the buffer is MAX_ARRAY_SIZE, at this point we are no longer able to expand (should return a better error in this case). I say this as calculateNewSize should not be able to return a value < capacity, so saturatedArraySizeCast trimming to MAX_ARRAY_SIZE is what should trigger this. If this is the case then it means the clustering key has a column larger than {code:java} Integer.MAX_VALUE - 8 {code} bytes; what type is used for this table? [~richardchesse] if you could take a look at a heap dump while this is happening it would help see what's going on. If you have data ~ Integer.MAX_VALUE then I would expect other issues going on (such as GC issues). > Compaction repeatedly fails validateReallocation exception > ---------------------------------------------------------- > > Key: CASSANDRA-16764 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16764 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction/LCS > Reporter: Richard Hesse > Priority: Normal > > I have a few nodes in my ring that are stuck repeatedly trying to compact the > same tables over and over again. I've run through the usual trick of rolling > restarts, and it doesn't seem to help. This exception is logged on the nodes: > {code} > ERROR [CompactionExecutor:6] 2021-06-25 20:28:30,001 CassandraDaemon.java:244 > - Exception in thread Thread[CompactionExecutor:6,1,main] > java.lang.RuntimeException: null > at > org.apache.cassandra.io.util.DataOutputBuffer.validateReallocation(DataOutputBuffer.java:134) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.util.DataOutputBuffer.calculateNewSize(DataOutputBuffer.java:152) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.util.DataOutputBuffer.expandToFit(DataOutputBuffer.java:159) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:119) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:151) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:296) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:426) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.ClusteringPrefix$Serializer.serializeValuesWithoutSize(ClusteringPrefix.java:323) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.Clustering$Serializer.serialize(Clustering.java:131) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.ClusteringPrefix$Serializer.serialize(ClusteringPrefix.java:266) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.Serializers$NewFormatSerializer.serialize(Serializers.java:167) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.Serializers$NewFormatSerializer.serialize(Serializers.java:154) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.sstable.IndexInfo$Serializer.serialize(IndexInfo.java:103) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.sstable.IndexInfo$Serializer.serialize(IndexInfo.java:82) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.ColumnIndex.addIndexBlock(ColumnIndex.java:216) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at org.apache.cassandra.db.ColumnIndex.add(ColumnIndex.java:264) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.ColumnIndex.buildRowIndex(ColumnIndex.java:111) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:173) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:136) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:98) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:143) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:204) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:85) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:272) > ~[apache-cassandra-3.11.10.jar:3.11.10] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_292] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_292] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[na:1.8.0_292] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_292] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) > [apache-cassandra-3.11.10.jar:3.11.10] > at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_292] > {code} > Watching the compaction progress, it looks like it makes it through, dies, > then starts over again. This process repeats forever. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org