[ https://issues.apache.org/jira/browse/CASSANDRA-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764383#comment-17764383 ]
Sebastian Marsching commented on CASSANDRA-15284: ------------------------------------------------- I did a bit more digging and I think that this bug got originally introduced in Cassandra 2.1.0: Before the {{uncompressedSize}} (which was called {{originalSize}} at this point in time) was only used for calculating compression statistics, but since 2.1.0 it has also been used when calling methods of the {{{}MetadataWriter{}}}, where it matters whether this number is correct. As this bug is only triggered when using {{resetAndTruncate}} and this method is only used by {{SSTableRewriter}} (which is only used by {{{}Scrubber{}}}), it only appears when scrubbing SSTables. In addition to that, the rewind causes by {{resetAndTruncate}} has to be large enough for the wrong data size to extend beyond the last chunk. Interestingly, I do not see a good rason why {{uncompressedSize}} is even needed: When not hitting this bug, it should always be the same as {{{}bufferOffset{}}}, the only exception being inside {{flushData}} between updating {{uncompressedSize}} and returning, because bufferOffset is only updated after {{flushData}} returns. Therefore, the simplest and least invasive fix is using {{bufferOffset}} instead of {{uncompressedSize}} in {{{}CompressedSequentialWriter.open{}}}. However, {{uncompressedSize}} is also used when updating {{lastFlushOffset}} and for compression statistics. Luckily, it seems like {{lastFlushOffset}} is only used by {{{}BigTableWriter.IndexWriter{}}}, which never uses the {{{}CompressedSequentialWriter{}}}, so this shouldn’t cause any actual problems. BTW, {{compressedSize}} has the same problem (not rolled back in {{{}resetAndTruncate{}}}), but as it is only used for compression statistics, it does not cause any issues. The nicer, but more invasive fix would be removing {{compressedSize}} and {{uncompressedSize}} completely. As written earlier, uncompressedSize can easily be replaced with {{{}bufferOffset{}}}. {{compressedSize}} is very similar to {{{}chunkOffset{}}}, the only difference being that {{chunkOffset}} also counts the size occupied by the checksum for each chunk, while {{compressedSize}} does not. Therefore, I think that it would be okay to replace {{compressedSize}} with {{{}chunkOffset{}}}. It would mean that the compressed size used for statistics would be a bit larger, but it would also ensure that the statistics remain correct after {{resetAndTruncate}} has been called. If the old method of counting should be retained, we could {{chunkOffset - chunkCount * 4L}} for the compression statistics. Reproducing the bug is quite easy: Simply go to {{CompressedSequentialWriterTest.resetAndTruncate}} and insert the following two lines anywhere after {{{}writer.resetAndTruncate(pos){}}}: {{ CompressionMetadata metadata = ((CompressedSequentialWriter) writer).open(0);}} {{ metadata.chunkFor(metadata.dataLength - 1);}} This causes the {{AssertionError}} when running the test. When inserting the two lines before {{{}resetAndTruncate{}}}, there is no exception, which shows that this problem is caused by this method. Unfortunately, I think that I found another bug while investigating this bug: The {{CompressedSequentialWriter.crcMetadata}} object gets updated with the uncompressed data as it is written. This has the consequence that the checksum written to the “-Digest.crc32” file will be wrong when {{resetAndTruncate}} has been called. In order for the checksum to be correct, the state of the {{ChecksumWriter}} would have to be reset in {{{}resetAndTruncate{}}}, but this is not easy because the underlying {{CRC32}} class does not provide any constructor or method for initializing the internal state, so there is no way to reset it to a previous state. So, in order to fix this problem we would either have to use or own implementation of the CRC-32 algorithm or we would have to calculate the byte sequence that gets us back to a certain state (which is possible, but not a nice solution, see https://stackoverflow.com/questions/38981523/crc32-change-initial-value for a discussion). Until this problem is fixed, scrubbing has to be considered broken. > AssertionError while scrubbing sstable > -------------------------------------- > > Key: CASSANDRA-15284 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15284 > Project: Cassandra > Issue Type: Bug > Components: Feature/Compression > Reporter: Gianluigi Tiesi > Priority: Normal > Fix For: 3.11.x, 4.0.x, 4.1.x > > Attachments: assert-comp-meta.diff > > > I've got a damaged data file but while trying to run scrub (online or > offline) I always get this > error: > > {code:java} > -- StackTrace -- > java.lang.AssertionError > at > org.apache.cassandra.io.compress.CompressionMetadata$Chunk.<init>(CompressionMetadata.java:474) > at > org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:239) > at > org.apache.cassandra.io.util.MmappedRegions.updateState(MmappedRegions.java:163) > at > org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:73) > at > org.apache.cassandra.io.util.MmappedRegions.<init>(MmappedRegions.java:61) > at > org.apache.cassandra.io.util.MmappedRegions.map(MmappedRegions.java:104) > at > org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:362) > at > org.apache.cassandra.io.util.FileHandle$Builder.complete(FileHandle.java:331) > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter.openFinal(BigTableWriter.java:336) > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter.openFinalEarly(BigTableWriter.java:318) > at > org.apache.cassandra.io.sstable.SSTableRewriter.switchWriter(SSTableRewriter.java:322) > at > org.apache.cassandra.io.sstable.SSTableRewriter.doPrepare(SSTableRewriter.java:370) > at > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173) > at > org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:184) > at > org.apache.cassandra.io.sstable.SSTableRewriter.finish(SSTableRewriter.java:357) > at > org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:291) > at > org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:1010) > at > org.apache.cassandra.db.compaction.CompactionManager.access$200(CompactionManager.java:83) > at > org.apache.cassandra.db.compaction.CompactionManager$3.execute(CompactionManager.java:391) > at > org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:312) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > at java.lang.Thread.run(Thread.java:748) > {code} > At the moment I've moved away the corrupted file, If you need more info fell > free to ask > > According to the source > [https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/io/compress/CompressionMetadata.java#L474] > looks like the requested chung length is <= 0 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org