[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784518#comment-17784518 ] Alex Petrov edited comment on CASSANDRA-18932 at 11/9/23 3:44 PM: -- Yup, I can confirm that CASSANDRA-18993 is also fixed by this. I have been running Harry with many seeds, and do not see these failures anymore (neither exceptions, nor data masking). Thank you for working on this! was (Author: ifesdjeen): Yup, I can confirm that 18993 is also fixed by this. I have been running Harry with many seeds, and do not see these failures anymore (neither exceptions, nor data masking). Thank you for working on this! > Harry-found CorruptSSTableException / RT Closer issue when reading entire > partition > --- > > Key: CASSANDRA-18932 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18932 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Alex Petrov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-beta1 > > Attachments: node1_.zip, operation.log.zip, screenshot-1.png > > Time Spent: 10m > Remaining Estimate: 0h > > While testing some new machinery for Harry, I have encountered a new RT > closer / SSTable Corruption issue. I have grounds to believe this was > introduced during the last year. > Issue seems to happen because of intricate interleaving of flushes with > writes and deletes. > {code:java} > ERROR [ReadStage-2] 2023-10-16 18:47:06,696 JVMStabilityInspector.java:76 - > Exception in thread Thread[ReadStage-2,5,SharedPool] > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > RandomAccessReader:BufferManagingRebufferer.Aligned:CompressedChunkReader.Mmap(/Users/ifesdjeen/foss/java/apache-cassandra-4.0/data/data1/harry/table_1-07c35a606c0a11eeae7a4f6ca489eb0c/nc-5-big-Data.db > - LZ4Compressor, chunk length 16384, data length 232569) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator$AbstractReader.hasNext(AbstractSSTableIterator.java:381) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:242) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:376) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:188) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:157) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:402) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:151) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:101) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:86) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:343) > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:201) > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:186) > at > org.apache.cassandra.db.ReadResponse.cr
[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784487#comment-17784487 ] Michael Semb Wever edited comment on CASSANDRA-18932 at 11/9/23 2:49 PM: - bq. it looks like we should enforce a contract related to the `seekToPosition' method usage so that we never have to face the same issue again. assertion (or pre-condition) over comment please. assert and pre-conditions are _always_ better than comments. (if possible, i'm not looking at the code…) was (Author: michaelsembwever): bq. it looks like we should enforce a contract related to the `seekToPosition' method usage so that we never have to face the same issue again. assertion (or pre-condition) over comment please. assert and pre-conditions are _always_ better than comments. > Harry-found CorruptSSTableException / RT Closer issue when reading entire > partition > --- > > Key: CASSANDRA-18932 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18932 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Alex Petrov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-beta, 5.0.x, 5.x > > Attachments: node1_.zip, operation.log.zip, screenshot-1.png > > Time Spent: 10m > Remaining Estimate: 0h > > While testing some new machinery for Harry, I have encountered a new RT > closer / SSTable Corruption issue. I have grounds to believe this was > introduced during the last year. > Issue seems to happen because of intricate interleaving of flushes with > writes and deletes. > {code:java} > ERROR [ReadStage-2] 2023-10-16 18:47:06,696 JVMStabilityInspector.java:76 - > Exception in thread Thread[ReadStage-2,5,SharedPool] > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > RandomAccessReader:BufferManagingRebufferer.Aligned:CompressedChunkReader.Mmap(/Users/ifesdjeen/foss/java/apache-cassandra-4.0/data/data1/harry/table_1-07c35a606c0a11eeae7a4f6ca489eb0c/nc-5-big-Data.db > - LZ4Compressor, chunk length 16384, data length 232569) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator$AbstractReader.hasNext(AbstractSSTableIterator.java:381) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:242) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:376) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:188) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:157) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:402) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:151) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:101) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:86) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:343) > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(Re
[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783620#comment-17783620 ] Alex Petrov edited comment on CASSANDRA-18932 at 11/7/23 1:45 PM: -- Smallest (so far) harry-generated repro: {code:java} @Test public void corruptedSStableDuringReadTest() throws Throwable { try (Cluster cluster = builder().withNodes(1) .withConfig((cfg) -> { cfg.set("memtable_heap_space", "512MiB") .set("column_index_size", "1KiB"); }) .start()) { String pk1 = "ZinzDdUuABgDknItABgDknItABgDknItLhDJFhdNPpEzbCrpSdhqYCfuFeXzSfHt528523179230134153120"; long pk2 = 1607590537L; cluster.schemaChange("CREATE KEYSPACE IF NOT EXISTS distributed_test_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};"); cluster.schemaChange(" CREATE TABLE IF NOT EXISTS distributed_test_keyspace.table_0 (pk1 ascii,pk2 bigint,ck1 ascii,ck2 ascii,v1 ascii, PRIMARY KEY ((pk1,pk2), ck1, ck2)) WITH CLUSTERING ORDER BY (ck1 DESC,ck2 DESC);"); cluster.coordinator(1).execute("INSERT INTO distributed_test_keyspace.table_0 (pk1,pk2,ck1,ck2,v1) VALUES (?, ?, ?, ?, ?) USING TIMESTAMP 2796;", QUORUM, pk1, pk2, "XhzszszswKvvPril160", "XhzszszsBXILetSZ244040129141106771024413922120319382156215212185199207812444823206114747416170251137776288", "PrqcApcjVlLUPnJu17810922624911817178133246422120325130151891112101812515729917023921424472071312417514176882215010117825115533215"); cluster.coordinator(1).execute("INSERT INTO distributed_test_keyspace.table_0 (pk1,pk2,ck1,ck2) VALUES (?, ?, ?, ?) USING TIMESTAMP 9673;", QUORUM, pk1, pk2, "XhzszszsvezsgWfS171636154115230246180242212216218139135122313202531842495455392072408614387177166104798029248115180250227", "XhzszszsPMEZrail5620525123714624177220143195554265108125193170145424203102391711193243114123751720220922122616813119189727170246204240262481152314517235365183217118227135119132104"); cluster.coordinator(1).execute("INSERT INTO distributed_test_keyspace.table_0 (pk1,pk2,ck1,ck2) VALUES (?, ?, ?, ?) USING TIMESTAMP 2253;", QUORUM, pk1, pk2, "XhzszszsukgWaIRt119213342532141149997211120", "XhzszszsKYwrbwjS4423671233187241841253408287917304019919636221282271552511886262224501472195223445254234164552508615225511022103174183165108145"); cluster.coordinator(1).execute("DELETE FROM distributed_test_keyspace.table_0 USING TIMESTAMP 713 WHERE pk1 = ? AND pk2 = ? AND ck1 = ? AND ck2 >= ? AND ck2 < ?;", QUORUM, pk1, pk2, "XhzszszsvezsgWfS171636154115230246180242212216218139135122313202531842495455392072408614387177166104798029248115180250227", "XhzszszsPMEZrail5620525123714624177220143195554265108125193170145424203102391711193243114123751720220922122616813119189727170246204240262481152314517235365183217118227135119132104", "XhzszszsuAPqXhcW651655711811739125249255921633060"); cluster.get(1).nodetool("flush", "distributed_test_keyspace", "table_0"); Iterator iter = cluster.coordinator(1).executeWithPaging("SELECT * FROM distributed_test_keyspace.table_0 WHERE pk1 = ? AND pk2 = ?;", QUORUM, 1, pk1, pk2); while (iter.hasNext()) iter.next(); } } {code} was (Author: ifesdjeen): Smallest (so far) harry-generated repro: {code:java} @Test public void corruptedSStableDuringReadTest() throws Throwable { try (Cluster cluster = builder().withNodes(1) .withConfig((cfg) -> { cfg.set("memtable_heap_space", "512MiB") .set("column_index_size", "1KiB"); }) .start()) { String pk1 = "ZinzDdUuABgDknItABgDknItABgDknItLhDJFhdNPpEzbCrpSdhqYCfuFeXzSfHt528523179230134153120"; long pk2 = 1607590537L; cluster.schemaChange("CREATE KEYSPACE IF NOT EXISTS distributed_test_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};"); cluster.schemaChange(" CREATE TABLE IF NOT EXISTS distributed_test_keyspace.table_0 (pk1 ascii,pk2 bigint,ck1 ascii,ck2 ascii,v1 ascii, PRIMARY KEY ((pk1,pk2), ck1, ck2)) WITH CLUSTERING ORDER BY (ck1 DESC,ck2 DESC);"); cluster.coordinator(1)
[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783620#comment-17783620 ] Alex Petrov edited comment on CASSANDRA-18932 at 11/7/23 1:41 PM: -- Smallest (so far) harry-generated repro: {code:java} @Test public void corruptedSStableDuringReadTest() throws Throwable { try (Cluster cluster = builder().withNodes(1) .withConfig((cfg) -> { cfg.set("memtable_heap_space", "512MiB") .set("column_index_size", "1KiB"); }) .start()) { String pk1 = "ZinzDdUuABgDknItABgDknItABgDknItLhDJFhdNPpEzbCrpSdhqYCfuFeXzSfHt528523179230134153120"; long pk2 = 1607590537L; cluster.schemaChange("CREATE KEYSPACE IF NOT EXISTS distributed_test_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};"); cluster.schemaChange(" CREATE TABLE IF NOT EXISTS distributed_test_keyspace.table_0 (pk1 ascii,pk2 bigint,ck1 ascii,ck2 ascii,v1 ascii, PRIMARY KEY ((pk1,pk2), ck1, ck2)) WITH CLUSTERING ORDER BY (ck1 DESC,ck2 DESC);"); cluster.coordinator(1).execute("INSERT INTO distributed_test_keyspace.table_0 (pk1,pk2,ck1,ck2,v1) VALUES (?, ?, ?, ?, ?) TIMESTAMP 2796;", QUORUM, pk1, pk2, "XhzszszswKvvPril160", "XhzszszsBXILetSZ244040129141106771024413922120319382156215212185199207812444823206114747416170251137776288", "PrqcApcjVlLUPnJu17810922624911817178133246422120325130151891112101812515729917023921424472071312417514176882215010117825115533215"); cluster.coordinator(1).execute("INSERT INTO distributed_test_keyspace.table_0 (pk1,pk2,ck1,ck2) VALUES (?, ?, ?, ?) USING TIMESTAMP 9673;", QUORUM, pk1, pk2, "XhzszszsvezsgWfS171636154115230246180242212216218139135122313202531842495455392072408614387177166104798029248115180250227", "XhzszszsPMEZrail5620525123714624177220143195554265108125193170145424203102391711193243114123751720220922122616813119189727170246204240262481152314517235365183217118227135119132104"); cluster.coordinator(1).execute("INSERT INTO distributed_test_keyspace.table_0 (pk1,pk2,ck1,ck2) VALUES (?, ?, ?, ?) USING TIMESTAMP 2253;", QUORUM, pk1, pk2, "XhzszszsukgWaIRt119213342532141149997211120", "XhzszszsKYwrbwjS4423671233187241841253408287917304019919636221282271552511886262224501472195223445254234164552508615225511022103174183165108145"); cluster.coordinator(1).execute("DELETE FROM distributed_test_keyspace.table_0 USING TIMESTAMP 713 WHERE pk1 = ? AND pk2 = ? AND ck1 = ? AND ck2 >= ? AND ck2 < ?;", QUORUM, pk1, pk2, "XhzszszsvezsgWfS171636154115230246180242212216218139135122313202531842495455392072408614387177166104798029248115180250227", "XhzszszsPMEZrail5620525123714624177220143195554265108125193170145424203102391711193243114123751720220922122616813119189727170246204240262481152314517235365183217118227135119132104", "XhzszszsuAPqXhcW651655711811739125249255921633060"); cluster.get(1).nodetool("flush", "distributed_test_keyspace", "table_0"); Iterator iter = cluster.coordinator(1).executeWithPaging("SELECT * FROM distributed_test_keyspace.table_0 WHERE pk1 = ? AND pk2 = ?;", QUORUM, 1, pk1, pk2); while (iter.hasNext()) iter.next(); } } {code} was (Author: ifesdjeen): Smallest (so far) harry-generated repro: {code:java} @Test public void corruptedSStableDuringReadTest() throws Throwable { try (Cluster cluster = builder().withNodes(1) .withConfig((cfg) -> { cfg.set("memtable_heap_space", "512MiB") .set("column_index_size", "1KiB"); }) .start()) { String pk1 = "ZinzDdUuABgDknItABgDknItABgDknItLhDJFhdNPpEzbCrpSdhqYCfuFeXzSfHt528523179230134153120"; long pk2 = 1607590537L; cluster.schemaChange("CREATE KEYSPACE IF NOT EXISTS distributed_test_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};"); cluster.schemaChange(" CREATE TABLE IF NOT EXISTS distributed_test_keyspace.table_0 (pk1 ascii,pk2 bigint,ck1 ascii,ck2 ascii,v1 ascii, PRIMARY KEY ((pk1,pk2), ck1, ck2)) WITH CLUSTERING ORDER BY (ck1 DESC,ck2 DESC);"); cluster.coordinator(1).execu
[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783620#comment-17783620 ] Alex Petrov edited comment on CASSANDRA-18932 at 11/7/23 1:18 PM: -- Smallest (so far) harry-generated repro: {code:java} @Test public void corruptedSStableDuringReadTest() throws Throwable { try (Cluster cluster = builder().withNodes(1) .withConfig((cfg) -> { cfg.set("memtable_heap_space", "512MiB") .set("column_index_size", "1KiB"); }) .start()) { String pk1 = "ZinzDdUuABgDknItABgDknItABgDknItLhDJFhdNPpEzbCrpSdhqYCfuFeXzSfHt528523179230134153120"; long pk2 = 1607590537L; cluster.schemaChange("CREATE KEYSPACE IF NOT EXISTS distributed_test_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};"); cluster.schemaChange(" CREATE TABLE IF NOT EXISTS distributed_test_keyspace.table_0 (pk1 ascii,pk2 bigint,ck1 ascii,ck2 ascii,v1 ascii, PRIMARY KEY ((pk1,pk2), ck1, ck2)) WITH CLUSTERING ORDER BY (ck1 DESC,ck2 DESC);"); cluster.coordinator(1).execute("INSERT INTO distributed_test_keyspace.table_0 (pk1,pk2,ck1,ck2,v1) VALUES (?, ?, ?, ?, ?) TIMESTAMP 2796;", QUORUM, pk1, pk2, "XhzszszswKvvPril160", "XhzszszsBXILetSZ244040129141106771024413922120319382156215212185199207812444823206114747416170251137776288", "PrqcApcjVlLUPnJu17810922624911817178133246422120325130151891112101812515729917023921424472071312417514176882215010117825115533215"); cluster.coordinator(1).execute("INSERT INTO distributed_test_keyspace.table_0 (pk1,pk2,ck1,ck2) VALUES (?, ?, ?, ?) USING TIMESTAMP 9673;", QUORUM, pk1, pk2, "XhzszszsvezsgWfS171636154115230246180242212216218139135122313202531842495455392072408614387177166104798029248115180250227", "XhzszszsPMEZrail5620525123714624177220143195554265108125193170145424203102391711193243114123751720220922122616813119189727170246204240262481152314517235365183217118227135119132104"); cluster.coordinator(1).execute("INSERT INTO distributed_test_keyspace.table_0 (pk1,pk2,ck1,ck2) VALUES (?, ?, ?, ?) USING TIMESTAMP 2253;", QUORUM, pk1, pk2, "XhzszszsukgWaIRt119213342532141149997211120", "XhzszszsKYwrbwjS4423671233187241841253408287917304019919636221282271552511886262224501472195223445254234164552508615225511022103174183165108145"); cluster.coordinator(1).execute("DELETE FROM distributed_test_keyspace.table_0 USING TIMESTAMP 713 WHERE pk1 = ? AND pk2 = ? AND ck1 = ? AND ck2 >= ? AND ck2 < ?;", QUORUM, pk1, pk2, "XhzszszsvezsgWfS171636154115230246180242212216218139135122313202531842495455392072408614387177166104798029248115180250227", "XhzszszsPMEZrail5620525123714624177220143195554265108125193170145424203102391711193243114123751720220922122616813119189727170246204240262481152314517235365183217118227135119132104", "XhzszszsuAPqXhcW651655711811739125249255921633060"); cluster.get(1).nodetool("flush", "distributed_test_keyspace", "table_0"); Iterator iter = cluster.coordinator(1).executeWithPaging("SELECT * FROM distributed_test_keyspace.tbl65 WHERE pk1 = ? AND pk2 = ?;", QUORUM, 1, pk1, pk2); while (iter.hasNext()) iter.next(); } } {code} was (Author: ifesdjeen): Smallest (so far harry-generated repro): {code:java} @Test public void corruptedSStableDuringReadTest() throws Throwable { try (Cluster cluster = builder().withNodes(1) .withConfig((cfg) -> { cfg.set("memtable_heap_space", "512MiB") .set("column_index_size", "1KiB"); }) .start()) { String pk1 = "ZinzDdUuABgDknItABgDknItABgDknItLhDJFhdNPpEzbCrpSdhqYCfuFeXzSfHt528523179230134153120"; long pk2 = 1607590537L; cluster.schemaChange("CREATE KEYSPACE IF NOT EXISTS distributed_test_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};"); cluster.schemaChange(" CREATE TABLE IF NOT EXISTS distributed_test_keyspace.table_0 (pk1 ascii,pk2 bigint,ck1 ascii,ck2 ascii,v1 ascii, PRIMARY KEY ((pk1,pk2), ck1, ck2)) WITH CLUSTERING ORDER BY (ck1 DESC,ck2 DESC);"); cluster.coordinator(1).execute
[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783334#comment-17783334 ] Maxim Muzafarov edited comment on CASSANDRA-18932 at 11/6/23 6:50 PM: -- I seem to have boiled the reproducer down to 100 lines. [~jlewandowski] explained to me the cause of the problem since it was related to his patch, so I'll try to take a look at it and figure out how to fix it, but it might take a while since I'm not familiar with that part of the code at all. If this is urgent and you want to fix it anyway I think my investigation might hopefully be useful in the issue review phase :-) was (Author: mmuzaf): I seem to have boiled the reproducer down to 100 lines. [~jlewandowski] explained to me the cause of the problem since it was related to his patch, so I'll try to take a look at it and figure out how to fix it, but it might take a while since I'm not familiar with that part of the code at all. If this is urgent and you want to fix it anyway I think my investigation might hopefully be helpful in the issue review phase :-) > Harry-found CorruptSSTableException / RT Closer issue when reading entire > partition > --- > > Key: CASSANDRA-18932 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18932 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Alex Petrov >Assignee: Maxim Muzafarov >Priority: Normal > Fix For: 5.x > > Attachments: node1_.zip, operation.log.zip, screenshot-1.png > > Time Spent: 10m > Remaining Estimate: 0h > > While testing some new machinery for Harry, I have encountered a new RT > closer / SSTable Corruption issue. I have grounds to believe this was > introduced during the last year. > Issue seems to happen because of intricate interleaving of flushes with > writes and deletes. > {code:java} > ERROR [ReadStage-2] 2023-10-16 18:47:06,696 JVMStabilityInspector.java:76 - > Exception in thread Thread[ReadStage-2,5,SharedPool] > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > RandomAccessReader:BufferManagingRebufferer.Aligned:CompressedChunkReader.Mmap(/Users/ifesdjeen/foss/java/apache-cassandra-4.0/data/data1/harry/table_1-07c35a606c0a11eeae7a4f6ca489eb0c/nc-5-big-Data.db > - LZ4Compressor, chunk length 16384, data length 232569) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator$AbstractReader.hasNext(AbstractSSTableIterator.java:381) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:242) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:376) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:188) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:157) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:402) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:151) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:101) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize
[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783197#comment-17783197 ] Jacek Lewandowski edited comment on CASSANDRA-18932 at 11/6/23 11:51 AM: - For the provided sstables this is a single query which reproduces the problem without paging at all: {code:sql} SELECT * FROM harry.table_1 WHERE pk0003 = 'ZinzDdUuABgDknItABgDknItABgDknItXEFrgBnOmPmPylWrwXHqjBHgeQrGfnZd1124124583' AND pk0004 = 'ZinzDdUuABgDknItABgDknItABgDknItABgDknItABgDknItzHqchghqCXLhVYKM22215251' AND pk0005 = 3.2758E-41 AND ck0002 = -1110871748 AND ck0003 = 'ZYFiYEUkzcKOhdyazcKOhdyazcKOhdyazcKOhdyazcKOhdyaFfLoPrEzlMDvLfXY18918213101196160' AND ck0004 < 'ZYFiYEUkzcKOhdyazcKOhdyazcKOhdyazcKOhdyazcKOhdyachTAyMjmsZMUPCzi23819065184175'; {code} was (Author: jlewandowski): For the provided sstables this is a single query which reproduces the problem without paging at al: {code:sql} SELECT * FROM harry.table_1 WHERE pk0003 = 'ZinzDdUuABgDknItABgDknItABgDknItXEFrgBnOmPmPylWrwXHqjBHgeQrGfnZd1124124583' AND pk0004 = 'ZinzDdUuABgDknItABgDknItABgDknItABgDknItABgDknItzHqchghqCXLhVYKM22215251' AND pk0005 = 3.2758E-41 AND ck0002 = -1110871748 AND ck0003 = 'ZYFiYEUkzcKOhdyazcKOhdyazcKOhdyazcKOhdyazcKOhdyaFfLoPrEzlMDvLfXY18918213101196160' AND ck0004 < 'ZYFiYEUkzcKOhdyazcKOhdyazcKOhdyazcKOhdyazcKOhdyachTAyMjmsZMUPCzi23819065184175'; {code} > Harry-found CorruptSSTableException / RT Closer issue when reading entire > partition > --- > > Key: CASSANDRA-18932 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18932 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Alex Petrov >Assignee: Maxim Muzafarov >Priority: Normal > Fix For: 5.x > > Attachments: node1_.zip, operation.log.zip > > > While testing some new machinery for Harry, I have encountered a new RT > closer / SSTable Corruption issue. I have grounds to believe this was > introduced during the last year. > Issue seems to happen because of intricate interleaving of flushes with > writes and deletes. > {code:java} > ERROR [ReadStage-2] 2023-10-16 18:47:06,696 JVMStabilityInspector.java:76 - > Exception in thread Thread[ReadStage-2,5,SharedPool] > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > RandomAccessReader:BufferManagingRebufferer.Aligned:CompressedChunkReader.Mmap(/Users/ifesdjeen/foss/java/apache-cassandra-4.0/data/data1/harry/table_1-07c35a606c0a11eeae7a4f6ca489eb0c/nc-5-big-Data.db > - LZ4Compressor, chunk length 16384, data length 232569) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator$AbstractReader.hasNext(AbstractSSTableIterator.java:381) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:242) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:376) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:188) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:157) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:402) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIterato
[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783197#comment-17783197 ] Jacek Lewandowski edited comment on CASSANDRA-18932 at 11/6/23 11:50 AM: - For the provided sstables this is a single query which reproduces the problem without paging at al: {code:sql} SELECT * FROM harry.table_1 WHERE pk0003 = 'ZinzDdUuABgDknItABgDknItABgDknItXEFrgBnOmPmPylWrwXHqjBHgeQrGfnZd1124124583' AND pk0004 = 'ZinzDdUuABgDknItABgDknItABgDknItABgDknItABgDknItzHqchghqCXLhVYKM22215251' AND pk0005 = 3.2758E-41 AND ck0002 = -1110871748 AND ck0003 = 'ZYFiYEUkzcKOhdyazcKOhdyazcKOhdyazcKOhdyazcKOhdyaFfLoPrEzlMDvLfXY18918213101196160' AND ck0004 < 'ZYFiYEUkzcKOhdyazcKOhdyazcKOhdyazcKOhdyazcKOhdyachTAyMjmsZMUPCzi23819065184175'; {code} was (Author: jlewandowski): For the provided sstables this is a single query which reproduces the problem without paging at al: {code:sql} SELECT * FROM harry.table_1 WHERE pk0003 = 'ZinzDdUuABgDknItABgDknItABgDknItXEFrgBnOmPmPylWrwXHqjBHgeQrGfnZd1124124583' AND pk0004 = 'ZinzDdUuABgDknItABgDknItABgDknItABgDknItABgDknItzHqchghqCXLhVYKM22215251' AND pk0005 = 3.2758E-41 AND ck0002 = -1110871748 AND ck0003 = 'ZYFiYEUkzcKOhdyazcKOhdyazcKOhdyazcKOhdyazcKOhdyaFfLoPrEzlMDvLfXY18918213101196160' AND ck0004 < 'ZYFiYEUkzcKOhdyazcKOhdyazcKOhdyazcKOhdyazcKOhdyachTAyMjmsZMUPCzi23819065184175' LIMIT 128 ALLOW FILTERING; {code} > Harry-found CorruptSSTableException / RT Closer issue when reading entire > partition > --- > > Key: CASSANDRA-18932 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18932 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Alex Petrov >Assignee: Maxim Muzafarov >Priority: Normal > Fix For: 5.x > > Attachments: node1_.zip, operation.log.zip > > > While testing some new machinery for Harry, I have encountered a new RT > closer / SSTable Corruption issue. I have grounds to believe this was > introduced during the last year. > Issue seems to happen because of intricate interleaving of flushes with > writes and deletes. > {code:java} > ERROR [ReadStage-2] 2023-10-16 18:47:06,696 JVMStabilityInspector.java:76 - > Exception in thread Thread[ReadStage-2,5,SharedPool] > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > RandomAccessReader:BufferManagingRebufferer.Aligned:CompressedChunkReader.Mmap(/Users/ifesdjeen/foss/java/apache-cassandra-4.0/data/data1/harry/table_1-07c35a606c0a11eeae7a4f6ca489eb0c/nc-5-big-Data.db > - LZ4Compressor, chunk length 16384, data length 232569) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator$AbstractReader.hasNext(AbstractSSTableIterator.java:381) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:242) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:376) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:188) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:157) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:402) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.seri
[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783160#comment-17783160 ] Alex Petrov edited comment on CASSANDRA-18932 at 11/6/23 9:52 AM: -- Talked to [~jlewandowski] and [~mmuzaf] off-jira, looks like pagesize 5 does not repro with this particular set of sstables. It does with 1 though. Modified the description accordingly. was (Author: ifesdjeen): Talked to [~jlewandowski] and [~mmuzaf] off-jira, looks like pagesize 5 does not repro with this particular set of sstables. It does with 1 though. > Harry-found CorruptSSTableException / RT Closer issue when reading entire > partition > --- > > Key: CASSANDRA-18932 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18932 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Alex Petrov >Assignee: Maxim Muzafarov >Priority: Normal > Fix For: 5.x > > Attachments: node1_.zip, operation.log.zip > > > While testing some new machinery for Harry, I have encountered a new RT > closer / SSTable Corruption issue. I have grounds to believe this was > introduced during the last year. > Issue seems to happen because of intricate interleaving of flushes with > writes and deletes. > {code:java} > ERROR [ReadStage-2] 2023-10-16 18:47:06,696 JVMStabilityInspector.java:76 - > Exception in thread Thread[ReadStage-2,5,SharedPool] > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > RandomAccessReader:BufferManagingRebufferer.Aligned:CompressedChunkReader.Mmap(/Users/ifesdjeen/foss/java/apache-cassandra-4.0/data/data1/harry/table_1-07c35a606c0a11eeae7a4f6ca489eb0c/nc-5-big-Data.db > - LZ4Compressor, chunk length 16384, data length 232569) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator$AbstractReader.hasNext(AbstractSSTableIterator.java:381) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:242) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:376) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:188) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:157) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:402) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:151) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:101) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:86) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:343) > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:201) > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:186) > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:48) > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:346) > at > org.apache.cassandra.s
[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775878#comment-17775878 ] Alex Petrov edited comment on CASSANDRA-18932 at 10/16/23 6:29 PM: --- [~mmuzaf] sounds good; feel free to close this one if you can attach a stable repro. I've attached a node state that can help anyone who has cycles to work on this to reproduce this. I also have a Harry script that reproduces if anyone wants to understand a sequence of events that leads to this, but it's about 16k lines long, so one would need to shrink for a minimal/commitable repro. was (Author: ifesdjeen): [~mmuzaf] sounds good; feel free to close this one if you can attach a stable repro. I've attached a node state that can help anyone who has cycles to work on this to reproduce this. I also have a Harry script that reproduces if anyone wants to understand a sequence of events, but it's about 16k lines long, so one would need to shrink for a minimal/commitable repro. > Harry-found CorruptSSTableException / RT Closer issue when reading entire > partition > --- > > Key: CASSANDRA-18932 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18932 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Priority: Normal > Attachments: node1_.zip, operation.log.zip > > > While testing some new machinery for Harry, I have encountered a new RT > closer / SSTable Corruption issue. I have grounds to believe this was > introduced during the last year. > Issue seems to happen because of intricate interleaving of flushes with > writes and deletes. > {code:java} > ERROR [ReadStage-2] 2023-10-16 18:47:06,696 JVMStabilityInspector.java:76 - > Exception in thread Thread[ReadStage-2,5,SharedPool] > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > RandomAccessReader:BufferManagingRebufferer.Aligned:CompressedChunkReader.Mmap(/Users/ifesdjeen/foss/java/apache-cassandra-4.0/data/data1/harry/table_1-07c35a606c0a11eeae7a4f6ca489eb0c/nc-5-big-Data.db > - LZ4Compressor, chunk length 16384, data length 232569) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator$AbstractReader.hasNext(AbstractSSTableIterator.java:381) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:242) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:376) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:188) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:157) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:402) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:151) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:101) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:86) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:343) > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.j
[jira] [Comment Edited] (CASSANDRA-18932) Harry-found CorruptSSTableException / RT Closer issue when reading entire partition
[ https://issues.apache.org/jira/browse/CASSANDRA-18932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775878#comment-17775878 ] Alex Petrov edited comment on CASSANDRA-18932 at 10/16/23 6:28 PM: --- [~mmuzaf] sounds good; feel free to close this one if you can attach a stable repro. I've attached a node state that can help anyone who has cycles to work on this to reproduce this. I also have a Harry script that reproduces if anyone wants to understand a sequence of events, but it's about 16k lines long, so one would need to shrink for a minimal/commitable repro. was (Author: ifesdjeen): [~mmuzaf] sounds good; feel free to close this one if you can attach a stable repro. > Harry-found CorruptSSTableException / RT Closer issue when reading entire > partition > --- > > Key: CASSANDRA-18932 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18932 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Priority: Normal > Attachments: node1_.zip > > > While testing some new machinery for Harry, I have encountered a new RT > closer / SSTable Corruption issue. I have grounds to believe this was > introduced during the last year. > Issue seems to happen because of intricate interleaving of flushes with > writes and deletes. > {code:java} > ERROR [ReadStage-2] 2023-10-16 18:47:06,696 JVMStabilityInspector.java:76 - > Exception in thread Thread[ReadStage-2,5,SharedPool] > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > RandomAccessReader:BufferManagingRebufferer.Aligned:CompressedChunkReader.Mmap(/Users/ifesdjeen/foss/java/apache-cassandra-4.0/data/data1/harry/table_1-07c35a606c0a11eeae7a4f6ca489eb0c/nc-5-big-Data.db > - LZ4Compressor, chunk length 16384, data length 232569) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator$AbstractReader.hasNext(AbstractSSTableIterator.java:381) > at > org.apache.cassandra.io.sstable.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:242) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:376) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:188) > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:157) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534) > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:402) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:151) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:101) > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:86) > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:343) > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:201) > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:186) > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:48) > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:346) > at > org.apache.cassandra.s