[ https://issues.apache.org/jira/browse/CASSANDRA-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839797#comment-17839797 ]
Stefan Miklosovic commented on CASSANDRA-19572: ----------------------------------------------- I think I am onto something. I check the logs from the build for plain 4.0 (without patch in 19401) and the first one is very interesting (testImportCorruptWithCopying) (1) It fails on: {code:java} junit.framework.AssertionFailedError: expected:<[/tmp/importtest7641524017208283450/cql_test_keyspace/table_15-005af720fd9511ee865eef8364010360]> but was:<[/tmp/importtest7641524017208283450/cql_test_keyspace/table_15-005af720fd9511ee865eef8364010360, /tmp/importtest916153905487802965/cql_test_keyspace/table_15-005af720fd9511ee865eef8364010360]> at org.apache.cassandra.db.ImportTest.testCorruptHelper(ImportTest.java:341) at org.apache.cassandra.db.ImportTest.testImportCorruptWithCopying(ImportTest.java:384) {code} That test is expecting only one directory of sstables to be imported to be failed and another it expects to be loaded just fine, but here we clearly see that it failed to import {_}both{_}. I was checking the raw logs and I was quite lucky to find it, it is in this one (2). Grep it on exactly this timestamp: {code} ERROR [main] 2024-04-18 15:04:49,454 SSTableImporter.java:102 {code} There you see that it failed to import the directory which it is not supposed to, that is the first stacktrace, but below it, there is another one: {code} [junit-timeout] ERROR [main] 2024-04-18 15:04:49,469 SSTableImporter.java:147 - Failed importing sstables in directory /tmp/importtest916153905487802965/cql_test_keyspace/table_15-005af720fd9511ee865eef8364010360 [junit-timeout] java.lang.AssertionError: null [junit-timeout] at org.apache.cassandra.utils.concurrent.Ref$State.assertNotReleased(Ref.java:196) [junit-timeout] at org.apache.cassandra.utils.concurrent.Ref.ref(Ref.java:152) [junit-timeout] at org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy.get(SSTableReader.java:2196) [junit-timeout] at org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier.setup(SSTableReader.java:2028) [junit-timeout] at org.apache.cassandra.io.sstable.format.SSTableReader.setup(SSTableReader.java:1971) [junit-timeout] at org.apache.cassandra.io.sstable.format.SSTableReaderBuilder$ForRead.build(SSTableReaderBuilder.java:370) [junit-timeout] at org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:501) [junit-timeout] at org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:372) [junit-timeout] at org.apache.cassandra.db.SSTableImporter.getTargetDirectory(SSTableImporter.java:211) [junit-timeout] at org.apache.cassandra.db.SSTableImporter.importNewSSTables(SSTableImporter.java:135) [junit-timeout] at org.apache.cassandra.db.ImportTest.testCorruptHelper(ImportTest.java:340) [junit-timeout] at org.apache.cassandra.db.ImportTest.testImportCorruptWithCopying(ImportTest.java:384) [junit-timeout] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit-timeout] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) {code} Here we see that it was asserting that sstable reader is not released but it failed because it seems it is. That release is happening here (3). I run the multiplexer on this (4) test for 3000x (5) and it all passed. I think we should just call "SSTableReader.resetTidying();". That method is actually annotated with VisibleForTesting. I think that resetting the tidying will clear underlying map of references so it will not complain afterwards. It is probably some concurrent thing or similar ... (1) [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests] (2) [https://circleci.com/api/v1.1/project/github/instaclustr/cassandra/223621/output/103/11?file=true&allocation-id=662134c47c6ecf4bb1db4681-11-build%2FABCDEFGH] (3) [https://github.com/apache/cassandra/blob/cassandra-4.0/test/unit/org/apache/cassandra/db/ImportTest.java#L235] (4) https://github.com/apache/cassandra/pull/3264/commits/d934e1c0f40353a12cd7588fc8a15a23d35d6a30 (5) https://app.circleci.com/pipelines/github/instaclustr/cassandra/4210/workflows/eea52e61-b670-4dc9-86b6-b07bf1030b09/jobs/224285 > Test failure: org.apache.cassandra.db.ImportTest flakiness > ---------------------------------------------------------- > > Key: CASSANDRA-19572 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19572 > Project: Cassandra > Issue Type: Bug > Components: Tool/bulk load > Reporter: Brandon Williams > Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > > As discovered on CASSANDRA-19401, the tests in this class are flaky, at least > the following: > * testImportCorruptWithoutValidationWithCopying > * testImportInvalidateCache > * testImportCorruptWithCopying > * testImportCacheEnabledWithoutSrcDir > * testImportInvalidateCache > [https://app.circleci.com/pipelines/github/instaclustr/cassandra/4199/workflows/a70b41d8-f848-4114-9349-9a01ac082281/jobs/223621/tests] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org