[ https://issues.apache.org/jira/browse/CASSANDRA-20829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18013723#comment-18013723 ]
Stefan Miklosovic commented on CASSANDRA-20829: ----------------------------------------------- [~blambov] OK, pushed the branch. How I tried to do that is that we will run through all expired rows just once. So I am iterating over SSTableReaders first, and then for each row I need to iterate over all indices, then create Indexer for each such row and remove. If I iterated over Indexes first, then each pass would effectively run through expired SSTables multiple times. Reading it all in one pass is better. > Secondary index implementations do not integrate with IndexGCTransaction when > compaction contains fully expired SSTables > ------------------------------------------------------------------------------------------------------------------------ > > Key: CASSANDRA-20829 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20829 > Project: Apache Cassandra > Issue Type: Bug > Components: Feature/2i Index, Local/Compaction, Local/Compaction/TWCS > Reporter: Stefan Miklosovic > Assignee: Stefan Miklosovic > Priority: Normal > Fix For: 4.0.x, 4.1.x > > Time Spent: 10m > Remaining Estimate: 0h > > There is a test (1) which ensures that when data are TTLed and compacted, > IndexGCTransaction is aware of that and it will invoke Indexer.removeRow() > method eventually. > However, this is not working properly when we have fully expired SSTables, > e.g. as the result of a table being on TWCS and having TTL on that. > The reason is that in CompactionTask, we are filtering out fully expired ones > (2). These then do not go to the compaction process and then they are not > reacted on in listener() (3) which contains this logic (4). Eventually, > onRowMerge in IndexGCTransaction will make the diff and in its commit > indexer.removeRow(row); will notify 2i about its removal. > > This integration is missing and it is quite a big problem because if there > are custom secondary index implementations the fact that SSTables were fully > expired is not propagated to them which means that data are never removed > from whatever backend they use. > The solution is to go to the compaction with fully expired SSTables as well > _but only if we detected that respective column family has some indexes_ > > (1) > [https://github.com/apache/cassandra/blob/cassandra-4.1/test/unit/org/apache/cassandra/index/CustomIndexTest.java#L583-L607] > (2) > [https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L174] > (3) > [https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/db/compaction/CompactionIterator.java#L130] > (4) > [https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/db/compaction/CompactionIterator.java#L235-L252] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org