Yes it does. Consider if it didn't and you kept writing to the same partition, you'd never be able to remove any tombstones for that partition.
On Tue., 6 Nov. 2018, 19:40 DuyHai Doan <doanduy...@gmail.com wrote: > Hello all > > I have tried to sum up all rules related to tombstone removal: > > > ---------------------------------------------------------------------------------- > > Given a tombstone written at timestamp (t) for a partition key (P) in > SSTable (S1). This tombstone will be removed: > > 1) after gc_grace_seconds period has passed > 2) at the next compaction round, if SSTable S1 is selected (not at all > guaranteed because compaction is not deterministic) > 3) if the partition key (P) is not present in any other SSTable that is > NOT picked by the current round of compaction > > Rule 3) is quite complex to understand so here is the detailed explanation: > > If Partition Key (P) also exists in another SSTable (S2) that is NOT > compacted together with SSTable (S1), if we remove the tombstone, there is > some data in S2 that may resurrect. > > Precisely, at compaction time, Cassandra does not have ANY detail about > Partition (P) that stays in S2 so it cannot remove the tombstone right away. > > Now, for each SSTable, we have some metadata, namely minTimestamp and > maxTimestamp. > > I wonder if the current compaction optimization does use/leverage this > metadata for tombstone removal. Indeed if we know that tombstone timestamp > (t) < minTimestamp, it can be safely removed. > > Does someone has the info ? > > Regards > > >