Blake Eggleston created CASSANDRA-14861:
-------------------------------------------

             Summary: Inaccurate sstable min/max metadata can cause data loss
                 Key: CASSANDRA-14861
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14861
             Project: Cassandra
          Issue Type: Bug
            Reporter: Blake Eggleston
            Assignee: Blake Eggleston
             Fix For: 3.0.18, 3.11.4, 4.0


There’s a bug in the way we filter sstables in the read path that can cause 
sstables containing relevant range tombstones to be excluded from reads. This 
can cause data resurrection for an individual read, and if compaction timing is 
right, permanent resurrection via read repair. 

We track the min and max clustering values when writing an sstable so we can 
avoid reading from sstables that don’t contain the clustering values we’re 
looking for in a given read. The min max for each clustering column are updated 
for each row / RT marker we write. In the case of range tombstones markers 
though, we only update the min max for the clustering values they contain, 
which is almost never the full set of clustering values. This leaves a min/max 
that are above/below (respectively) the real ranges covered by the range 
tombstone contained in the sstable.

For instance, assume we’re writing an sstable for a table with 3 clustering 
values. The current min clustering is 5:6:7. We write an RT marker for a range 
tombstone that deletes any row with the value 4 in the first clustering value 
so the open marker is [4:]. This would make the new min clustering 4:6:7 when 
it should really be 4:. If we do a read for clustering values of 4:5 and lower, 
we’ll exclude this sstable and it’s range tombstone, resurrecting any data 
there that this tombstone would have deleted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to