[ https://issues.apache.org/jira/browse/CASSANDRA-12991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stefan Podkowinski updated CASSANDRA-12991: ------------------------------------------- Comment: was deleted (was: My assumption is that validation compaction works as follows: * involved nodes receive a ValidationRequest message * affected keyspace is being flushed * validation is started using sstables candidates determined right after the flush I don't see why you'd have to "SSTables created after that timestamp to be filtered when doing a validation compaction". Any SSTable created after the validation compaction was started should not be involved in the validation process anyways. ) > Inter-node race condition in validation compaction > -------------------------------------------------- > > Key: CASSANDRA-12991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12991 > Project: Cassandra > Issue Type: Improvement > Reporter: Benjamin Roth > Priority: Minor > > Problem: > When a validation compaction is triggered by a repair it may happen that due > to flying in mutations the merkle trees differ but the data is consistent > however. > Example: > t = 10000: > Repair starts, triggers validations > Node A starts validation > t = 10001: > Mutation arrives at Node A > t = 10002: > Mutation arrives at Node B > t = 10003: > Node B starts validation > Hashes of node A+B will differ but data is consistent from a view (think of > it like a snapshot) t = 10000. > Impact: > Unnecessary streaming happens. This may not a big impact on low traffic CFs, > partitions but on high traffic CFs and maybe very big partitions, this may > have a bigger impact and is a waste of resources. > Possible solution: > Build hashes based upon a snapshot timestamp. > This requires SSTables created after that timestamp to be filtered when doing > a validation compaction: > - Cells with timestamp > snapshot time have to be removed > - Tombstone range markers have to be handled > - Bounds have to be removed if delete timestamp > snapshot time > - Boundary markers have to be either changed to a bound or completely > removed, depending if start and/or end are both affected or not > Probably this is a known behaviour. Have there been any discussions about > this in the past? Did not find an matching issue, so I created this one. > I am happy about any feedback, whatsoever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)