Repair: compare all trees together (for a given range/cf) instead of by pair in 
isolation
-----------------------------------------------------------------------------------------

                 Key: CASSANDRA-3200
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3200
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Sylvain Lebresne
            Assignee: Sylvain Lebresne
            Priority: Minor
             Fix For: 1.0.1


Currently, repair compare merkle trees by pair, in isolation of any other tree. 
What that means concretely is that if I have three node A, B and C (RF=3) with 
A and B in sync, but C having some range r inconsitent with both A and B (since 
those are consistent), we will do the following transfer of r: A -> C, C -> A, 
B -> C, C -> B.

The fact that we do both A -> C and C -> A is fine, because we cannot know 
which one is more to date from A or C. However, the transfer B -> C is useless 
provided we do A -> C if A and B are in sync. Not doing that transfer will be a 
25% improvement in that case. With RF=5 and only one node inconsistent with all 
the others, that almost a 40% improvement, etc...

Given that this situation of one node not in sync while the others are is 
probably fairly common (one node died so it is behind), this could be a fair 
improvement over what is transferred. In the case where we use repair to 
rebuild completely a node, this will be a dramatic improvement, because it will 
avoid the rebuilded node to get RF times the data it should get.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to