[ 
https://issues.apache.org/jira/browse/CASSANDRA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104361#comment-13104361
 ] 

Sylvain Lebresne commented on CASSANDRA-3200:
---------------------------------------------

Yes, having a not-bulky/continuous/incremental/ponies-powered repair would be 
nice. It's worth looking into it and I'm not even saying I won't help with that.

That being said, I've heard a number of ideas on that (including the discussion 
on CASSANDRA-2699) and I have yet to be fully convinced by one of those idea. I 
do think it's not a simple problem. So until proved otherwise, the ETA for 
CASSANDRA-2699 is unknown and unlikely in the very near future. In the 
meantime, repair is there and used by people.

Besides, while I understand that the past suckiness of the repair process may 
push one to think that "we should throw everything away and use something 
completely new", I think it would be wise to first ask ourselves if we can't 
improve/built on what we have to make it good enough first. In particular, 
repair is already able to work on any token range. It would be relatively easy 
for instance to run more repair on smaller ranges. That plus the fact that both 
(validation) compaction and streaming can now be throttled, that could make 
repair much less bulky at a very little cost (in development time/new bug 
potentially added).

And to get back to the issue at hand, it's actually not a complicated patch 
(given how repair works nowadays) and a very isolated one in what it will 
touch, so I see no reason why it wouldn't make it during the 1.0 series, while 
any potential replacement solution is almost guaranteed to not make it before 
1.1 *at best*.

> Repair: compare all trees together (for a given range/cf) instead of by pair 
> in isolation
> -----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3200
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3200
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>              Labels: repair
>             Fix For: 1.0.1
>
>
> Currently, repair compare merkle trees by pair, in isolation of any other 
> tree. What that means concretely is that if I have three node A, B and C 
> (RF=3) with A and B in sync, but C having some range r inconsitent with both 
> A and B (since those are consistent), we will do the following transfer of r: 
> A -> C, C -> A, B -> C, C -> B.
> The fact that we do both A -> C and C -> A is fine, because we cannot know 
> which one is more to date from A or C. However, the transfer B -> C is 
> useless provided we do A -> C if A and B are in sync. Not doing that transfer 
> will be a 25% improvement in that case. With RF=5 and only one node 
> inconsistent with all the others, that almost a 40% improvement, etc...
> Given that this situation of one node not in sync while the others are is 
> probably fairly common (one node died so it is behind), this could be a fair 
> improvement over what is transferred. In the case where we use repair to 
> rebuild completely a node, this will be a dramatic improvement, because it 
> will avoid the rebuilded node to get RF times the data it should get.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to