[ https://issues.apache.org/jira/browse/CASSANDRA-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13866698#comment-13866698 ]
Lyuben Todorov commented on CASSANDRA-6440: ------------------------------------------- bq. On failure, we could list the valid neighbors to make their life a little easier. Great idea, had to do that when verifying the patch to get a good idea of what is going on. I'd add that onto the v1 patch it was pointed out that it would be better to fail a repair with invalid hosts rather than ignoring errors and trying to continue a repair. > Repair should allow repairing particular endpoints to reduce WAN usage. > ------------------------------------------------------------------------ > > Key: CASSANDRA-6440 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6440 > Project: Cassandra > Issue Type: New Feature > Reporter: sankalp kohli > Assignee: sankalp kohli > Priority: Minor > Attachments: 6440_repair.log, JIRA-6440-v2.diff, JIRA-6440.diff > > > The way we send out data that does not match over WAN can be improved. > Example: Say there are four nodes(A,B,C,D) which are replica of a range we > are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data > which other replicas have, then we will have following streams > 1) A to B and back > 2) A to C and back(Goes over WAN) > 3) A to D and back(Goes over WAN) > One of the ways of doing it to reduce WAN traffic is this. > 1) Repair A and B only with each other and C and D with each other starting > at same time t. > 2) Once these repairs have finished, A,B and C,D are in sync with respect to > time t. > 3) Now run a repair between A and C, the streams which are exchanged as a > result of the diff will also be streamed to B and D via A and C(C and D > behaves like a proxy to the streams). > For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and > even more for higher replication factors. > Another easy way to do this is to have repair command take nodes with which > you want to repair with. Then we can do something like this. > 1) Run repair between (A and B) and (C and D) > 2) Run repair between (A and C) > 3) Run repair between (A and B) and (C and D) > But this will increase the traffic inside the DC as we wont be doing proxy. -- This message was sent by Atlassian JIRA (v6.1.5#6160)