[ https://issues.apache.org/jira/browse/CASSANDRA-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sankalp kohli updated CASSANDRA-6218: ------------------------------------- Summary: Repair should allow repairing particular data centers to reduce WAN usage (was: Reduce WAN traffic while doing repairs) > Repair should allow repairing particular data centers to reduce WAN usage > ------------------------------------------------------------------------- > > Key: CASSANDRA-6218 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6218 > Project: Cassandra > Issue Type: New Feature > Components: Core, Tools > Reporter: sankalp kohli > Assignee: Jimmy MÃ¥rdell > Priority: Minor > Fix For: 2.0.4 > > Attachments: trunk-6218-v2.txt, trunk-6218-v3.patch, trunk-6218.txt > > > The way we send out data that does not match over WAN can be improved. > Example: Say there are four nodes(A,B,C,D) which are replica of a range we > are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data > which other replicas have, then we will have following streams > 1) A to B and back > 2) A to C and back(Goes over WAN) > 3) A to D and back(Goes over WAN) > One of the ways of doing it to reduce WAN traffic is this. > 1) Repair A and B only with each other and C and D with each other starting > at same time t. > 2) Once these repairs have finished, A,B and C,D are in sync with respect to > time t. > 3) Now run a repair between A and C, the streams which are exchanged as a > result of the diff will also be streamed to B and D via A and C(C and D > behaves like a proxy to the streams). > For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and > even more for higher replication factors. > -- This message was sent by Atlassian JIRA (v6.1#6144)