[ https://issues.apache.org/jira/browse/CASSANDRA-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lyuben Todorov updated CASSANDRA-6440: -------------------------------------- Attachment: 6440_repair.log Say we have a 4 node cluster composed of 2 DCs. Node 1 & 4 are in DC1 and node 2 & 3 are in DC2, we carry out a repair on node DC1 (node 1 & 4, via {{./nodetool repair -hosts 10.0.0.1,10.0.0.4}} All goes well, except for the system_traces keyspace where the non-neighbour error gets thrown. I'll attach a log with a few System.out statements added to show the keyspace being repaired, the hosts supplied and neighbours. > Repair should allow repairing particular endpoints to reduce WAN usage. > ------------------------------------------------------------------------ > > Key: CASSANDRA-6440 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6440 > Project: Cassandra > Issue Type: New Feature > Reporter: sankalp kohli > Assignee: sankalp kohli > Priority: Minor > Attachments: 6440_repair.log, JIRA-6440.diff > > > The way we send out data that does not match over WAN can be improved. > Example: Say there are four nodes(A,B,C,D) which are replica of a range we > are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data > which other replicas have, then we will have following streams > 1) A to B and back > 2) A to C and back(Goes over WAN) > 3) A to D and back(Goes over WAN) > One of the ways of doing it to reduce WAN traffic is this. > 1) Repair A and B only with each other and C and D with each other starting > at same time t. > 2) Once these repairs have finished, A,B and C,D are in sync with respect to > time t. > 3) Now run a repair between A and C, the streams which are exchanged as a > result of the diff will also be streamed to B and D via A and C(C and D > behaves like a proxy to the streams). > For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and > even more for higher replication factors. > Another easy way to do this is to have repair command take nodes with which > you want to repair with. Then we can do something like this. > 1) Run repair between (A and B) and (C and D) > 2) Run repair between (A and C) > 3) Run repair between (A and B) and (C and D) > But this will increase the traffic inside the DC as we wont be doing proxy. -- This message was sent by Atlassian JIRA (v6.1.5#6160)