This related issue might be of interest: https://issues.apache.org/jira/browse/CASSANDRA-7450
In 1.2 "-pr" option does make cross DC repairs, but you must ensure that all nodes from all datacenter execute repair, otherwise some ranges will be missing. This fix enables -pr and -local together, which was disabled in 2.0 because it didn't work (it also does not work in 1.2). On Tue, Oct 7, 2014 at 5:46 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote: > Hi guys, sorry about digging this up, but, is this bug also affecting > 1.2.x versions ? I can't see this being backported to 1.2 on the Jira. Was > this bug introduced in 2.0 ? > > Anyway, how does nodetool repair -pr behave on a multi DC env, does it > make cross DC repairs or not ? Should we remove the "pr" option in a multi > DC context to remove entropy between DCs ? I mean a repair -pr is supposed > to repair the primary range for the current node, does it also repair > corresponding primary range in other DCs ? > > Thanks for insight around this. > > 2014-06-03 8:06 GMT+02:00 Nick Bailey <n...@datastax.com>: > >> See https://issues.apache.org/jira/browse/CASSANDRA-7317 >> >> >> On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen <matthew.j.al...@gmail.com> >> wrote: >> >>> Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually) >>> >>> SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out >>> identifiable information), they are same keyspace. >>> >>> Keyspace: SN_KEYSPACE: >>> Replication Strategy: >>> org.apache.cassandra.locator.NetworkTopologyStrategy >>> Durable Writes: true >>> Options: [DC_VIC:2, DC_NSW:2] >>> >>> In a nutshell, replication is working as expected, I'm just confused >>> about token range assignments in a Multi-DC environment and how repairs >>> should work >>> >>> From >>> http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html, >>> it specifies >>> >>> * "Multiple data center deployments: calculate the tokens for >>> each data center so that the hash range is evenly divided for the nodes in >>> each data center"* >>> >>> Given that nodetool -repair isn't multi-dc aware, in our production 18 >>> node cluster (9 nodes in each DC), which of the following token ranges >>> should be used (Murmur3 Partitioner) ? >>> >>> Token range divided evenly over the 2 DC's/18 nodes as below ? >>> >>> Node DC_NSW DC_VIC >>> 1 '-9223372036854775808' '-8198552921648689608' >>> 2 '-7173733806442603408' '-6148914691236517208' >>> 3 '-5124095576030431008' '-4099276460824344808' >>> 4 '-3074457345618258608' '-2049638230412172408' >>> 5 '-1024819115206086208' '-8' >>> 6 '1024819115206086192' '2049638230412172392' >>> 7 '3074457345618258592' '4099276460824344792' >>> 8 '5124095576030430992' '6148914691236517192' >>> 9 '7173733806442603392' '8198552921648689592' >>> >>> Or An offset used for DC_VIC (i.e. DC_NSW + 100) ? >>> >>> Node DC_NSW DC_VIC >>> 1 '-9223372036854775808' '-9223372036854775708' >>> 2 '-7173733806442603407' '-7173733806442603307' >>> 3 '-5124095576030431006' '-5124095576030430906' >>> 4 '-3074457345618258605' '-3074457345618258505' >>> 5 '-1024819115206086204' '-1024819115206086104' >>> 6 '1024819115206086197' '1024819115206086297' >>> 7 '3074457345618258598' '3074457345618258698' >>> 8 '5124095576030430999' '5124095576030431099' >>> 9 '7173733806442603400' '7173733806442603500' >>> >>> It's too late for me to switch to vnodes, hope that makes sense, thanks >>> >>> Matt >>> >>> >>> >>> On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal <ssram...@gmail.com> >>> wrote: >>> >>>> as Chovatia mentioned, the keyspaces seems to be different. >>>> try "Describe keyspace SN_KEYSPACE" and "describe keyspace MY_KEYSPACE" >>>> from CQL. >>>> This will give you an idea about how many replicas are there for these >>>> keyspaces. >>>> >>>> >>>> >>>> On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep < >>>> chovatia_jayd...@yahoo.co.in> wrote: >>>> >>>>> What is your partition type? Is >>>>> it org.apache.cassandra.dht.Murmur3Partitioner? >>>>> In your repair command i do see there are two different KeySpaces >>>>> "MY_KEYSPACE" >>>>> and "SN_KEYSPACE", are these two separate key spaces or typo? >>>>> >>>>> -jaydeep >>>>> >>>>> >>>>> On Tuesday, 27 May 2014 10:26 PM, Matthew Allen < >>>>> matthew.j.al...@gmail.com> wrote: >>>>> >>>>> >>>>> Hi, >>>>> >>>>> Am a bit confused regarding data ownership in a multi-dc environment. >>>>> >>>>> I have the following setup in a test cluster with a keyspace with >>>>> (placement_strategy = 'NetworkTopologyStrategy' and strategy_options = >>>>> {'DC_NSW':2,'DC_VIC':2};) >>>>> >>>>> Datacenter: DC_NSW >>>>> ========== >>>>> Replicas: 2 >>>>> Address Rack Status State Load >>>>> Owns Token >>>>> >>>>> 0 >>>>> nsw1 rack1 Up Normal 1007.43 MB 100.00% >>>>> -9223372036854775808 >>>>> nsw2 rack1 Up Normal 1008.08 MB 100.00% 0 >>>>> >>>>> >>>>> Datacenter: DC_VIC >>>>> ========== >>>>> Replicas: 2 >>>>> Address Rack Status State Load >>>>> Owns Token >>>>> >>>>> 100 >>>>> vic1 rack1 Up Normal 1015.1 MB 100.00% >>>>> -9223372036854775708 >>>>> vic2 rack1 Up Normal 1015.13 MB 100.00% >>>>> 100 >>>>> >>>>> My understanding is that both Datacenters have a complete copy of the >>>>> data, but when I run a repair -pr on each of the nodes, the vic hosts only >>>>> take a couple of seconds, while the nsw nodes take about 5 minutes each. >>>>> >>>>> Does this mean that nsw nodes "own" the majority of the data given >>>>> their key ranges and that repairs will need to cross datacenters ? >>>>> >>>>> Thanks >>>>> >>>>> Matt >>>>> >>>>> command>nodetool -h vic1 repair -pr (takes seconds) >>>>> Starting NodeTool >>>>> [2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 >>>>> ranges for keyspace MY_KEYSPACE >>>>> [2014-05-28 15:11:03,110] Repair session >>>>> 76d170f0-e626-11e3-af4e-218541ad23a1 for range >>>>> (-9223372036854775808,-9223372036854775708] finished >>>>> [2014-05-28 15:11:03,110] Repair command #1 finished >>>>> [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system' >>>>> [2014-05-28 15:11:03,126] Nothing to repair for keyspace >>>>> 'system_traces' >>>>> >>>>> command>nodetool -h vic2 repair -pr (takes seconds) >>>>> Starting NodeTool >>>>> [2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 >>>>> ranges for keyspace MY_KEYSPACE >>>>> [2014-05-28 15:11:28,840] Repair session >>>>> 864b14a0-e626-11e3-9612-07b0c029e3c7 for range (0,100] finished >>>>> [2014-05-28 15:11:28,840] Repair command #1 finished >>>>> [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system' >>>>> [2014-05-28 15:11:28,866] Nothing to repair for keyspace >>>>> 'system_traces' >>>>> >>>>> command>nodetool -h nsw1 repair -pr (takes minutes) >>>>> Starting NodeTool >>>>> [2014-05-28 15:11:32,579] Starting repair command #1, repairing 1 >>>>> ranges for keyspace SN_KEYSPACE >>>>> [2014-05-28 15:14:07,187] Repair session >>>>> 88966430-e626-11e3-81eb-c991646ac2bf for range (100,-9223372036854775808] >>>>> finished >>>>> [2014-05-28 15:14:07,187] Repair command #1 finished >>>>> [2014-05-28 15:14:11,393] Nothing to repair for keyspace 'system' >>>>> [2014-05-28 15:14:11,440] Nothing to repair for keyspace >>>>> 'system_traces' >>>>> >>>>> command>nodetool -h nsw2 repair -pr (takes minutes) >>>>> Starting NodeTool >>>>> [2014-05-28 15:14:18,670] Starting repair command #1, repairing 1 >>>>> ranges for keyspace SN_KEYSPACE >>>>> [2014-05-28 15:17:27,300] Repair session >>>>> eb936ce0-e626-11e3-81e2-8790242f886e for range (-9223372036854775708,0] >>>>> finished >>>>> [2014-05-28 15:17:27,300] Repair command #1 finished >>>>> [2014-05-28 15:17:32,017] Nothing to repair for keyspace 'system' >>>>> [2014-05-28 15:17:32,064] Nothing to repair for keyspace >>>>> 'system_traces' >>>>> >>>>> >>>>> >>>> >>> >> > -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br <http://www.chaordic.com.br/>* +55 48 3232.3200