Re: Multi-DC Repairs and Token Questions
Ok got it. Thanks. 2014-10-07 14:56 GMT+02:00 Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com: This related issue might be of interest: https://issues.apache.org/jira/browse/CASSANDRA-7450 In 1.2 -pr option does make cross DC repairs, but you must ensure that all nodes from all datacenter execute repair, otherwise some ranges will be missing. This fix enables -pr and -local together, which was disabled in 2.0 because it didn't work (it also does not work in 1.2). On Tue, Oct 7, 2014 at 5:46 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi guys, sorry about digging this up, but, is this bug also affecting 1.2.x versions ? I can't see this being backported to 1.2 on the Jira. Was this bug introduced in 2.0 ? Anyway, how does nodetool repair -pr behave on a multi DC env, does it make cross DC repairs or not ? Should we remove the pr option in a multi DC context to remove entropy between DCs ? I mean a repair -pr is supposed to repair the primary range for the current node, does it also repair corresponding primary range in other DCs ? Thanks for insight around this. 2014-06-03 8:06 GMT+02:00 Nick Bailey n...@datastax.com: See https://issues.apache.org/jira/browse/CASSANDRA-7317 On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually) SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out identifiable information), they are same keyspace. Keyspace: SN_KEYSPACE: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [DC_VIC:2, DC_NSW:2] In a nutshell, replication is working as expected, I'm just confused about token range assignments in a Multi-DC environment and how repairs should work From http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html, it specifies *Multiple data center deployments: calculate the tokens for each data center so that the hash range is evenly divided for the nodes in each data center* Given that nodetool -repair isn't multi-dc aware, in our production 18 node cluster (9 nodes in each DC), which of the following token ranges should be used (Murmur3 Partitioner) ? Token range divided evenly over the 2 DC's/18 nodes as below ? Node DC_NSWDC_VIC 1'-9223372036854775808''-8198552921648689608' 2'-7173733806442603408''-6148914691236517208' 3'-5124095576030431008''-4099276460824344808' 4'-3074457345618258608''-2049638230412172408' 5'-1024819115206086208''-8' 6'1024819115206086192' '2049638230412172392' 7'3074457345618258592' '4099276460824344792' 8'5124095576030430992' '6148914691236517192' 9'7173733806442603392' '8198552921648689592' Or An offset used for DC_VIC (i.e. DC_NSW + 100) ? Node DC_NSW DC_VIC 1 '-9223372036854775808''-9223372036854775708' 2 '-7173733806442603407''-7173733806442603307' 3 '-5124095576030431006''-5124095576030430906' 4 '-3074457345618258605''-3074457345618258505' 5 '-1024819115206086204''-1024819115206086104' 6 '1024819115206086197' '1024819115206086297' 7 '3074457345618258598' '3074457345618258698' 8 '5124095576030430999' '5124095576030431099' 9 '7173733806442603400' '7173733806442603500' It's too late for me to switch to vnodes, hope that makes sense, thanks Matt On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal ssram...@gmail.com wrote: as Chovatia mentioned, the keyspaces seems to be different. try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE from CQL. This will give you an idea about how many replicas are there for these keyspaces. On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep chovatia_jayd...@yahoo.co.in wrote: What is your partition type? Is it org.apache.cassandra.dht.Murmur3Partitioner? In your repair command i do see there are two different KeySpaces MY_KEYSPACE and SN_KEYSPACE, are these two separate key spaces or typo? -jaydeep On Tuesday, 27 May 2014 10:26 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, Am a bit confused regarding data ownership in a multi-dc environment. I have the following setup in a test cluster with a keyspace with (placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {'DC_NSW':2,'DC_VIC':2};) Datacenter: DC_NSW == Replicas: 2 Address RackStatus State Load OwnsToken 0 nsw1 rack1 Up Normal 1007.43 MB 100.00% -9223372036854775808 nsw2 rack1 Up Normal 1008.08 MB 100.00% 0 Datacenter: DC_VIC == Replicas: 2 Address RackStatus State Load OwnsToken 100 vic1 rack1 Up Normal 1015.1 MB 100.00%
Re: Multi-DC Repairs and Token Questions
Hi guys, sorry about digging this up, but, is this bug also affecting 1.2.x versions ? I can't see this being backported to 1.2 on the Jira. Was this bug introduced in 2.0 ? Anyway, how does nodetool repair -pr behave on a multi DC env, does it make cross DC repairs or not ? Should we remove the pr option in a multi DC context to remove entropy between DCs ? I mean a repair -pr is supposed to repair the primary range for the current node, does it also repair corresponding primary range in other DCs ? Thanks for insight around this. 2014-06-03 8:06 GMT+02:00 Nick Bailey n...@datastax.com: See https://issues.apache.org/jira/browse/CASSANDRA-7317 On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually) SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out identifiable information), they are same keyspace. Keyspace: SN_KEYSPACE: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [DC_VIC:2, DC_NSW:2] In a nutshell, replication is working as expected, I'm just confused about token range assignments in a Multi-DC environment and how repairs should work From http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html, it specifies *Multiple data center deployments: calculate the tokens for each data center so that the hash range is evenly divided for the nodes in each data center* Given that nodetool -repair isn't multi-dc aware, in our production 18 node cluster (9 nodes in each DC), which of the following token ranges should be used (Murmur3 Partitioner) ? Token range divided evenly over the 2 DC's/18 nodes as below ? Node DC_NSWDC_VIC 1'-9223372036854775808''-8198552921648689608' 2'-7173733806442603408''-6148914691236517208' 3'-5124095576030431008''-4099276460824344808' 4'-3074457345618258608''-2049638230412172408' 5'-1024819115206086208''-8' 6'1024819115206086192' '2049638230412172392' 7'3074457345618258592' '4099276460824344792' 8'5124095576030430992' '6148914691236517192' 9'7173733806442603392' '8198552921648689592' Or An offset used for DC_VIC (i.e. DC_NSW + 100) ? Node DC_NSW DC_VIC 1 '-9223372036854775808''-9223372036854775708' 2 '-7173733806442603407''-7173733806442603307' 3 '-5124095576030431006''-5124095576030430906' 4 '-3074457345618258605''-3074457345618258505' 5 '-1024819115206086204''-1024819115206086104' 6 '1024819115206086197' '1024819115206086297' 7 '3074457345618258598' '3074457345618258698' 8 '5124095576030430999' '5124095576030431099' 9 '7173733806442603400' '7173733806442603500' It's too late for me to switch to vnodes, hope that makes sense, thanks Matt On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal ssram...@gmail.com wrote: as Chovatia mentioned, the keyspaces seems to be different. try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE from CQL. This will give you an idea about how many replicas are there for these keyspaces. On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep chovatia_jayd...@yahoo.co.in wrote: What is your partition type? Is it org.apache.cassandra.dht.Murmur3Partitioner? In your repair command i do see there are two different KeySpaces MY_KEYSPACE and SN_KEYSPACE, are these two separate key spaces or typo? -jaydeep On Tuesday, 27 May 2014 10:26 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, Am a bit confused regarding data ownership in a multi-dc environment. I have the following setup in a test cluster with a keyspace with (placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {'DC_NSW':2,'DC_VIC':2};) Datacenter: DC_NSW == Replicas: 2 Address RackStatus State Load OwnsToken 0 nsw1 rack1 Up Normal 1007.43 MB 100.00% -9223372036854775808 nsw2 rack1 Up Normal 1008.08 MB 100.00% 0 Datacenter: DC_VIC == Replicas: 2 Address RackStatus State Load OwnsToken 100 vic1 rack1 Up Normal 1015.1 MB 100.00% -9223372036854775708 vic2 rack1 Up Normal 1015.13 MB 100.00% 100 My understanding is that both Datacenters have a complete copy of the data, but when I run a repair -pr on each of the nodes, the vic hosts only take a couple of seconds, while the nsw nodes take about 5 minutes each. Does this mean that nsw nodes own the majority of the data given their key ranges and that repairs will need to cross datacenters ? Thanks Matt commandnodetool -h vic1 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:02,783] Starting repair command
Re: Multi-DC Repairs and Token Questions
This related issue might be of interest: https://issues.apache.org/jira/browse/CASSANDRA-7450 In 1.2 -pr option does make cross DC repairs, but you must ensure that all nodes from all datacenter execute repair, otherwise some ranges will be missing. This fix enables -pr and -local together, which was disabled in 2.0 because it didn't work (it also does not work in 1.2). On Tue, Oct 7, 2014 at 5:46 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi guys, sorry about digging this up, but, is this bug also affecting 1.2.x versions ? I can't see this being backported to 1.2 on the Jira. Was this bug introduced in 2.0 ? Anyway, how does nodetool repair -pr behave on a multi DC env, does it make cross DC repairs or not ? Should we remove the pr option in a multi DC context to remove entropy between DCs ? I mean a repair -pr is supposed to repair the primary range for the current node, does it also repair corresponding primary range in other DCs ? Thanks for insight around this. 2014-06-03 8:06 GMT+02:00 Nick Bailey n...@datastax.com: See https://issues.apache.org/jira/browse/CASSANDRA-7317 On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually) SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out identifiable information), they are same keyspace. Keyspace: SN_KEYSPACE: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [DC_VIC:2, DC_NSW:2] In a nutshell, replication is working as expected, I'm just confused about token range assignments in a Multi-DC environment and how repairs should work From http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html, it specifies *Multiple data center deployments: calculate the tokens for each data center so that the hash range is evenly divided for the nodes in each data center* Given that nodetool -repair isn't multi-dc aware, in our production 18 node cluster (9 nodes in each DC), which of the following token ranges should be used (Murmur3 Partitioner) ? Token range divided evenly over the 2 DC's/18 nodes as below ? Node DC_NSWDC_VIC 1'-9223372036854775808''-8198552921648689608' 2'-7173733806442603408''-6148914691236517208' 3'-5124095576030431008''-4099276460824344808' 4'-3074457345618258608''-2049638230412172408' 5'-1024819115206086208''-8' 6'1024819115206086192' '2049638230412172392' 7'3074457345618258592' '4099276460824344792' 8'5124095576030430992' '6148914691236517192' 9'7173733806442603392' '8198552921648689592' Or An offset used for DC_VIC (i.e. DC_NSW + 100) ? Node DC_NSW DC_VIC 1 '-9223372036854775808''-9223372036854775708' 2 '-7173733806442603407''-7173733806442603307' 3 '-5124095576030431006''-5124095576030430906' 4 '-3074457345618258605''-3074457345618258505' 5 '-1024819115206086204''-1024819115206086104' 6 '1024819115206086197' '1024819115206086297' 7 '3074457345618258598' '3074457345618258698' 8 '5124095576030430999' '5124095576030431099' 9 '7173733806442603400' '7173733806442603500' It's too late for me to switch to vnodes, hope that makes sense, thanks Matt On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal ssram...@gmail.com wrote: as Chovatia mentioned, the keyspaces seems to be different. try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE from CQL. This will give you an idea about how many replicas are there for these keyspaces. On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep chovatia_jayd...@yahoo.co.in wrote: What is your partition type? Is it org.apache.cassandra.dht.Murmur3Partitioner? In your repair command i do see there are two different KeySpaces MY_KEYSPACE and SN_KEYSPACE, are these two separate key spaces or typo? -jaydeep On Tuesday, 27 May 2014 10:26 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, Am a bit confused regarding data ownership in a multi-dc environment. I have the following setup in a test cluster with a keyspace with (placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {'DC_NSW':2,'DC_VIC':2};) Datacenter: DC_NSW == Replicas: 2 Address RackStatus State Load OwnsToken 0 nsw1 rack1 Up Normal 1007.43 MB 100.00% -9223372036854775808 nsw2 rack1 Up Normal 1008.08 MB 100.00% 0 Datacenter: DC_VIC == Replicas: 2 Address RackStatus State Load OwnsToken 100 vic1 rack1 Up Normal 1015.1 MB 100.00% -9223372036854775708 vic2 rack1 Up Normal 1015.13 MB 100.00% 100 My understanding is that both
Re: Multi-DC Repairs and Token Questions
See https://issues.apache.org/jira/browse/CASSANDRA-7317 On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually) SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out identifiable information), they are same keyspace. Keyspace: SN_KEYSPACE: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [DC_VIC:2, DC_NSW:2] In a nutshell, replication is working as expected, I'm just confused about token range assignments in a Multi-DC environment and how repairs should work From http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html, it specifies *Multiple data center deployments: calculate the tokens for each data center so that the hash range is evenly divided for the nodes in each data center* Given that nodetool -repair isn't multi-dc aware, in our production 18 node cluster (9 nodes in each DC), which of the following token ranges should be used (Murmur3 Partitioner) ? Token range divided evenly over the 2 DC's/18 nodes as below ? Node DC_NSWDC_VIC 1'-9223372036854775808''-8198552921648689608' 2'-7173733806442603408''-6148914691236517208' 3'-5124095576030431008''-4099276460824344808' 4'-3074457345618258608''-2049638230412172408' 5'-1024819115206086208''-8' 6'1024819115206086192' '2049638230412172392' 7'3074457345618258592' '4099276460824344792' 8'5124095576030430992' '6148914691236517192' 9'7173733806442603392' '8198552921648689592' Or An offset used for DC_VIC (i.e. DC_NSW + 100) ? Node DC_NSW DC_VIC 1 '-9223372036854775808''-9223372036854775708' 2 '-7173733806442603407''-7173733806442603307' 3 '-5124095576030431006''-5124095576030430906' 4 '-3074457345618258605''-3074457345618258505' 5 '-1024819115206086204''-1024819115206086104' 6 '1024819115206086197' '1024819115206086297' 7 '3074457345618258598' '3074457345618258698' 8 '5124095576030430999' '5124095576030431099' 9 '7173733806442603400' '7173733806442603500' It's too late for me to switch to vnodes, hope that makes sense, thanks Matt On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal ssram...@gmail.com wrote: as Chovatia mentioned, the keyspaces seems to be different. try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE from CQL. This will give you an idea about how many replicas are there for these keyspaces. On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep chovatia_jayd...@yahoo.co.in wrote: What is your partition type? Is it org.apache.cassandra.dht.Murmur3Partitioner? In your repair command i do see there are two different KeySpaces MY_KEYSPACE and SN_KEYSPACE, are these two separate key spaces or typo? -jaydeep On Tuesday, 27 May 2014 10:26 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, Am a bit confused regarding data ownership in a multi-dc environment. I have the following setup in a test cluster with a keyspace with (placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {'DC_NSW':2,'DC_VIC':2};) Datacenter: DC_NSW == Replicas: 2 Address RackStatus State Load OwnsToken 0 nsw1 rack1 Up Normal 1007.43 MB 100.00% -9223372036854775808 nsw2 rack1 Up Normal 1008.08 MB 100.00% 0 Datacenter: DC_VIC == Replicas: 2 Address RackStatus State Load OwnsToken 100 vic1 rack1 Up Normal 1015.1 MB 100.00% -9223372036854775708 vic2 rack1 Up Normal 1015.13 MB 100.00% 100 My understanding is that both Datacenters have a complete copy of the data, but when I run a repair -pr on each of the nodes, the vic hosts only take a couple of seconds, while the nsw nodes take about 5 minutes each. Does this mean that nsw nodes own the majority of the data given their key ranges and that repairs will need to cross datacenters ? Thanks Matt commandnodetool -h vic1 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges for keyspace MY_KEYSPACE [2014-05-28 15:11:03,110] Repair session 76d170f0-e626-11e3-af4e-218541ad23a1 for range (-9223372036854775808,-9223372036854775708] finished [2014-05-28 15:11:03,110] Repair command #1 finished [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system' [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces' commandnodetool -h vic2 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges for keyspace MY_KEYSPACE [2014-05-28 15:11:28,840]
Re: Multi-DC Repairs and Token Questions
Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually) SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out identifiable information), they are same keyspace. Keyspace: SN_KEYSPACE: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [DC_VIC:2, DC_NSW:2] In a nutshell, replication is working as expected, I'm just confused about token range assignments in a Multi-DC environment and how repairs should work From http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html, it specifies *Multiple data center deployments: calculate the tokens for each data center so that the hash range is evenly divided for the nodes in each data center* Given that nodetool -repair isn't multi-dc aware, in our production 18 node cluster (9 nodes in each DC), which of the following token ranges should be used (Murmur3 Partitioner) ? Token range divided evenly over the 2 DC's/18 nodes as below ? Node DC_NSWDC_VIC 1'-9223372036854775808''-8198552921648689608' 2'-7173733806442603408''-6148914691236517208' 3'-5124095576030431008''-4099276460824344808' 4'-3074457345618258608''-2049638230412172408' 5'-1024819115206086208''-8' 6'1024819115206086192' '2049638230412172392' 7'3074457345618258592' '4099276460824344792' 8'5124095576030430992' '6148914691236517192' 9'7173733806442603392' '8198552921648689592' Or An offset used for DC_VIC (i.e. DC_NSW + 100) ? Node DC_NSW DC_VIC 1 '-9223372036854775808''-9223372036854775708' 2 '-7173733806442603407''-7173733806442603307' 3 '-5124095576030431006''-5124095576030430906' 4 '-3074457345618258605''-3074457345618258505' 5 '-1024819115206086204''-1024819115206086104' 6 '1024819115206086197' '1024819115206086297' 7 '3074457345618258598' '3074457345618258698' 8 '5124095576030430999' '5124095576030431099' 9 '7173733806442603400' '7173733806442603500' It's too late for me to switch to vnodes, hope that makes sense, thanks Matt On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal ssram...@gmail.com wrote: as Chovatia mentioned, the keyspaces seems to be different. try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE from CQL. This will give you an idea about how many replicas are there for these keyspaces. On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep chovatia_jayd...@yahoo.co.in wrote: What is your partition type? Is it org.apache.cassandra.dht.Murmur3Partitioner? In your repair command i do see there are two different KeySpaces MY_KEYSPACE and SN_KEYSPACE, are these two separate key spaces or typo? -jaydeep On Tuesday, 27 May 2014 10:26 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, Am a bit confused regarding data ownership in a multi-dc environment. I have the following setup in a test cluster with a keyspace with (placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {'DC_NSW':2,'DC_VIC':2};) Datacenter: DC_NSW == Replicas: 2 Address RackStatus State Load OwnsToken 0 nsw1 rack1 Up Normal 1007.43 MB 100.00% -9223372036854775808 nsw2 rack1 Up Normal 1008.08 MB 100.00% 0 Datacenter: DC_VIC == Replicas: 2 Address RackStatus State Load OwnsToken 100 vic1 rack1 Up Normal 1015.1 MB 100.00% -9223372036854775708 vic2 rack1 Up Normal 1015.13 MB 100.00% 100 My understanding is that both Datacenters have a complete copy of the data, but when I run a repair -pr on each of the nodes, the vic hosts only take a couple of seconds, while the nsw nodes take about 5 minutes each. Does this mean that nsw nodes own the majority of the data given their key ranges and that repairs will need to cross datacenters ? Thanks Matt commandnodetool -h vic1 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges for keyspace MY_KEYSPACE [2014-05-28 15:11:03,110] Repair session 76d170f0-e626-11e3-af4e-218541ad23a1 for range (-9223372036854775808,-9223372036854775708] finished [2014-05-28 15:11:03,110] Repair command #1 finished [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system' [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces' commandnodetool -h vic2 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges for keyspace MY_KEYSPACE [2014-05-28 15:11:28,840] Repair session 864b14a0-e626-11e3-9612-07b0c029e3c7 for range (0,100] finished [2014-05-28 15:11:28,840] Repair command #1 finished [2014-05-28 15:11:28,866] Nothing to repair for
Re: Multi-DC Repairs and Token Questions
What is your partition type? Is it org.apache.cassandra.dht.Murmur3Partitioner? In your repair command i do see there are two different KeySpaces MY_KEYSPACE and SN_KEYSPACE, are these two separate key spaces or typo? -jaydeep On Tuesday, 27 May 2014 10:26 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, Am a bit confused regarding data ownership in a multi-dc environment. I have the following setup in a test cluster with a keyspace with (placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {'DC_NSW':2,'DC_VIC':2};) Datacenter: DC_NSW == Replicas: 2 Address Rack Status State Load Owns Token 0 nsw1 rack1 Up Normal 1007.43 MB 100.00% -9223372036854775808 nsw2 rack1 Up Normal 1008.08 MB 100.00% 0 Datacenter: DC_VIC == Replicas: 2 Address Rack Status State Load Owns Token 100 vic1 rack1 Up Normal 1015.1 MB 100.00% -9223372036854775708 vic2 rack1 Up Normal 1015.13 MB 100.00% 100 My understanding is that both Datacenters have a complete copy of the data, but when I run a repair -pr on each of the nodes, the vic hosts only take a couple of seconds, while the nsw nodes take about 5 minutes each. Does this mean that nsw nodes own the majority of the data given their key ranges and that repairs will need to cross datacenters ? Thanks Matt commandnodetool -h vic1 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges for keyspace MY_KEYSPACE [2014-05-28 15:11:03,110] Repair session 76d170f0-e626-11e3-af4e-218541ad23a1 for range (-9223372036854775808,-9223372036854775708] finished [2014-05-28 15:11:03,110] Repair command #1 finished [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system' [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces' commandnodetool -h vic2 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges for keyspace MY_KEYSPACE [2014-05-28 15:11:28,840] Repair session 864b14a0-e626-11e3-9612-07b0c029e3c7 for range (0,100] finished [2014-05-28 15:11:28,840] Repair command #1 finished [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system' [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system_traces' commandnodetool -h nsw1 repair -pr (takes minutes) Starting NodeTool [2014-05-28 15:11:32,579] Starting repair command #1, repairing 1 ranges for keyspace SN_KEYSPACE [2014-05-28 15:14:07,187] Repair session 88966430-e626-11e3-81eb-c991646ac2bf for range (100,-9223372036854775808] finished [2014-05-28 15:14:07,187] Repair command #1 finished [2014-05-28 15:14:11,393] Nothing to repair for keyspace 'system' [2014-05-28 15:14:11,440] Nothing to repair for keyspace 'system_traces' commandnodetool -h nsw2 repair -pr (takes minutes) Starting NodeTool [2014-05-28 15:14:18,670] Starting repair command #1, repairing 1 ranges for keyspace SN_KEYSPACE [2014-05-28 15:17:27,300] Repair session eb936ce0-e626-11e3-81e2-8790242f886e for range (-9223372036854775708,0] finished [2014-05-28 15:17:27,300] Repair command #1 finished [2014-05-28 15:17:32,017] Nothing to repair for keyspace 'system' [2014-05-28 15:17:32,064] Nothing to repair for keyspace 'system_traces'
Re: Multi-DC Repairs and Token Questions
as Chovatia mentioned, the keyspaces seems to be different. try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE from CQL. This will give you an idea about how many replicas are there for these keyspaces. On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep chovatia_jayd...@yahoo.co.in wrote: What is your partition type? Is it org.apache.cassandra.dht.Murmur3Partitioner? In your repair command i do see there are two different KeySpaces MY_KEYSPACE and SN_KEYSPACE, are these two separate key spaces or typo? -jaydeep On Tuesday, 27 May 2014 10:26 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Hi, Am a bit confused regarding data ownership in a multi-dc environment. I have the following setup in a test cluster with a keyspace with (placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {'DC_NSW':2,'DC_VIC':2};) Datacenter: DC_NSW == Replicas: 2 Address RackStatus State Load OwnsToken 0 nsw1 rack1 Up Normal 1007.43 MB 100.00% -9223372036854775808 nsw2 rack1 Up Normal 1008.08 MB 100.00% 0 Datacenter: DC_VIC == Replicas: 2 Address RackStatus State Load OwnsToken 100 vic1 rack1 Up Normal 1015.1 MB 100.00% -9223372036854775708 vic2 rack1 Up Normal 1015.13 MB 100.00% 100 My understanding is that both Datacenters have a complete copy of the data, but when I run a repair -pr on each of the nodes, the vic hosts only take a couple of seconds, while the nsw nodes take about 5 minutes each. Does this mean that nsw nodes own the majority of the data given their key ranges and that repairs will need to cross datacenters ? Thanks Matt commandnodetool -h vic1 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges for keyspace MY_KEYSPACE [2014-05-28 15:11:03,110] Repair session 76d170f0-e626-11e3-af4e-218541ad23a1 for range (-9223372036854775808,-9223372036854775708] finished [2014-05-28 15:11:03,110] Repair command #1 finished [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system' [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces' commandnodetool -h vic2 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges for keyspace MY_KEYSPACE [2014-05-28 15:11:28,840] Repair session 864b14a0-e626-11e3-9612-07b0c029e3c7 for range (0,100] finished [2014-05-28 15:11:28,840] Repair command #1 finished [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system' [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system_traces' commandnodetool -h nsw1 repair -pr (takes minutes) Starting NodeTool [2014-05-28 15:11:32,579] Starting repair command #1, repairing 1 ranges for keyspace SN_KEYSPACE [2014-05-28 15:14:07,187] Repair session 88966430-e626-11e3-81eb-c991646ac2bf for range (100,-9223372036854775808] finished [2014-05-28 15:14:07,187] Repair command #1 finished [2014-05-28 15:14:11,393] Nothing to repair for keyspace 'system' [2014-05-28 15:14:11,440] Nothing to repair for keyspace 'system_traces' commandnodetool -h nsw2 repair -pr (takes minutes) Starting NodeTool [2014-05-28 15:14:18,670] Starting repair command #1, repairing 1 ranges for keyspace SN_KEYSPACE [2014-05-28 15:17:27,300] Repair session eb936ce0-e626-11e3-81e2-8790242f886e for range (-9223372036854775708,0] finished [2014-05-28 15:17:27,300] Repair command #1 finished [2014-05-28 15:17:32,017] Nothing to repair for keyspace 'system' [2014-05-28 15:17:32,064] Nothing to repair for keyspace 'system_traces'
Multi-DC Repairs and Token Questions
Hi, Am a bit confused regarding data ownership in a multi-dc environment. I have the following setup in a test cluster with a keyspace with (placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {'DC_NSW':2,'DC_VIC':2};) Datacenter: DC_NSW == Replicas: 2 Address RackStatus State Load OwnsToken 0 nsw1 rack1 Up Normal 1007.43 MB 100.00% -9223372036854775808 nsw2 rack1 Up Normal 1008.08 MB 100.00% 0 Datacenter: DC_VIC == Replicas: 2 Address RackStatus State Load OwnsToken 100 vic1 rack1 Up Normal 1015.1 MB 100.00% -9223372036854775708 vic2 rack1 Up Normal 1015.13 MB 100.00% 100 My understanding is that both Datacenters have a complete copy of the data, but when I run a repair -pr on each of the nodes, the vic hosts only take a couple of seconds, while the nsw nodes take about 5 minutes each. Does this mean that nsw nodes own the majority of the data given their key ranges and that repairs will need to cross datacenters ? Thanks Matt commandnodetool -h vic1 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges for keyspace MY_KEYSPACE [2014-05-28 15:11:03,110] Repair session 76d170f0-e626-11e3-af4e-218541ad23a1 for range (-9223372036854775808,-9223372036854775708] finished [2014-05-28 15:11:03,110] Repair command #1 finished [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system' [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces' commandnodetool -h vic2 repair -pr (takes seconds) Starting NodeTool [2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges for keyspace MY_KEYSPACE [2014-05-28 15:11:28,840] Repair session 864b14a0-e626-11e3-9612-07b0c029e3c7 for range (0,100] finished [2014-05-28 15:11:28,840] Repair command #1 finished [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system' [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system_traces' commandnodetool -h nsw1 repair -pr (takes minutes) Starting NodeTool [2014-05-28 15:11:32,579] Starting repair command #1, repairing 1 ranges for keyspace SN_KEYSPACE [2014-05-28 15:14:07,187] Repair session 88966430-e626-11e3-81eb-c991646ac2bf for range (100,-9223372036854775808] finished [2014-05-28 15:14:07,187] Repair command #1 finished [2014-05-28 15:14:11,393] Nothing to repair for keyspace 'system' [2014-05-28 15:14:11,440] Nothing to repair for keyspace 'system_traces' commandnodetool -h nsw2 repair -pr (takes minutes) Starting NodeTool [2014-05-28 15:14:18,670] Starting repair command #1, repairing 1 ranges for keyspace SN_KEYSPACE [2014-05-28 15:17:27,300] Repair session eb936ce0-e626-11e3-81e2-8790242f886e for range (-9223372036854775708,0] finished [2014-05-28 15:17:27,300] Repair command #1 finished [2014-05-28 15:17:32,017] Nothing to repair for keyspace 'system' [2014-05-28 15:17:32,064] Nothing to repair for keyspace 'system_traces'