Re: Multi-DC Repairs and Token Questions

2014-10-09 Thread Alain RODRIGUEZ
Ok got it.

Thanks.

2014-10-07 14:56 GMT+02:00 Paulo Ricardo Motta Gomes 
paulo.mo...@chaordicsystems.com:

 This related issue might be of interest:
 https://issues.apache.org/jira/browse/CASSANDRA-7450

 In 1.2 -pr option does make cross DC repairs, but you must ensure that
 all nodes from all datacenter execute repair, otherwise some ranges will be
 missing. This fix enables -pr and -local together, which was disabled in
 2.0 because it didn't work (it also does not work in 1.2).

 On Tue, Oct 7, 2014 at 5:46 AM, Alain RODRIGUEZ arodr...@gmail.com
 wrote:

 Hi guys, sorry about digging this up, but, is this bug also affecting
 1.2.x versions ? I can't see this being backported to 1.2 on the Jira. Was
 this bug introduced in 2.0 ?

 Anyway, how does nodetool repair -pr behave on a multi DC env, does it
 make cross DC repairs or not ? Should we remove the pr option in a multi
 DC context to remove entropy between DCs ? I mean a repair -pr is supposed
 to repair the primary range for the current node, does it also repair
 corresponding primary range in other DCs ?

 Thanks for insight around this.

 2014-06-03 8:06 GMT+02:00 Nick Bailey n...@datastax.com:

 See https://issues.apache.org/jira/browse/CASSANDRA-7317


 On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen matthew.j.al...@gmail.com
  wrote:

 Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually)

 SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out
 identifiable information), they are same keyspace.

 Keyspace: SN_KEYSPACE:
   Replication Strategy:
 org.apache.cassandra.locator.NetworkTopologyStrategy
   Durable Writes: true
 Options: [DC_VIC:2, DC_NSW:2]

 In a nutshell, replication is working as expected, I'm just confused
 about token range assignments in a Multi-DC environment and how repairs
 should work

 From
 http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html,
 it specifies

 *Multiple data center deployments: calculate the tokens for
 each data center so that the hash range is evenly divided for the nodes in
 each data center*

 Given that nodetool -repair isn't multi-dc aware, in our production 18
 node cluster (9 nodes in each DC), which of the following token ranges
 should be used (Murmur3 Partitioner) ?

 Token range divided evenly over the 2 DC's/18 nodes as below ?

 Node DC_NSWDC_VIC
 1'-9223372036854775808''-8198552921648689608'
 2'-7173733806442603408''-6148914691236517208'
 3'-5124095576030431008''-4099276460824344808'
 4'-3074457345618258608''-2049638230412172408'
 5'-1024819115206086208''-8'
 6'1024819115206086192' '2049638230412172392'
 7'3074457345618258592' '4099276460824344792'
 8'5124095576030430992' '6148914691236517192'
 9'7173733806442603392' '8198552921648689592'

 Or An offset used for DC_VIC (i.e. DC_NSW + 100) ?

 Node DC_NSW DC_VIC
 1 '-9223372036854775808''-9223372036854775708'
 2 '-7173733806442603407''-7173733806442603307'
 3 '-5124095576030431006''-5124095576030430906'
 4 '-3074457345618258605''-3074457345618258505'
 5 '-1024819115206086204''-1024819115206086104'
 6 '1024819115206086197' '1024819115206086297'
 7 '3074457345618258598' '3074457345618258698'
 8 '5124095576030430999' '5124095576030431099'
 9 '7173733806442603400' '7173733806442603500'

 It's too late for me to switch to vnodes, hope that makes sense, thanks

 Matt



 On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal ssram...@gmail.com
  wrote:

 as Chovatia mentioned, the keyspaces seems to be different.
 try Describe keyspace SN_KEYSPACE and describe keyspace
 MY_KEYSPACE from CQL.
 This will give you an idea about how many replicas are there for these
 keyspaces.



 On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep 
 chovatia_jayd...@yahoo.co.in wrote:

 What is your partition type? Is
 it org.apache.cassandra.dht.Murmur3Partitioner?
 In your repair command i do see there are two different KeySpaces 
 MY_KEYSPACE
 and SN_KEYSPACE, are these two separate key spaces or typo?

 -jaydeep


   On Tuesday, 27 May 2014 10:26 PM, Matthew Allen 
 matthew.j.al...@gmail.com wrote:


 Hi,

 Am a bit confused regarding data ownership in a multi-dc environment.

 I have the following setup in a test cluster with a keyspace with
 (placement_strategy = 'NetworkTopologyStrategy' and strategy_options =
 {'DC_NSW':2,'DC_VIC':2};)

 Datacenter: DC_NSW
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 0
 nsw1  rack1   Up Normal  1007.43 MB  100.00%
 -9223372036854775808
 nsw2  rack1   Up Normal  1008.08 MB  100.00% 0


 Datacenter: DC_VIC
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 100
 vic1   rack1   Up Normal  1015.1 MB   100.00%
 

Re: Multi-DC Repairs and Token Questions

2014-10-07 Thread Alain RODRIGUEZ
Hi guys, sorry about digging this up, but, is this bug also affecting 1.2.x
versions ? I can't see this being backported to 1.2 on the Jira. Was this
bug introduced in 2.0 ?

Anyway, how does nodetool repair -pr behave on a multi DC env, does it make
cross DC repairs or not ? Should we remove the pr option in a multi DC
context to remove entropy between DCs ? I mean a repair -pr is supposed to
repair the primary range for the current node, does it also repair
corresponding primary range in other DCs ?

Thanks for insight around this.

2014-06-03 8:06 GMT+02:00 Nick Bailey n...@datastax.com:

 See https://issues.apache.org/jira/browse/CASSANDRA-7317


 On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen matthew.j.al...@gmail.com
 wrote:

 Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually)

 SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out
 identifiable information), they are same keyspace.

 Keyspace: SN_KEYSPACE:
   Replication Strategy:
 org.apache.cassandra.locator.NetworkTopologyStrategy
   Durable Writes: true
 Options: [DC_VIC:2, DC_NSW:2]

 In a nutshell, replication is working as expected, I'm just confused
 about token range assignments in a Multi-DC environment and how repairs
 should work

 From
 http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html,
 it specifies

 *Multiple data center deployments: calculate the tokens for each
 data center so that the hash range is evenly divided for the nodes in each
 data center*

 Given that nodetool -repair isn't multi-dc aware, in our production 18
 node cluster (9 nodes in each DC), which of the following token ranges
 should be used (Murmur3 Partitioner) ?

 Token range divided evenly over the 2 DC's/18 nodes as below ?

 Node DC_NSWDC_VIC
 1'-9223372036854775808''-8198552921648689608'
 2'-7173733806442603408''-6148914691236517208'
 3'-5124095576030431008''-4099276460824344808'
 4'-3074457345618258608''-2049638230412172408'
 5'-1024819115206086208''-8'
 6'1024819115206086192' '2049638230412172392'
 7'3074457345618258592' '4099276460824344792'
 8'5124095576030430992' '6148914691236517192'
 9'7173733806442603392' '8198552921648689592'

 Or An offset used for DC_VIC (i.e. DC_NSW + 100) ?

 Node DC_NSW DC_VIC
 1 '-9223372036854775808''-9223372036854775708'
 2 '-7173733806442603407''-7173733806442603307'
 3 '-5124095576030431006''-5124095576030430906'
 4 '-3074457345618258605''-3074457345618258505'
 5 '-1024819115206086204''-1024819115206086104'
 6 '1024819115206086197' '1024819115206086297'
 7 '3074457345618258598' '3074457345618258698'
 8 '5124095576030430999' '5124095576030431099'
 9 '7173733806442603400' '7173733806442603500'

 It's too late for me to switch to vnodes, hope that makes sense, thanks

 Matt



 On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal ssram...@gmail.com
 wrote:

 as Chovatia mentioned, the keyspaces seems to be different.
 try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE
 from CQL.
 This will give you an idea about how many replicas are there for these
 keyspaces.



 On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep 
 chovatia_jayd...@yahoo.co.in wrote:

 What is your partition type? Is
 it org.apache.cassandra.dht.Murmur3Partitioner?
 In your repair command i do see there are two different KeySpaces 
 MY_KEYSPACE
 and SN_KEYSPACE, are these two separate key spaces or typo?

 -jaydeep


   On Tuesday, 27 May 2014 10:26 PM, Matthew Allen 
 matthew.j.al...@gmail.com wrote:


 Hi,

 Am a bit confused regarding data ownership in a multi-dc environment.

 I have the following setup in a test cluster with a keyspace with
 (placement_strategy = 'NetworkTopologyStrategy' and strategy_options =
 {'DC_NSW':2,'DC_VIC':2};)

 Datacenter: DC_NSW
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 0
 nsw1  rack1   Up Normal  1007.43 MB  100.00%
 -9223372036854775808
 nsw2  rack1   Up Normal  1008.08 MB  100.00% 0


 Datacenter: DC_VIC
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 100
 vic1   rack1   Up Normal  1015.1 MB   100.00%
 -9223372036854775708
 vic2   rack1   Up Normal  1015.13 MB  100.00%
 100

 My understanding is that both Datacenters have a complete copy of the
 data, but when I run a repair -pr on each of the nodes, the vic hosts only
 take a couple of seconds, while the nsw nodes take about 5 minutes each.

 Does this mean that nsw nodes own the majority of the data given
 their key ranges and that repairs will need to cross datacenters ?

 Thanks

 Matt

 commandnodetool -h vic1 repair -pr   (takes seconds)
 Starting NodeTool
 [2014-05-28 15:11:02,783] Starting repair command 

Re: Multi-DC Repairs and Token Questions

2014-10-07 Thread Paulo Ricardo Motta Gomes
This related issue might be of interest:
https://issues.apache.org/jira/browse/CASSANDRA-7450

In 1.2 -pr option does make cross DC repairs, but you must ensure that
all nodes from all datacenter execute repair, otherwise some ranges will be
missing. This fix enables -pr and -local together, which was disabled in
2.0 because it didn't work (it also does not work in 1.2).

On Tue, Oct 7, 2014 at 5:46 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:

 Hi guys, sorry about digging this up, but, is this bug also affecting
 1.2.x versions ? I can't see this being backported to 1.2 on the Jira. Was
 this bug introduced in 2.0 ?

 Anyway, how does nodetool repair -pr behave on a multi DC env, does it
 make cross DC repairs or not ? Should we remove the pr option in a multi
 DC context to remove entropy between DCs ? I mean a repair -pr is supposed
 to repair the primary range for the current node, does it also repair
 corresponding primary range in other DCs ?

 Thanks for insight around this.

 2014-06-03 8:06 GMT+02:00 Nick Bailey n...@datastax.com:

 See https://issues.apache.org/jira/browse/CASSANDRA-7317


 On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen matthew.j.al...@gmail.com
 wrote:

 Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually)

 SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out
 identifiable information), they are same keyspace.

 Keyspace: SN_KEYSPACE:
   Replication Strategy:
 org.apache.cassandra.locator.NetworkTopologyStrategy
   Durable Writes: true
 Options: [DC_VIC:2, DC_NSW:2]

 In a nutshell, replication is working as expected, I'm just confused
 about token range assignments in a Multi-DC environment and how repairs
 should work

 From
 http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html,
 it specifies

 *Multiple data center deployments: calculate the tokens for
 each data center so that the hash range is evenly divided for the nodes in
 each data center*

 Given that nodetool -repair isn't multi-dc aware, in our production 18
 node cluster (9 nodes in each DC), which of the following token ranges
 should be used (Murmur3 Partitioner) ?

 Token range divided evenly over the 2 DC's/18 nodes as below ?

 Node DC_NSWDC_VIC
 1'-9223372036854775808''-8198552921648689608'
 2'-7173733806442603408''-6148914691236517208'
 3'-5124095576030431008''-4099276460824344808'
 4'-3074457345618258608''-2049638230412172408'
 5'-1024819115206086208''-8'
 6'1024819115206086192' '2049638230412172392'
 7'3074457345618258592' '4099276460824344792'
 8'5124095576030430992' '6148914691236517192'
 9'7173733806442603392' '8198552921648689592'

 Or An offset used for DC_VIC (i.e. DC_NSW + 100) ?

 Node DC_NSW DC_VIC
 1 '-9223372036854775808''-9223372036854775708'
 2 '-7173733806442603407''-7173733806442603307'
 3 '-5124095576030431006''-5124095576030430906'
 4 '-3074457345618258605''-3074457345618258505'
 5 '-1024819115206086204''-1024819115206086104'
 6 '1024819115206086197' '1024819115206086297'
 7 '3074457345618258598' '3074457345618258698'
 8 '5124095576030430999' '5124095576030431099'
 9 '7173733806442603400' '7173733806442603500'

 It's too late for me to switch to vnodes, hope that makes sense, thanks

 Matt



 On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal ssram...@gmail.com
 wrote:

 as Chovatia mentioned, the keyspaces seems to be different.
 try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE
 from CQL.
 This will give you an idea about how many replicas are there for these
 keyspaces.



 On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep 
 chovatia_jayd...@yahoo.co.in wrote:

 What is your partition type? Is
 it org.apache.cassandra.dht.Murmur3Partitioner?
 In your repair command i do see there are two different KeySpaces 
 MY_KEYSPACE
 and SN_KEYSPACE, are these two separate key spaces or typo?

 -jaydeep


   On Tuesday, 27 May 2014 10:26 PM, Matthew Allen 
 matthew.j.al...@gmail.com wrote:


 Hi,

 Am a bit confused regarding data ownership in a multi-dc environment.

 I have the following setup in a test cluster with a keyspace with
 (placement_strategy = 'NetworkTopologyStrategy' and strategy_options =
 {'DC_NSW':2,'DC_VIC':2};)

 Datacenter: DC_NSW
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 0
 nsw1  rack1   Up Normal  1007.43 MB  100.00%
 -9223372036854775808
 nsw2  rack1   Up Normal  1008.08 MB  100.00% 0


 Datacenter: DC_VIC
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 100
 vic1   rack1   Up Normal  1015.1 MB   100.00%
 -9223372036854775708
 vic2   rack1   Up Normal  1015.13 MB  100.00%
 100

 My understanding is that both 

Re: Multi-DC Repairs and Token Questions

2014-06-03 Thread Nick Bailey
See https://issues.apache.org/jira/browse/CASSANDRA-7317


On Mon, Jun 2, 2014 at 8:57 PM, Matthew Allen matthew.j.al...@gmail.com
wrote:

 Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually)

 SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out
 identifiable information), they are same keyspace.

 Keyspace: SN_KEYSPACE:
   Replication Strategy:
 org.apache.cassandra.locator.NetworkTopologyStrategy
   Durable Writes: true
 Options: [DC_VIC:2, DC_NSW:2]

 In a nutshell, replication is working as expected, I'm just confused about
 token range assignments in a Multi-DC environment and how repairs should
 work

 From
 http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html,
 it specifies

 *Multiple data center deployments: calculate the tokens for each
 data center so that the hash range is evenly divided for the nodes in each
 data center*

 Given that nodetool -repair isn't multi-dc aware, in our production 18
 node cluster (9 nodes in each DC), which of the following token ranges
 should be used (Murmur3 Partitioner) ?

 Token range divided evenly over the 2 DC's/18 nodes as below ?

 Node DC_NSWDC_VIC
 1'-9223372036854775808''-8198552921648689608'
 2'-7173733806442603408''-6148914691236517208'
 3'-5124095576030431008''-4099276460824344808'
 4'-3074457345618258608''-2049638230412172408'
 5'-1024819115206086208''-8'
 6'1024819115206086192' '2049638230412172392'
 7'3074457345618258592' '4099276460824344792'
 8'5124095576030430992' '6148914691236517192'
 9'7173733806442603392' '8198552921648689592'

 Or An offset used for DC_VIC (i.e. DC_NSW + 100) ?

 Node DC_NSW DC_VIC
 1 '-9223372036854775808''-9223372036854775708'
 2 '-7173733806442603407''-7173733806442603307'
 3 '-5124095576030431006''-5124095576030430906'
 4 '-3074457345618258605''-3074457345618258505'
 5 '-1024819115206086204''-1024819115206086104'
 6 '1024819115206086197' '1024819115206086297'
 7 '3074457345618258598' '3074457345618258698'
 8 '5124095576030430999' '5124095576030431099'
 9 '7173733806442603400' '7173733806442603500'

 It's too late for me to switch to vnodes, hope that makes sense, thanks

 Matt



 On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal ssram...@gmail.com
 wrote:

 as Chovatia mentioned, the keyspaces seems to be different.
 try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE
 from CQL.
 This will give you an idea about how many replicas are there for these
 keyspaces.



 On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep 
 chovatia_jayd...@yahoo.co.in wrote:

 What is your partition type? Is
 it org.apache.cassandra.dht.Murmur3Partitioner?
 In your repair command i do see there are two different KeySpaces 
 MY_KEYSPACE
 and SN_KEYSPACE, are these two separate key spaces or typo?

 -jaydeep


   On Tuesday, 27 May 2014 10:26 PM, Matthew Allen 
 matthew.j.al...@gmail.com wrote:


 Hi,

 Am a bit confused regarding data ownership in a multi-dc environment.

 I have the following setup in a test cluster with a keyspace with
 (placement_strategy = 'NetworkTopologyStrategy' and strategy_options =
 {'DC_NSW':2,'DC_VIC':2};)

 Datacenter: DC_NSW
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 0
 nsw1  rack1   Up Normal  1007.43 MB  100.00%
 -9223372036854775808
 nsw2  rack1   Up Normal  1008.08 MB  100.00% 0


 Datacenter: DC_VIC
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 100
 vic1   rack1   Up Normal  1015.1 MB   100.00%
 -9223372036854775708
 vic2   rack1   Up Normal  1015.13 MB  100.00% 100

 My understanding is that both Datacenters have a complete copy of the
 data, but when I run a repair -pr on each of the nodes, the vic hosts only
 take a couple of seconds, while the nsw nodes take about 5 minutes each.

 Does this mean that nsw nodes own the majority of the data given their
 key ranges and that repairs will need to cross datacenters ?

 Thanks

 Matt

 commandnodetool -h vic1 repair -pr   (takes seconds)
 Starting NodeTool
 [2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges
 for keyspace MY_KEYSPACE
 [2014-05-28 15:11:03,110] Repair session
 76d170f0-e626-11e3-af4e-218541ad23a1 for range
 (-9223372036854775808,-9223372036854775708] finished
 [2014-05-28 15:11:03,110] Repair command #1 finished
 [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system'
 [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces'

 commandnodetool -h vic2 repair -pr (takes seconds)
 Starting NodeTool
 [2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges
 for keyspace MY_KEYSPACE
 [2014-05-28 15:11:28,840] 

Re: Multi-DC Repairs and Token Questions

2014-06-02 Thread Matthew Allen
Hi Rameez, Chovatia, (sorry I initially replied to Dwight individually)

SN_KEYSPACE and MY_KEYSPACE are just typos (was try to mask out
identifiable information), they are same keyspace.

Keyspace: SN_KEYSPACE:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [DC_VIC:2, DC_NSW:2]

In a nutshell, replication is working as expected, I'm just confused about
token range assignments in a Multi-DC environment and how repairs should
work

From
http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configGenTokens_c.html,
it specifies

*Multiple data center deployments: calculate the tokens for each
data center so that the hash range is evenly divided for the nodes in each
data center*

Given that nodetool -repair isn't multi-dc aware, in our production 18 node
cluster (9 nodes in each DC), which of the following token ranges should be
used (Murmur3 Partitioner) ?

Token range divided evenly over the 2 DC's/18 nodes as below ?

Node DC_NSWDC_VIC
1'-9223372036854775808''-8198552921648689608'
2'-7173733806442603408''-6148914691236517208'
3'-5124095576030431008''-4099276460824344808'
4'-3074457345618258608''-2049638230412172408'
5'-1024819115206086208''-8'
6'1024819115206086192' '2049638230412172392'
7'3074457345618258592' '4099276460824344792'
8'5124095576030430992' '6148914691236517192'
9'7173733806442603392' '8198552921648689592'

Or An offset used for DC_VIC (i.e. DC_NSW + 100) ?

Node DC_NSW DC_VIC
1 '-9223372036854775808''-9223372036854775708'
2 '-7173733806442603407''-7173733806442603307'
3 '-5124095576030431006''-5124095576030430906'
4 '-3074457345618258605''-3074457345618258505'
5 '-1024819115206086204''-1024819115206086104'
6 '1024819115206086197' '1024819115206086297'
7 '3074457345618258598' '3074457345618258698'
8 '5124095576030430999' '5124095576030431099'
9 '7173733806442603400' '7173733806442603500'

It's too late for me to switch to vnodes, hope that makes sense, thanks

Matt



On Thu, May 29, 2014 at 12:01 AM, Rameez Thonnakkal ssram...@gmail.com
wrote:

 as Chovatia mentioned, the keyspaces seems to be different.
 try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE
 from CQL.
 This will give you an idea about how many replicas are there for these
 keyspaces.



 On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep 
 chovatia_jayd...@yahoo.co.in wrote:

 What is your partition type? Is
 it org.apache.cassandra.dht.Murmur3Partitioner?
 In your repair command i do see there are two different KeySpaces 
 MY_KEYSPACE
 and SN_KEYSPACE, are these two separate key spaces or typo?

 -jaydeep


   On Tuesday, 27 May 2014 10:26 PM, Matthew Allen 
 matthew.j.al...@gmail.com wrote:


 Hi,

 Am a bit confused regarding data ownership in a multi-dc environment.

 I have the following setup in a test cluster with a keyspace with
 (placement_strategy = 'NetworkTopologyStrategy' and strategy_options =
 {'DC_NSW':2,'DC_VIC':2};)

 Datacenter: DC_NSW
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 0
 nsw1  rack1   Up Normal  1007.43 MB  100.00%
 -9223372036854775808
 nsw2  rack1   Up Normal  1008.08 MB  100.00% 0


 Datacenter: DC_VIC
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 100
 vic1   rack1   Up Normal  1015.1 MB   100.00%
 -9223372036854775708
 vic2   rack1   Up Normal  1015.13 MB  100.00% 100

 My understanding is that both Datacenters have a complete copy of the
 data, but when I run a repair -pr on each of the nodes, the vic hosts only
 take a couple of seconds, while the nsw nodes take about 5 minutes each.

 Does this mean that nsw nodes own the majority of the data given their
 key ranges and that repairs will need to cross datacenters ?

 Thanks

 Matt

 commandnodetool -h vic1 repair -pr   (takes seconds)
 Starting NodeTool
 [2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges
 for keyspace MY_KEYSPACE
 [2014-05-28 15:11:03,110] Repair session
 76d170f0-e626-11e3-af4e-218541ad23a1 for range
 (-9223372036854775808,-9223372036854775708] finished
 [2014-05-28 15:11:03,110] Repair command #1 finished
 [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system'
 [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces'

 commandnodetool -h vic2 repair -pr (takes seconds)
 Starting NodeTool
 [2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges
 for keyspace MY_KEYSPACE
 [2014-05-28 15:11:28,840] Repair session
 864b14a0-e626-11e3-9612-07b0c029e3c7 for range (0,100] finished
 [2014-05-28 15:11:28,840] Repair command #1 finished
 [2014-05-28 15:11:28,866] Nothing to repair for 

Re: Multi-DC Repairs and Token Questions

2014-05-28 Thread chovatia jaydeep
What is your partition type? Is it org.apache.cassandra.dht.Murmur3Partitioner?
In your repair command i do see there are two different KeySpaces MY_KEYSPACE 
and SN_KEYSPACE, are these two separate key spaces or typo?  

  
-jaydeep



On Tuesday, 27 May 2014 10:26 PM, Matthew Allen matthew.j.al...@gmail.com 
wrote:
 


Hi,

Am a bit confused regarding data ownership in a multi-dc environment.

I have the following setup in a test cluster with a keyspace with 
(placement_strategy = 'NetworkTopologyStrategy' and strategy_options = 
{'DC_NSW':2,'DC_VIC':2};)


Datacenter: DC_NSW
==
Replicas: 2
Address Rack    Status State   Load    Owns    
Token
   0
nsw1  rack1   Up Normal  1007.43 MB  100.00% 
-9223372036854775808
nsw2  rack1   Up Normal  1008.08 MB  100.00% 0


Datacenter: DC_VIC
==
Replicas: 2
Address Rack    Status State   Load    Owns    
Token
   
100
vic1   rack1   Up Normal  1015.1 MB   100.00% 
-9223372036854775708
vic2   rack1   Up Normal  1015.13 MB  100.00% 100


My understanding is that both Datacenters have a complete copy of the data, but 
when I run a repair -pr on each of the nodes, the vic hosts only take a couple 
of seconds, while the nsw nodes take about 5 minutes each.

Does this mean that nsw nodes own the majority of the data given their key 
ranges and that repairs will need to cross datacenters ?


Thanks

Matt


commandnodetool -h vic1 repair -pr   (takes seconds)
Starting NodeTool
[2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges for 
keyspace MY_KEYSPACE
[2014-05-28 15:11:03,110] Repair session 76d170f0-e626-11e3-af4e-218541ad23a1 
for range (-9223372036854775808,-9223372036854775708] finished
[2014-05-28 15:11:03,110] Repair command #1 finished
[2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system'
[2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces'

commandnodetool -h vic2 repair -pr (takes seconds)
Starting NodeTool
[2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges for 
keyspace MY_KEYSPACE
[2014-05-28 15:11:28,840] Repair session 864b14a0-e626-11e3-9612-07b0c029e3c7 
for range (0,100] finished
[2014-05-28 15:11:28,840] Repair command #1 finished
[2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system'
[2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system_traces'

commandnodetool -h nsw1 repair -pr (takes minutes)
Starting NodeTool
[2014-05-28 15:11:32,579] Starting repair command #1, repairing 1 ranges for 
keyspace SN_KEYSPACE
[2014-05-28 15:14:07,187] Repair session 88966430-e626-11e3-81eb-c991646ac2bf 
for range (100,-9223372036854775808] finished
[2014-05-28 15:14:07,187] Repair command #1 finished
[2014-05-28 15:14:11,393] Nothing to repair for keyspace 'system'
[2014-05-28 15:14:11,440] Nothing to repair for keyspace 'system_traces'

commandnodetool -h nsw2 repair -pr (takes minutes)
Starting NodeTool
[2014-05-28 15:14:18,670] Starting repair command #1, repairing 1 ranges for 
keyspace SN_KEYSPACE
[2014-05-28 15:17:27,300] Repair session eb936ce0-e626-11e3-81e2-8790242f886e 
for range (-9223372036854775708,0] finished
[2014-05-28 15:17:27,300] Repair command #1 finished
[2014-05-28 15:17:32,017] Nothing to repair for keyspace 'system'
[2014-05-28 15:17:32,064] Nothing to repair for keyspace 'system_traces'

Re: Multi-DC Repairs and Token Questions

2014-05-28 Thread Rameez Thonnakkal
as Chovatia mentioned, the keyspaces seems to be different.
try Describe keyspace SN_KEYSPACE and describe keyspace MY_KEYSPACE
from CQL.
This will give you an idea about how many replicas are there for these
keyspaces.



On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep 
chovatia_jayd...@yahoo.co.in wrote:

 What is your partition type? Is
 it org.apache.cassandra.dht.Murmur3Partitioner?
 In your repair command i do see there are two different KeySpaces 
 MY_KEYSPACE
 and SN_KEYSPACE, are these two separate key spaces or typo?

 -jaydeep


   On Tuesday, 27 May 2014 10:26 PM, Matthew Allen 
 matthew.j.al...@gmail.com wrote:


 Hi,

 Am a bit confused regarding data ownership in a multi-dc environment.

 I have the following setup in a test cluster with a keyspace with
 (placement_strategy = 'NetworkTopologyStrategy' and strategy_options =
 {'DC_NSW':2,'DC_VIC':2};)

 Datacenter: DC_NSW
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 0
 nsw1  rack1   Up Normal  1007.43 MB  100.00%
 -9223372036854775808
 nsw2  rack1   Up Normal  1008.08 MB  100.00% 0


 Datacenter: DC_VIC
 ==
 Replicas: 2
 Address RackStatus State   Load
 OwnsToken

 100
 vic1   rack1   Up Normal  1015.1 MB   100.00%
 -9223372036854775708
 vic2   rack1   Up Normal  1015.13 MB  100.00% 100

 My understanding is that both Datacenters have a complete copy of the
 data, but when I run a repair -pr on each of the nodes, the vic hosts only
 take a couple of seconds, while the nsw nodes take about 5 minutes each.

 Does this mean that nsw nodes own the majority of the data given their
 key ranges and that repairs will need to cross datacenters ?

 Thanks

 Matt

 commandnodetool -h vic1 repair -pr   (takes seconds)
 Starting NodeTool
 [2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges
 for keyspace MY_KEYSPACE
 [2014-05-28 15:11:03,110] Repair session
 76d170f0-e626-11e3-af4e-218541ad23a1 for range
 (-9223372036854775808,-9223372036854775708] finished
 [2014-05-28 15:11:03,110] Repair command #1 finished
 [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system'
 [2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces'

 commandnodetool -h vic2 repair -pr (takes seconds)
 Starting NodeTool
 [2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges
 for keyspace MY_KEYSPACE
 [2014-05-28 15:11:28,840] Repair session
 864b14a0-e626-11e3-9612-07b0c029e3c7 for range (0,100] finished
 [2014-05-28 15:11:28,840] Repair command #1 finished
 [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system'
 [2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system_traces'

 commandnodetool -h nsw1 repair -pr (takes minutes)
 Starting NodeTool
 [2014-05-28 15:11:32,579] Starting repair command #1, repairing 1 ranges
 for keyspace SN_KEYSPACE
 [2014-05-28 15:14:07,187] Repair session
 88966430-e626-11e3-81eb-c991646ac2bf for range (100,-9223372036854775808]
 finished
 [2014-05-28 15:14:07,187] Repair command #1 finished
 [2014-05-28 15:14:11,393] Nothing to repair for keyspace 'system'
 [2014-05-28 15:14:11,440] Nothing to repair for keyspace 'system_traces'

 commandnodetool -h nsw2 repair -pr (takes minutes)
 Starting NodeTool
 [2014-05-28 15:14:18,670] Starting repair command #1, repairing 1 ranges
 for keyspace SN_KEYSPACE
 [2014-05-28 15:17:27,300] Repair session
 eb936ce0-e626-11e3-81e2-8790242f886e for range (-9223372036854775708,0]
 finished
 [2014-05-28 15:17:27,300] Repair command #1 finished
 [2014-05-28 15:17:32,017] Nothing to repair for keyspace 'system'
 [2014-05-28 15:17:32,064] Nothing to repair for keyspace 'system_traces'





Multi-DC Repairs and Token Questions

2014-05-27 Thread Matthew Allen
Hi,

Am a bit confused regarding data ownership in a multi-dc environment.

I have the following setup in a test cluster with a keyspace with
(placement_strategy = 'NetworkTopologyStrategy' and strategy_options =
{'DC_NSW':2,'DC_VIC':2};)

Datacenter: DC_NSW
==
Replicas: 2
Address RackStatus State   Load
OwnsToken

0
nsw1  rack1   Up Normal  1007.43 MB  100.00%
-9223372036854775808
nsw2  rack1   Up Normal  1008.08 MB  100.00% 0


Datacenter: DC_VIC
==
Replicas: 2
Address RackStatus State   Load
OwnsToken

100
vic1   rack1   Up Normal  1015.1 MB   100.00%
-9223372036854775708
vic2   rack1   Up Normal  1015.13 MB  100.00% 100

My understanding is that both Datacenters have a complete copy of the data,
but when I run a repair -pr on each of the nodes, the vic hosts only take a
couple of seconds, while the nsw nodes take about 5 minutes each.

Does this mean that nsw nodes own the majority of the data given their
key ranges and that repairs will need to cross datacenters ?

Thanks

Matt

commandnodetool -h vic1 repair -pr   (takes seconds)
Starting NodeTool
[2014-05-28 15:11:02,783] Starting repair command #1, repairing 1 ranges
for keyspace MY_KEYSPACE
[2014-05-28 15:11:03,110] Repair session
76d170f0-e626-11e3-af4e-218541ad23a1 for range
(-9223372036854775808,-9223372036854775708] finished
[2014-05-28 15:11:03,110] Repair command #1 finished
[2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system'
[2014-05-28 15:11:03,126] Nothing to repair for keyspace 'system_traces'

commandnodetool -h vic2 repair -pr (takes seconds)
Starting NodeTool
[2014-05-28 15:11:28,746] Starting repair command #1, repairing 1 ranges
for keyspace MY_KEYSPACE
[2014-05-28 15:11:28,840] Repair session
864b14a0-e626-11e3-9612-07b0c029e3c7 for range (0,100] finished
[2014-05-28 15:11:28,840] Repair command #1 finished
[2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system'
[2014-05-28 15:11:28,866] Nothing to repair for keyspace 'system_traces'

commandnodetool -h nsw1 repair -pr (takes minutes)
Starting NodeTool
[2014-05-28 15:11:32,579] Starting repair command #1, repairing 1 ranges
for keyspace SN_KEYSPACE
[2014-05-28 15:14:07,187] Repair session
88966430-e626-11e3-81eb-c991646ac2bf for range (100,-9223372036854775808]
finished
[2014-05-28 15:14:07,187] Repair command #1 finished
[2014-05-28 15:14:11,393] Nothing to repair for keyspace 'system'
[2014-05-28 15:14:11,440] Nothing to repair for keyspace 'system_traces'

commandnodetool -h nsw2 repair -pr (takes minutes)
Starting NodeTool
[2014-05-28 15:14:18,670] Starting repair command #1, repairing 1 ranges
for keyspace SN_KEYSPACE
[2014-05-28 15:17:27,300] Repair session
eb936ce0-e626-11e3-81e2-8790242f886e for range (-9223372036854775708,0]
finished
[2014-05-28 15:17:27,300] Repair command #1 finished
[2014-05-28 15:17:32,017] Nothing to repair for keyspace 'system'
[2014-05-28 15:17:32,064] Nothing to repair for keyspace 'system_traces'