RE: Replica data distributing between racks

Jeremiah Jordan Tue, 03 May 2011 07:34:56 -0700

So we are currently running a 10 node ring in one DC, and we are going to be 
adding 5 more nodes
in another DC.  To keep the rings in each DC balanced, should I really 
calculate the tokens independently
and just make sure none of them are the same? Something like:


DC1 (RF 5):
1:      0
2:      17014118346046923173168730371588410572
3:      34028236692093846346337460743176821144
4:      51042355038140769519506191114765231716
5:      68056473384187692692674921486353642288
6:      85070591730234615865843651857942052860
7:      102084710076281539039012382229530463432
8:      119098828422328462212181112601118874004
9:      136112946768375385385349842972707284576
10:     153127065114422308558518573344295695148

DC2 (RF 3):
1:      1 (one off from DC1 node 1)
2:      34028236692093846346337460743176821145 (one off from DC1 node 3)
3:      68056473384187692692674921486353642290 (two off from DC1 node 5)
4:      102084710076281539039012382229530463435 (three off from DC1 node 7)
5:      136112946768375385385349842972707284580 (four off from DC1 node 9)

Originally I was thinking I should spread the DC2 nodes evenly in between every 
other DC1 node.
Or does it not matter where they are in respect to the DC1 nodes, and long as 
they fall somewhere
after every other DC1 node? So it is DC1-1, DC2-1, DC1-2, DC1-3, DC2-2, DC1-4, 
DC1-5...

-----Original Message-----
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Tuesday, May 03, 2011 9:14 AM
To: user@cassandra.apache.org
Subject: Re: Replica data distributing between racks

Right, when you are computing balanced RP tokens for NTS you need to compute 
the tokens for each DC independently.

On Tue, May 3, 2011 at 6:23 AM, aaron morton <aa...@thelastpickle.com> wrote:
> I've been digging into this and worked was able to reproduce something, not 
> sure if it's a fault and I can't work on it any more tonight.
>
>
> To reproduce:
> - 2 node cluster on my mac book
> - set the tokens as if they were nodes 3 and 4 in a 4 node cluster, 
> e.g. node 1 with 85070591730234615865843651857942052864 and node 2 
> 127605887595351923798765477786913079296
> - set cassandra-topology.properties to put the nodes in DC1 on RAC1 
> and RAC2
> - create a keyspace using NTS and strategy_options = [{DC1:1}]
>
> Inserted 10 rows they were distributed as
> - node 1 - 9 rows
> - node 2 - 1 row
>
> I *think* the problem has to do with TokenMetadata.firstTokenIndex(). It 
> often says the closest token to a key is the node 1 because in effect...
>
> - node 1 is responsible for 0 to 
> 85070591730234615865843651857942052864
> - node 2 is responsible for 85070591730234615865843651857942052864 to 
> 127605887595351923798765477786913079296
> - AND node 1 does the wrap around from 
> 127605887595351923798765477786913079296 to 0 as keys that would insert past 
> the last token in the ring array wrap to 0 because  insertMin is false.
>
> Thoughts ?
>
> Aaron
>
>
> On 3 May 2011, at 10:29, Eric tamme wrote:
>
>> On Mon, May 2, 2011 at 5:59 PM, aaron morton <aa...@thelastpickle.com> wrote:
>>> My bad, I missed the way TokenMetadata.ringIterator() and firstTokenIndex() 
>>> work.
>>>
>>> Eric, can you show the output from nodetool ring ?
>>>
>>>
>>
>> Sorry if the previous paste was way to unformatted, here is a 
>> pastie.org link with nicer formatting of nodetool ring output than 
>> plain text email allows.
>>
>> http://pastie.org/private/50khpakpffjhsmgf66oetg
>
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support 
http://www.datastax.com

RE: Replica data distributing between racks

Reply via email to