Re: Confusion regarding the terms "replica" and "replication factor"

Jeff Williams Wed, 30 May 2012 12:15:22 -0700

First, note that replication is done at the row level, not at the node level.


That line should look more like:

placement_strategy = 'NetworkTopologyStrategy'  and strategy_options = {DC1: 
1,DC2: 1,DC3: 1 }

This means that each row will have one copy in each DC and within each DC it's 
placement will be according to the partitioner, so could be on any of the nodes 
in the each DC.

So, don't think of it as nodes replicating, but rather as how nodes should 
store a copy of each row in each DC.

Also, replication does not relate the the seed nodes. Seed nodes allow the 
nodes to find each other initially, but are not special otherwise - any node 
can be used as a seed node.

So if you had a strategy like:

placement_strategy = 'NetworkTopologyStrategy'  and strategy_options = {DC1: 
3,DC2: 2,DC3: 1 }

Each row would exist on 3 of 4 nodes in DC1, 2 of 4 nodes in DC2 and on one of 
the nodes in DC3. Again, with the placement in each DC due to the partitioner, 
based on the row key.

Jeff

On May 29, 2012, at 11:25 PM, David Fischer wrote:

> Ok now i am confused :),
> 
> ok if i have the following
> placement_strategy = 'NetworkTopologyStrategy'  and strategy_options =
> {DC1:R1,DC2:R1,DC3:R1 }
> 
> this means in each of my datacenters i will have one full replica that
> also can be seed node?
> if i have 3 node in addition to the DC replica's with normal token
> calculations a key can be in any datacenter plus on each of the
> replicas right?
> It will show 12 nodes total in its ring
> 
> On Thu, May 24, 2012 at 2:39 AM, aaron morton <aa...@thelastpickle.com> wrote:
>> This is partly historical. NTS (as it is now) has not always existed and was 
>> not always the default. In days gone by used to be a fella could run a 
>> mighty fine key-value store using just a Simple Replication Strategy.
>> 
>> A different way to visualise it is a single ring with a Z axis for the DC's. 
>> When you look at the ring from the top you can see all the nodes. When you 
>> look at it from the side you can see the nodes are on levels that correspond 
>> to their DC. Simple Strategy looks at the ring from the top. NTS works 
>> through the layers of the ring.
>> 
>>> If the hierarchy is Cluster ->
>>> DataCenter -> Node, why exactly do we need globally unique node tokens
>>> even though nodes are at the lowest level in the hierarchy.
>> Nodes having a DC is a feature of *some* snitches and utilised by the *some* 
>> of the replication strategies (and by the messaging system for network 
>> efficiency). For background, mapping from row tokens to nodes is based on 
>> http://en.wikipedia.org/wiki/Consistent_hashing
>> 
>> Hope that helps.
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 24/05/2012, at 1:07 AM, java jalwa wrote:
>> 
>>> Thanks Aaron. That makes things clear.
>>> So I guess the 0 - 2^127 range for tokens corresponds to a cluster
>>> -level top-level ring. and then you add some logic on top of that with
>>> NTS to logically segment that range into sub-rings as per the notion
>>> of data clusters defined in NTS. Whats the advantage of having a
>>> single top-level ring ? intuitively it seems like each replication
>>> group could have a separate ring so that the same tokens can be
>>> assigned to nodes in different DC. If the hierarchy is Cluster ->
>>> DataCenter -> Node, why exactly do we need globally unique node tokens
>>> even though nodes are at the lowest level in the hierarchy.
>>> 
>>> Thanks again.
>>> 
>>> 
>>> On Wed, May 23, 2012 at 3:14 AM, aaron morton <aa...@thelastpickle.com> 
>>> wrote:
>>>>> Now if a row key hash is mapped to a range owned by a node in DC3,
>>>>> will the Node in DC3 still store the key as determined by the
>>>>> partitioner and then walk the ring and store 2 replicas each in DC1
>>>>> and DC2 ?
>>>> No, only nodes in the DC's specified in the NTS configuration will be 
>>>> replicas.
>>>> 
>>>>> Or will the co-ordinator node be aware of the
>>>>> replica placement strategy,
>>>>> and override the partitioner's decision and walk the ring until it
>>>>> first encounters a node in DC1 or DC2 ? and then place the remaining
>>>>> replicas ?
>>>> The NTS considers each DC to have it's own ring. This can make token 
>>>> selection in a multi DC environment confusing at times. There is something 
>>>> in the DS docs about it.
>>>> 
>>>> Cheers
>>>> 
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>> 
>>>> On 23/05/2012, at 3:16 PM, java jalwa wrote:
>>>> 
>>>>> Hi all,
>>>>>              I am a bit confused regarding the terms "replica" and
>>>>> "replication factor". Assume that I am using RandomPartitioner and
>>>>> NetworkTopologyStrategy for replica placement.
>>>>> From what I understand, with a RandomPartitioner, a row key will
>>>>> always be hashed and be stored on the node that owns the range to
>>>>> which the key is mapped.
>>>>> http://www.datastax.com/docs/1.0/cluster_architecture/replication#networktopologystrategy.
>>>>> The example here, talks about having 2 data centers and a replication
>>>>> factor of 4 with 2 replicas in each datacenter, so the strategy is
>>>>> configured as DC1:2 and DC2:2. Now suppose I add another datacenter
>>>>> DC3, and do not change the NetworkTopologyStrategy.
>>>>> Now if a row key hash is mapped to a range owned by a node in DC3,
>>>>> will the Node in DC3 still store the key as determined by the
>>>>> partitioner and then walk the ring and store 2 replicas each in DC1
>>>>> and DC2 ? Will that mean that I will then have 5 replicas in the
>>>>> cluster and not 4 ? Or will the co-ordinator node be aware of the
>>>>> replica placement strategy,
>>>>> and override the partitioner's decision and walk the ring until it
>>>>> first encounters a node in DC1 or DC2 ? and then place the remaining
>>>>> replicas ?
>>>>> 
>>>>> Thanks.
>>>> 
>>

Re: Confusion regarding the terms "replica" and "replication factor"

Reply via email to