Re: Confusion regarding the terms replica and replication factor

2012-05-30 Thread Jeff Williams
First, note that replication is done at the row level, not at the node level. That line should look more like: placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC1: 1,DC2: 1,DC3: 1 } This means that each row will have one copy in each DC and within each DC it's

Re: Confusion regarding the terms replica and replication factor

2012-05-30 Thread David Fischer
Thanks! My missunderstanding was the snitch names are broken up by DC1:RAC1 and the strategy_options takes only the first part of the snitch names? On Wed, May 30, 2012 at 12:14 PM, Jeff Williams je...@wherethebitsroam.com wrote: First, note that replication is done at the row level, not at

Re: Confusion regarding the terms replica and replication factor

2012-05-30 Thread Edward Capriolo
You can avoid the confusion by using the term natural endpoints. For example, with a replication factor of 3 natural endpoints for key x are node1, node2, node11. The snitch does use the datacenter and the rack but almost all deployments use a single rack per DC, because when you have more then

Re: Confusion regarding the terms replica and replication factor

2012-05-30 Thread Jeff Williams
On May 30, 2012, at 10:32 PM, Edward Capriolo wrote: The snitch does use the datacenter and the rack but almost all deployments use a single rack per DC, because when you have more then one rack in a data center the NTS snitch has some logic to spread the data between racks. (most people

Re: Confusion regarding the terms replica and replication factor

2012-05-30 Thread Edward Capriolo
http://answers.oreilly.com/topic/2408-replica-placement-strategies-when-using-cassandra/ As mentioned it does this: The Network Topology Strategy places some replicas in another data center and the remainder in other racks in the first data center, as specified Which is not what most would

Re: Confusion regarding the terms replica and replication factor

2012-05-29 Thread David Fischer
Ok now i am confused :), ok if i have the following placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC1:R1,DC2:R1,DC3:R1 } this means in each of my datacenters i will have one full replica that also can be seed node? if i have 3 node in addition to the DC replica's with

Re: Confusion regarding the terms replica and replication factor

2012-05-24 Thread aaron morton
This is partly historical. NTS (as it is now) has not always existed and was not always the default. In days gone by used to be a fella could run a mighty fine key-value store using just a Simple Replication Strategy. A different way to visualise it is a single ring with a Z axis for the DC's.

Re: Confusion regarding the terms replica and replication factor

2012-05-23 Thread aaron morton
Now if a row key hash is mapped to a range owned by a node in DC3, will the Node in DC3 still store the key as determined by the partitioner and then walk the ring and store 2 replicas each in DC1 and DC2 ? No, only nodes in the DC's specified in the NTS configuration will be replicas. Or

Re: Confusion regarding the terms replica and replication factor

2012-05-23 Thread java jalwa
Thanks Aaron. That makes things clear. So I guess the 0 - 2^127 range for tokens corresponds to a cluster -level top-level ring. and then you add some logic on top of that with NTS to logically segment that range into sub-rings as per the notion of data clusters defined in NTS. Whats the advantage

Confusion regarding the terms replica and replication factor

2012-05-22 Thread java jalwa
Hi all, I am a bit confused regarding the terms replica and replication factor. Assume that I am using RandomPartitioner and NetworkTopologyStrategy for replica placement. From what I understand, with a RandomPartitioner, a row key will always be hashed and be stored on the node that