Thanks. Some follow up questions :

1.  How do the reads use strategy/snitch information ? I am assuming
the reads can go to any of the replicas. WIll it also use the
snitch/strategy info to find next 'R' replicas 'closest' to
coordinator-node ?

2. In a single DC ( with n racks and r replicas ) what algorithm
cassandra uses to write its replicas in following scenarios :
a. n>r : I am assuming, have 1 replica in each rack.
b. n<r : ?? I am assuming, try to equally distribute replicas across
in each racks.

-Thanks,
Prasenjit

On Thu, Jul 12, 2012 at 11:24 AM, Tyler Hobbs <ty...@datastax.com> wrote:
> I highly recommend specifying the same rack for all nodes (using
> cassandra-topology.properties) unless you really have a good reason not too
> (and you probably don't).  The way that replicas are chosen when multiple
> racks are in play can be fairly confusing and lead to a data imbalance if
> you don't catch it.
>
>
> On Wed, Jul 11, 2012 at 10:53 PM, prasenjit mukherjee <prasen....@gmail.com>
> wrote:
>>
>> > As far as I know there isn't any way to use the rack name in the
>> > strategy_options for a keyspace. You
>> > might want to look at the code to dig into that, perhaps.
>>
>> Aha, I was wondering if I could do that as well ( specify rack options )
>> :)
>>
>> Thanks for the pointer, I will dig into the code.
>>
>> -Thanks,
>> Prasenjit
>>
>> On Thu, Jul 12, 2012 at 5:33 AM, Richard Lowe <richard.l...@arkivum.com>
>> wrote:
>> > If you then specify the parameters for the keyspace to use these, you
>> > can control exactly which set of nodes replicas end up on.
>> >
>> > For example, in cassandra-cli:
>> >
>> > create keyspace ks1 with placement_strategy =
>> > 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options
>> > = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 };
>> >
>> > As far as I know there isn't any way to use the rack name in the
>> > strategy_options for a keyspace. You might want to look at the code to dig
>> > into that, perhaps.
>> >
>> > Whichever snitch you use, the nodes are sorted in order of proximity to
>> > the client node. How this is determined depends on the snitch that's used
>> > but most (the ones that ship with Cassandra) will use the default ordering
>> > of same-node < same-rack < same-datacenter < different-datacenter. Each
>> > snitch has methods to tell Cassandra which rack and DC a node is in, so it
>> > always knows which node is closest. Used with the Bloom filters this can
>> > tell us where the nearest replica is.
>> >
>> >
>> >
>> > -----Original Message-----
>> > From: prasenjit mukherjee [mailto:prasen....@gmail.com]
>> > Sent: 11 July 2012 06:33
>> > To: user
>> > Subject: How to come up with a predefined topology
>> >
>> > Quoting from
>> > http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy
>> > :
>> >
>> > "Asymmetrical replication groupings are also possible depending on your
>> > use case. For example, you may want to have three replicas per data center
>> > to serve real-time application requests, and then have a single replica in 
>> > a
>> > separate data center designated to running analytics."
>> >
>> > Have 2 questions :
>> > 1. Any example how to configure a topology with 3 replicas in one DC (
>> > with 2 in 1 rack + 1 in another rack ) and one replica in another DC ?
>> >  The default networktopologystrategy with rackinferringsnitch will only
>> > give me equal distribution ( 2+2 )
>> >
>> > 2. I am assuming the reads can go to any of the replicas. Is there a
>> > client which will send query to a node ( in cassandra ring ) which is
>> > closest to the client ?
>> >
>> > -Thanks,
>> > Prasenjit
>> >
>> >
>
>
>
>
> --
> Tyler Hobbs
> DataStax
>

Reply via email to