Thanks. Some follow up questions : 1. How do the reads use strategy/snitch information ? I am assuming the reads can go to any of the replicas. WIll it also use the snitch/strategy info to find next 'R' replicas 'closest' to coordinator-node ?
2. In a single DC ( with n racks and r replicas ) what algorithm cassandra uses to write its replicas in following scenarios : a. n>r : I am assuming, have 1 replica in each rack. b. n<r : ?? I am assuming, try to equally distribute replicas across in each racks. -Thanks, Prasenjit On Thu, Jul 12, 2012 at 11:24 AM, Tyler Hobbs <ty...@datastax.com> wrote: > I highly recommend specifying the same rack for all nodes (using > cassandra-topology.properties) unless you really have a good reason not too > (and you probably don't). The way that replicas are chosen when multiple > racks are in play can be fairly confusing and lead to a data imbalance if > you don't catch it. > > > On Wed, Jul 11, 2012 at 10:53 PM, prasenjit mukherjee <prasen....@gmail.com> > wrote: >> >> > As far as I know there isn't any way to use the rack name in the >> > strategy_options for a keyspace. You >> > might want to look at the code to dig into that, perhaps. >> >> Aha, I was wondering if I could do that as well ( specify rack options ) >> :) >> >> Thanks for the pointer, I will dig into the code. >> >> -Thanks, >> Prasenjit >> >> On Thu, Jul 12, 2012 at 5:33 AM, Richard Lowe <richard.l...@arkivum.com> >> wrote: >> > If you then specify the parameters for the keyspace to use these, you >> > can control exactly which set of nodes replicas end up on. >> > >> > For example, in cassandra-cli: >> > >> > create keyspace ks1 with placement_strategy = >> > 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options >> > = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 }; >> > >> > As far as I know there isn't any way to use the rack name in the >> > strategy_options for a keyspace. You might want to look at the code to dig >> > into that, perhaps. >> > >> > Whichever snitch you use, the nodes are sorted in order of proximity to >> > the client node. How this is determined depends on the snitch that's used >> > but most (the ones that ship with Cassandra) will use the default ordering >> > of same-node < same-rack < same-datacenter < different-datacenter. Each >> > snitch has methods to tell Cassandra which rack and DC a node is in, so it >> > always knows which node is closest. Used with the Bloom filters this can >> > tell us where the nearest replica is. >> > >> > >> > >> > -----Original Message----- >> > From: prasenjit mukherjee [mailto:prasen....@gmail.com] >> > Sent: 11 July 2012 06:33 >> > To: user >> > Subject: How to come up with a predefined topology >> > >> > Quoting from >> > http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy >> > : >> > >> > "Asymmetrical replication groupings are also possible depending on your >> > use case. For example, you may want to have three replicas per data center >> > to serve real-time application requests, and then have a single replica in >> > a >> > separate data center designated to running analytics." >> > >> > Have 2 questions : >> > 1. Any example how to configure a topology with 3 replicas in one DC ( >> > with 2 in 1 rack + 1 in another rack ) and one replica in another DC ? >> > The default networktopologystrategy with rackinferringsnitch will only >> > give me equal distribution ( 2+2 ) >> > >> > 2. I am assuming the reads can go to any of the replicas. Is there a >> > client which will send query to a node ( in cassandra ring ) which is >> > closest to the client ? >> > >> > -Thanks, >> > Prasenjit >> > >> > > > > > > -- > Tyler Hobbs > DataStax >