Re: Quorum, Hector, and datacenter preference

2011-03-24 Thread Jonathan Colby
Indeed I found the big flaw in my own logic.   Even writing to the local 
cassandra nodes does not guarantee where the replicas will end up.   The 
decision where to write the first replica is based on the token ring, which is 
spread out on all nodes regardless of datacenter.   right ?

On Mar 24, 2011, at 2:02 PM, Jonathan Colby wrote:

 Hi -
 
 Our cluster is spread between 2 datacenters.   We have a straight-forward IP 
 assignment so that OldNetworkTopology (rackinferring snitch) works well.
 We have cassandra clients written in Hector in each of those data centers.   
 The Hector clients all have a list of all cassandra nodes across both data 
 centers.  RF=3.
 
 Is there an order as to which data center gets the first write?In other 
 words, would (or can) the Hector client do its first write to the cassandra 
 nodes in its own data center?
 
 It would be ideal it Hector chose the local cassandra nodes.  That way, if 
 one data center is unreachable, the Quorum of replicas in cassandra is still 
 reached (because it was written to the working data center first).
 
 Otherwise, if the cassandra writes are really random from the Hector client 
 point-of-view, a data center outage would result in a read failure for any 
 data that has 2 replicas in the lost data center.
 
 Is anyone doing this?  Is there a flaw in my logic?
 
 



Re: Quorum, Hector, and datacenter preference

2011-03-24 Thread Nate McCall
We have a load balancing policy which selects the host best on latency
and uses a Phi convict algorithm in a method similar to DynamicSnitch.
Using this policy, you would inherently get the closest replica
whenever possible as that would most likely be the best performing.

This policy is still in trunk and 0.7.0 tip. We should have a new
release out containing the above in the next few days.

On Thu, Mar 24, 2011 at 8:46 AM, Jonathan Colby
jonathan.co...@gmail.com wrote:
 Indeed I found the big flaw in my own logic.   Even writing to the local 
 cassandra nodes does not guarantee where the replicas will end up.   The 
 decision where to write the first replica is based on the token ring, which 
 is spread out on all nodes regardless of datacenter.   right ?

 On Mar 24, 2011, at 2:02 PM, Jonathan Colby wrote:

 Hi -

 Our cluster is spread between 2 datacenters.   We have a straight-forward IP 
 assignment so that OldNetworkTopology (rackinferring snitch) works well.    
 We have cassandra clients written in Hector in each of those data centers.   
 The Hector clients all have a list of all cassandra nodes across both data 
 centers.  RF=3.

 Is there an order as to which data center gets the first write?    In other 
 words, would (or can) the Hector client do its first write to the cassandra 
 nodes in its own data center?

 It would be ideal it Hector chose the local cassandra nodes.  That way, if 
 one data center is unreachable, the Quorum of replicas in cassandra is still 
 reached (because it was written to the working data center first).

 Otherwise, if the cassandra writes are really random from the Hector client 
 point-of-view, a data center outage would result in a read failure for any 
 data that has 2 replicas in the lost data center.

 Is anyone doing this?  Is there a flaw in my logic?






Re: Re: Quorum, Hector, and datacenter preference

2011-03-24 Thread Patricio Echagüe
Doesn't CL=LOCAL_QUORUM solve your problem?

On Thu, Mar 24, 2011 at 9:33 AM, jonathan.co...@gmail.com wrote:

 Hi Nate -

 That sounds really promising and I'm looking forward to trying that out.

 My original question came up while thinking how to achieve quorum (with
 rf=3) with a loss of 1 of 2 data centers. My logic was that if you had 2
 replicas in the same data center where the client originally written to,
 then that client is guaranteed to be able to satisfy quorum, even if the
 other data center is unreachable.

 But I think there is no way to guarantee where the first write is written
 to. That would be based on the token range, which could very well be in any
 data center.

 Jon




 On Mar 24, 2011 3:05pm, Nate McCall n...@datastax.com wrote:
  We have a load balancing policy which selects the host best on latency
 
  and uses a Phi convict algorithm in a method similar to DynamicSnitch.
 
  Using this policy, you would inherently get the closest replica
 
  whenever possible as that would most likely be the best performing.
 
 
 
  This policy is still in trunk and 0.7.0 tip. We should have a new
 
  release out containing the above in the next few days.
 
 
 
  On Thu, Mar 24, 2011 at 8:46 AM, Jonathan Colby
 
  jonathan.co...@gmail.com wrote:
 
   Indeed I found the big flaw in my own logic.   Even writing to the
 local cassandra nodes does not guarantee where the replicas will end up.
 The decision where to write the first replica is based on the token ring,
 which is spread out on all nodes regardless of datacenter.   right ?
 
  
 
   On Mar 24, 2011, at 2:02 PM, Jonathan Colby wrote:
 
  
 
   Hi -
 
  
 
   Our cluster is spread between 2 datacenters.   We have a
 straight-forward IP assignment so that OldNetworkTopology (rackinferring
 snitch) works well.We have cassandra clients written in Hector in each
 of those data centers.   The Hector clients all have a list of all cassandra
 nodes across both data centers.  RF=3.
 
  
 
   Is there an order as to which data center gets the first write?In
 other words, would (or can) the Hector client do its first write to the
 cassandra nodes in its own data center?
 
  
 
   It would be ideal it Hector chose the local cassandra nodes.  That
 way, if one data center is unreachable, the Quorum of replicas in cassandra
 is still reached (because it was written to the working data center first).
 
  
 
   Otherwise, if the cassandra writes are really random from the Hector
 client point-of-view, a data center outage would result in a read failure
 for any data that has 2 replicas in the lost data center.
 
  
 
   Is anyone doing this?  Is there a flaw in my logic?