Re: Re: Re: Re: Quorum, Hector, and datacenter preference

2011-03-25 Thread Patricio Echagüe
Glad it worked

2011/3/25 

> very cool. thanks for the info. this is exactly what we need.
>
>
> On Mar 25, 2011 8:22am, Patricio Echagüe  wrote:
> >
> > It's a cassandra consistency level
> > On Mar 24, 2011 11:44 PM, jonathan.co...@gmail.com> wrote:> Patricio -
> > >
> > > I haven't heard of local_quorum. Is that a cassandra setting or
> something
> >
> > > you specify in the client?
> > >
> > > On Mar 24, 2011 8:44pm, Patricio Echagüe patric...@gmail.com> wrote:
> > >> Doesn't CL=LOCAL_QUORUM solve your problem?
> >
> > >
> > >> On Thu, Mar 24, 2011 at 9:33 AM, jonathan.co...@gmail.com> wrote:
> > >
> > >
> > >> Hi Nate -
> > >
> > >> That sounds really promising and I'm looking forward to trying that
> out.
> >
> > >
> > >> My original question came up while thinking how to achieve quorum
> (with
> > >> rf=3) with a loss of 1 of 2 data centers. My logic was that if you had
> 2
> > >> replicas in the same data center where the client originally written
> to,
> >
> > >> then that client is guaranteed to be able to satisfy quorum, even if
> the
> > >> other data center is unreachable.
> > >
> > >
> > >
> > >> But I think there is no way to guarantee where the first write is
> written
> >
> > >> to. That would be based on the token range, which could very well be
> in
> > >> any data center.
> > >
> > >> Jon
> > >
> > >
> > >
> > >
> > >
> > >
> > >> On Mar 24, 2011 3:05pm, Nate McCall n...@datastax.com> wrote:
> >
> > >> > We have a load balancing policy which selects the host best on
> latency
> > >> >
> > >> > and uses a Phi convict algorithm in a method similar to
> DynamicSnitch.
> > >
> > >
> > >> >
> >
> > >> > Using this policy, you would inherently get the closest replica
> > >> >
> > >> > whenever possible as that would most likely be the best performing.
> > >> >
> > >> >
> > >> >
> >
> > >> > This policy is still in trunk and 0.7.0 tip. We should have a new
> > >
> > >
> > >> >
> > >> > release out containing the above in the next few days.
> > >> >
> > >> >
> >
> > >> >
> > >> > On Thu, Mar 24, 2011 at 8:46 AM, Jonathan Colby
> > >> >
> > >> > jonathan.co...@gmail.com> wrote:
> > >
> > >
> > >> >
> >
> > >> > > Indeed I found the big flaw in my own logic. Even writing to
> > >> the "local" cassandra nodes does not guarantee where the replicas will
>
> > >> end up. The decision where to write the first replica is based on the
> >
> > >> token ring, which is spread out on all nodes regardless of datacenter.
>
> > >> right ?
> > >
> > >
> > >> >
> > >> > >
> > >> >
> > >> > > On Mar 24, 2011, at 2:02 PM, Jonathan Colby wrote:
> >
> > >> >
> > >> > >
> > >> >
> > >> > >> Hi -
> > >> >
> > >> > >>
> > >> >
> > >> > >> Our cluster is spread between 2 datacenters. We have a
> >
> > >> straight-forward IP assignment so that OldNetworkTopology
> (rackinferring
> > >> snitch) works well. We have cassandra clients written in Hector in
> each
> > >> of those data centers. The Hector clients all have a list of all
> >
> > >> cassandra nodes across both data centers. RF=3.
> > >
> > >
> > >> >
> > >> > >>
> > >> >
> > >> > >> Is there an order as to which data center gets the first write?
> In
> >
> > >> other words, would (or can) the Hector client do its first write to
> the
> > >> cassandra nodes in its own data center?
> > >
> > >
> > >> >
> > >> > >>
> > >> >
> > >> > >> It would be ideal it Hector chose the "local" cassandra nodes.
> That
> >
> > >> way, if one data center is unreachable, the Quorum of replicas in
> > >> cassandra is still reached (because it was written to the working data
>
> > >> center first).
> > >
> > >
> > >> >
> >
> > >> > >>
> > >> >
> > >> > >> Otherwise, if the cassandra writes are really random from the
> Hector
> > >> client point-of-view, a data center outage would result in a read
> failure
> >
> > >> for any data that has 2 replicas in the lost data center.
> > >
> > >
> > >> >
> > >> > >>
> > >> >
> > >> > >> Is anyone doing this? Is there a flaw in my logic?
> >
> > >> >
> > >> > >>
> > >> >
> > >> > >>
> > >> >
> > >> > >
> > >> >
> > >> > >
> > >> >
> > >
> > >
> > >
> > >
> > >
> >
> > >
> >
> >
>


Re: Re: Quorum, Hector, and datacenter preference

2011-03-24 Thread Patricio Echagüe
Doesn't CL=LOCAL_QUORUM solve your problem?

On Thu, Mar 24, 2011 at 9:33 AM,  wrote:

> Hi Nate -
>
> That sounds really promising and I'm looking forward to trying that out.
>
> My original question came up while thinking how to achieve quorum (with
> rf=3) with a loss of 1 of 2 data centers. My logic was that if you had 2
> replicas in the same data center where the client originally written to,
> then that client is guaranteed to be able to satisfy quorum, even if the
> other data center is unreachable.
>
> But I think there is no way to guarantee where the first write is written
> to. That would be based on the token range, which could very well be in any
> data center.
>
> Jon
>
>
>
>
> On Mar 24, 2011 3:05pm, Nate McCall  wrote:
> > We have a load balancing policy which selects the host best on latency
> >
> > and uses a Phi convict algorithm in a method similar to DynamicSnitch.
> >
> > Using this policy, you would inherently get the closest replica
> >
> > whenever possible as that would most likely be the best performing.
> >
> >
> >
> > This policy is still in trunk and 0.7.0 tip. We should have a new
> >
> > release out containing the above in the next few days.
> >
> >
> >
> > On Thu, Mar 24, 2011 at 8:46 AM, Jonathan Colby
> >
> > jonathan.co...@gmail.com> wrote:
> >
> > > Indeed I found the big flaw in my own logic.   Even writing to the
> "local" cassandra nodes does not guarantee where the replicas will end up.
> The decision where to write the first replica is based on the token ring,
> which is spread out on all nodes regardless of datacenter.   right ?
> >
> > >
> >
> > > On Mar 24, 2011, at 2:02 PM, Jonathan Colby wrote:
> >
> > >
> >
> > >> Hi -
> >
> > >>
> >
> > >> Our cluster is spread between 2 datacenters.   We have a
> straight-forward IP assignment so that OldNetworkTopology (rackinferring
> snitch) works well.We have cassandra clients written in Hector in each
> of those data centers.   The Hector clients all have a list of all cassandra
> nodes across both data centers.  RF=3.
> >
> > >>
> >
> > >> Is there an order as to which data center gets the first write?In
> other words, would (or can) the Hector client do its first write to the
> cassandra nodes in its own data center?
> >
> > >>
> >
> > >> It would be ideal it Hector chose the "local" cassandra nodes.  That
> way, if one data center is unreachable, the Quorum of replicas in cassandra
> is still reached (because it was written to the working data center first).
> >
> > >>
> >
> > >> Otherwise, if the cassandra writes are really random from the Hector
> client point-of-view, a data center outage would result in a read failure
> for any data that has 2 replicas in the lost data center.
> >
> > >>
> >
> > >> Is anyone doing this?  Is there a flaw in my logic?
> >
> > >>
> >
> > >>
> >
> > >
> >
> > >
> >
>


Re: Re: Quorum, Hector, and datacenter preference

2011-03-24 Thread jonathan . colby

Hi Nate -

That sounds really promising and I'm looking forward to trying that out.

My original question came up while thinking how to achieve quorum (with  
rf=3) with a loss of 1 of 2 data centers. My logic was that if you had 2  
replicas in the same data center where the client originally written to,  
then that client is guaranteed to be able to satisfy quorum, even if the  
other data center is unreachable.


But I think there is no way to guarantee where the first write is written  
to. That would be based on the token range, which could very well be in any  
data center.


Jon



On Mar 24, 2011 3:05pm, Nate McCall  wrote:

We have a load balancing policy which selects the host best on latency



and uses a Phi convict algorithm in a method similar to DynamicSnitch.



Using this policy, you would inherently get the closest replica



whenever possible as that would most likely be the best performing.





This policy is still in trunk and 0.7.0 tip. We should have a new



release out containing the above in the next few days.





On Thu, Mar 24, 2011 at 8:46 AM, Jonathan Colby



jonathan.co...@gmail.com> wrote:


> Indeed I found the big flaw in my own logic. Even writing to  
the "local" cassandra nodes does not guarantee where the replicas will  
end up. The decision where to write the first replica is based on the  
token ring, which is spread out on all nodes regardless of datacenter.  
right ?



>



> On Mar 24, 2011, at 2:02 PM, Jonathan Colby wrote:



>



>> Hi -



>>


>> Our cluster is spread between 2 datacenters. We have a  
straight-forward IP assignment so that OldNetworkTopology (rackinferring  
snitch) works well. We have cassandra clients written in Hector in each  
of those data centers. The Hector clients all have a list of all  
cassandra nodes across both data centers. RF=3.



>>


>> Is there an order as to which data center gets the first write? In  
other words, would (or can) the Hector client do its first write to the  
cassandra nodes in its own data center?



>>


>> It would be ideal it Hector chose the "local" cassandra nodes. That  
way, if one data center is unreachable, the Quorum of replicas in  
cassandra is still reached (because it was written to the working data  
center first).



>>


>> Otherwise, if the cassandra writes are really random from the Hector  
client point-of-view, a data center outage would result in a read failure  
for any data that has 2 replicas in the lost data center.



>>



>> Is anyone doing this? Is there a flaw in my logic?



>>



>>



>



>




Re: Quorum, Hector, and datacenter preference

2011-03-24 Thread Nate McCall
We have a load balancing policy which selects the host best on latency
and uses a Phi convict algorithm in a method similar to DynamicSnitch.
Using this policy, you would inherently get the closest replica
whenever possible as that would most likely be the best performing.

This policy is still in trunk and 0.7.0 tip. We should have a new
release out containing the above in the next few days.

On Thu, Mar 24, 2011 at 8:46 AM, Jonathan Colby
 wrote:
> Indeed I found the big flaw in my own logic.   Even writing to the "local" 
> cassandra nodes does not guarantee where the replicas will end up.   The 
> decision where to write the first replica is based on the token ring, which 
> is spread out on all nodes regardless of datacenter.   right ?
>
> On Mar 24, 2011, at 2:02 PM, Jonathan Colby wrote:
>
>> Hi -
>>
>> Our cluster is spread between 2 datacenters.   We have a straight-forward IP 
>> assignment so that OldNetworkTopology (rackinferring snitch) works well.    
>> We have cassandra clients written in Hector in each of those data centers.   
>> The Hector clients all have a list of all cassandra nodes across both data 
>> centers.  RF=3.
>>
>> Is there an order as to which data center gets the first write?    In other 
>> words, would (or can) the Hector client do its first write to the cassandra 
>> nodes in its own data center?
>>
>> It would be ideal it Hector chose the "local" cassandra nodes.  That way, if 
>> one data center is unreachable, the Quorum of replicas in cassandra is still 
>> reached (because it was written to the working data center first).
>>
>> Otherwise, if the cassandra writes are really random from the Hector client 
>> point-of-view, a data center outage would result in a read failure for any 
>> data that has 2 replicas in the lost data center.
>>
>> Is anyone doing this?  Is there a flaw in my logic?
>>
>>
>
>


Re: Quorum, Hector, and datacenter preference

2011-03-24 Thread Jonathan Colby
Indeed I found the big flaw in my own logic.   Even writing to the "local" 
cassandra nodes does not guarantee where the replicas will end up.   The 
decision where to write the first replica is based on the token ring, which is 
spread out on all nodes regardless of datacenter.   right ?

On Mar 24, 2011, at 2:02 PM, Jonathan Colby wrote:

> Hi -
> 
> Our cluster is spread between 2 datacenters.   We have a straight-forward IP 
> assignment so that OldNetworkTopology (rackinferring snitch) works well.
> We have cassandra clients written in Hector in each of those data centers.   
> The Hector clients all have a list of all cassandra nodes across both data 
> centers.  RF=3.
> 
> Is there an order as to which data center gets the first write?In other 
> words, would (or can) the Hector client do its first write to the cassandra 
> nodes in its own data center?
> 
> It would be ideal it Hector chose the "local" cassandra nodes.  That way, if 
> one data center is unreachable, the Quorum of replicas in cassandra is still 
> reached (because it was written to the working data center first).
> 
> Otherwise, if the cassandra writes are really random from the Hector client 
> point-of-view, a data center outage would result in a read failure for any 
> data that has 2 replicas in the lost data center.
> 
> Is anyone doing this?  Is there a flaw in my logic?
> 
>