[RELEASE] Apache Cassandra 2.2.9 released

2017-02-21 Thread Michael Shuler
The Cassandra team is pleased to announce the release of Apache
Cassandra version 2.2.9.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 2.2 series. As always,
please pay attention to the release notes[2] and Let us know[3] if you
were to encounter any problem.

Enjoy!

[1]: (CHANGES.txt)
http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-2.2.9
[2]: (NEWS.txt)
http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-2.2.9
[3]: https://issues.apache.org/jira/browse/CASSANDRA



signature.asc
Description: OpenPGP digital signature


[RELEASE] Apache Cassandra 3.0.11 released

2017-02-21 Thread Michael Shuler
The Cassandra team is pleased to announce the release of Apache
Cassandra version 3.0.11.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 3.0 series. As always,
please pay attention to the release notes[2] and Let us know[3] if you
were to encounter any problem.

Enjoy!

[1]: (CHANGES.txt)
http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-3.0.11
[2]: (NEWS.txt)
http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-3.0.11
[3]: https://issues.apache.org/jira/browse/CASSANDRA



signature.asc
Description: OpenPGP digital signature


[RELEASE] Apache Cassandra 2.1.17 released

2017-02-21 Thread Michael Shuler
The Cassandra team is pleased to announce the release of Apache
Cassandra version 2.1.17.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 2.1 series. As always,
please pay attention to the release notes[2] and Let us know[3] if you
were to encounter any problem.

Enjoy!

[1]: (CHANGES.txt)
http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-2.1.17
[2]: (NEWS.txt)
http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=NEWS.txt;hb=refs/tags/cassandra-2.1.17
[3]: https://issues.apache.org/jira/browse/CASSANDRA



signature.asc
Description: OpenPGP digital signature


Re: One thread pool per repair in nodetool tpstats

2017-02-21 Thread Vincent Rischmann
Ok, thanks Matija.





On Tue, Feb 21, 2017, at 11:43 AM, Matija Gobec wrote:

> They appear for each repair run and disappear when repair run
> finishes.
> 

> On Tue, Feb 21, 2017 at 11:14 AM, Vincent Rischmann
>  wrote:
>> __

>> Hi,

>> 

>> I upgraded to Cassandra 2.2.8 and noticed something weird in nodetool
>> tpstats:
>> 

>> Pool NameActive   Pending  Completed
>> Blocked  All time blocked
>> MutationStage 0 0  116265693
>> 0 0
>> ReadStage 1 0   56132474
>> 0 0
>> RequestResponseStage  0 0  163640931
>> 0 0
>> ReadRepairStage   0 03152856
>> 0 0
>> CounterMutationStage  0 0 630690
>> 0 0
>> Repair#26 1 4  1
>> 0 0
>> Repair#48 1 2  3
>> 0 0
>> HintedHandoff 1 1   1198
>> 0 0
>> MiscStage 0 0  0
>> 0 0
>> CompactionExecutor0 0 111438
>> 0 0
>> Repair#45 1 4  1
>> 0 0
>> MemtableReclaimMemory 0 0   3399
>> 0 0
>> Repair#30 1 4  1
>> 0 0
>> PendingRangeCalculator0 0 37
>> 0 0
>> Repair#61 1 4  1
>> 0
>> 

>> There are multiples "pools" named Repair# which
>> weren't there with Cassandra 2.1.16. These appear in the JMX
>> metrics too.
>> 

>> Do they go away eventually ? because this is making the tpstats
>> output harder to read in my opinion


Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Kant Kodali
It looks like there is ordering within one client (ordering based on
timestamp) and looks like this *order is preserved across all replicas*
however the benefits of async given the ordering restriction are slightly
blur for me.


On Tue, Feb 21, 2017 at 2:35 AM, Kant Kodali  wrote:

> Agreed with multiple clients one cannot guarantee the order however with
> multiple clients the client side timestamps will overlap as well. so even
> in the case of LWT's and multiple clients the order is not guaranteed
> right. By multiple clients I mean multiple C* sessions on the driver side.
> if multiple LWT's have same time stamps from different clients I would
> assume one of the LWT's from one client can be overwritten by the other LWT
> from another client with same timestamp.
>
>
>
>
>
> On Tue, Feb 21, 2017 at 1:52 AM, Benjamin Roth 
> wrote:
>
>> For eventual consistency, it does not matter if it is sync or async. LWW
>> always works as long as clocks are synchronized.
>> Thats a design pattern of CS or EC databases in general. Every write has
>> a timestamp and no matter at what time it arrives, the last write will win
>> even if a "sooner" write arrives late due to network latency oder a
>> unavailable server that receives a hint after 1 hour.
>> Doing replication sync will kill all the benefits you have from CS's
>> design:
>> - low latency
>> - partition tolerance
>> - high availability
>>
>> Doing sync replication would also not guarantee a state as another client
>> could "interfer" with your write. So you still have no "linearizability".
>> Only LWT does this.
>> You cannot rely on orders in CS. No matter how replication works. You
>> only can rely "eventually" on it but there is never a point in time you can
>> tell 100% your system is completely consistent.
>>
>> Maybe what you could do if you are talking of "orders" and that pointer
>> thing you mentioned earlier: Try sth similar like MVs do.
>> Create a trigger, operate on your local dataset, read the order based on
>> PK (locally) and update "the pointer" on every write (also locally). If you
>> then store your pointer with the last known timestamp of your base data,
>> you also have a LWW on your pointer so also the last pointer wins when
>> reading with > CL_ONE.
>> But that will probably harm your write performance.
>>
>> 2017-02-21 10:36 GMT+01:00 Kant Kodali :
>>
>>> @Benjamin I am more looking for how C* replication works underneath.
>>> There are few things here that I would need some clarification.
>>>
>>> 1. Does C* uses sync replication or async replication? If it is async
>>> replication how can one get performance especially when there is an
>>> ordering constraint among requests to comply with LWW.  Also below is a
>>> statement from C* website so how can one choose between sync or async
>>> replication? any configuration parameter that needs to be passed in?
>>>
>>> "Choose between synchronous or asynchronous replication for each update.
>>> "
>>>
>>> http://cassandra.apache.org/
>>>
>>> 2. Is it Guaranteed that C* coordinator writes data in the same order to
>>> all the replicas (either sync or async)?
>>>
>>> Thanks,
>>> kant
>>>
>>> On Tue, Feb 21, 2017 at 1:23 AM, Benjamin Roth 
>>> wrote:
>>>
 To me that sounds like a completely different design pattern and a
 different use case.
 CS was not designed to guarantee order. It was build to be linear
 scalable, highly concurrent and eventual consistent.
 To me it sounds like a ACID DB better serves what you are asking for.

 2017-02-21 10:17 GMT+01:00 Kant Kodali :

> Agreed that async performs better than sync in general but the catch
> here to me is the "order".
>
> The whole point of async is to do out of order processing by which I
> mean say if a request 1 comes in at time t1 and a request 2 comes in at
> time t2 where t1 < t2 and say now that t1 is taking longer to process than
> t2 in which case request 2 should get a response first and subsequently a
> response for request 1. This is where I would imagine all the benefits of
> async come in but the moment you introduce order by saying for Last Write
> Wins all the async requests should be processed in order I would imagine
> all the benefits of async are lost.
>
> Let's see if anyone can comment about how it works inside C*.
>
> Thanks!
>
>
>
> On Mon, Feb 20, 2017 at 10:54 PM, Dor Laor  wrote:
>
>> Could be. Let's stay tuned to see if someone else pick it up.
>> Anyway, if it's synchronous, you'll have a large penalty for latency.
>>
>> On Mon, Feb 20, 2017 at 10:11 PM, Kant Kodali 
>> wrote:
>>
>>> Thanks again for the response! if they mean it between client and
>>> server I am not sure why they would use the word "replication" in the
>>> 

Re: One thread pool per repair in nodetool tpstats

2017-02-21 Thread Matija Gobec
They appear for each repair run and disappear when repair run finishes.

On Tue, Feb 21, 2017 at 11:14 AM, Vincent Rischmann 
wrote:

> Hi,
>
> I upgraded to Cassandra 2.2.8 and noticed something weird in nodetool
> tpstats:
>
> Pool NameActive   Pending  Completed   Blocked
> All time blocked
> MutationStage 0 0  116265693
> 0 0
> ReadStage 1 0   56132474
> 0 0
> RequestResponseStage  0 0  163640931
> 0 0
> ReadRepairStage   0 03152856
> 0 0
> CounterMutationStage  0 0 630690
> 0 0
> Repair#26 1 4  1
> 0 0
> Repair#48 1 2  3
> 0 0
> HintedHandoff 1 1   1198
> 0 0
> MiscStage 0 0  0
> 0 0
> CompactionExecutor0 0 111438
> 0 0
> Repair#45 1 4  1
> 0 0
> MemtableReclaimMemory 0 0   3399
> 0 0
> Repair#30 1 4  1
> 0 0
> PendingRangeCalculator0 0 37
> 0 0
> Repair#61 1 4  1
> 0 0
>
> There are multiples "pools" named Repair# which weren't
> there with Cassandra 2.1.16. These appear in the JMX metrics too.
>
> Do they go away eventually ? because this is making the tpstats output
> harder to read in my opinion
>


Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Kant Kodali
Agreed with multiple clients one cannot guarantee the order however with
multiple clients the client side timestamps will overlap as well. so even
in the case of LWT's and multiple clients the order is not guaranteed
right. By multiple clients I mean multiple C* sessions on the driver side.
if multiple LWT's have same time stamps from different clients I would
assume one of the LWT's from one client can be overwritten by the other LWT
from another client with same timestamp.





On Tue, Feb 21, 2017 at 1:52 AM, Benjamin Roth 
wrote:

> For eventual consistency, it does not matter if it is sync or async. LWW
> always works as long as clocks are synchronized.
> Thats a design pattern of CS or EC databases in general. Every write has a
> timestamp and no matter at what time it arrives, the last write will win
> even if a "sooner" write arrives late due to network latency oder a
> unavailable server that receives a hint after 1 hour.
> Doing replication sync will kill all the benefits you have from CS's
> design:
> - low latency
> - partition tolerance
> - high availability
>
> Doing sync replication would also not guarantee a state as another client
> could "interfer" with your write. So you still have no "linearizability".
> Only LWT does this.
> You cannot rely on orders in CS. No matter how replication works. You only
> can rely "eventually" on it but there is never a point in time you can tell
> 100% your system is completely consistent.
>
> Maybe what you could do if you are talking of "orders" and that pointer
> thing you mentioned earlier: Try sth similar like MVs do.
> Create a trigger, operate on your local dataset, read the order based on
> PK (locally) and update "the pointer" on every write (also locally). If you
> then store your pointer with the last known timestamp of your base data,
> you also have a LWW on your pointer so also the last pointer wins when
> reading with > CL_ONE.
> But that will probably harm your write performance.
>
> 2017-02-21 10:36 GMT+01:00 Kant Kodali :
>
>> @Benjamin I am more looking for how C* replication works underneath.
>> There are few things here that I would need some clarification.
>>
>> 1. Does C* uses sync replication or async replication? If it is async
>> replication how can one get performance especially when there is an
>> ordering constraint among requests to comply with LWW.  Also below is a
>> statement from C* website so how can one choose between sync or async
>> replication? any configuration parameter that needs to be passed in?
>>
>> "Choose between synchronous or asynchronous replication for each update."
>>
>> http://cassandra.apache.org/
>>
>> 2. Is it Guaranteed that C* coordinator writes data in the same order to
>> all the replicas (either sync or async)?
>>
>> Thanks,
>> kant
>>
>> On Tue, Feb 21, 2017 at 1:23 AM, Benjamin Roth 
>> wrote:
>>
>>> To me that sounds like a completely different design pattern and a
>>> different use case.
>>> CS was not designed to guarantee order. It was build to be linear
>>> scalable, highly concurrent and eventual consistent.
>>> To me it sounds like a ACID DB better serves what you are asking for.
>>>
>>> 2017-02-21 10:17 GMT+01:00 Kant Kodali :
>>>
 Agreed that async performs better than sync in general but the catch
 here to me is the "order".

 The whole point of async is to do out of order processing by which I
 mean say if a request 1 comes in at time t1 and a request 2 comes in at
 time t2 where t1 < t2 and say now that t1 is taking longer to process than
 t2 in which case request 2 should get a response first and subsequently a
 response for request 1. This is where I would imagine all the benefits of
 async come in but the moment you introduce order by saying for Last Write
 Wins all the async requests should be processed in order I would imagine
 all the benefits of async are lost.

 Let's see if anyone can comment about how it works inside C*.

 Thanks!



 On Mon, Feb 20, 2017 at 10:54 PM, Dor Laor  wrote:

> Could be. Let's stay tuned to see if someone else pick it up.
> Anyway, if it's synchronous, you'll have a large penalty for latency.
>
> On Mon, Feb 20, 2017 at 10:11 PM, Kant Kodali 
> wrote:
>
>> Thanks again for the response! if they mean it between client and
>> server I am not sure why they would use the word "replication" in the
>> statement below since there is no replication between client and server(
>> coordinator).
>>
>> "Choose between synchronous or asynchronous replication for each
>>> update."
>>>
>>
>> Sent from my iPhone
>>
>> On Feb 20, 2017, at 5:30 PM, Dor Laor  wrote:
>>
>> I think they mean the client to server and not among the servers
>>
>> On Mon, Feb 20, 

One thread pool per repair in nodetool tpstats

2017-02-21 Thread Vincent Rischmann
Hi,



I upgraded to Cassandra 2.2.8 and noticed something weird in
nodetool tpstats:


Pool NameActive   Pending  Completed   Blocked
All time blocked
MutationStage 0 0  116265693
0 0
ReadStage 1 0   56132474
0 0
RequestResponseStage  0 0  163640931
0 0
ReadRepairStage   0 03152856
0 0
CounterMutationStage  0 0 630690
0 0
Repair#26 1 4  1
0 0
Repair#48 1 2  3
0 0
HintedHandoff 1 1   1198
0 0
MiscStage 0 0  0
0 0
CompactionExecutor0 0 111438
0 0
Repair#45 1 4  1
0 0
MemtableReclaimMemory 0 0   3399
0 0
Repair#30 1 4  1
0 0
PendingRangeCalculator0 0 37
0 0
Repair#61 1 4  1 0


There are multiples "pools" named Repair# which weren't
there with Cassandra 2.1.16. These appear in the JMX metrics too.


Do they go away eventually ? because this is making the tpstats output
harder to read in my opinion


Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Benjamin Roth
For eventual consistency, it does not matter if it is sync or async. LWW
always works as long as clocks are synchronized.
Thats a design pattern of CS or EC databases in general. Every write has a
timestamp and no matter at what time it arrives, the last write will win
even if a "sooner" write arrives late due to network latency oder a
unavailable server that receives a hint after 1 hour.
Doing replication sync will kill all the benefits you have from CS's design:
- low latency
- partition tolerance
- high availability

Doing sync replication would also not guarantee a state as another client
could "interfer" with your write. So you still have no "linearizability".
Only LWT does this.
You cannot rely on orders in CS. No matter how replication works. You only
can rely "eventually" on it but there is never a point in time you can tell
100% your system is completely consistent.

Maybe what you could do if you are talking of "orders" and that pointer
thing you mentioned earlier: Try sth similar like MVs do.
Create a trigger, operate on your local dataset, read the order based on PK
(locally) and update "the pointer" on every write (also locally). If you
then store your pointer with the last known timestamp of your base data,
you also have a LWW on your pointer so also the last pointer wins when
reading with > CL_ONE.
But that will probably harm your write performance.

2017-02-21 10:36 GMT+01:00 Kant Kodali :

> @Benjamin I am more looking for how C* replication works underneath. There
> are few things here that I would need some clarification.
>
> 1. Does C* uses sync replication or async replication? If it is async
> replication how can one get performance especially when there is an
> ordering constraint among requests to comply with LWW.  Also below is a
> statement from C* website so how can one choose between sync or async
> replication? any configuration parameter that needs to be passed in?
>
> "Choose between synchronous or asynchronous replication for each update."
>
> http://cassandra.apache.org/
>
> 2. Is it Guaranteed that C* coordinator writes data in the same order to
> all the replicas (either sync or async)?
>
> Thanks,
> kant
>
> On Tue, Feb 21, 2017 at 1:23 AM, Benjamin Roth 
> wrote:
>
>> To me that sounds like a completely different design pattern and a
>> different use case.
>> CS was not designed to guarantee order. It was build to be linear
>> scalable, highly concurrent and eventual consistent.
>> To me it sounds like a ACID DB better serves what you are asking for.
>>
>> 2017-02-21 10:17 GMT+01:00 Kant Kodali :
>>
>>> Agreed that async performs better than sync in general but the catch
>>> here to me is the "order".
>>>
>>> The whole point of async is to do out of order processing by which I
>>> mean say if a request 1 comes in at time t1 and a request 2 comes in at
>>> time t2 where t1 < t2 and say now that t1 is taking longer to process than
>>> t2 in which case request 2 should get a response first and subsequently a
>>> response for request 1. This is where I would imagine all the benefits of
>>> async come in but the moment you introduce order by saying for Last Write
>>> Wins all the async requests should be processed in order I would imagine
>>> all the benefits of async are lost.
>>>
>>> Let's see if anyone can comment about how it works inside C*.
>>>
>>> Thanks!
>>>
>>>
>>>
>>> On Mon, Feb 20, 2017 at 10:54 PM, Dor Laor  wrote:
>>>
 Could be. Let's stay tuned to see if someone else pick it up.
 Anyway, if it's synchronous, you'll have a large penalty for latency.

 On Mon, Feb 20, 2017 at 10:11 PM, Kant Kodali 
 wrote:

> Thanks again for the response! if they mean it between client and
> server I am not sure why they would use the word "replication" in the
> statement below since there is no replication between client and server(
> coordinator).
>
> "Choose between synchronous or asynchronous replication for each
>> update."
>>
>
> Sent from my iPhone
>
> On Feb 20, 2017, at 5:30 PM, Dor Laor  wrote:
>
> I think they mean the client to server and not among the servers
>
> On Mon, Feb 20, 2017 at 5:28 PM, Kant Kodali 
> wrote:
>
>> Also here is a statement from C* website
>>
>> "Choose between synchronous or asynchronous replication for each
>> update."
>>
>> http://cassandra.apache.org/
>>
>> Looks like we can choose then either sync or async then?
>>
>> On Mon, Feb 20, 2017 at 5:25 PM, Kant Kodali 
>> wrote:
>>
>>> Hi Dor,
>>>
>>> Great response! My comments are inline.
>>>
>>> Thanks a lot,
>>> kant
>>>
>>>
>>> On Mon, Feb 20, 2017 at 4:41 PM, Dor Laor  wrote:
>>>
 I sent this answer but it bounced off the 

Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Kant Kodali
@Benjamin I am more looking for how C* replication works underneath. There
are few things here that I would need some clarification.

1. Does C* uses sync replication or async replication? If it is async
replication how can one get performance especially when there is an
ordering constraint among requests to comply with LWW.  Also below is a
statement from C* website so how can one choose between sync or async
replication? any configuration parameter that needs to be passed in?

"Choose between synchronous or asynchronous replication for each update."

http://cassandra.apache.org/

2. Is it Guaranteed that C* coordinator writes data in the same order to
all the replicas (either sync or async)?

Thanks,
kant

On Tue, Feb 21, 2017 at 1:23 AM, Benjamin Roth 
wrote:

> To me that sounds like a completely different design pattern and a
> different use case.
> CS was not designed to guarantee order. It was build to be linear
> scalable, highly concurrent and eventual consistent.
> To me it sounds like a ACID DB better serves what you are asking for.
>
> 2017-02-21 10:17 GMT+01:00 Kant Kodali :
>
>> Agreed that async performs better than sync in general but the catch here
>> to me is the "order".
>>
>> The whole point of async is to do out of order processing by which I mean
>> say if a request 1 comes in at time t1 and a request 2 comes in at time t2
>> where t1 < t2 and say now that t1 is taking longer to process than t2 in
>> which case request 2 should get a response first and subsequently a
>> response for request 1. This is where I would imagine all the benefits of
>> async come in but the moment you introduce order by saying for Last Write
>> Wins all the async requests should be processed in order I would imagine
>> all the benefits of async are lost.
>>
>> Let's see if anyone can comment about how it works inside C*.
>>
>> Thanks!
>>
>>
>>
>> On Mon, Feb 20, 2017 at 10:54 PM, Dor Laor  wrote:
>>
>>> Could be. Let's stay tuned to see if someone else pick it up.
>>> Anyway, if it's synchronous, you'll have a large penalty for latency.
>>>
>>> On Mon, Feb 20, 2017 at 10:11 PM, Kant Kodali  wrote:
>>>
 Thanks again for the response! if they mean it between client and
 server I am not sure why they would use the word "replication" in the
 statement below since there is no replication between client and server(
 coordinator).

 "Choose between synchronous or asynchronous replication for each update
> ."
>

 Sent from my iPhone

 On Feb 20, 2017, at 5:30 PM, Dor Laor  wrote:

 I think they mean the client to server and not among the servers

 On Mon, Feb 20, 2017 at 5:28 PM, Kant Kodali  wrote:

> Also here is a statement from C* website
>
> "Choose between synchronous or asynchronous replication for each
> update."
>
> http://cassandra.apache.org/
>
> Looks like we can choose then either sync or async then?
>
> On Mon, Feb 20, 2017 at 5:25 PM, Kant Kodali 
> wrote:
>
>> Hi Dor,
>>
>> Great response! My comments are inline.
>>
>> Thanks a lot,
>> kant
>>
>>
>> On Mon, Feb 20, 2017 at 4:41 PM, Dor Laor  wrote:
>>
>>> I sent this answer but it bounced off the user@apache.
>>> Here is the email anyway:
>>>
>>> -- Forwarded message --
>>> From: Dor Laor 
>>> Date: Mon, Feb 20, 2017 at 4:37 PM
>>> Subject: Re: Does C* coordinator writes to replicas in same order or
>>> different order?
>>> To: d...@cassandra.apache.org
>>> Cc: user@cassandra.apache.org
>>>
>>>
>>> + The C* coordinator send async write requests to the replicas.
>>>This is very important since it allows it to return a low latency
>>>reply to the client once the CL is reached. You wouldn't want
>>>to serialize the replicas one after the other.
>>>
>>
>> *so coordinator wont wait until a CL is reached before it
>> process another request? *
>>
>>>
>>>  + The client <-> server sync/async isn't related to the coordinator
>>> in this case.
>>>
>>>  + In the case of concurrent writes (always the case...), the time
>>> stamp
>>> sets the order. Note that it's possible to work with client
>>> timestamps or
>>> server timestamps. The client ones are usually the best choice.
>>>
>>
>>  *In theory, Why we say concurrent writes they should have the same
>> timestamp right?  What I am really looking for is that if I send write
>> request concurrently for record 1 and record 2 are they guaranteed to be
>> inserted in the same order across replicas? (Whatever order coordinator 
>> may
>> choose is fine but I want the same order across all 

Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Benjamin Roth
To me that sounds like a completely different design pattern and a
different use case.
CS was not designed to guarantee order. It was build to be linear scalable,
highly concurrent and eventual consistent.
To me it sounds like a ACID DB better serves what you are asking for.

2017-02-21 10:17 GMT+01:00 Kant Kodali :

> Agreed that async performs better than sync in general but the catch here
> to me is the "order".
>
> The whole point of async is to do out of order processing by which I mean
> say if a request 1 comes in at time t1 and a request 2 comes in at time t2
> where t1 < t2 and say now that t1 is taking longer to process than t2 in
> which case request 2 should get a response first and subsequently a
> response for request 1. This is where I would imagine all the benefits of
> async come in but the moment you introduce order by saying for Last Write
> Wins all the async requests should be processed in order I would imagine
> all the benefits of async are lost.
>
> Let's see if anyone can comment about how it works inside C*.
>
> Thanks!
>
>
>
> On Mon, Feb 20, 2017 at 10:54 PM, Dor Laor  wrote:
>
>> Could be. Let's stay tuned to see if someone else pick it up.
>> Anyway, if it's synchronous, you'll have a large penalty for latency.
>>
>> On Mon, Feb 20, 2017 at 10:11 PM, Kant Kodali  wrote:
>>
>>> Thanks again for the response! if they mean it between client and server
>>> I am not sure why they would use the word "replication" in the statement
>>> below since there is no replication between client and server( coordinator).
>>>
>>> "Choose between synchronous or asynchronous replication for each update
 ."

>>>
>>> Sent from my iPhone
>>>
>>> On Feb 20, 2017, at 5:30 PM, Dor Laor  wrote:
>>>
>>> I think they mean the client to server and not among the servers
>>>
>>> On Mon, Feb 20, 2017 at 5:28 PM, Kant Kodali  wrote:
>>>
 Also here is a statement from C* website

 "Choose between synchronous or asynchronous replication for each update
 ."

 http://cassandra.apache.org/

 Looks like we can choose then either sync or async then?

 On Mon, Feb 20, 2017 at 5:25 PM, Kant Kodali  wrote:

> Hi Dor,
>
> Great response! My comments are inline.
>
> Thanks a lot,
> kant
>
>
> On Mon, Feb 20, 2017 at 4:41 PM, Dor Laor  wrote:
>
>> I sent this answer but it bounced off the user@apache.
>> Here is the email anyway:
>>
>> -- Forwarded message --
>> From: Dor Laor 
>> Date: Mon, Feb 20, 2017 at 4:37 PM
>> Subject: Re: Does C* coordinator writes to replicas in same order or
>> different order?
>> To: d...@cassandra.apache.org
>> Cc: user@cassandra.apache.org
>>
>>
>> + The C* coordinator send async write requests to the replicas.
>>This is very important since it allows it to return a low latency
>>reply to the client once the CL is reached. You wouldn't want
>>to serialize the replicas one after the other.
>>
>
> *so coordinator wont wait until a CL is reached before it
> process another request? *
>
>>
>>  + The client <-> server sync/async isn't related to the coordinator
>> in this case.
>>
>>  + In the case of concurrent writes (always the case...), the time
>> stamp
>> sets the order. Note that it's possible to work with client
>> timestamps or
>> server timestamps. The client ones are usually the best choice.
>>
>
>  *In theory, Why we say concurrent writes they should have the same
> timestamp right?  What I am really looking for is that if I send write
> request concurrently for record 1 and record 2 are they guaranteed to be
> inserted in the same order across replicas? (Whatever order coordinator 
> may
> choose is fine but I want the same order across all replicas and with 
> async
> replication I am not sure how that is possible ? for example,  if a 
> request
> arrives with timestamp t1 and another request arrives with a timestamp t2
> where t1 < t2...with async replication what if one replica chooses to
> execute t2 first and then t1 simply because t1 is slow while another
> replica choose to execute t1 first and then t2..how would that work?  )*
>
>>
>> Note that C* each node can be a coordinator (one per request) and its
>> the desired case in order to load balance the incoming requests. Once
>> again,
>> timestamps determine the order among the requests.
>>
>> Cheers,
>> Dor
>>
>> On Mon, Feb 20, 2017 at 4:12 PM, Kant Kodali 
>> wrote:
>>
>>> Hi,
>>>
>>> when C* coordinator writes to replicas does it write it in same
>>> order or
>>> 

Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Kant Kodali
Agreed that async performs better than sync in general but the catch here
to me is the "order".

The whole point of async is to do out of order processing by which I mean
say if a request 1 comes in at time t1 and a request 2 comes in at time t2
where t1 < t2 and say now that t1 is taking longer to process than t2 in
which case request 2 should get a response first and subsequently a
response for request 1. This is where I would imagine all the benefits of
async come in but the moment you introduce order by saying for Last Write
Wins all the async requests should be processed in order I would imagine
all the benefits of async are lost.

Let's see if anyone can comment about how it works inside C*.

Thanks!



On Mon, Feb 20, 2017 at 10:54 PM, Dor Laor  wrote:

> Could be. Let's stay tuned to see if someone else pick it up.
> Anyway, if it's synchronous, you'll have a large penalty for latency.
>
> On Mon, Feb 20, 2017 at 10:11 PM, Kant Kodali  wrote:
>
>> Thanks again for the response! if they mean it between client and server
>> I am not sure why they would use the word "replication" in the statement
>> below since there is no replication between client and server( coordinator).
>>
>> "Choose between synchronous or asynchronous replication for each update."
>>>
>>
>> Sent from my iPhone
>>
>> On Feb 20, 2017, at 5:30 PM, Dor Laor  wrote:
>>
>> I think they mean the client to server and not among the servers
>>
>> On Mon, Feb 20, 2017 at 5:28 PM, Kant Kodali  wrote:
>>
>>> Also here is a statement from C* website
>>>
>>> "Choose between synchronous or asynchronous replication for each update.
>>> "
>>>
>>> http://cassandra.apache.org/
>>>
>>> Looks like we can choose then either sync or async then?
>>>
>>> On Mon, Feb 20, 2017 at 5:25 PM, Kant Kodali  wrote:
>>>
 Hi Dor,

 Great response! My comments are inline.

 Thanks a lot,
 kant


 On Mon, Feb 20, 2017 at 4:41 PM, Dor Laor  wrote:

> I sent this answer but it bounced off the user@apache.
> Here is the email anyway:
>
> -- Forwarded message --
> From: Dor Laor 
> Date: Mon, Feb 20, 2017 at 4:37 PM
> Subject: Re: Does C* coordinator writes to replicas in same order or
> different order?
> To: d...@cassandra.apache.org
> Cc: user@cassandra.apache.org
>
>
> + The C* coordinator send async write requests to the replicas.
>This is very important since it allows it to return a low latency
>reply to the client once the CL is reached. You wouldn't want
>to serialize the replicas one after the other.
>

 *so coordinator wont wait until a CL is reached before it
 process another request? *

>
>  + The client <-> server sync/async isn't related to the coordinator
> in this case.
>
>  + In the case of concurrent writes (always the case...), the time
> stamp
> sets the order. Note that it's possible to work with client
> timestamps or
> server timestamps. The client ones are usually the best choice.
>

  *In theory, Why we say concurrent writes they should have the same
 timestamp right?  What I am really looking for is that if I send write
 request concurrently for record 1 and record 2 are they guaranteed to be
 inserted in the same order across replicas? (Whatever order coordinator may
 choose is fine but I want the same order across all replicas and with async
 replication I am not sure how that is possible ? for example,  if a request
 arrives with timestamp t1 and another request arrives with a timestamp t2
 where t1 < t2...with async replication what if one replica chooses to
 execute t2 first and then t1 simply because t1 is slow while another
 replica choose to execute t1 first and then t2..how would that work?  )*

>
> Note that C* each node can be a coordinator (one per request) and its
> the desired case in order to load balance the incoming requests. Once
> again,
> timestamps determine the order among the requests.
>
> Cheers,
> Dor
>
> On Mon, Feb 20, 2017 at 4:12 PM, Kant Kodali 
> wrote:
>
>> Hi,
>>
>> when C* coordinator writes to replicas does it write it in same order
>> or
>> different order? other words, Does the replication happen
>> synchronously or
>> asynchrnoulsy ? Also does this depend sync or async client? What
>> happens in
>> the case of concurrent writes to a coordinator ?
>>
>> Thanks,
>> kant
>>
>
>
>

>>>
>>
>