Agree Tyler. I think its our application problem. If client returns failed 
write in spite of retries, application must have a rollback mechanism to make 
sure old state is restored. Failed write may be because of the fact that CL was 
not met even though one node successfully wrote.Cassandra wont do cleanup or 
rollback on one node so you need to do it yourself to make sure that integrity 
of data is maintained in case strong consistency is a requirement. Right?


We use Hector by the way and plannning to switch to CQL driver..



Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:"Tyler Hobbs" <ty...@datastax.com>
Date:Tue, 30 Jun, 2015 at 10:42 pm
Subject:Re: Read Consistency


I think these scenarios are still possible even when we are writing at QUORUM 
..if we have dropped mutations in our cluster..

It was very strange in our case ...We had RF=3 and READ/WRITE CL=QUORUM..we had 
dropped mutations for long time but we never faced any scenario like scenario 1 
when READ went to node 2 and 3 and read did's return any data..Any comments on 
this are welcome?? 


They are not possible if you write at QUORUM, because QUORUM guarantees that at 
least two of the nodes will have the most recent version of the data.  If fewer 
than two replicas respond successfully (meaning two replicas dropped 
mutations), you will get an error on the write.

All of the drivers and cqlsh default to consistency level ONE, so I would 
double check that your application is setting the consistency level correctly.

 


On Sun, Jun 28, 2015 at 12:55 PM, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:

Sorry for typo in your name Owen !!


Anuj

Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <anujw_2...@yahoo.co.in>
Date:Sun, 28 Jun, 2015 at 11:11 pm
Subject:Re: Read Consistency

Agree Owem !! Response in both scenarios would depend on the 2 replicas chosen 
for meeting QUORUM. But, the intent is to get the tricky part of scenario 1 
answered i.e. when 2 nodes selected are one with and one without data. 


As per my understanding of Read Path and documentation 
https://wiki.apache.org/cassandra/ArchitectureInternals:

1. Data would be read from closest node and digest would be received from one 
more replica.

2. If mismatch is found between digest, blocked read happens on same 2 replicas 
(not all replicas ..so in scenario 2, if 2 nodes didnt have latest data and 
third node has it ..still stale data would be returned)


I think these scenarios are still possible even when we are writing at QUORUM 
..if we have dropped mutations in our cluster..

It was very strange in our case ...We had RF=3 and READ/WRITE CL=QUORUM..we had 
dropped mutations for long time but we never faced any scenario like scenario 1 
when READ went to node 2 and 3 and read did's return any data..Any comments on 
this are welcome?? 


Thanks for clarifying further as discussion could have mislead few..


Thanks

Anuj




On Sunday, 28 June 2015 6:16 AM, Owen Kim <ohech...@gmail.com> wrote:



Sorry. I have to jump in and disagree. Data is not guaranteed to retire in 
scenario 1. Since two nodes do not have data and two nodes may be the only 
nodes queried at that CL, the read query may return data or not.


Similarly, in scenario 2, the query may not return the most recent data because 
the node with that data may not be queried at all (the other two may).


Keep in mind, these scenarios seem to generally assume you are not writing data 
at consistently at QUORUM CL so therefore your reads may be inconsistent.


On Tuesday, June 23, 2015, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:

Thanks..So all of us agree that in scenario 1, data would be returned and that 
was my initial understanding..



Anuj




Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <anujw_2...@yahoo.co.in>
Date:Wed, 24 Jun, 2015 at 12:15 am
Subject:Re: Read Consistency

M more confused...Different responses. .Anyone who can explain with 100% surity 
?


Thanks

Anuj



Sent from Yahoo Mail on Android

From:"arun sirimalla" <arunsi...@gmail.com>
Date:Wed, 24 Jun, 2015 at 12:00 am
Subject:Re: Read Consistency



Thanks good to know that.


On Tue, Jun 23, 2015 at 11:27 AM, Philip Thompson 
<philip.thomp...@datastax.com> wrote:

Yes, that is what he means. CL is for how many nodes need to respond, not agree.


On Tue, Jun 23, 2015 at 2:26 PM, arun sirimalla <arunsi...@gmail.com> wrote:

So do you mean with CL set to QUORUM, if data is only on one node, the query 
still succeeds.


On Tue, Jun 23, 2015 at 11:21 AM, Philip Thompson 
<philip.thomp...@datastax.com> wrote:

Anuj,

In the first scenario, the data from the single node holding data is returned. 
The query will not fail if the consistency level is met, even if the read was 
inconsistent.


On Tue, Jun 23, 2015 at 2:16 PM, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:

Why would it fail and with what Thrift error? What if the data didnt exist on 
any of the nodes..query wont fail if doesnt find data..


Not convinced..

Sent from Yahoo Mail on Android

From:"arun sirimalla" <arunsi...@gmail.com>
Date:Tue, 23 Jun, 2015 at 11:39 pm
Subject:Re: Read Consistency

Scenario 1: Read query is fired for a key, data is found on one node and not 
found on other two nodes who are responsible for the token corresponding to key.


You read query will fail, as it expects to receive data from 2 nodes with RF=3



Scenario 2: Read query is fired and all 3 replicas have different data with 
different timestamps.


Read query will return the data with most recent timestamp and trigger a read 
repair in the backend .


On Tue, Jun 23, 2015 at 10:57 AM, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:

Hi,


Need to validate my understanding..


RF=3 , Read CL = Quorum


What would be returned to the client in following scenarios:


Scenario 1: Read query is fired for a key, data is found on one node and not 
found on other two nodes who are responsible for the token corresponding to key.


Options: no data is returned OR data from the only node having data is returned?


Scenario 2: Read query is fired and all 3 replicas have different data with 
different timestamps.


Options: data with latest timestamp is returned OR something else???


Thanks

Anuj


Sent from Yahoo Mail on Android




-- 

Arun 





-- 

Arun 

Senior Hadoop/Cassandra Engineer

Cloudwick



2014 Data Impact Award Winner (Cloudera)

http://www.cloudera.com/content/cloudera/en/campaign/data-impact-awards.html






-- 

Arun 

Senior Hadoop/Cassandra Engineer

Cloudwick



2014 Data Impact Award Winner (Cloudera)

http://www.cloudera.com/content/cloudera/en/campaign/data-impact-awards.html







-- 

Tyler Hobbs
DataStax

Reply via email to