[ 
https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142227#comment-14142227
 ] 

Peter Bailis commented on CASSANDRA-7056:
-----------------------------------------

bq. Let's assume we query from partition A and B, and we see the results don't 
match timestamps, we would pull the latest batchlog assuming they are from the 
same batch but let's say they in fact are not. In this case we wasted a lot of 
time so my question is should we only do this in the user supplies a new CL 
type?

If you set the same, unique (e.g., UUID) write timestamp for all writes in a 
batch, then you know that any results with different timestamps  are part of 
different batches. So, given mismatched timestamps, should you check the 
batchlog for pending writes? One solution is to always check (as in 
RAMP-Small). This doesn't require any extra metadata, but, as you point out, 
also requires 2 RTTs. To cut down on these RTTs, you could also do attach a 
Bloom filter of the items in each batch and only check any possibly missing 
writes (as in RAMP-Hybrid). (I can go into more detail if you want.) However, I 
agree that you might not want to pay these costs *all* of the time for reads. 
Would a BATCH_READ or other modifier to CQL SELECT statements make sense?

bq. In the case of a global index we plan on reading the data after reading the 
index. The data query might reveal the indexed value is stale. We would need to 
apply the batchlog and fix the index, would we then restart the entire query? 
or maybe overquery assuming some index values will be stale? Either way this 
query looks different than the above scenario.

I think there are a few options. The easiest is to simply filter out the out of 
date rows, and then you are guaranteed to see a subset of the index entries. 
Alternatively, you could provide a "snapshot index read" where you read the 
older, overwritten values from the data node. If you want a "read latest and 
read snapshot" mode, there are some options I can describe, but they generally 
entail either more metadata or, otherwise, using locks/blocking coordination, 
which I don't think you want.


> Add RAMP transactions
> ---------------------
>
>                 Key: CASSANDRA-7056
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
>             Project: Cassandra
>          Issue Type: Wish
>          Components: Core
>            Reporter: Tupshin Harper
>            Priority: Minor
>
> We should take a look at 
> [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
>  transactions, and figure out if they can be used to provide more efficient 
> LWT (or LWT-like) operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to