Sylvain Lebresne created CASSANDRA-6178:
-------------------------------------------

             Summary: Consider allowing timestamp at the protocol level ... and 
deprecating server side timestamps
                 Key: CASSANDRA-6178
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6178
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Sylvain Lebresne
            Assignee: Sylvain Lebresne


Generating timestamps server side by default for CQL has been done for 
convenience, so that end-user don't have to provide one with every query.  
However, doing it server side has the downside that updates made sequentially 
by one single client (thread) are no guaranteed to have sequentially increasing 
timestamps. Unless a client thread is always pinned to one specific server 
connection that is, but no good client driver out (that is, including thrit 
driver) there does that because that's contradictory to abstracting fault 
tolerance to the driver user (and goes again most sane load balancing strategy).

Very concretely, this means that if you write a very trivial test program that 
sequentially insert a value and then erase it (or overwrite it), then, if you 
let CQL pick timestamp server side, the deletion might not erase the just 
inserted value (because the delete might reach a different coordinator than the 
insert and thus get a lower timestamp). From the user point of view, this is a 
very confusing behavior, and understandably so: if timestamps are optional, 
you'd hope that they are least respect the sequentiality of operation from a 
single client thread.

Of course we do support client-side assigned timestamps so it's not like the 
test above is not fixable. And you could argue that's it's not a bug per-se.  
Still, it's a very confusing "default" behavior for something very simple, 
which suggest it's not the best default.

You could also argue that inserting a value and deleting/overwriting right away 
(in the same thread) is not something real program often do. And indeed, it's 
likely that in practice server-side timestamps work fine for most real 
application. Still, it's too easy to get counter-intuitive behavior with 
server-side timestamps and I think we should consider moving away from them.

So what I'd suggest is that we push back the job of providing timestamp client 
side. But to make it easy for the driver to generate it (rather than the end 
user), we should allow providing said timestamp at the protocol level.

As a side note, letting the client provide the timestamp would also have the 
advantage of making it easy for the driver to retry failed operations with 
their initial timestamp, so that retries are truly idempotent.




--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to