In most datacenters you're going to see significant variance in your server times. Likely > 20ms between servers in the same rack. Even google, using atomic clocks, has 1-7ms variance. [1]
I would +1 Tyler's advice here, as using the clocks is only valid if clocks are perfectly sync'ed, which they are not, and likely never will be in our lifetime. [1] http://queue.acm.org/detail.cfm?id=2745385 On Thu, Mar 12, 2015 at 7:04 AM Eric Stevens <migh...@gmail.com> wrote: > > It's possible, but you'll end up with problems when attempting to > overwrite or delete entries > > I'm wondering if you can elucidate on that a little bit, do you just mean > that it's easy to forget to always set your timestamp correctly, and if you > goof it up, it makes it difficult to recover from (i.e. you issue a delete > with system timestamp instead of document version, and that's way larger > than your document version would ever be, so you can never write that > document again)? Or is there some bug in write timestamps that can cause > the wrong entry to win the write contention? > > We're looking at doing something similar to keep a live max value column > in a given table, our setup is as follows: > > CREATE TABLE a ( > id <whatever>, > time timestamp, > max_b_foo int, > PRIMARY KEY (id) > ); > CREATE TABLE b ( > b_id <whatever>, > a_id <whatever>, > a_timestamp timestamp, > foo int, > PRIMARY KEY (a_id, b_id) > ); > > The idea being that there's a one-to-many relationship between *a* and *b*. > We want *a* to know what the maximum value is in *b* for field *foo* so > we can avoid reading *all* *b* when we want to resolve *a*. You can see > that we can't just use *b*'s clustering key to resolve that with LIMIT 1; > also this is for DSE Solr, which wouldn't be able to query a by max b.foo > anyway. So when we write to *b*, we also write to *a* with something > like > > UPDATE a USING TIMESTAMP ${b.a_timestamp.toMicros + b.foo} SET max_b_foo = > ${b.foo} WHERE id = ${b.a_id} > > Assuming that we don't run afoul of related antipatterns such as > repeatedly overwriting the same value indefinitely, this strikes me as > sound if unorthodox practice, as long as conflict resolution in Cassandra > isn't broken in some subtle way. We also designed this to be safe from > getting write timestamps greatly out of sync with clock time so that > non-timestamped operations (especially delete) if done accidentally will > still have a reasonable chance of having the expected results. > > So while it may not be the intended use case for write timestamps, and > there are definitely gotchas if you are not careful or misunderstand the > consequences, as far as I can see the logic behind it is sound but does > rely on correct conflict resolution in Cassandra. I'm curious if I'm > missing or misunderstanding something important. > > On Wed, Mar 11, 2015 at 4:11 PM, Tyler Hobbs <ty...@datastax.com> wrote: > >> Don't use the version as your timestamp. It's possible, but you'll end >> up with problems when attempting to overwrite or delete entries. >> >> Instead, make the version part of the primary key: >> >> CREATE TABLE document_store (document_id bigint, version int, document >> text, PRIMARY KEY (document_id, version)) WITH CLUSTERING ORDER BY (version >> desc) >> >> That way you don't have to worry about overwriting higher versions with a >> lower one, and to read the latest version, you only have to do: >> >> SELECT * FROM document_store WHERE document_id = ? LIMIT 1; >> >> Another option is to use lightweight transactions (i.e. UPDATE ... SET >> docuement = ?, version = ? WHERE document_id = ? IF version < ?), but >> that's going to make writes much more expensive. >> >> On Wed, Mar 11, 2015 at 12:45 AM, Sachin Nikam <skni...@gmail.com> wrote: >> >>> I am planning to use the Update...USING TIMESTAMP... statement to make >>> sure that I do not overwrite fresh data with stale data while having to >>> avoid doing at least LOCAL_QUORUM writes. >>> >>> Here is my table structure. >>> >>> Table=DocumentStore >>> DocumentID (primaryKey, bigint) >>> Document(text) >>> Version(int) >>> >>> If the service receives 2 write requests with Version=1 and Version=2, >>> regardless of the order of arrival, the business requirement is that we end >>> up with Version=2 in the database. >>> >>> Can I use the following CQL Statement? >>> >>> Update DocumentStore using <versionValue> >>> SET Document=<documentValue>, >>> Version=<versionValue> >>> where DocumentID=<documentIDValue>; >>> >>> Has anybody used something like this? If so was the behavior as expected? >>> >>> Regards >>> Sachin >>> >> >> >> >> -- >> Tyler Hobbs >> DataStax <http://datastax.com/> >> > >