Hi Robert, Just to clarify a bit, there's nothing inherently wrong with a read-modify-write cycle as you would use for a document store. The read-before-write antipattern refers to depending on a read immediately before a write, as was being done in the original post. Generally, such a read is done to either (a) verify that the underlying record hasn't changed immediately before updating or (b) to merge updated parts of the document with those originally excluded from the original read. Obviously, both can be problematic if concurrent modifications are being performed, or if the operations required to perform the update are executed concurrently.
The original post was problematic for a different reason - updating the same column very rapidly with the read-before-write antipattern built into the update. This fails occasionally because the database is not yet consistent by the time the next read is performed. The result is an update that mostly, but not always succeeds. Using Lightweight Transactions and BatchStatements can address many of these problems in a normal OLTP environment as with a document store, and will not be likely to have a negative impact on performance, but rapidly updated time series data is a different animal, and requires its own strategies and patterns. Steve On Fri, Jan 10, 2014 at 3:24 PM, Todd Carrico <todd.carr...@match.com>wrote: > I’ve solved this for other systems, and it might work here. > > > > Add a Guid as a field to the record. > > When you update the document, check to make sure the Guid hasn’t changed > since you read it. If the Guid is the same, go ahead and save the document > along with a new Guid. > > > > This keeps you from locking the document if you just want to read it while > still keeping you from overwriting someone else’s changes. In this other > system, it was easy enough to add the guid check as part of the where > clause: > > > > Update doc > > Set Text = Text > > Where key = ? > > And Guid = ? > > > > If the row failed to update, then it was removed, or the Guids didn’t > match. > > > > Not sure if C* has some magic that can make this better, timestamp should > do the same thing I think. > > > > “There are a multitude of methods whereby a feline might be divested of > its epidermal layer”.. > > > > *From:* Tupshin Harper [mailto:tups...@tupshin.com] > *Sent:* Friday, January 10, 2014 5:13 PM > > *To:* user@cassandra.apache.org > *Subject:* Re: Read/Write consistency issue > > > > It is bad because of the risk of concurrent modifications. If you don't > have some kind of global lock on the document/row, then 2 readers might > read version A, reader 1 writes version B based on A, and reader 2 writes > version C based on A, overwriting the changes in B. This is *inherent* to > the notion distributed systems and multiple writers, and can only be fixed > by: > > 1) Having a global lock, either in the form of a DB lock (CAS for > Cassandra 2.0 and above), or some higher level business mechanism that is > ensuring only one concurrent reader/writer for a given document > > 2) Idempotent writes by appending at write and aggregate on read. For > time-series and possibly counter style information, this is often the ideal > strategy, but usually not so good for documents. > > For the counters scenario, idempotent writes, or the rewrite of counters > (which use idempotent writes behind the scenes) are probably good solutions. > > Concurrent editing of documents, on the other hand, is almost the ideal > scenario for lightweight transactions. > > -Tupshin > > > > On Fri, Jan 10, 2014 at 5:51 PM, Robert Wille <rwi...@fold3.com> wrote: > > Interested in knowing more on why read-before-write is an anti-pattern. > In the next month or so, I intend to use Cassandra as a doc store. One very > common operation will be to read the document, make a change, and write it > back. These would be interactive users modifying their own documents, so > rapid repeated writing is not an issue. Why would this be bad? > > > > Robert > > > > *From: *Steven A Robenalt <srobe...@stanford.edu> > *Reply-To: *<user@cassandra.apache.org> > *Date: *Friday, January 10, 2014 at 3:41 PM > > > *To: *<user@cassandra.apache.org> > *Subject: *Re: Read/Write consistency issue > > > > My understanding is that it's generally a Cassandra anti-pattern to do > read-before-write in any case, not just because of this issue. I'd agree > with Robert's suggestion earlier in this thread of writing each update > independently and aggregating on read. > > > > Steve > > > > > -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu