New bulk docs behavior - transactional

Scott Shumaker Thu, 02 Apr 2009 23:09:52 -0700

We've been using CouchDB in our new web startup for the past few
months to represent a very rich set of interconnected data.  We have
tons of code modeling different types of objects in Javascript, and a
run-time serialization layer that serializes objects to CouchDB
(including serializing 'pointers' to other couchdb docs, instantiating
Javascript classes on load, etc), all done in the client-side
run-time.


The recent removal of the ability to treat a bulk_write as a single
atomic transaction (that is rejected if any conflicts happen) is
pretty much a deal killer for us, though.

There are plenty of times when you have two separate documents that
need to be changed atomically.   Everything from sorted lists,
reference counted shared objects, and so on.  Our whole reason to use
CouchDB vs. a traditional database was that it let us model our
documents around the way we're actually using them, from a UI and
design perspective - not being forced into some bizarre contortions to
try to deal with arbitrary limitations of the underlying storage
system.

I understand the motivations for removing the transactional semantics
- they're hard to do right, and can be very expensive in a sharded
solution.  But there are some times when it's almost essential to have
them - even if the cost is higher.

That said, it may be possible to come up with some compromises on
transactional qualities that make it easier to implement.

How about something like the following?

During an 'transactional bulk write':

- For each document in the transaction,

   - Attempt to create a new rev of the document, specially flagged so
it doesn't show up in views, document fetches, or replications (return
the last version of this documents instead).
   - This rev SHOULD be examined when checking for conflicts with
subsequent writes (so no new writes to the document can happen while
the transaction is in progress - they return with a conflict error).

- If all of the documents in the transaction are created successfully
(pass document validation, don't conflict with earlier revs), remove
the flags from these revisions (therefore becoming standard revs in
the database).  If not, remove these newly created revs.

This would sort of give a loose form of two-phase commit that should
work over sharded databases.  There's obviously a higher performance
cost - it's equivalent to two writes (once to write these 'shadow'
revisions, and once again to remove the flags on success or prune the
revisions on failure).  And clients won't be able to write to any of
the documents in the transaction until it completes.  But it would
probably achieve the effect that most people want.  And naturally, it
wouldn't be the default bulk_write behavior (given the cost and
limitations).

Would this approach work?

Scott

New bulk docs behavior - transactional

Reply via email to