On 24/02/2009, at 12:15 PM, Chris Anderson wrote:

Would it be overly difficult to just add in the ability to keep a full rev
history based on a config setting?

This would be a pretty big change. As Antony says, once you go down
that path a little, you end up at something that is not really much
like Couch.

I don't want to re-open a dead issue, but to clarify this - there are other models of replication that provide stronger weak-consistency guarantees - I urge you to read a few Bayou papers if you are interested. Using such replication would be very close to Couch. So I don't agree with the strength of Chris's comment.

The issue however, is that Couch's identity is, and has always been, largely determined by it's replication model. There's so much more to Couch that is independent of that, such as map/reduce views, forms, futon, an HTTP API, JSON etc, that it's not immediately obvious that it's the *replication model* that makes this product 'CouchDB'. The project founder and the PMC, are all committed to that replication model, which is derived from Notes.

You can add all of the other Couch features, and in fact reuse all of the Couch code, with a different replication model, but it's unlikely it would be accepted into the Couch code base. If you want that, you need to fork and call it something different (which is what I'm doing). It's important to note however that the Couch replication model has some characteristics that cannot be achieved using any stronger form of consistency. In fact, technically speaking, Couch provides coherence, but NO consistency.

Given all of that, it would be good to have a very clear 'What is Couch' that emphasizes the primacy of the replication model (and it's implications, both pro and con), because none of the other things IMO are as central to the identity, as consequential, or as confusing (except maybe reduce/re-reduce) as the operational semantics of the replicational model.

As an aside to this (and I'm not being bolshy), looking further ahead, Eventual Consistency, which seems to be promoted as an article of faith, is not *strictly* achievable in a partial replication environment. Achieving Eventual Consistency is also dependent on some other constraints, so depending on your deployment model, it can be more theoretical than practical. At the end of the day however, dealing with non-Monotonic Writes subsumes dealing with Eventual Consistency in all but asymptotic senses.

These are all points that I think should be made clearly and up front in the documentation, because a failure to understand Couch's replication model, and the implications for applications, both pro and com, will IMO lead to failures that will be blamed on Couch, but are in fact due to misunderstanding. You don't want a 'Couch is a piece of shit' meme to establish. IMO the bulk of Couch users will not think this through themselves, because they will be tool users, not tool builders.

There's yet to be a really clear reference for how to do
application-versioned documents in CouchDB. Hopefully we'll address
the topic in the book, but we haven't gotten that far yet.

The way I see it, the salient options are:

A) leave it as _rev and answer the versioning question every week forever
B) rename it to _mvcc or _lock or _token or something else that
doesn't confuse people

The main drawback of B is that when we start renaming _rev, someone
else comes along and tries to take the opportunity to change _id, or
otherwise change the whole system. If we can stick to just renaming to
something clearer, I'm happy to go ahead with this.

Orthogonally, I still think the id and rev should be wrapped in a _meta tag, but modulo that ...

It's not a _lock. Saying it's a _token has nothing to do with it's function - it would be like calling a car a 'construct of metal'. It's not _mvcc because that's the name of a technique, not a thing.

Maybe _mvcc_commit_id - although in the current implementation it isn't, it philosophically is and could be implemented that way. But really, it is a document version/revision identifier. Maybe put 'couch' in there to emphasize the internal nature of it e.g. '_couch_rev_id' i.e. something which, at the limit might be '_couch_private_revision_id_which_you_should_treat_as_opaque'.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The intuitive mind is a sacred gift and the rational mind is a faithful servant. We have created a society that honours the servant and has forgotten the gift.
  -- Albert Einstein


Reply via email to