On 24/02/2009, at 9:32 AM, Dean Landolt wrote:

Can you suggest how we improve the wiki docs to satisfy this? In my
opinion, the docs are clear* and the term is overloaded and confusing.

* http://wiki.apache.org/couchdb/Document_revisions has
"You cannot rely on document revisions for any other purpose
than concurrency control." in bold letters.

I stated this in earlier discussions as well: Even if our documentation
were perfect, we don't control how people learn about CouchDB. We
only control the API and we should work hard to get it right.

The way it stands now, a lot of people new to CouchDB get it wrong
because "revision" is a familiar term and they associate the behaviour
they associate with it to them. That's how humans learn. In this case
we make the learning hard.

Firstly, I completely agree that one should consider the implications of using certain terms; the baggage and context such terms bring with them.

<flamesuit on>
OTOH, one should use the correct term and not redefine existing terms to suit one's own purpose. In a tangentially related way, the use of the term RESTful wrt CouchDB is a marketing abomination.
</flamesuit off>

The documentation about replication, the role of revisions, the lack of inter-document consistency guarantees (including, crucially to the operation model, the lack of Monotonic Write guarantees), really needs to be expanded.

The consequences of CouchDB's underlying model aren't immediately obvious, and should be spelled out, as I started to do here: http://mail-archives.apache.org/mod_mbox/couchdb-dev/200902.mbox/%3c0fddc57c-db78-4241-86de-549fecc8b...@gmail.com%3e - which was obviously in the context of changing that mechanism, but still the explanation and references are useful.

I couldn't agree more with this sentiment, but revision still strikes me as the right term. Perhaps the easiest way to fix this misconception is for
there to actually be a way to keep old revisions around for good :)

Would it be overly difficult to just add in the ability to keep a full rev
history based on a config setting? The replication api would need to
accommodate this, of course, and if the machine you're replicating from doesn't also keep old revisions around your SOL, but is there any other compelling reason to not offer this option? If it wouldn't complicate the code base, this seems like a helpful feature. Sure, it could be wasteful and should be off by default, but if your dataset is relatively small, this config flag would be pretty nice to have, and it could help clear up this
confusion.

Danger Will Robinson!

The problem here is that you then need to make certain guarantees about revisions to make them at all useful, and you get into a discussion like the above email thread.

IMO, discussing these issues without having read the relevant literature around replication models, is a waste of time. Serious research has been done into this, and (once again, IMO) it is more productive to advance that understanding than try (and possibly fail) to reinvent the wheel.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

A priest, a minister and a rabbi walk into a bar. The bartender says "What is this, a joke?"


Reply via email to