Re: Fail on a simple case on replication

Dean Landolt Mon, 23 Feb 2009 15:03:16 -0800

On Mon, Feb 23, 2009 at 10:30 AM, Jan Lehnardt <j...@apache.org> wrote:

>
> On 23 Feb 2009, at 16:11, Patrick Antivackis wrote:
>
>  For a reminder :
>>
>> revision  (n)
>> 1. the act or process of revising,
>> 2. a corrected or new version of a book, article, etc.
>>
>> For me this term is correct with the use in Couch
>>
>
> Damien is not saying the usage is wrong in CouchDB, but people
> associate more with "revision" than he'd like. Hence the proposal.
>
>
>  I think a good explanation of what a compaction/replication are doing (ie
>> removing  old rev, or replicating only current rev) is the right solution
>> to
>> this misunderstanding
>>
>
> Can you suggest how we improve the wiki docs to satisfy this? In my
> opinion, the docs are clear* and the term is overloaded and confusing.
>
> * http://wiki.apache.org/couchdb/Document_revisions has
> "You cannot rely on document revisions for any other purpose
> than concurrency control." in bold letters.
>
> I stated this in earlier discussions as well: Even if our documentation
> were perfect, we don't control how people learn about CouchDB. We
> only control the API and we should work hard to get it right.
>
> The way it stands now, a lot of people new to CouchDB get it wrong
> because "revision" is a familiar term and they associate the behaviour
> they associate with it to them. That's how humans learn. In this case
> we make the learning hard.

I couldn't agree more with this sentiment, but revision still strikes me as
the right term. Perhaps the easiest way to fix this misconception is for
there to actually be a way to keep old revisions around for good :)

Would it be overly difficult to just add in the ability to keep a full rev
history based on a config setting? The replication api would need to
accommodate this, of course, and if the machine you're replicating from
doesn't also keep old revisions around your SOL, but is there any other
compelling reason to not offer this option? If it wouldn't complicate the
code base, this seems like a helpful feature. Sure, it could be wasteful and
should be off by default, but if your dataset is relatively small, this
config flag would be pretty nice to have, and it could help clear up this
confusion.

Re: Fail on a simple case on replication

Reply via email to