On Mon, Feb 23, 2009 at 10:30 AM, Jan Lehnardt <j...@apache.org> wrote:
> > On 23 Feb 2009, at 16:11, Patrick Antivackis wrote: > > For a reminder : >> >> revision (n) >> 1. the act or process of revising, >> 2. a corrected or new version of a book, article, etc. >> >> For me this term is correct with the use in Couch >> > > Damien is not saying the usage is wrong in CouchDB, but people > associate more with "revision" than he'd like. Hence the proposal. > > > I think a good explanation of what a compaction/replication are doing (ie >> removing old rev, or replicating only current rev) is the right solution >> to >> this misunderstanding >> > > Can you suggest how we improve the wiki docs to satisfy this? In my > opinion, the docs are clear* and the term is overloaded and confusing. > > * http://wiki.apache.org/couchdb/Document_revisions has > "You cannot rely on document revisions for any other purpose > than concurrency control." in bold letters. > > I stated this in earlier discussions as well: Even if our documentation > were perfect, we don't control how people learn about CouchDB. We > only control the API and we should work hard to get it right. > > The way it stands now, a lot of people new to CouchDB get it wrong > because "revision" is a familiar term and they associate the behaviour > they associate with it to them. That's how humans learn. In this case > we make the learning hard. I couldn't agree more with this sentiment, but revision still strikes me as the right term. Perhaps the easiest way to fix this misconception is for there to actually be a way to keep old revisions around for good :) Would it be overly difficult to just add in the ability to keep a full rev history based on a config setting? The replication api would need to accommodate this, of course, and if the machine you're replicating from doesn't also keep old revisions around your SOL, but is there any other compelling reason to not offer this option? If it wouldn't complicate the code base, this seems like a helpful feature. Sure, it could be wasteful and should be off by default, but if your dataset is relatively small, this config flag would be pretty nice to have, and it could help clear up this confusion.