Re: Relying on revisions for rollbacks

Jan Lehnardt Sat, 12 Apr 2008 02:15:31 -0700

Heya Ralf,

Thanks for your input and engaging in this discussion!


On Apr 12, 2008, at 04:36, Ralf Nieuwenhuijsen wrote:

Hi,
I've joined this mailing-list, because i wanted to reply to thisdiscussion
specifically.
I was hoping you could clear a number of things up for me.
1. Why make compacting the default? Isn't more likely that in thisday &
age, most will prefer revisions for all data?

Because the storage system is pretty wasteful and you'd end up withseveral Gigabytes of database files for just a few hundred Megabytesof actual data. So we do need compaction in one form or another. Acompaction that retains revisions is a lot harder to write. Also,dealing with revisions in a distributed setup is less than trivial andwould complicate the replication system quite a bit.

2. Compacting seems like very specific behavior, wouldn't a built-in
cron-like system be much more generic? It could allow for all kinds of

background proccessing, like replication, fulltext-search usingjavascript,

compacting, searching-for-dead-urls, etc.

Compacting is a manual process at the moment. If we would introduce ascheduling mechanism, it would certainly be more general purpose andyou could hook in al sorts of operations, including compaction.

3. Is support for some sort of reduce behavior, as part of the views,
planned and ifso, what can we expect?


See http://damienkatz.net/2008/02/incremental_map.html
and http://damienkatz.net/2008/02/incremental_map_1.html

4. What is the default conflict behavor? Most recent version wins?

There's no 'recent' in a distributed system. At the moment, therevision with the most changes wins, if I remember correctly.

5. Is it possible to merge on conflicts, or ifnot, how couldattachmentspossible properly model revisions. Wouldn't we loose a wholerevision tree?

You don't merge, at least at the moment, but declare one revision tobe the winner when resolving the conflict. Since this is a manualprocess, you can make sure you don't lose revision trees. Merge mightbe in at some point, but no thoughts (at least public) went into that.

6. Without merging, we need to store revisions in seperate documents,
thereby prohibiting usefull doc-is for documents under revision.

I don't understand what you mean here :) What is 'doc-is' in thiscontext?

7. What added benefit do manual revisisons have when we can juststore extra
revision data to each document anyway?
I'm quite sure my understanding of CouchDB can be lacking. But to meit
seems like garantueed revisisions are the killer feature.

The revisions are not, at least at this point, meant to implementrevision control systems, they rather exists for the optimisticconcurrency control that allows any number of parallel readers whileserialised writes are happening and to power replication.

The alternative of a cron-like system, could work much like the
view-documents. These documents could contain a source url (possiblylocal),a schedule-parameter and a function that maps a document to an arrayofdocuments that is treated as a batch-put. This way we could easilysetupreplication, but also all kinds of delayed and/or scheduledproccessing of
data.

Indeed. No planning went into such a thing at the moment. You mightwant to open a feature request at https://issues.apache.org/jira/browse/COUCHDBor come up with a patch.

Likewise, being able to define a conflict function that could mergedata or
decide who wins, seems like a much better alternative to the 'atomic'
batch-put-operations, that break down when distributed. (thereby nolonger
garantueeing the scalability; another killer-feature).

Conflict resolution and merge functions do sound interesting, I don'tunderstand the "not guaranteeing scalability" remark though. In thecurrent implementation, this feature actually makes CouchDB scalableby ensuring, that all node participating in a cluster eventually endup with the same data. If you really do need two-phase-commit (if Iunderstand correctly, you want that), that would need to be part ofyour application or a intermediate storage layer.



Cheers
Jan
--

Re: Relying on revisions for rollbacks

Reply via email to