Re: History Proposal

Damien Katz Mon, 03 Aug 2009 09:09:35 -0700


On Aug 3, 2009, at 11:55 AM, Jason Davies wrote:

Hi Damien,

On 3 Aug 2009, at 16:39, Damien Katz wrote:
On Aug 2, 2009, at 3:29 PM, Chris Anderson wrote:
On Sat, Aug 1, 2009 at 3:29 AM, JasonDavies<[email protected]> wrote:
On 31 Jul 2009, at 14:42, Benoit Chesneau wrote:
2009/7/31 Jason Davies <[email protected]>:
The main points of this proposal are:
1. Store the historical versions of documents in a separatedatabase.
This
is for a number of reasons: a) keeping it separate means wedon't clog upthe main database with historical data b) history-specificviews can be
kept
here c) non-intrusive implementation of this is easier.
2. The change will be made at the couch_db layer so that *any*change to
any
document in the target database will be mirrored to the historydatabase.
seem good.
3. Each and every change to a document will result in a newdocument
being
created in the history database (with a new ID) containing anexact copy
of
that document e.g. {_id: <new ID>, doc: <exact copy of doc> }.
How would you handle case of attachements ? If attachements arecopied
for each revision of a doc, it would take a lot of place. Maybe
storing attachements in their own doc could be solution though. So
storing a revision would be

store attachements in differents docs
create a doc  {_id: <id>, doc: <doc>, attachments: [<id1>, ...]}
attachements will be tests across revisions depending of theirsignature
if signature change, a new atatchment doc is created.

Just a thought anyway.
Good idea, the disk space issue would be quite important for larger
databases with larger number of changes.  I wonder if some kind of
alternative storage layer supporting diffs would help here.Probably
something to consider as a future improvement.
4. Adding meta-data to changes can be handled by a custom_update handler
(yet to be developed) to set fields such as "last_modified" and
"last_modified_user".
I've been quiet on this thread as I'm largely in agreement withthe proposal.
I think the best route for implementation is to allow Erlangcallbacks
on changes. This way we can write a simple history function that
copies off each change to a backup db, setting timestamps anduserCtx
metadata on the way.

The user interface could surface this function's activation in the
node config as a check box, and applications wouldn't need to know
about it at all. It should be possible to develop a generic futon-likeinterface for browsing old documents to revert individual changes,so
users can work with non-backup-aware applications.

As far as keeping track of time ranges when backups are turned off,
the user interface could record a timestamped metadata document tothe
backup db whenever the switch is flipped.
Some comments about the proposal
1. The callbacks must be synchronous. Queueing them for writinglater means the queue can get overloaded and changes lost.2 Changes can still get lost. We don't have commits across dbs, soit's possible a crash during update will put the main and historydbs out of sync.3. Replicated changes get lost. If a client makes 5 edits to localreplica of a document, then replicates it to a server db, only themost recent change get recorded in the history.
I would prefer to store the history as attachments to the maindocument.
Can you expand on your last sentence in a bit more detail? I assumeyou mean you would rather each document in the history db mirroredeach document in the target db, with attachments storing historicalversions?

No, I mean the earlier revisions of the document, stored asattachments to the current revision.


The history then replicates with the document, and is always available.

To solve #3 we could also allow the history database to bereplicated for use-cases where the entire history is desirable onall peers.

The problems with the history database is there are a lot of edgecases where the history gets out of sync, especially with distributededits. The system breaks easily in the face of network and securityerrors.


-Damien


--
Jason Davies

www.jasondavies.com

Re: History Proposal

Reply via email to