On 31 Jul 2009, at 14:42, Benoit Chesneau wrote:
2009/7/31 Jason Davies <[email protected]>:
The main points of this proposal are:
1. Store the historical versions of documents in a separate
database. This
is for a number of reasons: a) keeping it separate means we don't
clog up
the main database with historical data b) history-specific views
can be kept
here c) non-intrusive implementation of this is easier.
2. The change will be made at the couch_db layer so that *any*
change to any
document in the target database will be mirrored to the history
database.
seem good.
3. Each and every change to a document will result in a new
document being
created in the history database (with a new ID) containing an exact
copy of
that document e.g. {_id: <new ID>, doc: <exact copy of doc> }.
How would you handle case of attachements ? If attachements are copied
for each revision of a doc, it would take a lot of place. Maybe
storing attachements in their own doc could be solution though. So
storing a revision would be
store attachements in differents docs
create a doc {_id: <id>, doc: <doc>, attachments: [<id1>, ...]}
attachements will be tests across revisions depending of their
signature
if signature change, a new atatchment doc is created.
Just a thought anyway.
Good idea, the disk space issue would be quite important for larger
databases with larger number of changes. I wonder if some kind of
alternative storage layer supporting diffs would help here. Probably
something to consider as a future improvement.
4. Adding meta-data to changes can be handled by a custom _update
handler
(yet to be developed) to set fields such as "last_modified" and
"last_modified_user".
why not adding date metadata when storing revision . The obvious one I
mean userCtx, and date?
My idea was that userCtx and date could be stored using _update, or do
you think this should be done automatically? It's certainly a
possibility but I wouldn't want to add unnecessary data if the user
doesn't need it, although I imagine in 99% of cases they would need
the "date/time" of the change in the history.
One use case we'd like to support is effectively (from the point of
the
user) being able to "roll back" a view to a specific point in time,
but how
this would look in the history database has me stumped so far.
Rolling back
a specific doc is easy, but multiple docs, not so easy it seems. Any
suggestions welcome!
rolling back could be handled on a view based on date in history
database ?
Indeed, but I haven't been able to come up with such a view without
blowing the reduce limitations. I want to do something like fetch all
the latest history docs that were changed before some particular
date. As Jan pointed out though, this could be solved using snapshot
databases instead.
--
Jason Davies
www.jasondavies.com