Hi Ronny,
sorry, late reply.
Once way to re-introduce optimistic locking is saving the new revision
over the latest one and then copying the previous revision of the doc
into a new doc. You can't run compaction in between, but since you
control it, JUST DON'T CALL IT ;-).
Cheers
Jan
--
On Sep 14, 2008, at 23:49, Ronny Hanssen wrote:
Thanks for your reply, Jan.
I do remember the discussion in the mailinglist, but at the time I
didn't
understand the argumentation. Maybe because I really didn't have
time to
dive into the matter back then. But, it seriously has puzzled me
since. Then
this post appears and I jump at the chance to get this cleared out
(sorry
for being slow - which makes me the opposite of arrogant I guess :D).
But, I don't have a solution. I guess you are right in that sense. I
just
fail to see that making new docs are making life easier? I believe
it makes
the single node case worse and probably equally difficult (or worse)
for the
distributed multiple node architecture. Reading from what you say,
there is
"evil" lurking in the replication process no matter which way we
handle
this. I mean, for multiple nodes the replication would probably be
slower
than the return to the users changing the same doc on two different
nodes to
be informed. This would result in multiple versions of the same doc
being
around, at least until replication - when couchdb would find out
that two
competing versions exist. I might be wrong about this, but the users
can't
be left waiting for an "ok-saved" reply from couchdb "forever",
right? So,
couchdb would have to decide which version "wins" during
replication, right?
Considering the effects you are hinting about, I'd personally want a
single
node couchdb for writes, with extra nodes for reading and serving
views...
Maybe additional write-nodes for different doc-types (one write-node
pr
doc-type)... Just to "ensure" that there cannot be two+ docs updated
at two+
nodes simultaneously. That is, in the beginning I'd really rather go
for a
single node, with a replicated backup/failover. As (if) system stress
increase I'd opt for splitting write and reads on nodes and/or
creating
write-nodes designated for different doc-types. This is still not
perfect,
but distributed never will be, really.
Unless... If the couchdb data was stored in a distributed file-
system (NAS
or SAN), each copy of the couchdb process would be operating on the
same
disk. This doesn't mean more data-reliability and also imposes
delays in
reads and writes. But, it would mean that couchdb would be scalable
(multiple (vurtual" nodes work on same physical disk). Other
"physical"
nodes could be created that would replicate as couchdb is set up to do
already. So, allowing "virtual" nodes could work out as a nice
addition I
think.
But, then again, my knowledge in distributed file-systems (NAS or
SAN) are
really limited... And, I might have missed out on alot more than
that - so
all this might of course just be stupid :)
Thank's for reading.
~Ronny
2008/9/14 Jan Lehnardt <[EMAIL PROTECTED]>
Hi Ronny,
On Sep 14, 2008, at 11:45, Ronny Hanssen wrote:
Or have I seriously missed out on some vital information?
Because, based
on
the above I still feel very confused about why we cannot use the
built-in
rev-control mechanism.
You correctly identify that adding revision control to a single node
instance of
CouchDB is not that hard (a quick search through the archives would
have
told
you, too :-) Making all that work in a distributed environment with
replication conflict
detection and all is mighty hard. If you can come up with a nice an
clean
solution to
make proper revision control work with CouchDB's replication
including all
the weird
edge cases I don't even know about (aren't I arrogant this
morning? :), we
are happy
to hear about it.
Cheers
Jan
--
~Ronny
2008/9/14 Jeremy Wall <[EMAIL PROTECTED]>
Two reasons.
* First as I understand it the revisions are not changes between
documents.
They are actual full copies of the document.
* Second revisions get blown away when doing a database compact.
Something
you will more than likely want to do since it eats up database
space
fairly
quickly. (see above for the reason why)
That said there is nothing preventing you from storing revisions in
CouchDB.
You could store a changeset for each document revision is a
seperate
revision document that accompanies your main document. It would
be really
easy and designing views to take advantage of them to show a
revision
history for you document would be really easy.
I suppose you could use the revisions that CouchDB stores but that
wouldn't
be very efficient since each one is a complete copy of the
document. And
you
couldn't depend on that "feature not changing behaviour on you in
later
versions since it's not intended for revision history as a feature.
On Sat, Sep 13, 2008 at 7:24 PM, Ronny Hanssen <[EMAIL PROTECTED]
wrote:
Why is the revision control system in couchdb inadequate for, well,
revision
control? I thought that this feature indeed was a feature, not
just an
internal mechanism for resolving conflicts?
Ronny
2008/9/14 Calum Miller <[EMAIL PROTECTED]>
Hi Chris,
Many thanks for your prompt response.
Storing a complete new version of each bond/instrument every
day seems
a
tad excessive. You can imagine how fast the database will grow
overtime
if a
unique version of each instrument must be saved, rather than
just the
individual changes. This must be a common pattern, not confined
to
investment banking. Any ideas how this pattern can be
accommodated
within
CouchDB?
Calum Miller
Chris Anderson wrote:
Calum,
CouchDB should be easily able to handle this load.
Please note that the built-in revision system is not designed
for
document history. Its sole purpose is to manage conflicting
documents
that result from edits done in separate copies of the DB,
which are
subsequently replicated into a single DB.
If you allow CouchDB to create a new document for each daily
import of
each security, and create a view which makes these documents
available
by security and date, you should be able to access securities
history
fairly simply.
Chris
On Sat, Sep 13, 2008 at 12:31 PM, Calum Miller <
[EMAIL PROTECTED]>
wrote:
Hi,
I trying to evaluate CouchDB for use within investment
banking, yes
some
of
these banks still exist. I want to load 500,000 bonds into the
database
with
each bond containing around 100 fields. I would be looking to
bulk
load
a
similar amount of these bonds every day whilst maintaining a
history
via
the
revision feature. Are there any bulk load features available
for
CouchDB
and
any tips on how to manage regular loads of this volume?
Many thanks in advance and best of luck with this project.
Calum Miller