>> That would be a big problem of replicating huge databases. Everything must >> come over in one transaction. > > You could still do that incrementally e.g. it wouldn't have to load in a > single request. The key is that the replication shows MVCC boundaries i.e. > add a marker in the replication stream to indicate when you passed an MVCC > commit point. The current model would ignore such markers.- nothing else is > required I think. You could even cycle as long as there were new MVCC > states, which would give the same 'includes-updates-as-they-come-in' form of > replication, but with somewhat more consistency. If these restart points > were included in the replication stream, then systems that wanted to allow > replication rollback (see below) could reset the rollback MVCC state when > they get an end-of-MVCC state marker. > > If you cared however, and if your application model allowed it, then you > could guarantee consistency by only accepting completed replications. I > imagine you would need to be able to 'undo' an incomplete replication, which > would be a matter of allowing the db to be rolled back to the MVCC state > that was in effect when the replication started. This would prevent > permanent lockup, and I'm sure you'd want this facility to be > enabled/disabled in configuration. > > I want to stress that I know this is only useful for a certain class of use, > but I don't think it negatively impacts other uses, so only those uses that > want it would pay for it. > > Also, this doesn't resolve the cluster-ACID issue, but I'm confident there > is a solution there that doesn't impact the > clustered/non-exclusive-replication/constant-conflict model.
There is no concept of an "MVCC boundary" anywhere in the code that I'm aware of. These types of operations would require major re-engineering of the database format and the current single-writer/multi-reader semantics that are pretty well baked in. >> If you want consistency on the target, you'll have to write lock the >> database (not a concept couchdb really has) > > I think that's an application-level concept, but I can imagine a patch that > allows it in CouchDB. Still, I'd do that at an application level, because > I'm in user-triggered exclusive replication mode. I think the bigger point here is that what you're asking for violates a huge swath of assumptions baked into the core of CouchDB. Asking CouchDB to do consistent inter-document writes is going to require you to either change a large amount of internal code or write some very specific app code to get what you want. You may be able to get atomic interdocument updates on a single node, but this is violated if you do so much as try and replicate. IMO, it would be better to not support _bulk_docs for exactly this reason. People that use _bulk_docs will end up assuming that the atomic properties will carry over into places it doesn't actually get passed on to. >> until the whole replication completes, or write all the docs in one big >> bulk transaction, which won't work for large databases. And while that >> transaction is occurring, neither the source database or target database >> file itself be compacted (the compaction will take place, but the old file >> can't be freed until the full transaction completes). > > Yes, but once again I think there are valid use-cases that allow that e.g. > mine at least. It occurs to me that once you get to the point of writing source and target database locking, you no longer need _bulk_docs. You'd have enough code to do all the atomic interdoc writes you need. Though it'd be rather un-couchy. HTH, Paul Davis The trouble with the world is that the stupid are cocksure and the intelligent are full of doubt. -- Bertrand Russell
