FYI I moved this to an RFC at https://github.com/apache/couchdb-documentation/pull/401
Adam > On Mar 18, 2019, at 10:47 PM, Adam Kocoloski <kocol...@apache.org> wrote: > > >> On Mar 18, 2019, at 9:03 PM, Alex Miller <alexmil...@apple.com.INVALID> >> wrote: >> >> >>> On Mar 5, 2019, at 4:04 PM, Adam Kocoloski <kocol...@apache.org> wrote: >>> With the incarnation and branch count in place we’d be looking at a design >>> where the KV pairs have the structure >>> >>> (“changes”, Incarnation, Versionstamp) = (ValFomat, DocID, RevFormat, >>> RevPosition, RevHash, BranchCount) >>> >>> where ValFormat is an enumeration enabling schema evolution of the value >>> format in the future, and RevFormat, RevPosition, RevHash are associated >>> with the winning edit branch for the document (not necessarily the edit >>> that occurred at this version, matching current CouchDB behavior) and carry >>> the meanings defined in the revision storage RFC[2]. >> >> >> >> Do note that with versionstamped keys, and atomic operations in general, >> it’s important to keep in mind that committing a transaction might return >> `commit_unknown_result`. Transaction loops will retry a >> `commit_unknown_result` error by default. (Or, will, if your erlang/elixer >> bindings copy the behavior of the rest of the bindings.) So you’ll need >> some way of making an insert into `changes` an idempotent operation. >> >> >> I’ll volunteer three possible options: >> >> 1. The easiest case is if you happen to be inserting a known, fixed key (and >> preferably one that contains a versionstamped value) in the same transaction >> as a versionstamped key, as then you have a key to check in your database to >> tell if your commit happened or not. >> >> 2. If you’re doing an insert of just this key in a transaction, and your key >> space has relatively infrequent writes, then you might be able to get away >> with remembering the initial read version of your transaction, and issue a >> range scan from (“changes”, Incarnation, InitiailReadVersion) -> (“changes”, >> infinity, infinity), and filter through looking for a value equal to what >> you tried to write. >> >> 3. Accept that you might write duplicate values at different versionstamped >> keys, and write your client code such that it will skip repeated values that >> it has already seen. >> >> I had filed an internal bug long ago to complain about this before, which >> I’ve now copied over to GitHub[1]. So if this becomes absurdly difficult to >> work around, feel free to show up there to complain. >> >> [1]: https://github.com/apple/foundationdb/issues/1321 >> <https://github.com/apple/foundationdb/issues/1321> > > Hi Alex, thanks for that comment and for taking a close read. Option 1 could > almost work here; we will be inserting up to two keys in a “revisions” > subspace as part of the same transaction that we could read and that would > include both the RevHash and the Versionstamp. The latest design for that > subspace is here: > > https://github.com/apache/couchdb-documentation/blob/5197cdffe1e2c08a7640dd646dd02909c0cf51ef/rfcs/001-fdb-revision-metadata-model.md > > If I understand correctly, I think the edge case regarding > `commit_unknown_result` that we’re not adequately guarding against is the > following series of events: > > 1) Txn A tries to commit an edit and gets `commit_unknown_result`; in > reality, the transaction failed > 2) Txn B tries to commit an *identical* edit (save for the versionstamp) and > succeeds > 3) Txn A retries and finds the entry in “revisions” for this `RevHash` exists > and the `Versionstamp` in “changes” for this DocID higher than the one > initially attempted > > In this scenario we should report an edit conflict failure back to the client > for Txn A, but the end result is indistinguishable from the case where > > 1) Txn A tries to commit an edit and gets `commit_unknown_result`; in > reality, the transaction *succeeds* > 2) Txn B tries to edit a *different* branch of the document and succeeds > (thereby replacing Txn A’s entry in “changes”) > > which is a scenario where we need to report success for both Txn A and Txn B. > > We could close this loophole by storing the Versionstamp alongside the > RevHash for every edit in the “revisions” subspace, rather than only storing > the Versionstamp of the latest edit to the document. Not cheap though. Will > give it some thought. Thanks! > > Adam >