> On Sep 26, 2019, at 1:38 PM, Joan Touzet <woh...@apache.org> wrote: > > On 2019-09-26 13:14, Adam Kocoloski wrote: >> Hi Joan, no need for apologies! Snipping out a few bits: >> >>> One alternative is to always keep just one around, and constantly update >>> it every 5s, whether it's used or not (idle server). >> >> Agreed, I see no reason to keep multiple old read versions around on a given >> CouchDB node. Updating it every second or two could be a nice thing to do (I >> wouldn’t wait 5 seconds because handing out a read version 4.95 seconds old >> isn’t very useful to anyone ;). >> >>> This second option seems better, but as mentioned later we don't want it >>> to be a transparent FDB token (or convertible into one). This parallels >>> the nonce approach we use in _changes feeds to ensure a stable feed, yeah? >> >> In our current design we _do_ expose FDB versions pretty directly as >> database update sequences (there’s a small prefix to allow for _changes to >> stay monotonically increasing when relocating a database to a new FDB >> cluster). I believe it’s worth us thinking about expanding the use of >> sequences to other places in the API as those are a concept that’s already >> pretty familiar to our users > > Did users ever craft their own 2.x db update sequence tokens to abuse > the system? Probably not, because our clustering code was hard to > understand. Did users ever craft their own 1.x db update sequence > values? Yes, and it caused lots of problems.
I don't remember the problems that this caused in 1.x, but I can certainly imagine a too-clever user generating a sequence that doesn’t correspond to any consistent FDB version and supplying it. FoundationDB allows for this sort of thing with the ominous caveat: "The database cannot guarantee causal consistency if this method is used (the transaction’s reads will be causally consistent only if the provided read version has that property).” So … yeah. > Does this prevent implementing the CouchDB API on any other backend? In > which case, I'd be -1.... In other words, at the very least we need to > reinforce that the token is opaque and that manipulating it can produce > both undefined errors as well as potentially lead to (perceived?) data loss. I mean, we’re already down the path where we are using various specific features of FoundationDB (versionstamps, atomic operations, and of course transactions) that would not necessarily be in an arbitrary key-value store. I suppose adding this enhancement would add to the list of requirements on an underlying storage engine, but if a storage engine couldn’t support transactions with snapshot isolation I’m not sure it’d be a good choice for us anyway. Even something as basic as atomic maintenance of the _all_docs and _changes indexes becomes a heroic effort without that. > If we eschew API changes for 4.0 then we need to decide on the default. And if >>> we're voting, I'd say making RYWs the default (never hanging onto a >>> handle) and then (ab-)using stale=ok or whatever state we have lying >>> around might be sufficient. >> >> I definitely agree. We should not be using old read versions without the >> client’s knowledge unless it's for some internal process where we know all >> the tradeoffs. >> >>> This is the really important data point here for me. While Cloudant >>> cares about 2-3 extra ms on the server side, many many MANY CouchDB >>> users don't. Can we benchmark what this looks like when running >>> FDB+CouchDB on a teeny platform like a RasPi? Is it still 2-3ms? What >>> about the average laptop/desktop? Or is it only 2-3ms on a beefy >>> Cloudant-sized server? >> >> I don’t have hard performance numbers, but I expect that acquiring a read >> version in a small-scale deployment is faster than the same operation >> against a big FoundationDB deployment spanning zones in a cloud region. When >> you scale down e.g. to a single FDB process that process ends up playing all >> the roles that need to collaborate to decide on a read version and so the >> network latency gets taken out of the picture. > > Then I'm concerned this is premature optimization. A fair concern. What I really like about this is that the way to more efficient operations is exposing richer transactional semantics to users. How often do you get a deal like that! Cheers, Adam