Hi Mike,

Some really nice ideas there. I agree having a revId and a FDB read version
is not ideal. When we introduce this, I would like to have a plan that
leads us to being able to merge them together to have one consistent
interface. Obviously this would be really tricky but worth it in the long
run.

Cheers
Garren

On Mon, Sep 23, 2019 at 5:24 PM Adam Kocoloski <kocol...@apache.org> wrote:

> Oh, now _that’s_ a fun can of worms! Document revision identifiers have
> some clear weaknesses and I think there have been several notable
> advancements in the literature in the last decade that would be great to
> incorporate. Evolving those revisions is challenging because of their
> critical role in replication and our use of replication as a fallback
> mechanism for all manner of upgrades and migrations. Nevertheless, I agree
> that it feels like there is space for read versions / database sequences to
> play a larger role than they do today (i.e., zero) in the optimistic
> concurrency control aspect of working with a single CouchDB database.
>
> Garren, I definitely agree that any new functionality here needs to be
> opt-in.
>
> Adam
>
> > On Sep 23, 2019, at 10:34 AM, Mike Rhodes <couc...@dx13.co.uk> wrote:
> >
> > Adam,  Garren, All,
> >
> > As Garren says, I think exposing the raw read version to users as
> explicitly a read version (i.e., pretty much exposing the raw FDB
> functionality direct to users) will be confusing and make it easy for
> people to make mistakes given there are some hidden semantics to it, not
> least of which it currently can't be more than 5s old :)
> >
> > I do, however, think there is a large amount of value in enabling
> clients to avoid the overhead of making an FDB read version request for
> every CouchDB request.
> >
> > For me, the way to expose read versions is to use it as a building block
> for higher level concepts like bounded staleness and read-your-writes type
> guarantees that make sense at an application level. Adding these
> abstractions as concepts in the HTTP API (even if they end up essentially
> being an opaque token which is really the FDB read version) makes sense to
> me, because there are a lot of people who like CouchDB's HTTP API of itself
> and eschew client libraries.
> >
> > A second big thing to me is how this functionality interacts with the
> current MVCC logic. I think there's a large potential for overlap. What I'm
> pondering is how and whether the names we use in the API can make the "MVCC
> document rev ID" and what is something like a "database rev ID" (or even
> "CouchDB instance rev ID"!) concept feel like one consistent interface
> rather than two patterns bolted together.
> >
> > An overlap example:
> >
> > - User issues GET /db/document?stale=true -- stale=true allows the node
> that serves the request to use a cached read version. The request returns a
> header with an opaque token A (basically the read version).
> >
> > - User issues PUT /db/document -- returns a token=A+1, but another
> question is how we avoid conflicts for the write in the MVCC sense:
> >   - We could use the existing rev value MVCC mechanics, and so client
> sends rev ID and the server reads the current rev ID, checks it and allows
> or denies the write.
> >   - Instead, the write could include token A in the request, and on the
> server side we use the "token" as the read value in the transaction, and
> add the document's keys to the FDB transaction's read set, ensuring the
> transaction fails if the document has changed without needing to read the
> document's current rev ID.
> >
> > It would be nice if we didn't allow two ways to do this, but it'd also
> be nice if the client didn't have to cope with several rev-ID like things.
> >
> > ---
> >
> > Obviously we can enable read-your-writes with this:
> >
> > - User issues a POST /db/_find?token=A+1. We can use A+1 as the read
> value to ensure we see the previous write.
> >   - I guess if we send a read version that's too old, FDB will have some
> way to tell us that?
> >
> > ---
> >
> > A separate question. Can we / are we looking at embedding the read
> version into the document rev ID? I wondered if that could be used to avoid
> a read request to FDB to read the current rev ID in some cases, because we
> could leverage FDB's semantics as above.
> >
> > --
> > Mike.
> >
> > On Mon, 23 Sep 2019, at 13:31, Garren Smith wrote:
> >> Hi Adam,
> >>
> >> In general, I like this idea especially with the future possibility of
> >> adding transactions to CouchDB. What makes me a little nervous is that
> this
> >> requires a fair amount of knowledge of CouchDB and FDB for a user to
> fully
> >> understand what is happening and could be a potential place where a user
> >> could get it horribly wrong or cause unnecessary issues. I would prefer
> >> that a user has to explicitly opt into this functionality, either by
> >> changing config or via adding another field in the HTTP header or a
> query
> >> parameter.
> >>
> >> Cheers
> >> Garren
> >>
> >> On Fri, Sep 20, 2019 at 12:11 AM Adam Kocoloski <kocol...@apache.org>
> wrote:
> >>
> >>> Hi all,
> >>>
> >>> As we’ve gotten more familiar with FoundationDB we’ve come to realize
> that
> >>> acquiring a read version at the beginning of a transaction is a
> relatively
> >>> expensive[*] operation. It’s also a challenging one to scale given the
> >>> amount of communication required between proxies and tlogs in order to
> >>> agree on a good version. The prototype CouchDB layer we’ve been
> working on
> >>> (i.e., the beginnings of CouchDB 4.0) uses a separate FDB transaction
> with
> >>> a new read version for every request made to CouchDB. I wanted to
> start a
> >>> discussion about ways we might augment that approach while preserving
> (or
> >>> even enhancing) the semantics that we can expose to CouchDB users.
> >>>
> >>> One thing we can do is cache known versions that FDB has supplied in
> the
> >>> past second in the CouchDB layer and reuse those when a client permits
> us
> >>> to do so. If you like, this is the modern version of `?stale=ok`, but
> now
> >>> applicable to all types of requests. One big downside of this approach
> is
> >>> that if you scale out the members of the CouchDB layer they’ll have
> >>> different views of recent FDB versions, and a client whose requests are
> >>> load-balanced across members won’t have any guarantee that time moves
> >>> forward from request to request. You could imagine gossiping versions
> >>> between layer members, but now you’re basically redoing the work that
> >>> FoundationDB is doing itself.
> >>>
> >>> Another approach is to communicate the FDB version as part of the
> response
> >>> to each request, and allow the client to set an FDB version as part of
> a
> >>> submitted request. Clients that do this will experience lower
> latencies for
> >>> requests 2..N that share a version, will have the benefit of a
> consistent
> >>> snapshot of the database for all the reads that are executed using the
> same
> >>> version, and can guarantee they read their own writes when interleaving
> >>> those operations (assuming any reads following a write use the new FDB
> >>> version associated with the write).
> >>>
> >>
> >>> These techniques are not mutually exclusive; a client could acquire a
> >>> slightly stale FDB version and then use that for a collection of read
> >>> requests that would all observe the same consistent snapshot of the
> >>> database.  Also, recall that a CouchDB sequence is now essentially the
> same
> >>> as an FDB version, with a little extra metadata to ensure sequences are
> >>> always monotonically increasing even when moving a database to a
> different
> >>> FDB cluster. So if you like, this is about allowing requests to be
> executed
> >>> as of a certain sequence (provided that sequence is no more than 5
> seconds
> >>> old).
> >>>
> >>> I’m refraining from proposing any specific API extensions at this
> point,
> >>> partly because that’s an easy bikeshed and partly because I think
> whatever
> >>> API we’d add would be a primitive that client libraries would use to
> >>> construct richer semantics around. I’m also biting my tongue and
> avoiding
> >>> any detailed discussion of the transactional capabilities that CouchDB
> >>> could offer by surfacing these versions to clients — but that’s
> definitely
> >>> an interesting topic in its own right!
> >>>
> >>> Curious to hear what you all think. Thanks, Adam
> >>>
> >>> [*]: I don’t want to come off as alarmist; when I say this operation is
> >>> “expensive” I mean it might take a couple of milliseconds depending on
> FDB
> >>> configuration, and FDB can execute 10s of thousands of these per second
> >>> without much tuning. But it’s always good to be looking for the next
> >>> bottleneck :)
> >>>
> >>>
> >>
>
>

Reply via email to