Re: [DISCUSS] couchdb 4.0 transactional semantics
Apologies for resurrecting this thread after so long. I’ve looked over the thread again today and it seems there is general consensus on the desired semantics. I will start a vote thread. B. > On 24 Jul 2020, at 18:27, Nick Vatamaniuc wrote: > > Great discussion everyone! > > For normal replications, I think it might be nice to make an exception > and allow server-side pagination for compatibility at first, with a > new option to explicitly enable strict snapshots behavior. Then, in a > later release make it the default to match _all_docs and _view reads. > In other words, for a short while, we'd support bi-directional > replications between 4.x and 1/2/3.x on any replicator and document > that fact, then after a while will switch that capability off and > users would have to run replications on a 4.x replicator only, or > specially updated 3.x replicators. > >> I'd rather support this scenario than have to support explaining why the >> "one shot" replication back to an old 1.x, when initiated by a 1.x cluster, >> is returning results "ahead" of the time at which the one-shot replication >> was started. > > Ah, that won't happen in the current fdb prototype branch > implementation. What might happen is there would be changes present in > the changes feed that happened _after_ the request has started. That > won't be any different than if a node where replication runs restarts, > or there is a network glitch. The changes feed would proceed from the > last checkpoint and see changes that happened after the initial > starting sequence and apply them in order (document "a" was deleted, > then it was updated again then deleted again, every change will be > applied incrementally to the target, etc). > > We'd have to document the fact that a single snapshot replication from > 4.x -> 1/2/3.x is impossible anyway (unless we do the trick where we > compare the update sequence and db was not updated in the meantime or > the new FDB storage engine allows it). The question then becomes if > we allow the pagination to happen on the client or the server. In case > of normal replication I think it would be nice to allow it to happen > on the server for a bit to allow for maximum initial replication > interoperability. > >> For cases where you’re not concerned about the snapshot isolation (e.g. >> streaming an entire _changes feed), there is a small performance benefit to >> requesting a new FDB transaction asynchronously before the old one actually >> times out and swapping over to it. That’s a pattern I’ve seen in other FDB >> layers but I’m not sure we’ve used it anywhere in CouchDB yet. > > Good point, Adam. We could optimize that part, yeah. Fetch a GRV after > 4.9 seconds or so and keep it ready to go for example. So far we tried > to react to the transaction_too_old exception, as opposed to starting > a timer there in order to allow us to use the maximum time a tx is > alive, to save a few seconds or milliseconds. That required some > tricks such as handling the exception bubbling up from either the > range read itself, or from the user's callback (say if user code in > the callback fetched a doc body which blew up with a > transaction_too_old exception). As an interesting aside, from quick > experiments I had noticed we were able to stream about 100-150k rows > from a single tx snapshot, that wasn't too bad I thought. > > Speaking of replication, I am trying to see what the replicator might > look like in 4.x in the https://github.com/apache/couchdb/pull/3015 > (prototype/fdb-replicator branch). It's very much a wip and hot mess > currently. Will issue an RFC once I have a better handle on the > general shape of it. So far it's based on couch_jobs, with a global > queue and looks like it might be smaller overall, as it's leveraging > the scheduling capabilities already present in couch_jobs, and but > once started individual replication job process hierarchy is largely > the same as before. > > Cheers, > -Nick > > > > > > On Wed, Jul 22, 2020 at 8:48 AM Bessenyei Balázs Donát > wrote: >> >> On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt wrote: >>> I’m not sure why a URL parameter vs. a path makes a big difference? >>> >>> Do you have an example? >>> >>> Best >>> Jan >>> — >> >> Oh, sure! OpenAPI Generator [1] and et al. for example generate Java >> methods (like [2] out of spec [3]) per path per verb. >> Java's type safety and the way methods are currently generated don't >> really provide an easy way to retrieve multiple kinds of responses, so >> having them separate would help a lot there. >> >> >> Donat >> >> PS. I'm getting self-conscious about discussing this in this thread. >> Should I open a new one? >> >> >> [1] https://openapi-generator.tech/ >> [2] >> https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/src/main/java/org/openapitools/client/api/PetApi.java#L606 >> [3] >>
Re: [DISCUSS] couchdb 4.0 transactional semantics
Great discussion everyone! For normal replications, I think it might be nice to make an exception and allow server-side pagination for compatibility at first, with a new option to explicitly enable strict snapshots behavior. Then, in a later release make it the default to match _all_docs and _view reads. In other words, for a short while, we'd support bi-directional replications between 4.x and 1/2/3.x on any replicator and document that fact, then after a while will switch that capability off and users would have to run replications on a 4.x replicator only, or specially updated 3.x replicators. > I'd rather support this scenario than have to support explaining why the "one > shot" replication back to an old 1.x, when initiated by a 1.x cluster, is > returning results "ahead" of the time at which the one-shot replication was > started. Ah, that won't happen in the current fdb prototype branch implementation. What might happen is there would be changes present in the changes feed that happened _after_ the request has started. That won't be any different than if a node where replication runs restarts, or there is a network glitch. The changes feed would proceed from the last checkpoint and see changes that happened after the initial starting sequence and apply them in order (document "a" was deleted, then it was updated again then deleted again, every change will be applied incrementally to the target, etc). We'd have to document the fact that a single snapshot replication from 4.x -> 1/2/3.x is impossible anyway (unless we do the trick where we compare the update sequence and db was not updated in the meantime or the new FDB storage engine allows it). The question then becomes if we allow the pagination to happen on the client or the server. In case of normal replication I think it would be nice to allow it to happen on the server for a bit to allow for maximum initial replication interoperability. > For cases where you’re not concerned about the snapshot isolation (e.g. > streaming an entire _changes feed), there is a small performance benefit to > requesting a new FDB transaction asynchronously before the old one actually > times out and swapping over to it. That’s a pattern I’ve seen in other FDB > layers but I’m not sure we’ve used it anywhere in CouchDB yet. Good point, Adam. We could optimize that part, yeah. Fetch a GRV after 4.9 seconds or so and keep it ready to go for example. So far we tried to react to the transaction_too_old exception, as opposed to starting a timer there in order to allow us to use the maximum time a tx is alive, to save a few seconds or milliseconds. That required some tricks such as handling the exception bubbling up from either the range read itself, or from the user's callback (say if user code in the callback fetched a doc body which blew up with a transaction_too_old exception). As an interesting aside, from quick experiments I had noticed we were able to stream about 100-150k rows from a single tx snapshot, that wasn't too bad I thought. Speaking of replication, I am trying to see what the replicator might look like in 4.x in the https://github.com/apache/couchdb/pull/3015 (prototype/fdb-replicator branch). It's very much a wip and hot mess currently. Will issue an RFC once I have a better handle on the general shape of it. So far it's based on couch_jobs, with a global queue and looks like it might be smaller overall, as it's leveraging the scheduling capabilities already present in couch_jobs, and but once started individual replication job process hierarchy is largely the same as before. Cheers, -Nick On Wed, Jul 22, 2020 at 8:48 AM Bessenyei Balázs Donát wrote: > > On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt wrote: > > I’m not sure why a URL parameter vs. a path makes a big difference? > > > > Do you have an example? > > > > Best > > Jan > > — > > Oh, sure! OpenAPI Generator [1] and et al. for example generate Java > methods (like [2] out of spec [3]) per path per verb. > Java's type safety and the way methods are currently generated don't > really provide an easy way to retrieve multiple kinds of responses, so > having them separate would help a lot there. > > > Donat > > PS. I'm getting self-conscious about discussing this in this thread. > Should I open a new one? > > > [1] https://openapi-generator.tech/ > [2] > https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/src/main/java/org/openapitools/client/api/PetApi.java#L606 > [3] > https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/api/openapi.yaml#L208
Re: Supporting API client generation tools (Was: [DISCUSS] couchdb 4.0 transactional semantics)
For the specific case of _changes I think proper handling of the Accept header would make a lot of sense as it is the HTTP way of changing the content-type. As such it is usually much better supported by API tooling than format switching on query parameters. The eventsource stream is half-way there in that it returns a `content-type: text/event-stream`, but you have to specify `feed=eventsource` - it isn't enabled by passing the `Accept: text/event-stream` header (and you don't get a 406 Not acceptable if you try; you just a get a normal feed response in application/json). The continuous feed is a little more complicated, it isn't valid as `application/json`, but using `text/plain` as a switch is problematic because I think one could reasonably expect to accept other feed types like normal or longpoll feeds with `Accept: text/plain` as an alternative to `application/json`. The output does conform to http://jsonlines.org/ and https://github.com/ndjson/ndjson-spec formats and they have proposed various mime types `application/x-jsonlines`, `application/x-ndjson` etc, but given they are not standardized I don't know how much advantage there is over using say something like `application/x-couch-continuous-json` or similar. The mime type `application/json-seq` is backed by https://tools.ietf.org/html/rfc7464 but would involve adding a record separator character to each line, which may come with a host of problems. Anyway I think with some consideration of a suitable mime-type for continuous it would potentially be possible use accept/content-type to correctly switch between the different feed formats and improve the API without adding new endpoints or necessarily breaking anything. The feed parameter could be left operating as it currently does, but deprecated. From a more general perspective I think that most of the issues I've come across when working with Couch and OpenAPI are related to places where Couch switches type in a schema. An example related to _changes would be the `heartbeat` parameter which can either be a number or a boolean. While it is technically possible to apply multiple types in OpenAPI from a generation perspective this inevitably leads to complications in strongly typed languages. I think it would often be possible to resolve that type of problem in a non-breaking way by continuing to allow both types in the existing parameter, but formally declaring parameters with single types (e.g. heartbeat=number, heartbeatOn=boolean) thus effectively deprecating the use of multiple types within a parameter. Another example, this time unrelated to _changes, would be field sorts in index definitions which can be either a string "field" for default ascending or an object like {"field": "asc"}. When listing indexes IIRC they are returned in the form they were supplied in, but if we could agree for example to continue to accept both forms when defining indexes, but always return the "expanded" object form it would facilitate deserialization in generated tooling. I think that would be considered a breaking change because under some circumstances the response schema changes. I guess the risk is related to how many people rely on being able to parse field sorts from index listings in the same format that they passed them in as. Especially since that as soon as there is a descending field at least some would be in object form. I expect not many people rely on that, but surprise me! Of course there are ways to resolve this without being breaking too, e.g. adding a header to toggle the behaviour or something, but I guess there is a complexity trade-off to be made against the impact. Rich From: Jan Lehnardt To: dev@couchdb.apache.org Date: 22/07/2020 17:23 Subject:[EXTERNAL] Supporting API client generation tools (Was: [DISCUSS] couchdb 4.0 transactional semantics) > On 22. Jul 2020, at 14:48, Bessenyei Balázs Donát wrote: > > On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt wrote: >> I’m not sure why a URL parameter vs. a path makes a big difference? >> >> Do you have an example? >> >> Best >> Jan >> — > > Oh, sure! OpenAPI Generator [1] and et al. for example generate Java > methods (like [2] out of spec [3]) per path per verb. > Java's type safety and the way methods are currently generated don't > really provide an easy way to retrieve multiple kinds of responses, so > having them separate would help a lot there. My argument would be that API generation tools that try to abstract over HTTP that aren’t able to really abstract over HTTP aren’t our place to fix ;P But I wouldn’t be averse to adding endpoints that make this easier for these tools. Although I’m sceptical they can deal with our continuous modes anyway. Adding endpoints is not a BC break, and I would not support removing the original versions. We shoul
Supporting API client generation tools (Was: [DISCUSS] couchdb 4.0 transactional semantics)
> On 22. Jul 2020, at 14:48, Bessenyei Balázs Donát wrote: > > On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt wrote: >> I’m not sure why a URL parameter vs. a path makes a big difference? >> >> Do you have an example? >> >> Best >> Jan >> — > > Oh, sure! OpenAPI Generator [1] and et al. for example generate Java > methods (like [2] out of spec [3]) per path per verb. > Java's type safety and the way methods are currently generated don't > really provide an easy way to retrieve multiple kinds of responses, so > having them separate would help a lot there. My argument would be that API generation tools that try to abstract over HTTP that aren’t able to really abstract over HTTP aren’t our place to fix ;P But I wouldn’t be averse to adding endpoints that make this easier for these tools. Although I’m sceptical they can deal with our continuous modes anyway. Adding endpoints is not a BC break, and I would not support removing the original versions. We should identify all places that would be problematic before deciding either way. I know a few Cloudant folks have looked at this previously. I also don’t feel too strongly about this, but I’m happy to have a discussion on this. > Donat > > PS. I'm getting self-conscious about discussing this in this thread. > Should I open a new one? Done. Best Jan — > > > [1] https://openapi-generator.tech/ > [2] > https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/src/main/java/org/openapitools/client/api/PetApi.java#L606 > [3] > https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/api/openapi.yaml#L208
Re: [DISCUSS] couchdb 4.0 transactional semantics
On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt wrote: > I’m not sure why a URL parameter vs. a path makes a big difference? > > Do you have an example? > > Best > Jan > — Oh, sure! OpenAPI Generator [1] and et al. for example generate Java methods (like [2] out of spec [3]) per path per verb. Java's type safety and the way methods are currently generated don't really provide an easy way to retrieve multiple kinds of responses, so having them separate would help a lot there. Donat PS. I'm getting self-conscious about discussing this in this thread. Should I open a new one? [1] https://openapi-generator.tech/ [2] https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/src/main/java/org/openapitools/client/api/PetApi.java#L606 [3] https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/api/openapi.yaml#L208
Re: [DISCUSS] couchdb 4.0 transactional semantics
> On 21. Jul 2020, at 18:29, Bessenyei Balázs Donát wrote: > > On Tue, 21 Jul 2020 at 17:42, Jan Lehnardt wrote: >> We rather don’t like to break things just because we can :) >> >> Do you have anything specific in mind? >> >> Best >> Jan >> — >> > > I'm not suggesting that breaking changes should be introduced just for > the fun of it :) > Anyway, an example could be the changes feed [1]: it returns JSON, > line-by-line JSON or EventSource responses (for `normal`, `continuous` > and `eventsource` modes, respectively). > This makes integration and tooling around it difficult. One potential > fix to that could be separating the feed into different URLs (such as > `_changes`, `_changes/_continuous` and `_changes/_eventsource`). > > Let me know what you think. I’m not sure why a URL parameter vs. a path makes a big difference? Do you have an example? Best Jan — > > > Donat > > [1] https://docs.couchdb.org/en/stable/api/database/changes.html
Re: [DISCUSS] couchdb 4.0 transactional semantics
On Tue, 21 Jul 2020 at 17:42, Jan Lehnardt wrote: > We rather don’t like to break things just because we can :) > > Do you have anything specific in mind? > > Best > Jan > — > I'm not suggesting that breaking changes should be introduced just for the fun of it :) Anyway, an example could be the changes feed [1]: it returns JSON, line-by-line JSON or EventSource responses (for `normal`, `continuous` and `eventsource` modes, respectively). This makes integration and tooling around it difficult. One potential fix to that could be separating the feed into different URLs (such as `_changes`, `_changes/_continuous` and `_changes/_eventsource`). Let me know what you think. Donat [1] https://docs.couchdb.org/en/stable/api/database/changes.html
Re: [DISCUSS] couchdb 4.0 transactional semantics
> On 21. Jul 2020, at 17:24, Bessenyei Balázs Donát wrote: > > I think being able to leverage FoundationDB's serializability is an > awesome idea! +1 (non-binding) on all 4 points. > I also support the idea of changing the API in backwards-incompatible > ways if that makes things more convenient / streamlined. I wonder, > does this mean other, backwards-incompatible changes are also welcome > in the next major? (Given that replicator-compatibility (from later on > in this thread) is expected.) We rather don’t like to break things just because we can :) Do you have anything specific in mind? Best Jan — > > > Thank you, > > Donat > > > On Thu, 16 Jul 2020 at 18:26, Paul Davis wrote: >> >> From what I'm reading it sounds like we have general consensus on a few >> things: >> >> 1. A single CouchDB API call should map to a single FDB transaction >> 2. We absolutely do not want to return a valid JSON response to any >> streaming API that hit a transaction boundary (because data >> loss/corruption) >> 3. We're willing to change the API requirements so that 2 is not an issue. >> 4. None of this applies to continuous changes since that API call was >> never a single snapshot. >> >> If everyone generally agrees with that summarization, my suggestion >> would be that we just revisit the new pagination APIs and make them >> the only behavior rather than having them be opt-in. I believe those >> APIs already address all the concerns in this thread and the only >> reason we kept the older versions with `restart_tx` was to maintain >> API backwards compatibility at the expense of a slight change to >> semantics of snapshots. However, if there's a consensus that the >> semantics are more important than allowing a blanket `GET >> /db/_all_docs` I think it'd make the most sense to just embrace the >> pagination APIs that already exist and were written to cover these >> issues. >> >> The only thing I'm not 100% on is how to deal with non-continuous >> replications. I.e., the older single shot replication. Do we go back >> with patches to older replicators to allow 4.0 compatibility? Just >> declare that you have to mediate a replication on the newer of the two >> CouchDB deployments? Sniff the replicator's UserAgent and behave >> differently on 4.x for just that special case? >> >> Paul >> >> On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski wrote: >>> >>> Sorry, I also missed that you quoted this specific bit about eagerly >>> requesting a new snapshot. Currently the code will just react to the >>> transaction expiring, then wait till it acquires a new snapshot if >>> “restart_tx” is set (which can take a couple of milliseconds on a >>> FoundationDB cluster that is deployed across multiple AZs in a cloud >>> Region) and then proceed. >>> >>> Adam >>> On Jul 15, 2020, at 6:54 PM, Adam Kocoloski wrote: Right now the code has an internal “restart_tx” flag that is used to automatically request a new snapshot if the original one expires and continue streaming the response. It can be used for all manner of multi-row responses, not just _changes. As this is a pretty big change to the isolation guarantees provided by the database Bob volunteered to elevate the issue to the mailing list for a deeper discussion. Cheers, Adam > On Jul 15, 2020, at 11:38 AM, Joan Touzet wrote: > > I'm having trouble following the thread... > > On 14/07/2020 14:56, Adam Kocoloski wrote: >> For cases where you’re not concerned about the snapshot isolation (e.g. >> streaming an entire _changes feed), there is a small performance benefit >> to requesting a new FDB transaction asynchronously before the old one >> actually times out and swapping over to it. That’s a pattern I’ve seen >> in other FDB layers but I’m not sure we’ve used it anywhere in CouchDB >> yet. > > How does _changes work right now in the proposed 4.0 code? > > -Joan >>>
Re: [DISCUSS] couchdb 4.0 transactional semantics
I think being able to leverage FoundationDB's serializability is an awesome idea! +1 (non-binding) on all 4 points. I also support the idea of changing the API in backwards-incompatible ways if that makes things more convenient / streamlined. I wonder, does this mean other, backwards-incompatible changes are also welcome in the next major? (Given that replicator-compatibility (from later on in this thread) is expected.) Thank you, Donat On Thu, 16 Jul 2020 at 18:26, Paul Davis wrote: > > From what I'm reading it sounds like we have general consensus on a few > things: > > 1. A single CouchDB API call should map to a single FDB transaction > 2. We absolutely do not want to return a valid JSON response to any > streaming API that hit a transaction boundary (because data > loss/corruption) > 3. We're willing to change the API requirements so that 2 is not an issue. > 4. None of this applies to continuous changes since that API call was > never a single snapshot. > > If everyone generally agrees with that summarization, my suggestion > would be that we just revisit the new pagination APIs and make them > the only behavior rather than having them be opt-in. I believe those > APIs already address all the concerns in this thread and the only > reason we kept the older versions with `restart_tx` was to maintain > API backwards compatibility at the expense of a slight change to > semantics of snapshots. However, if there's a consensus that the > semantics are more important than allowing a blanket `GET > /db/_all_docs` I think it'd make the most sense to just embrace the > pagination APIs that already exist and were written to cover these > issues. > > The only thing I'm not 100% on is how to deal with non-continuous > replications. I.e., the older single shot replication. Do we go back > with patches to older replicators to allow 4.0 compatibility? Just > declare that you have to mediate a replication on the newer of the two > CouchDB deployments? Sniff the replicator's UserAgent and behave > differently on 4.x for just that special case? > > Paul > > On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski wrote: > > > > Sorry, I also missed that you quoted this specific bit about eagerly > > requesting a new snapshot. Currently the code will just react to the > > transaction expiring, then wait till it acquires a new snapshot if > > “restart_tx” is set (which can take a couple of milliseconds on a > > FoundationDB cluster that is deployed across multiple AZs in a cloud > > Region) and then proceed. > > > > Adam > > > > > On Jul 15, 2020, at 6:54 PM, Adam Kocoloski wrote: > > > > > > Right now the code has an internal “restart_tx” flag that is used to > > > automatically request a new snapshot if the original one expires and > > > continue streaming the response. It can be used for all manner of > > > multi-row responses, not just _changes. > > > > > > As this is a pretty big change to the isolation guarantees provided by > > > the database Bob volunteered to elevate the issue to the mailing list for > > > a deeper discussion. > > > > > > Cheers, Adam > > > > > >> On Jul 15, 2020, at 11:38 AM, Joan Touzet wrote: > > >> > > >> I'm having trouble following the thread... > > >> > > >> On 14/07/2020 14:56, Adam Kocoloski wrote: > > >>> For cases where you’re not concerned about the snapshot isolation (e.g. > > >>> streaming an entire _changes feed), there is a small performance > > >>> benefit to requesting a new FDB transaction asynchronously before the > > >>> old one actually times out and swapping over to it. That’s a pattern > > >>> I’ve seen in other FDB layers but I’m not sure we’ve used it anywhere > > >>> in CouchDB yet. > > >> > > >> How does _changes work right now in the proposed 4.0 code? > > >> > > >> -Joan > > > > >
Re: [DISCUSS] couchdb 4.0 transactional semantics
No, I would not. I was thinking only of the previous major release. so a 3.x.y that adds bidirection replication compatibility with 4.0.0. B. > On 16 Jul 2020, at 21:50, Joan Touzet wrote: > > > > On 2020-07-16 2:24 p.m., Robert Samuel Newson wrote: >> Agreed on all 4 points. On the final point, it's worth noting that a >> continuous changes feed was two-phase, the first is indeed over a snapshot >> of the db as of the start of the _changes request, the second phase is an >> endless series of subsequent snapshots. the 4.0 behaviour won't exactly >> match that but it's definitely in the same spirit. >> Agreed also on requiring pagination (I've not reviewed the proposed >> pagination api in sufficient detail to +1 it yet). Would we start the >> response as rows are retrieved, though? That's my preference, with an >> unclean termination if we hit txn_too_old, and an upper bound on the "limit" >> parameter or equivalent chosen such that txn_too_old is vanishingly unlikely. >> On compatibility, there's precedent for a minor release of old branches just >> to add replicator compatibility. for example, the replicator could call >> _changes again if it received a complete _changes response (i.e, one that >> ended with a } that completes the json object) that did not include a >> "last_seq" row. The 4.0 replicator would always do this. > > I wouldn't really want to release a new 1.x, would you? Augh. > > If we're going to change how replication works, wouldn't it better to simply > say "there is no guaranteed one-shot replication back from 4.x to 1.x?" Or, > intentionally break backward compatibility so one-shot replication to > un-upgraded old Couches refuses to work at all? This would prevent the > confusion by making it clear - you can't do things this way anymore. > > We could do a point release of 3.x, sure. > > -Joan > >> B. >>> On 16 Jul 2020, at 17:25, Paul Davis wrote: >>> >>> From what I'm reading it sounds like we have general consensus on a few >>> things: >>> >>> 1. A single CouchDB API call should map to a single FDB transaction >>> 2. We absolutely do not want to return a valid JSON response to any >>> streaming API that hit a transaction boundary (because data >>> loss/corruption) >>> 3. We're willing to change the API requirements so that 2 is not an issue. >>> 4. None of this applies to continuous changes since that API call was >>> never a single snapshot. >>> >>> If everyone generally agrees with that summarization, my suggestion >>> would be that we just revisit the new pagination APIs and make them >>> the only behavior rather than having them be opt-in. I believe those >>> APIs already address all the concerns in this thread and the only >>> reason we kept the older versions with `restart_tx` was to maintain >>> API backwards compatibility at the expense of a slight change to >>> semantics of snapshots. However, if there's a consensus that the >>> semantics are more important than allowing a blanket `GET >>> /db/_all_docs` I think it'd make the most sense to just embrace the >>> pagination APIs that already exist and were written to cover these >>> issues. >>> >>> The only thing I'm not 100% on is how to deal with non-continuous >>> replications. I.e., the older single shot replication. Do we go back >>> with patches to older replicators to allow 4.0 compatibility? Just >>> declare that you have to mediate a replication on the newer of the two >>> CouchDB deployments? Sniff the replicator's UserAgent and behave >>> differently on 4.x for just that special case? >>> >>> Paul >>> >>> On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski wrote: Sorry, I also missed that you quoted this specific bit about eagerly requesting a new snapshot. Currently the code will just react to the transaction expiring, then wait till it acquires a new snapshot if “restart_tx” is set (which can take a couple of milliseconds on a FoundationDB cluster that is deployed across multiple AZs in a cloud Region) and then proceed. Adam > On Jul 15, 2020, at 6:54 PM, Adam Kocoloski wrote: > > Right now the code has an internal “restart_tx” flag that is used to > automatically request a new snapshot if the original one expires and > continue streaming the response. It can be used for all manner of > multi-row responses, not just _changes. > > As this is a pretty big change to the isolation guarantees provided by > the database Bob volunteered to elevate the issue to the mailing list for > a deeper discussion. > > Cheers, Adam > >> On Jul 15, 2020, at 11:38 AM, Joan Touzet wrote: >> >> I'm having trouble following the thread... >> >> On 14/07/2020 14:56, Adam Kocoloski wrote: >>> For cases where you’re not concerned about the snapshot isolation (e.g. >>> streaming an entire _changes feed), there is a small performance >>> benefit to
Re: [DISCUSS] couchdb 4.0 transactional semantics
On 2020-07-16 4:50 p.m., Joan Touzet wrote: On 2020-07-16 2:24 p.m., Robert Samuel Newson wrote: Agreed on all 4 points. On the final point, it's worth noting that a continuous changes feed was two-phase, the first is indeed over a snapshot of the db as of the start of the _changes request, the second phase is an endless series of subsequent snapshots. the 4.0 behaviour won't exactly match that but it's definitely in the same spirit. Agreed also on requiring pagination (I've not reviewed the proposed pagination api in sufficient detail to +1 it yet). Would we start the response as rows are retrieved, though? That's my preference, with an unclean termination if we hit txn_too_old, and an upper bound on the "limit" parameter or equivalent chosen such that txn_too_old is vanishingly unlikely. On compatibility, there's precedent for a minor release of old branches just to add replicator compatibility. for example, the replicator could call _changes again if it received a complete _changes response (i.e, one that ended with a } that completes the json object) that did not include a "last_seq" row. The 4.0 replicator would always do this. I wouldn't really want to release a new 1.x, would you? Augh. If we're going to change how replication works, wouldn't it better to simply say "there is no guaranteed one-shot replication back from 4.x to 1.x?" Or, intentionally break backward compatibility so one-shot replication to un-upgraded old Couches refuses to work at all? This would prevent the confusion by making it clear - you can't do things this way anymore. Sorry, meant to say we publish that the workaround is you need either a "push" replication from 4.x -> 1.x, or must use a hypothetically patched 3.x+ replicator as a "third party" to replicate successfully from 4.x -> non-patched older CouchDBs. I'd rather support this scenario than have to support explaining why the "one shot" replication back to an old 1.x, when initiated by a 1.x cluster, is returning results "ahead" of the time at which the one-shot replication was started. We could do a point release of 3.x, sure. -Joan B. On 16 Jul 2020, at 17:25, Paul Davis wrote: From what I'm reading it sounds like we have general consensus on a few things: 1. A single CouchDB API call should map to a single FDB transaction 2. We absolutely do not want to return a valid JSON response to any streaming API that hit a transaction boundary (because data loss/corruption) 3. We're willing to change the API requirements so that 2 is not an issue. 4. None of this applies to continuous changes since that API call was never a single snapshot. If everyone generally agrees with that summarization, my suggestion would be that we just revisit the new pagination APIs and make them the only behavior rather than having them be opt-in. I believe those APIs already address all the concerns in this thread and the only reason we kept the older versions with `restart_tx` was to maintain API backwards compatibility at the expense of a slight change to semantics of snapshots. However, if there's a consensus that the semantics are more important than allowing a blanket `GET /db/_all_docs` I think it'd make the most sense to just embrace the pagination APIs that already exist and were written to cover these issues. The only thing I'm not 100% on is how to deal with non-continuous replications. I.e., the older single shot replication. Do we go back with patches to older replicators to allow 4.0 compatibility? Just declare that you have to mediate a replication on the newer of the two CouchDB deployments? Sniff the replicator's UserAgent and behave differently on 4.x for just that special case? Paul On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski wrote: Sorry, I also missed that you quoted this specific bit about eagerly requesting a new snapshot. Currently the code will just react to the transaction expiring, then wait till it acquires a new snapshot if “restart_tx” is set (which can take a couple of milliseconds on a FoundationDB cluster that is deployed across multiple AZs in a cloud Region) and then proceed. Adam On Jul 15, 2020, at 6:54 PM, Adam Kocoloski wrote: Right now the code has an internal “restart_tx” flag that is used to automatically request a new snapshot if the original one expires and continue streaming the response. It can be used for all manner of multi-row responses, not just _changes. As this is a pretty big change to the isolation guarantees provided by the database Bob volunteered to elevate the issue to the mailing list for a deeper discussion. Cheers, Adam On Jul 15, 2020, at 11:38 AM, Joan Touzet wrote: I'm having trouble following the thread... On 14/07/2020 14:56, Adam Kocoloski wrote: For cases where you’re not concerned about the snapshot isolation (e.g. streaming an entire _changes feed), there is a small performance benefit to requesting a new FDB transaction
Re: [DISCUSS] couchdb 4.0 transactional semantics
On 2020-07-16 2:24 p.m., Robert Samuel Newson wrote: Agreed on all 4 points. On the final point, it's worth noting that a continuous changes feed was two-phase, the first is indeed over a snapshot of the db as of the start of the _changes request, the second phase is an endless series of subsequent snapshots. the 4.0 behaviour won't exactly match that but it's definitely in the same spirit. Agreed also on requiring pagination (I've not reviewed the proposed pagination api in sufficient detail to +1 it yet). Would we start the response as rows are retrieved, though? That's my preference, with an unclean termination if we hit txn_too_old, and an upper bound on the "limit" parameter or equivalent chosen such that txn_too_old is vanishingly unlikely. On compatibility, there's precedent for a minor release of old branches just to add replicator compatibility. for example, the replicator could call _changes again if it received a complete _changes response (i.e, one that ended with a } that completes the json object) that did not include a "last_seq" row. The 4.0 replicator would always do this. I wouldn't really want to release a new 1.x, would you? Augh. If we're going to change how replication works, wouldn't it better to simply say "there is no guaranteed one-shot replication back from 4.x to 1.x?" Or, intentionally break backward compatibility so one-shot replication to un-upgraded old Couches refuses to work at all? This would prevent the confusion by making it clear - you can't do things this way anymore. We could do a point release of 3.x, sure. -Joan B. On 16 Jul 2020, at 17:25, Paul Davis wrote: From what I'm reading it sounds like we have general consensus on a few things: 1. A single CouchDB API call should map to a single FDB transaction 2. We absolutely do not want to return a valid JSON response to any streaming API that hit a transaction boundary (because data loss/corruption) 3. We're willing to change the API requirements so that 2 is not an issue. 4. None of this applies to continuous changes since that API call was never a single snapshot. If everyone generally agrees with that summarization, my suggestion would be that we just revisit the new pagination APIs and make them the only behavior rather than having them be opt-in. I believe those APIs already address all the concerns in this thread and the only reason we kept the older versions with `restart_tx` was to maintain API backwards compatibility at the expense of a slight change to semantics of snapshots. However, if there's a consensus that the semantics are more important than allowing a blanket `GET /db/_all_docs` I think it'd make the most sense to just embrace the pagination APIs that already exist and were written to cover these issues. The only thing I'm not 100% on is how to deal with non-continuous replications. I.e., the older single shot replication. Do we go back with patches to older replicators to allow 4.0 compatibility? Just declare that you have to mediate a replication on the newer of the two CouchDB deployments? Sniff the replicator's UserAgent and behave differently on 4.x for just that special case? Paul On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski wrote: Sorry, I also missed that you quoted this specific bit about eagerly requesting a new snapshot. Currently the code will just react to the transaction expiring, then wait till it acquires a new snapshot if “restart_tx” is set (which can take a couple of milliseconds on a FoundationDB cluster that is deployed across multiple AZs in a cloud Region) and then proceed. Adam On Jul 15, 2020, at 6:54 PM, Adam Kocoloski wrote: Right now the code has an internal “restart_tx” flag that is used to automatically request a new snapshot if the original one expires and continue streaming the response. It can be used for all manner of multi-row responses, not just _changes. As this is a pretty big change to the isolation guarantees provided by the database Bob volunteered to elevate the issue to the mailing list for a deeper discussion. Cheers, Adam On Jul 15, 2020, at 11:38 AM, Joan Touzet wrote: I'm having trouble following the thread... On 14/07/2020 14:56, Adam Kocoloski wrote: For cases where you’re not concerned about the snapshot isolation (e.g. streaming an entire _changes feed), there is a small performance benefit to requesting a new FDB transaction asynchronously before the old one actually times out and swapping over to it. That’s a pattern I’ve seen in other FDB layers but I’m not sure we’ve used it anywhere in CouchDB yet. How does _changes work right now in the proposed 4.0 code? -Joan
Re: [DISCUSS] couchdb 4.0 transactional semantics
Agreed on all 4 points. On the final point, it's worth noting that a continuous changes feed was two-phase, the first is indeed over a snapshot of the db as of the start of the _changes request, the second phase is an endless series of subsequent snapshots. the 4.0 behaviour won't exactly match that but it's definitely in the same spirit. Agreed also on requiring pagination (I've not reviewed the proposed pagination api in sufficient detail to +1 it yet). Would we start the response as rows are retrieved, though? That's my preference, with an unclean termination if we hit txn_too_old, and an upper bound on the "limit" parameter or equivalent chosen such that txn_too_old is vanishingly unlikely. On compatibility, there's precedent for a minor release of old branches just to add replicator compatibility. for example, the replicator could call _changes again if it received a complete _changes response (i.e, one that ended with a } that completes the json object) that did not include a "last_seq" row. The 4.0 replicator would always do this. B. > On 16 Jul 2020, at 17:25, Paul Davis wrote: > > From what I'm reading it sounds like we have general consensus on a few > things: > > 1. A single CouchDB API call should map to a single FDB transaction > 2. We absolutely do not want to return a valid JSON response to any > streaming API that hit a transaction boundary (because data > loss/corruption) > 3. We're willing to change the API requirements so that 2 is not an issue. > 4. None of this applies to continuous changes since that API call was > never a single snapshot. > > If everyone generally agrees with that summarization, my suggestion > would be that we just revisit the new pagination APIs and make them > the only behavior rather than having them be opt-in. I believe those > APIs already address all the concerns in this thread and the only > reason we kept the older versions with `restart_tx` was to maintain > API backwards compatibility at the expense of a slight change to > semantics of snapshots. However, if there's a consensus that the > semantics are more important than allowing a blanket `GET > /db/_all_docs` I think it'd make the most sense to just embrace the > pagination APIs that already exist and were written to cover these > issues. > > The only thing I'm not 100% on is how to deal with non-continuous > replications. I.e., the older single shot replication. Do we go back > with patches to older replicators to allow 4.0 compatibility? Just > declare that you have to mediate a replication on the newer of the two > CouchDB deployments? Sniff the replicator's UserAgent and behave > differently on 4.x for just that special case? > > Paul > > On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski wrote: >> >> Sorry, I also missed that you quoted this specific bit about eagerly >> requesting a new snapshot. Currently the code will just react to the >> transaction expiring, then wait till it acquires a new snapshot if >> “restart_tx” is set (which can take a couple of milliseconds on a >> FoundationDB cluster that is deployed across multiple AZs in a cloud Region) >> and then proceed. >> >> Adam >> >>> On Jul 15, 2020, at 6:54 PM, Adam Kocoloski wrote: >>> >>> Right now the code has an internal “restart_tx” flag that is used to >>> automatically request a new snapshot if the original one expires and >>> continue streaming the response. It can be used for all manner of multi-row >>> responses, not just _changes. >>> >>> As this is a pretty big change to the isolation guarantees provided by the >>> database Bob volunteered to elevate the issue to the mailing list for a >>> deeper discussion. >>> >>> Cheers, Adam >>> On Jul 15, 2020, at 11:38 AM, Joan Touzet wrote: I'm having trouble following the thread... On 14/07/2020 14:56, Adam Kocoloski wrote: > For cases where you’re not concerned about the snapshot isolation (e.g. > streaming an entire _changes feed), there is a small performance benefit > to requesting a new FDB transaction asynchronously before the old one > actually times out and swapping over to it. That’s a pattern I’ve seen in > other FDB layers but I’m not sure we’ve used it anywhere in CouchDB yet. How does _changes work right now in the proposed 4.0 code? -Joan >>> >>
Re: [DISCUSS] couchdb 4.0 transactional semantics
>From what I'm reading it sounds like we have general consensus on a few things: 1. A single CouchDB API call should map to a single FDB transaction 2. We absolutely do not want to return a valid JSON response to any streaming API that hit a transaction boundary (because data loss/corruption) 3. We're willing to change the API requirements so that 2 is not an issue. 4. None of this applies to continuous changes since that API call was never a single snapshot. If everyone generally agrees with that summarization, my suggestion would be that we just revisit the new pagination APIs and make them the only behavior rather than having them be opt-in. I believe those APIs already address all the concerns in this thread and the only reason we kept the older versions with `restart_tx` was to maintain API backwards compatibility at the expense of a slight change to semantics of snapshots. However, if there's a consensus that the semantics are more important than allowing a blanket `GET /db/_all_docs` I think it'd make the most sense to just embrace the pagination APIs that already exist and were written to cover these issues. The only thing I'm not 100% on is how to deal with non-continuous replications. I.e., the older single shot replication. Do we go back with patches to older replicators to allow 4.0 compatibility? Just declare that you have to mediate a replication on the newer of the two CouchDB deployments? Sniff the replicator's UserAgent and behave differently on 4.x for just that special case? Paul On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski wrote: > > Sorry, I also missed that you quoted this specific bit about eagerly > requesting a new snapshot. Currently the code will just react to the > transaction expiring, then wait till it acquires a new snapshot if > “restart_tx” is set (which can take a couple of milliseconds on a > FoundationDB cluster that is deployed across multiple AZs in a cloud Region) > and then proceed. > > Adam > > > On Jul 15, 2020, at 6:54 PM, Adam Kocoloski wrote: > > > > Right now the code has an internal “restart_tx” flag that is used to > > automatically request a new snapshot if the original one expires and > > continue streaming the response. It can be used for all manner of multi-row > > responses, not just _changes. > > > > As this is a pretty big change to the isolation guarantees provided by the > > database Bob volunteered to elevate the issue to the mailing list for a > > deeper discussion. > > > > Cheers, Adam > > > >> On Jul 15, 2020, at 11:38 AM, Joan Touzet wrote: > >> > >> I'm having trouble following the thread... > >> > >> On 14/07/2020 14:56, Adam Kocoloski wrote: > >>> For cases where you’re not concerned about the snapshot isolation (e.g. > >>> streaming an entire _changes feed), there is a small performance benefit > >>> to requesting a new FDB transaction asynchronously before the old one > >>> actually times out and swapping over to it. That’s a pattern I’ve seen in > >>> other FDB layers but I’m not sure we’ve used it anywhere in CouchDB yet. > >> > >> How does _changes work right now in the proposed 4.0 code? > >> > >> -Joan > > >
Re: [DISCUSS] couchdb 4.0 transactional semantics
Sorry, I also missed that you quoted this specific bit about eagerly requesting a new snapshot. Currently the code will just react to the transaction expiring, then wait till it acquires a new snapshot if “restart_tx” is set (which can take a couple of milliseconds on a FoundationDB cluster that is deployed across multiple AZs in a cloud Region) and then proceed. Adam > On Jul 15, 2020, at 6:54 PM, Adam Kocoloski wrote: > > Right now the code has an internal “restart_tx” flag that is used to > automatically request a new snapshot if the original one expires and continue > streaming the response. It can be used for all manner of multi-row responses, > not just _changes. > > As this is a pretty big change to the isolation guarantees provided by the > database Bob volunteered to elevate the issue to the mailing list for a > deeper discussion. > > Cheers, Adam > >> On Jul 15, 2020, at 11:38 AM, Joan Touzet wrote: >> >> I'm having trouble following the thread... >> >> On 14/07/2020 14:56, Adam Kocoloski wrote: >>> For cases where you’re not concerned about the snapshot isolation (e.g. >>> streaming an entire _changes feed), there is a small performance benefit to >>> requesting a new FDB transaction asynchronously before the old one actually >>> times out and swapping over to it. That’s a pattern I’ve seen in other FDB >>> layers but I’m not sure we’ve used it anywhere in CouchDB yet. >> >> How does _changes work right now in the proposed 4.0 code? >> >> -Joan >
Re: [DISCUSS] couchdb 4.0 transactional semantics
Right now the code has an internal “restart_tx” flag that is used to automatically request a new snapshot if the original one expires and continue streaming the response. It can be used for all manner of multi-row responses, not just _changes. As this is a pretty big change to the isolation guarantees provided by the database Bob volunteered to elevate the issue to the mailing list for a deeper discussion. Cheers, Adam > On Jul 15, 2020, at 11:38 AM, Joan Touzet wrote: > > I'm having trouble following the thread... > > On 14/07/2020 14:56, Adam Kocoloski wrote: >> For cases where you’re not concerned about the snapshot isolation (e.g. >> streaming an entire _changes feed), there is a small performance benefit to >> requesting a new FDB transaction asynchronously before the old one actually >> times out and swapping over to it. That’s a pattern I’ve seen in other FDB >> layers but I’m not sure we’ve used it anywhere in CouchDB yet. > > How does _changes work right now in the proposed 4.0 code? > > -Joan
Re: [DISCUSS] couchdb 4.0 transactional semantics
I'm having trouble following the thread... On 14/07/2020 14:56, Adam Kocoloski wrote: For cases where you’re not concerned about the snapshot isolation (e.g. streaming an entire _changes feed), there is a small performance benefit to requesting a new FDB transaction asynchronously before the old one actually times out and swapping over to it. That’s a pattern I’ve seen in other FDB layers but I’m not sure we’ve used it anywhere in CouchDB yet. How does _changes work right now in the proposed 4.0 code? -Joan
Re: [DISCUSS] couchdb 4.0 transactional semantics
> On 15. Jul 2020, at 16:12, Robert Newson wrote: > > > Thanks Jan > > I would prefer not to have the configuration switch, instead remove what we > don’t want. As you said there’ll be a 3 / 4 split for a while (and not just > for this reason). I’d support an effort for folks to ease into 4.x, as long as it is not the default behaviour. I haven’t thought about this enough to have a definite opinion about what that looks like. Best Jan — > -- > Robert Samuel Newson > rnew...@apache.org > > On Wed, 15 Jul 2020, at 14:46, Jan Lehnardt wrote: >> >>> On 14. Jul 2020, at 18:00, Adam Kocoloski wrote: >>> >>> I think there’s tremendous value in being able to tell our users that each >>> response served by CouchDB is constructed from a single isolated snapshot >>> of the underlying database. I’d advocate for this being the default >>> behavior of 4.0. >> >> I too am in favour of this. I apologise for not speaking up in the >> earlier thread, which I followed closely, but never found the time to >> respond to. >> >> From rnewson’s options, I’d suggest 3. the mandatory limit parameter. >> While this does indeed mean a BC break, it teaches the right semantics >> for folks on 4.0 and onwards. For client libraries like our own nano, >> we can easily wrap this behaviour, so the resulting API is mostly >> compatible still, at least when used in streaming mode, less so when >> buffering a big _all_docs response). >> >>> If folks wanted to add an opt-in compatibility mode to support longer >>> responses, I suppose that could be OK. I think we should discourage that >>> access pattern in general, though, as it’s somewhat less friendly to >>> various other parts of the stack than a pattern of shorter responses and a >>> smart pagination API like the one we’re introducing. To wit, I don’t think >>> we’d want to support that compatibility mode in IBM Cloud. >> >> Like Adam, I do not mind a compat mode, either through a different API >> endpoint, or even a config option. I think we will be fine in getting >> people on this path when we document this in our update guide for the >> 4.0 release. I don’t think this will lead to a Python 2/3 situation >> overall, because the 4.0+ features are compelling enough for relatively >> small changes required, and CouchDB 3.x in its then latest form will >> continue to be a fine database for years to come, for folks who can’t >> upgrade as easily. So yes, I anticipate we’ll live in a two-versions >> world a little longer than we did during 1.x to 2.x, but the reasons to >> leave 1.x behind were a little more severe than the improvements of 4.x >> over 3.x (while still significant, of course). >> >> Best >> Jan >> — >> >>> >>> Adam >>> On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson wrote: Thanks Nick, very helpful, and it vindicates me opening this thread. I don't accept Mike Rhodes argument at all but I should explain why I don't; In CouchDB 1.x, a response was generated from a single .couch file. There was always a window between the start of the request as the client sees it and CouchDB acquiring a snapshot of the relevant database. I don't think that gap is meaningful and does not refute our statements of the time that CouchDB responses are from a snapshot (specifically, that no change to the database made _during_ the response will be visible in _this_ response). In CouchDB 2.x (and continuing in 3.x), a CouchDB database typically consists of multiple shards, each of which, once opened, remain snapshotted for the duration of that response. The difference between 1.x and 2.x/3.x is that the window is potentially larger (though the requests are issued in parallel). The response, however much it returned, was impervious to changes in other requests once it has begun. I don't think _all_docs, _view or a non-continuous _changes response should allow changes made in other requests to appear midway through them and I want to hear the opinions of folks that have watched over CouchDB from its earliest days on this specific point (If I must name names, at least Adam K, Paul D, Jan L, Joan T). If there's a majority for deviating from this semantic, I will go with the majority. If we were to agree to preserve the 'single snapshot' behaviour, what would the behaviour be if we can't honour it because of the FoundationDB transaction limits? I see a few options. 1) We could end the response uncleanly, mid-response. CouchDB does this when it has no alternative, and it is ugly, but it is usually handled well by clients. They are at least not usually convinced they got a complete response if they are using a competent HTTP client. 2) We could disavow the streaming API, as you've suggested, attempt to gather the full
Re: [DISCUSS] couchdb 4.0 transactional semantics
Thanks Jan I would prefer not to have the configuration switch, instead remove what we don’t want. As you said there’ll be a 3 / 4 split for a while (and not just for this reason). -- Robert Samuel Newson rnew...@apache.org On Wed, 15 Jul 2020, at 14:46, Jan Lehnardt wrote: > > > On 14. Jul 2020, at 18:00, Adam Kocoloski wrote: > > > > I think there’s tremendous value in being able to tell our users that each > > response served by CouchDB is constructed from a single isolated snapshot > > of the underlying database. I’d advocate for this being the default > > behavior of 4.0. > > I too am in favour of this. I apologise for not speaking up in the > earlier thread, which I followed closely, but never found the time to > respond to. > > From rnewson’s options, I’d suggest 3. the mandatory limit parameter. > While this does indeed mean a BC break, it teaches the right semantics > for folks on 4.0 and onwards. For client libraries like our own nano, > we can easily wrap this behaviour, so the resulting API is mostly > compatible still, at least when used in streaming mode, less so when > buffering a big _all_docs response). > > > If folks wanted to add an opt-in compatibility mode to support longer > > responses, I suppose that could be OK. I think we should discourage that > > access pattern in general, though, as it’s somewhat less friendly to > > various other parts of the stack than a pattern of shorter responses and a > > smart pagination API like the one we’re introducing. To wit, I don’t think > > we’d want to support that compatibility mode in IBM Cloud. > > Like Adam, I do not mind a compat mode, either through a different API > endpoint, or even a config option. I think we will be fine in getting > people on this path when we document this in our update guide for the > 4.0 release. I don’t think this will lead to a Python 2/3 situation > overall, because the 4.0+ features are compelling enough for relatively > small changes required, and CouchDB 3.x in its then latest form will > continue to be a fine database for years to come, for folks who can’t > upgrade as easily. So yes, I anticipate we’ll live in a two-versions > world a little longer than we did during 1.x to 2.x, but the reasons to > leave 1.x behind were a little more severe than the improvements of 4.x > over 3.x (while still significant, of course). > > Best > Jan > — > > > > > Adam > > > >> On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson > >> wrote: > >> > >> Thanks Nick, very helpful, and it vindicates me opening this thread. > >> > >> I don't accept Mike Rhodes argument at all but I should explain why I > >> don't; > >> > >> In CouchDB 1.x, a response was generated from a single .couch file. There > >> was always a window between the start of the request as the client sees it > >> and CouchDB acquiring a snapshot of the relevant database. I don't think > >> that gap is meaningful and does not refute our statements of the time that > >> CouchDB responses are from a snapshot (specifically, that no change to the > >> database made _during_ the response will be visible in _this_ response). > >> In CouchDB 2.x (and continuing in 3.x), a CouchDB database typically > >> consists of multiple shards, each of which, once opened, remain > >> snapshotted for the duration of that response. The difference between 1.x > >> and 2.x/3.x is that the window is potentially larger (though the requests > >> are issued in parallel). The response, however much it returned, was > >> impervious to changes in other requests once it has begun. > >> > >> I don't think _all_docs, _view or a non-continuous _changes response > >> should allow changes made in other requests to appear midway through them > >> and I want to hear the opinions of folks that have watched over CouchDB > >> from its earliest days on this specific point (If I must name names, at > >> least Adam K, Paul D, Jan L, Joan T). If there's a majority for deviating > >> from this semantic, I will go with the majority. > >> > >> If we were to agree to preserve the 'single snapshot' behaviour, what > >> would the behaviour be if we can't honour it because of the FoundationDB > >> transaction limits? > >> > >> I see a few options. > >> > >> 1) We could end the response uncleanly, mid-response. CouchDB does this > >> when it has no alternative, and it is ugly, but it is usually handled well > >> by clients. They are at least not usually convinced they got a complete > >> response if they are using a competent HTTP client. > >> > >> 2) We could disavow the streaming API, as you've suggested, attempt to > >> gather the full response. If we do this within the FDB bounds, return a > >> 200 code and the response body. A 400 and an error body if we don't. > >> > >> 3) We could make the "limit" parameter mandatory and with an upper bound, > >> in combination with 1 or 2, such that a valid request is very likely to be > >>
Re: [DISCUSS] couchdb 4.0 transactional semantics
> On 14. Jul 2020, at 18:00, Adam Kocoloski wrote: > > I think there’s tremendous value in being able to tell our users that each > response served by CouchDB is constructed from a single isolated snapshot of > the underlying database. I’d advocate for this being the default behavior of > 4.0. I too am in favour of this. I apologise for not speaking up in the earlier thread, which I followed closely, but never found the time to respond to. From rnewson’s options, I’d suggest 3. the mandatory limit parameter. While this does indeed mean a BC break, it teaches the right semantics for folks on 4.0 and onwards. For client libraries like our own nano, we can easily wrap this behaviour, so the resulting API is mostly compatible still, at least when used in streaming mode, less so when buffering a big _all_docs response). > If folks wanted to add an opt-in compatibility mode to support longer > responses, I suppose that could be OK. I think we should discourage that > access pattern in general, though, as it’s somewhat less friendly to various > other parts of the stack than a pattern of shorter responses and a smart > pagination API like the one we’re introducing. To wit, I don’t think we’d > want to support that compatibility mode in IBM Cloud. Like Adam, I do not mind a compat mode, either through a different API endpoint, or even a config option. I think we will be fine in getting people on this path when we document this in our update guide for the 4.0 release. I don’t think this will lead to a Python 2/3 situation overall, because the 4.0+ features are compelling enough for relatively small changes required, and CouchDB 3.x in its then latest form will continue to be a fine database for years to come, for folks who can’t upgrade as easily. So yes, I anticipate we’ll live in a two-versions world a little longer than we did during 1.x to 2.x, but the reasons to leave 1.x behind were a little more severe than the improvements of 4.x over 3.x (while still significant, of course). Best Jan — > > Adam > >> On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson >> wrote: >> >> Thanks Nick, very helpful, and it vindicates me opening this thread. >> >> I don't accept Mike Rhodes argument at all but I should explain why I don't; >> >> In CouchDB 1.x, a response was generated from a single .couch file. There >> was always a window between the start of the request as the client sees it >> and CouchDB acquiring a snapshot of the relevant database. I don't think >> that gap is meaningful and does not refute our statements of the time that >> CouchDB responses are from a snapshot (specifically, that no change to the >> database made _during_ the response will be visible in _this_ response). In >> CouchDB 2.x (and continuing in 3.x), a CouchDB database typically consists >> of multiple shards, each of which, once opened, remain snapshotted for the >> duration of that response. The difference between 1.x and 2.x/3.x is that >> the window is potentially larger (though the requests are issued in >> parallel). The response, however much it returned, was impervious to changes >> in other requests once it has begun. >> >> I don't think _all_docs, _view or a non-continuous _changes response should >> allow changes made in other requests to appear midway through them and I >> want to hear the opinions of folks that have watched over CouchDB from its >> earliest days on this specific point (If I must name names, at least Adam K, >> Paul D, Jan L, Joan T). If there's a majority for deviating from this >> semantic, I will go with the majority. >> >> If we were to agree to preserve the 'single snapshot' behaviour, what would >> the behaviour be if we can't honour it because of the FoundationDB >> transaction limits? >> >> I see a few options. >> >> 1) We could end the response uncleanly, mid-response. CouchDB does this when >> it has no alternative, and it is ugly, but it is usually handled well by >> clients. They are at least not usually convinced they got a complete >> response if they are using a competent HTTP client. >> >> 2) We could disavow the streaming API, as you've suggested, attempt to >> gather the full response. If we do this within the FDB bounds, return a 200 >> code and the response body. A 400 and an error body if we don't. >> >> 3) We could make the "limit" parameter mandatory and with an upper bound, in >> combination with 1 or 2, such that a valid request is very likely to be >> served within the limits. >> >> I'd like to hear more voices on which way we want to break the unachievable >> semantic of old where you could read _all_docs on a billion document >> database over, uptime gods willing, a snapshot of the database. >> >> B. >> >>> On 13 Jul 2020, at 21:15, Nick Vatamaniuc wrote: >>> >>> Thanks for bringing the topic up for the discussion! >>> >>> For background, this topic was discussed on the mailing list starting >>> in February,
Re: [DISCUSS] couchdb 4.0 transactional semantics
Technically, we could certainly terminate a response cleanly when the underlying FoundationDB transaction expires and offer a bookmark to resume the response using a new transaction in a subsequent request. Some of us have reservations about that approach because an application that did not know to look for the “txn_too_long” attribute would quietly proceed with an incomplete, corrupted dataset. Terminating the response brutally reduces the likelihood of that occurring to ~zero. It’s true that we can’t absolutely guarantee that the database will never timeout, but setting a reasonable limit of ~2000 rows in a response should make it quite unlikely. I‘d expect those responses be delivered in 50ms or less, which is 100x faster than the 5 second transaction expiry. For cases where you’re not concerned about the snapshot isolation (e.g. streaming an entire _changes feed), there is a small performance benefit to requesting a new FDB transaction asynchronously before the old one actually times out and swapping over to it. That’s a pattern I’ve seen in other FDB layers but I’m not sure we’ve used it anywhere in CouchDB yet. Adam > On Jul 14, 2020, at 2:06 PM, San Sato wrote: > > Interesting. > > 1. end the response ("uncleanly") - does this mean the HTTP response > wouldn't be valid JSON? I guess the HTTP response code can't be expected > to reflect a non-normal result. Maybe in a trailing attribute in json, can > the response indicate that it's truncated for the reason of txn_too_long, > to distinguish it from completed responses with less-than-a-page-size (e.g. > limit=20k, 18k records sent, no more records present)? > > Can bookmark/etc still be included at that point to resume in > closest-key-order? Even though it's a streaming, not paginated, response, > it would match pre-v4 semantics of pagination over multiple http requests, > right? > > 2. Sending a 400 error seems like a good way to waste requests. I imagine > there's no constant limit= that can avoid the issue, so people will have to > do things that are sensitive to the presence of the limit one way or > other. I'd way rather get a partial response with a flag indicating I > should resume from - but maybe that's the "rewrite the app" scenario > Nick described designing to avoid. > > 4. request-level isolation=(TRUE|false) could be a way to express default > a preference for 1, but allow opt-in for streaming the fresher rows. I'd > want to be able to know what kind of boundaries are used for switching to a > newer txn snapshot - obviously there's a practical outer limit from FDB > but is it a performance hit to switch with some greater frequency like > 1000-rows, an FDB index-page-size if there's such a thing, every 250ms or > similar? > > > > On Tue, Jul 14, 2020 at 7:18 AM Robert Samuel Newson > wrote: > >> Thanks Nick, very helpful, and it vindicates me opening this thread. >> >> I don't accept Mike Rhodes argument at all but I should explain why I >> don't; >> >> In CouchDB 1.x, a response was generated from a single .couch file. There >> was always a window between the start of the request as the client sees it >> and CouchDB acquiring a snapshot of the relevant database. I don't think >> that gap is meaningful and does not refute our statements of the time that >> CouchDB responses are from a snapshot (specifically, that no change to the >> database made _during_ the response will be visible in _this_ response). In >> CouchDB 2.x (and continuing in 3.x), a CouchDB database typically consists >> of multiple shards, each of which, once opened, remain snapshotted for the >> duration of that response. The difference between 1.x and 2.x/3.x is that >> the window is potentially larger (though the requests are issued in >> parallel). The response, however much it returned, was impervious to >> changes in other requests once it has begun. >> >> I don't think _all_docs, _view or a non-continuous _changes response >> should allow changes made in other requests to appear midway through them >> and I want to hear the opinions of folks that have watched over CouchDB >> from its earliest days on this specific point (If I must name names, at >> least Adam K, Paul D, Jan L, Joan T). If there's a majority for deviating >> from this semantic, I will go with the majority. >> >> If we were to agree to preserve the 'single snapshot' behaviour, what >> would the behaviour be if we can't honour it because of the FoundationDB >> transaction limits? >> >> I see a few options. >> >> 1) We could end the response uncleanly, mid-response. CouchDB does this >> when it has no alternative, and it is ugly, but it is usually handled well >> by clients. They are at least not usually convinced they got a complete >> response if they are using a competent HTTP client. >> >> 2) We could disavow the streaming API, as you've suggested, attempt to >> gather the full response. If we do this within the FDB bounds, return a 200 >> code and the
Re: [DISCUSS] couchdb 4.0 transactional semantics
Interesting. 1. end the response ("uncleanly") - does this mean the HTTP response wouldn't be valid JSON? I guess the HTTP response code can't be expected to reflect a non-normal result. Maybe in a trailing attribute in json, can the response indicate that it's truncated for the reason of txn_too_long, to distinguish it from completed responses with less-than-a-page-size (e.g. limit=20k, 18k records sent, no more records present)? Can bookmark/etc still be included at that point to resume in closest-key-order? Even though it's a streaming, not paginated, response, it would match pre-v4 semantics of pagination over multiple http requests, right? 2. Sending a 400 error seems like a good way to waste requests. I imagine there's no constant limit= that can avoid the issue, so people will have to do things that are sensitive to the presence of the limit one way or other. I'd way rather get a partial response with a flag indicating I should resume from - but maybe that's the "rewrite the app" scenario Nick described designing to avoid. 4. request-level isolation=(TRUE|false) could be a way to express default a preference for 1, but allow opt-in for streaming the fresher rows. I'd want to be able to know what kind of boundaries are used for switching to a newer txn snapshot - obviously there's a practical outer limit from FDB but is it a performance hit to switch with some greater frequency like 1000-rows, an FDB index-page-size if there's such a thing, every 250ms or similar? On Tue, Jul 14, 2020 at 7:18 AM Robert Samuel Newson wrote: > Thanks Nick, very helpful, and it vindicates me opening this thread. > > I don't accept Mike Rhodes argument at all but I should explain why I > don't; > > In CouchDB 1.x, a response was generated from a single .couch file. There > was always a window between the start of the request as the client sees it > and CouchDB acquiring a snapshot of the relevant database. I don't think > that gap is meaningful and does not refute our statements of the time that > CouchDB responses are from a snapshot (specifically, that no change to the > database made _during_ the response will be visible in _this_ response). In > CouchDB 2.x (and continuing in 3.x), a CouchDB database typically consists > of multiple shards, each of which, once opened, remain snapshotted for the > duration of that response. The difference between 1.x and 2.x/3.x is that > the window is potentially larger (though the requests are issued in > parallel). The response, however much it returned, was impervious to > changes in other requests once it has begun. > > I don't think _all_docs, _view or a non-continuous _changes response > should allow changes made in other requests to appear midway through them > and I want to hear the opinions of folks that have watched over CouchDB > from its earliest days on this specific point (If I must name names, at > least Adam K, Paul D, Jan L, Joan T). If there's a majority for deviating > from this semantic, I will go with the majority. > > If we were to agree to preserve the 'single snapshot' behaviour, what > would the behaviour be if we can't honour it because of the FoundationDB > transaction limits? > > I see a few options. > > 1) We could end the response uncleanly, mid-response. CouchDB does this > when it has no alternative, and it is ugly, but it is usually handled well > by clients. They are at least not usually convinced they got a complete > response if they are using a competent HTTP client. > > 2) We could disavow the streaming API, as you've suggested, attempt to > gather the full response. If we do this within the FDB bounds, return a 200 > code and the response body. A 400 and an error body if we don't. > > 3) We could make the "limit" parameter mandatory and with an upper bound, > in combination with 1 or 2, such that a valid request is very likely to be > served within the limits. > > I'd like to hear more voices on which way we want to break the > unachievable semantic of old where you could read _all_docs on a billion > document database over, uptime gods willing, a snapshot of the database. > > B. > > > On 13 Jul 2020, at 21:15, Nick Vatamaniuc wrote: > > > > Thanks for bringing the topic up for the discussion! > > > > For background, this topic was discussed on the mailing list starting > > in February, 2019 > > > https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E > > > > The primary reason for restart_tx option is to provide compatibility > > for _changes feeds to allow older replicators to handle 4.0 sources. > > It starts a new transaction after 5 seconds or so (a current FDB > > limitation, might go up in the future) and transparently continues to > > stream data where it left off. Ex, streaming [a,b,c,d], times out > > after b, then it will continue with c, d etc. Currently this is also > > used for other streaming APIs as an alternative to returning mangled > >
Re: [DISCUSS] couchdb 4.0 transactional semantics
I think there’s tremendous value in being able to tell our users that each response served by CouchDB is constructed from a single isolated snapshot of the underlying database. I’d advocate for this being the default behavior of 4.0. If folks wanted to add an opt-in compatibility mode to support longer responses, I suppose that could be OK. I think we should discourage that access pattern in general, though, as it’s somewhat less friendly to various other parts of the stack than a pattern of shorter responses and a smart pagination API like the one we’re introducing. To wit, I don’t think we’d want to support that compatibility mode in IBM Cloud. Adam > On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson wrote: > > Thanks Nick, very helpful, and it vindicates me opening this thread. > > I don't accept Mike Rhodes argument at all but I should explain why I don't; > > In CouchDB 1.x, a response was generated from a single .couch file. There was > always a window between the start of the request as the client sees it and > CouchDB acquiring a snapshot of the relevant database. I don't think that gap > is meaningful and does not refute our statements of the time that CouchDB > responses are from a snapshot (specifically, that no change to the database > made _during_ the response will be visible in _this_ response). In CouchDB > 2.x (and continuing in 3.x), a CouchDB database typically consists of > multiple shards, each of which, once opened, remain snapshotted for the > duration of that response. The difference between 1.x and 2.x/3.x is that the > window is potentially larger (though the requests are issued in parallel). > The response, however much it returned, was impervious to changes in other > requests once it has begun. > > I don't think _all_docs, _view or a non-continuous _changes response should > allow changes made in other requests to appear midway through them and I want > to hear the opinions of folks that have watched over CouchDB from its > earliest days on this specific point (If I must name names, at least Adam K, > Paul D, Jan L, Joan T). If there's a majority for deviating from this > semantic, I will go with the majority. > > If we were to agree to preserve the 'single snapshot' behaviour, what would > the behaviour be if we can't honour it because of the FoundationDB > transaction limits? > > I see a few options. > > 1) We could end the response uncleanly, mid-response. CouchDB does this when > it has no alternative, and it is ugly, but it is usually handled well by > clients. They are at least not usually convinced they got a complete response > if they are using a competent HTTP client. > > 2) We could disavow the streaming API, as you've suggested, attempt to gather > the full response. If we do this within the FDB bounds, return a 200 code and > the response body. A 400 and an error body if we don't. > > 3) We could make the "limit" parameter mandatory and with an upper bound, in > combination with 1 or 2, such that a valid request is very likely to be > served within the limits. > > I'd like to hear more voices on which way we want to break the unachievable > semantic of old where you could read _all_docs on a billion document database > over, uptime gods willing, a snapshot of the database. > > B. > >> On 13 Jul 2020, at 21:15, Nick Vatamaniuc wrote: >> >> Thanks for bringing the topic up for the discussion! >> >> For background, this topic was discussed on the mailing list starting >> in February, 2019 >> https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E >> >> The primary reason for restart_tx option is to provide compatibility >> for _changes feeds to allow older replicators to handle 4.0 sources. >> It starts a new transaction after 5 seconds or so (a current FDB >> limitation, might go up in the future) and transparently continues to >> stream data where it left off. Ex, streaming [a,b,c,d], times out >> after b, then it will continue with c, d etc. Currently this is also >> used for other streaming APIs as an alternative to returning mangled >> JSON after emitting a 200 response and streaming some of the rows. >> However it is not used for paginated responses, the new APIs developed >> by Ilya. So users have an option to get the guaranteed snapshot >> behavior option as well. >> >> And for completeness, if we decide to remove the option, we should >> specify what happens if we remove it and get a transaction_too_old >> exception. Currently the behavior would be to restart the transaction, >> resend all the headers and all the rows again down the socket, which I >> don't think anyone wants, but is what we'd get if we just make >> {restart_tx, false} >> >>> I understand that automatically resetting the FDB txn during a response is >>> an attempt to work around that and maintain "compatibility" with CouchDB < >>> 4 semantics. I think it fails to
Re: [DISCUSS] couchdb 4.0 transactional semantics
Thanks Nick, very helpful, and it vindicates me opening this thread. I don't accept Mike Rhodes argument at all but I should explain why I don't; In CouchDB 1.x, a response was generated from a single .couch file. There was always a window between the start of the request as the client sees it and CouchDB acquiring a snapshot of the relevant database. I don't think that gap is meaningful and does not refute our statements of the time that CouchDB responses are from a snapshot (specifically, that no change to the database made _during_ the response will be visible in _this_ response). In CouchDB 2.x (and continuing in 3.x), a CouchDB database typically consists of multiple shards, each of which, once opened, remain snapshotted for the duration of that response. The difference between 1.x and 2.x/3.x is that the window is potentially larger (though the requests are issued in parallel). The response, however much it returned, was impervious to changes in other requests once it has begun. I don't think _all_docs, _view or a non-continuous _changes response should allow changes made in other requests to appear midway through them and I want to hear the opinions of folks that have watched over CouchDB from its earliest days on this specific point (If I must name names, at least Adam K, Paul D, Jan L, Joan T). If there's a majority for deviating from this semantic, I will go with the majority. If we were to agree to preserve the 'single snapshot' behaviour, what would the behaviour be if we can't honour it because of the FoundationDB transaction limits? I see a few options. 1) We could end the response uncleanly, mid-response. CouchDB does this when it has no alternative, and it is ugly, but it is usually handled well by clients. They are at least not usually convinced they got a complete response if they are using a competent HTTP client. 2) We could disavow the streaming API, as you've suggested, attempt to gather the full response. If we do this within the FDB bounds, return a 200 code and the response body. A 400 and an error body if we don't. 3) We could make the "limit" parameter mandatory and with an upper bound, in combination with 1 or 2, such that a valid request is very likely to be served within the limits. I'd like to hear more voices on which way we want to break the unachievable semantic of old where you could read _all_docs on a billion document database over, uptime gods willing, a snapshot of the database. B. > On 13 Jul 2020, at 21:15, Nick Vatamaniuc wrote: > > Thanks for bringing the topic up for the discussion! > > For background, this topic was discussed on the mailing list starting > in February, 2019 > https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E > > The primary reason for restart_tx option is to provide compatibility > for _changes feeds to allow older replicators to handle 4.0 sources. > It starts a new transaction after 5 seconds or so (a current FDB > limitation, might go up in the future) and transparently continues to > stream data where it left off. Ex, streaming [a,b,c,d], times out > after b, then it will continue with c, d etc. Currently this is also > used for other streaming APIs as an alternative to returning mangled > JSON after emitting a 200 response and streaming some of the rows. > However it is not used for paginated responses, the new APIs developed > by Ilya. So users have an option to get the guaranteed snapshot > behavior option as well. > > And for completeness, if we decide to remove the option, we should > specify what happens if we remove it and get a transaction_too_old > exception. Currently the behavior would be to restart the transaction, > resend all the headers and all the rows again down the socket, which I > don't think anyone wants, but is what we'd get if we just make > {restart_tx, false} > >> I understand that automatically resetting the FDB txn during a response is >> an attempt to work around that and maintain "compatibility" with CouchDB < 4 >> semantics. I think it fails to do so and is very misleading. > > It is a trade-off in order to keep the same API shape as before. Sure, > streaming all the docs with _all_docs or _changes feeds is not a great > pattern but many applications are implemented that way already. > Letting them migrate to 4.0 without having to rewrite the application > with the caveat that they might see a document updated in the > _all_docs stream after the request has already started, is a nicer > choice, I think, than forcing them to rewrite their application, which > could lead to a python 2/3 scenario. > > Due to having multiple shards (Q>1), as discussed in the original > mailing thread by Mike > (https://lists.apache.org/thread.html/r8345f534a6fa88c107c1085fba13e660e0e2aedfd206c2748e002664%40%3Cdev.couchdb.apache.org%3E), > we don't provide a strict read-only snapshot guarantee in 2.x and 3.x >
Re: [DISCUSS] couchdb 4.0 transactional semantics
Thanks for bringing the topic up for the discussion! For background, this topic was discussed on the mailing list starting in February, 2019 https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E The primary reason for restart_tx option is to provide compatibility for _changes feeds to allow older replicators to handle 4.0 sources. It starts a new transaction after 5 seconds or so (a current FDB limitation, might go up in the future) and transparently continues to stream data where it left off. Ex, streaming [a,b,c,d], times out after b, then it will continue with c, d etc. Currently this is also used for other streaming APIs as an alternative to returning mangled JSON after emitting a 200 response and streaming some of the rows. However it is not used for paginated responses, the new APIs developed by Ilya. So users have an option to get the guaranteed snapshot behavior option as well. And for completeness, if we decide to remove the option, we should specify what happens if we remove it and get a transaction_too_old exception. Currently the behavior would be to restart the transaction, resend all the headers and all the rows again down the socket, which I don't think anyone wants, but is what we'd get if we just make {restart_tx, false} > I understand that automatically resetting the FDB txn during a response is > an attempt to work around that and maintain "compatibility" with CouchDB < 4 > semantics. I think it fails to do so and is very misleading. It is a trade-off in order to keep the same API shape as before. Sure, streaming all the docs with _all_docs or _changes feeds is not a great pattern but many applications are implemented that way already. Letting them migrate to 4.0 without having to rewrite the application with the caveat that they might see a document updated in the _all_docs stream after the request has already started, is a nicer choice, I think, than forcing them to rewrite their application, which could lead to a python 2/3 scenario. Due to having multiple shards (Q>1), as discussed in the original mailing thread by Mike (https://lists.apache.org/thread.html/r8345f534a6fa88c107c1085fba13e660e0e2aedfd206c2748e002664%40%3Cdev.couchdb.apache.org%3E), we don't provide a strict read-only snapshot guarantee in 2.x and 3.x anyway, so users would have to handle scenarios where a document might appear in the stream that wasn't there at the start of the request already. Though, granted, a much smaller corner case but I wonder how many users care to handle that... Currently users do have an option of using the new paginated API which disables restart_tx behavior https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/chttpd/src/chttpd_db.erl#L947, though I am not sure what happens when transaction_too_old exception is thrown then (emit a bookmark?) So based on the compatibility consideration, I'd vote to keep the restart_tx option (configurable perhaps if we figure out what to do when it is disabled) in order to allow users to migrate their application to 4.0. At least informally we promised users to keep a strong API compatibility when we released 3.0 with an eye towards 4.0 (https://blog.couchdb.org/2020/02/26/the-road-to-couchdb-3-0/). I'd think not emitting all the data in a _changes or _all_docs response would break that compatibility more than using multiple transactions. As for what happens when a transaction_too_old is thrown, I could see an option passed in, something like, single_snapshot=true, and then use Adam's suggestion to accumulate all the rows in memory and if we hit the end of the transaction return a 400 error. We won't emit anything out while rows are accumulated, so users don't get partial data, it will be every row requested or a 400 error (so no chance of perceived data loss). Users may retry if they think it was a temporary hiccup or may use a small limit number. Cheers, -Nick On Mon, Jul 13, 2020 at 2:05 PM Robert Samuel Newson wrote: > > Hi All, > > I'm concerned to see the restart_fold function in fabric2_fdb > (https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/src/fabric2_fdb.erl#L1828) > in the 4.0 development branch. > > The upshot of doing this is that a CouchDB response could be taken across > multiple snapshots of the database, which is not the behaviour of CouchDB 1 > through 3. > > I don't think this is ok (with the obvious and established exception of a > continuous changes feed, where new snapshots are continuously visible at the > end of the response). > > FoundationDB imposes certain limits on transactions, the most notable being > the 5 second maximum duration. I understand that automatically resetting the > FDB txn during a response is an attempt to work around that and maintain > "compatibility" with CouchDB < 4 semantics. I think it fails to do so and is > very misleading. > > Discuss. > > B. >
[DISCUSS] couchdb 4.0 transactional semantics
Hi All, I'm concerned to see the restart_fold function in fabric2_fdb (https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/src/fabric2_fdb.erl#L1828) in the 4.0 development branch. The upshot of doing this is that a CouchDB response could be taken across multiple snapshots of the database, which is not the behaviour of CouchDB 1 through 3. I don't think this is ok (with the obvious and established exception of a continuous changes feed, where new snapshots are continuously visible at the end of the response). FoundationDB imposes certain limits on transactions, the most notable being the 5 second maximum duration. I understand that automatically resetting the FDB txn during a response is an attempt to work around that and maintain "compatibility" with CouchDB < 4 semantics. I think it fails to do so and is very misleading. Discuss. B.