Re: [DISCUSS] couchdb 4.0 transactional semantics

2021-01-07 Thread Robert Newson
Apologies for resurrecting this thread after so long.

I’ve looked over the thread again today and it seems there is general consensus 
on the desired semantics. I will start a vote thread.

B.

> On 24 Jul 2020, at 18:27, Nick Vatamaniuc  wrote:
> 
> Great discussion everyone!
> 
> For normal replications, I think it might be nice to make an exception
> and allow server-side pagination for compatibility at first, with a
> new option to explicitly enable strict snapshots behavior. Then, in a
> later release make it the default to match _all_docs and _view reads.
> In other words, for a short while, we'd support bi-directional
> replications between 4.x and 1/2/3.x on any replicator and document
> that fact, then after a while will switch that capability off and
> users would have to run replications on a 4.x replicator only, or
> specially updated 3.x replicators.
> 
>> I'd rather support this scenario than have to support explaining why the 
>> "one shot" replication back to an old 1.x, when initiated by a 1.x cluster, 
>> is returning results "ahead" of the time at which the one-shot replication 
>> was started.
> 
> Ah, that won't happen in the current fdb prototype branch
> implementation. What might happen is there would be changes present in
> the changes feed that happened _after_ the request has started. That
> won't be any different than if a node where replication runs restarts,
> or there is a network glitch. The changes feed would proceed from the
> last checkpoint and see changes that happened after the initial
> starting sequence and apply them in order (document "a" was deleted,
> then it was updated again then deleted again, every change will be
> applied incrementally to the target, etc).
> 
> We'd have to document the fact that a single snapshot replication from
> 4.x -> 1/2/3.x is impossible anyway (unless we do the trick where we
> compare the update sequence and db was not updated in the meantime or
> the new FDB storage engine allows it).  The question then becomes if
> we allow the pagination to happen on the client or the server. In case
> of normal replication I think it would be nice to allow it to happen
> on the server for a bit to allow for maximum initial replication
> interoperability.
> 
>> For cases where you’re not concerned about the snapshot isolation (e.g. 
>> streaming an entire _changes feed), there is a small performance benefit to 
>> requesting a new FDB transaction asynchronously before the old one actually 
>> times out and swapping over to it. That’s a pattern I’ve seen in other FDB 
>> layers but I’m not sure we’ve used it anywhere in CouchDB yet.
> 
> Good point, Adam. We could optimize that part, yeah. Fetch a GRV after
> 4.9 seconds or so and keep it ready to go for example. So far we tried
> to react to the transaction_too_old exception, as opposed to starting
> a timer there in order to allow us to use the maximum time a tx is
> alive, to save a few seconds or milliseconds. That required some
> tricks such as handling the exception bubbling up from either the
> range read itself, or from the user's callback (say if user code in
> the callback fetched a doc body which blew up with a
> transaction_too_old exception). As an interesting aside, from quick
> experiments I had noticed we were able to stream about 100-150k rows
> from a single tx snapshot, that wasn't too bad I thought.
> 
> Speaking of replication, I am trying to see what the replicator might
> look like in 4.x in the https://github.com/apache/couchdb/pull/3015
> (prototype/fdb-replicator branch). It's very much a wip and hot mess
> currently. Will issue an RFC once I have a better handle on the
> general shape of it. So far it's based on couch_jobs, with a global
> queue and looks like it might be smaller overall, as it's leveraging
> the scheduling capabilities already present in couch_jobs, and but
> once started individual replication job process hierarchy is largely
> the same as before.
> 
> Cheers,
> -Nick
> 
> 
> 
> 
> 
> On Wed, Jul 22, 2020 at 8:48 AM Bessenyei Balázs Donát
>  wrote:
>> 
>> On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt  wrote:
>>> I’m not sure why a URL parameter vs. a path makes a big difference?
>>> 
>>> Do you have an example?
>>> 
>>> Best
>>> Jan
>>> —
>> 
>> Oh, sure! OpenAPI Generator [1] and et al. for example generate Java
>> methods (like [2] out of spec [3]) per path per verb.
>> Java's type safety and the way methods are currently generated don't
>> really provide an easy way to retrieve multiple kinds of responses, so
>> having them separate would help a lot there.
>> 
>> 
>> Donat
>> 
>> PS. I'm getting self-conscious about discussing this in this thread.
>> Should I open a new one?
>> 
>> 
>> [1] https://openapi-generator.tech/
>> [2] 
>> https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/src/main/java/org/openapitools/client/api/PetApi.java#L606
>> [3] 
>> 

Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-24 Thread Nick Vatamaniuc
Great discussion everyone!

For normal replications, I think it might be nice to make an exception
and allow server-side pagination for compatibility at first, with a
new option to explicitly enable strict snapshots behavior. Then, in a
later release make it the default to match _all_docs and _view reads.
In other words, for a short while, we'd support bi-directional
replications between 4.x and 1/2/3.x on any replicator and document
that fact, then after a while will switch that capability off and
users would have to run replications on a 4.x replicator only, or
specially updated 3.x replicators.

> I'd rather support this scenario than have to support explaining why the "one 
> shot" replication back to an old 1.x, when initiated by a 1.x cluster, is 
> returning results "ahead" of the time at which the one-shot replication was 
> started.

Ah, that won't happen in the current fdb prototype branch
implementation. What might happen is there would be changes present in
the changes feed that happened _after_ the request has started. That
won't be any different than if a node where replication runs restarts,
or there is a network glitch. The changes feed would proceed from the
last checkpoint and see changes that happened after the initial
starting sequence and apply them in order (document "a" was deleted,
then it was updated again then deleted again, every change will be
applied incrementally to the target, etc).

We'd have to document the fact that a single snapshot replication from
4.x -> 1/2/3.x is impossible anyway (unless we do the trick where we
compare the update sequence and db was not updated in the meantime or
the new FDB storage engine allows it).  The question then becomes if
we allow the pagination to happen on the client or the server. In case
of normal replication I think it would be nice to allow it to happen
on the server for a bit to allow for maximum initial replication
interoperability.

> For cases where you’re not concerned about the snapshot isolation (e.g. 
> streaming an entire _changes feed), there is a small performance benefit to 
> requesting a new FDB transaction asynchronously before the old one actually 
> times out and swapping over to it. That’s a pattern I’ve seen in other FDB 
> layers but I’m not sure we’ve used it anywhere in CouchDB yet.

Good point, Adam. We could optimize that part, yeah. Fetch a GRV after
4.9 seconds or so and keep it ready to go for example. So far we tried
to react to the transaction_too_old exception, as opposed to starting
a timer there in order to allow us to use the maximum time a tx is
alive, to save a few seconds or milliseconds. That required some
tricks such as handling the exception bubbling up from either the
range read itself, or from the user's callback (say if user code in
the callback fetched a doc body which blew up with a
transaction_too_old exception). As an interesting aside, from quick
experiments I had noticed we were able to stream about 100-150k rows
from a single tx snapshot, that wasn't too bad I thought.

Speaking of replication, I am trying to see what the replicator might
look like in 4.x in the https://github.com/apache/couchdb/pull/3015
(prototype/fdb-replicator branch). It's very much a wip and hot mess
currently. Will issue an RFC once I have a better handle on the
general shape of it. So far it's based on couch_jobs, with a global
queue and looks like it might be smaller overall, as it's leveraging
the scheduling capabilities already present in couch_jobs, and but
once started individual replication job process hierarchy is largely
the same as before.

Cheers,
-Nick





On Wed, Jul 22, 2020 at 8:48 AM Bessenyei Balázs Donát
 wrote:
>
> On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt  wrote:
> > I’m not sure why a URL parameter vs. a path makes a big difference?
> >
> > Do you have an example?
> >
> > Best
> > Jan
> > —
>
> Oh, sure! OpenAPI Generator [1] and et al. for example generate Java
> methods (like [2] out of spec [3]) per path per verb.
> Java's type safety and the way methods are currently generated don't
> really provide an easy way to retrieve multiple kinds of responses, so
> having them separate would help a lot there.
>
>
> Donat
>
> PS. I'm getting self-conscious about discussing this in this thread.
> Should I open a new one?
>
>
> [1] https://openapi-generator.tech/
> [2] 
> https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/src/main/java/org/openapitools/client/api/PetApi.java#L606
> [3] 
> https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/api/openapi.yaml#L208


Re: Supporting API client generation tools (Was: [DISCUSS] couchdb 4.0 transactional semantics)

2020-07-23 Thread Richard Ellis
For the specific case of _changes I think proper handling of the Accept 
header would make a lot of sense as it is the HTTP way of changing the 
content-type. As such it is usually much better supported by API tooling 
than format switching on query parameters.

The eventsource stream is half-way there in that it returns a 
`content-type: text/event-stream`, but you have to specify 
`feed=eventsource` - it isn't enabled by passing the `Accept: 
text/event-stream` header (and you don't get a 406 Not acceptable if you 
try; you just a get a normal feed response in application/json).

The continuous feed is a little more complicated, it isn't valid as 
`application/json`, but using `text/plain` as a switch is problematic 
because I think one could reasonably expect to accept other feed types 
like normal or longpoll feeds with `Accept: text/plain` as an alternative 
to `application/json`. The output does conform to http://jsonlines.org/ 
and https://github.com/ndjson/ndjson-spec formats and they have proposed 
various mime types `application/x-jsonlines`, `application/x-ndjson` etc, 
but given they are not standardized I don't know how much advantage there 
is over using say something like `application/x-couch-continuous-json` or 
similar. The mime type `application/json-seq` is backed by 
https://tools.ietf.org/html/rfc7464 but would involve adding a record 
separator character to each line, which may come with a host of problems.

Anyway I think with some consideration of a suitable mime-type for 
continuous it would potentially be possible use accept/content-type to 
correctly switch between the different feed formats and improve the API 
without adding new endpoints or necessarily breaking anything. The feed 
parameter could be left operating as it currently does, but deprecated.

From a more general perspective I think that most of the issues I've come 
across when working with Couch and OpenAPI are related to places where 
Couch switches type in a schema.

An example related to _changes would be the `heartbeat` parameter which 
can either be a number or a boolean. While it is technically possible to 
apply multiple types in OpenAPI from a generation perspective this 
inevitably leads to complications in strongly typed languages. I think it 
would often be possible to resolve that type of problem in a non-breaking 
way by continuing to allow both types in the existing parameter, but 
formally declaring parameters with single types (e.g. heartbeat=number, 
heartbeatOn=boolean) thus effectively deprecating the use of multiple 
types within a parameter.

Another example, this time unrelated to _changes, would be field sorts in 
index definitions which can be either a string "field" for default 
ascending or an object like {"field": "asc"}. When listing indexes IIRC 
they are returned in the form they were supplied in, but if we could agree 
for example to continue to accept both forms when defining indexes, but 
always return the "expanded" object form it would facilitate 
deserialization in generated tooling. I think that would be considered a 
breaking change because under some circumstances the response schema 
changes. I guess the risk is related to how many people rely on being able 
to parse field sorts from index listings in the same format that they 
passed them in as. Especially since that as soon as there is a descending 
field at least some would be in object form. I expect not many people rely 
on that, but surprise me! Of course there are ways to resolve this without 
being breaking too, e.g. adding a header to toggle the behaviour or 
something, but I guess there is a complexity trade-off to be made against 
the impact.

Rich



From:   Jan Lehnardt 
To: dev@couchdb.apache.org
Date:   22/07/2020 17:23
Subject:[EXTERNAL] Supporting API client generation tools (Was: 
[DISCUSS] couchdb 4.0 transactional semantics)





> On 22. Jul 2020, at 14:48, Bessenyei Balázs Donát  
wrote:
> 
> On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt  wrote:
>> I’m not sure why a URL parameter vs. a path makes a big difference?
>> 
>> Do you have an example?
>> 
>> Best
>> Jan
>> —
> 
> Oh, sure! OpenAPI Generator [1] and et al. for example generate Java
> methods (like [2] out of spec [3]) per path per verb.
> Java's type safety and the way methods are currently generated don't
> really provide an easy way to retrieve multiple kinds of responses, so
> having them separate would help a lot there.

My argument would be that API generation tools that try to abstract over
HTTP that aren’t able to really abstract over HTTP aren’t our place
to fix ;P

But I wouldn’t be averse to adding endpoints that make this easier for
these tools. Although I’m sceptical they can deal with our continuous
modes anyway. Adding endpoints is not a BC break, and I would not
support removing the original versions. We shoul

Supporting API client generation tools (Was: [DISCUSS] couchdb 4.0 transactional semantics)

2020-07-22 Thread Jan Lehnardt



> On 22. Jul 2020, at 14:48, Bessenyei Balázs Donát  wrote:
> 
> On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt  wrote:
>> I’m not sure why a URL parameter vs. a path makes a big difference?
>> 
>> Do you have an example?
>> 
>> Best
>> Jan
>> —
> 
> Oh, sure! OpenAPI Generator [1] and et al. for example generate Java
> methods (like [2] out of spec [3]) per path per verb.
> Java's type safety and the way methods are currently generated don't
> really provide an easy way to retrieve multiple kinds of responses, so
> having them separate would help a lot there.

My argument would be that API generation tools that try to abstract over
HTTP that aren’t able to really abstract over HTTP aren’t our place
to fix ;P

But I wouldn’t be averse to adding endpoints that make this easier for
these tools. Although I’m sceptical they can deal with our continuous
modes anyway. Adding endpoints is not a BC break, and I would not
support removing the original versions. We should identify all places
that would be problematic before deciding either way.

I know a few Cloudant folks have looked at this previously.

I also don’t feel too strongly about this, but I’m happy to have a
discussion on this.

> Donat
> 
> PS. I'm getting self-conscious about discussing this in this thread.
> Should I open a new one?

Done.

Best
Jan
—
> 
> 
> [1] https://openapi-generator.tech/
> [2] 
> https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/src/main/java/org/openapitools/client/api/PetApi.java#L606
> [3] 
> https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/api/openapi.yaml#L208



Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-22 Thread Bessenyei Balázs Donát
On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt  wrote:
> I’m not sure why a URL parameter vs. a path makes a big difference?
>
> Do you have an example?
>
> Best
> Jan
> —

Oh, sure! OpenAPI Generator [1] and et al. for example generate Java
methods (like [2] out of spec [3]) per path per verb.
Java's type safety and the way methods are currently generated don't
really provide an easy way to retrieve multiple kinds of responses, so
having them separate would help a lot there.


Donat

PS. I'm getting self-conscious about discussing this in this thread.
Should I open a new one?


[1] https://openapi-generator.tech/
[2] 
https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/src/main/java/org/openapitools/client/api/PetApi.java#L606
[3] 
https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/api/openapi.yaml#L208


Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-21 Thread Jan Lehnardt



> On 21. Jul 2020, at 18:29, Bessenyei Balázs Donát  wrote:
> 
> On Tue, 21 Jul 2020 at 17:42, Jan Lehnardt  wrote:
>> We rather don’t like to break things just because we can :)
>> 
>> Do you have anything specific in mind?
>> 
>> Best
>> Jan
>> —
>> 
> 
> I'm not suggesting that breaking changes should be introduced just for
> the fun of it :)
> Anyway, an example could be the changes feed [1]: it returns JSON,
> line-by-line JSON or EventSource responses (for `normal`, `continuous`
> and `eventsource` modes, respectively).
> This makes integration and tooling around it difficult. One potential
> fix to that could be separating the feed into different URLs (such as
> `_changes`, `_changes/_continuous` and `_changes/_eventsource`).
> 
> Let me know what you think.

I’m not sure why a URL parameter vs. a path makes a big difference?

Do you have an example?

Best
Jan
—

> 
> 
> Donat
> 
> [1] https://docs.couchdb.org/en/stable/api/database/changes.html



Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-21 Thread Bessenyei Balázs Donát
On Tue, 21 Jul 2020 at 17:42, Jan Lehnardt  wrote:
> We rather don’t like to break things just because we can :)
>
> Do you have anything specific in mind?
>
> Best
> Jan
> —
>

I'm not suggesting that breaking changes should be introduced just for
the fun of it :)
Anyway, an example could be the changes feed [1]: it returns JSON,
line-by-line JSON or EventSource responses (for `normal`, `continuous`
and `eventsource` modes, respectively).
This makes integration and tooling around it difficult. One potential
fix to that could be separating the feed into different URLs (such as
`_changes`, `_changes/_continuous` and `_changes/_eventsource`).

Let me know what you think.


Donat

[1] https://docs.couchdb.org/en/stable/api/database/changes.html


Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-21 Thread Jan Lehnardt



> On 21. Jul 2020, at 17:24, Bessenyei Balázs Donát  wrote:
> 
> I think being able to leverage FoundationDB's serializability is an
> awesome idea! +1 (non-binding) on all 4 points.
> I also support the idea of changing the API in backwards-incompatible
> ways if that makes things more convenient / streamlined. I wonder,
> does this mean other, backwards-incompatible changes are also welcome
> in the next major? (Given that replicator-compatibility (from later on
> in this thread) is expected.)

We rather don’t like to break things just because we can :)

Do you have anything specific in mind?

Best
Jan
—

> 
> 
> Thank you,
> 
> Donat
> 
> 
> On Thu, 16 Jul 2020 at 18:26, Paul Davis  wrote:
>> 
>> From what I'm reading it sounds like we have general consensus on a few 
>> things:
>> 
>> 1. A single CouchDB API call should map to a single FDB transaction
>> 2. We absolutely do not want to return a valid JSON response to any
>> streaming API that hit a transaction boundary (because data
>> loss/corruption)
>> 3. We're willing to change the API requirements so that 2 is not an issue.
>> 4. None of this applies to continuous changes since that API call was
>> never a single snapshot.
>> 
>> If everyone generally agrees with that summarization, my suggestion
>> would be that we just revisit the new pagination APIs and make them
>> the only behavior rather than having them be opt-in. I believe those
>> APIs already address all the concerns in this thread and the only
>> reason we kept the older versions with `restart_tx` was to maintain
>> API backwards compatibility at the expense of a slight change to
>> semantics of snapshots. However, if there's a consensus that the
>> semantics are more important than allowing a blanket `GET
>> /db/_all_docs` I think it'd make the most sense to just embrace the
>> pagination APIs that already exist and were written to cover these
>> issues.
>> 
>> The only thing I'm not 100% on is how to deal with non-continuous
>> replications. I.e., the older single shot replication. Do we go back
>> with patches to older replicators to allow 4.0 compatibility? Just
>> declare that you have to mediate a replication on the newer of the two
>> CouchDB deployments? Sniff the replicator's UserAgent and behave
>> differently on 4.x for just that special case?
>> 
>> Paul
>> 
>> On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski  wrote:
>>> 
>>> Sorry, I also missed that you quoted this specific bit about eagerly 
>>> requesting a new snapshot. Currently the code will just react to the 
>>> transaction expiring, then wait till it acquires a new snapshot if 
>>> “restart_tx” is set (which can take a couple of milliseconds on a 
>>> FoundationDB cluster that is deployed across multiple AZs in a cloud 
>>> Region) and then proceed.
>>> 
>>> Adam
>>> 
 On Jul 15, 2020, at 6:54 PM, Adam Kocoloski  wrote:
 
 Right now the code has an internal “restart_tx” flag that is used to 
 automatically request a new snapshot if the original one expires and 
 continue streaming the response. It can be used for all manner of 
 multi-row responses, not just _changes.
 
 As this is a pretty big change to the isolation guarantees provided by the 
 database Bob volunteered to elevate the issue to the mailing list for a 
 deeper discussion.
 
 Cheers, Adam
 
> On Jul 15, 2020, at 11:38 AM, Joan Touzet  wrote:
> 
> I'm having trouble following the thread...
> 
> On 14/07/2020 14:56, Adam Kocoloski wrote:
>> For cases where you’re not concerned about the snapshot isolation (e.g. 
>> streaming an entire _changes feed), there is a small performance benefit 
>> to requesting a new FDB transaction asynchronously before the old one 
>> actually times out and swapping over to it. That’s a pattern I’ve seen 
>> in other FDB layers but I’m not sure we’ve used it anywhere in CouchDB 
>> yet.
> 
> How does _changes work right now in the proposed 4.0 code?
> 
> -Joan
 
>>> 



Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-21 Thread Bessenyei Balázs Donát
I think being able to leverage FoundationDB's serializability is an
awesome idea! +1 (non-binding) on all 4 points.
I also support the idea of changing the API in backwards-incompatible
ways if that makes things more convenient / streamlined. I wonder,
does this mean other, backwards-incompatible changes are also welcome
in the next major? (Given that replicator-compatibility (from later on
in this thread) is expected.)


Thank you,

Donat


On Thu, 16 Jul 2020 at 18:26, Paul Davis  wrote:
>
> From what I'm reading it sounds like we have general consensus on a few 
> things:
>
> 1. A single CouchDB API call should map to a single FDB transaction
> 2. We absolutely do not want to return a valid JSON response to any
> streaming API that hit a transaction boundary (because data
> loss/corruption)
> 3. We're willing to change the API requirements so that 2 is not an issue.
> 4. None of this applies to continuous changes since that API call was
> never a single snapshot.
>
> If everyone generally agrees with that summarization, my suggestion
> would be that we just revisit the new pagination APIs and make them
> the only behavior rather than having them be opt-in. I believe those
> APIs already address all the concerns in this thread and the only
> reason we kept the older versions with `restart_tx` was to maintain
> API backwards compatibility at the expense of a slight change to
> semantics of snapshots. However, if there's a consensus that the
> semantics are more important than allowing a blanket `GET
> /db/_all_docs` I think it'd make the most sense to just embrace the
> pagination APIs that already exist and were written to cover these
> issues.
>
> The only thing I'm not 100% on is how to deal with non-continuous
> replications. I.e., the older single shot replication. Do we go back
> with patches to older replicators to allow 4.0 compatibility? Just
> declare that you have to mediate a replication on the newer of the two
> CouchDB deployments? Sniff the replicator's UserAgent and behave
> differently on 4.x for just that special case?
>
> Paul
>
> On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski  wrote:
> >
> > Sorry, I also missed that you quoted this specific bit about eagerly 
> > requesting a new snapshot. Currently the code will just react to the 
> > transaction expiring, then wait till it acquires a new snapshot if 
> > “restart_tx” is set (which can take a couple of milliseconds on a 
> > FoundationDB cluster that is deployed across multiple AZs in a cloud 
> > Region) and then proceed.
> >
> > Adam
> >
> > > On Jul 15, 2020, at 6:54 PM, Adam Kocoloski  wrote:
> > >
> > > Right now the code has an internal “restart_tx” flag that is used to 
> > > automatically request a new snapshot if the original one expires and 
> > > continue streaming the response. It can be used for all manner of 
> > > multi-row responses, not just _changes.
> > >
> > > As this is a pretty big change to the isolation guarantees provided by 
> > > the database Bob volunteered to elevate the issue to the mailing list for 
> > > a deeper discussion.
> > >
> > > Cheers, Adam
> > >
> > >> On Jul 15, 2020, at 11:38 AM, Joan Touzet  wrote:
> > >>
> > >> I'm having trouble following the thread...
> > >>
> > >> On 14/07/2020 14:56, Adam Kocoloski wrote:
> > >>> For cases where you’re not concerned about the snapshot isolation (e.g. 
> > >>> streaming an entire _changes feed), there is a small performance 
> > >>> benefit to requesting a new FDB transaction asynchronously before the 
> > >>> old one actually times out and swapping over to it. That’s a pattern 
> > >>> I’ve seen in other FDB layers but I’m not sure we’ve used it anywhere 
> > >>> in CouchDB yet.
> > >>
> > >> How does _changes work right now in the proposed 4.0 code?
> > >>
> > >> -Joan
> > >
> >


Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-17 Thread Robert Samuel Newson
No, I would not. I was thinking only of the previous major release. so a 3.x.y 
that adds bidirection replication compatibility with 4.0.0.

B.

> On 16 Jul 2020, at 21:50, Joan Touzet  wrote:
> 
> 
> 
> On 2020-07-16 2:24 p.m., Robert Samuel Newson wrote:
>> Agreed on all 4 points. On the final point, it's worth noting that a 
>> continuous changes feed was two-phase, the first is indeed over a snapshot 
>> of the db as of the start of the _changes request, the second phase is an 
>> endless series of subsequent snapshots. the 4.0 behaviour won't exactly 
>> match that but it's definitely in the same spirit.
>> Agreed also on requiring pagination (I've not reviewed the proposed 
>> pagination api in sufficient detail to +1 it yet). Would we start the 
>> response as rows are retrieved, though? That's my preference, with an 
>> unclean termination if we hit txn_too_old, and an upper bound on the "limit" 
>> parameter or equivalent chosen such that txn_too_old is vanishingly unlikely.
>> On compatibility, there's precedent for a minor release of old branches just 
>> to add replicator compatibility. for example, the replicator could call 
>> _changes again if it received a complete _changes response (i.e, one that 
>> ended with a } that completes the json object) that did not include a 
>> "last_seq" row. The 4.0 replicator would always do this.
> 
> I wouldn't really want to release a new 1.x, would you? Augh.
> 
> If we're going to change how replication works, wouldn't it better to simply 
> say "there is no guaranteed one-shot replication back from 4.x to 1.x?" Or, 
> intentionally break backward compatibility so one-shot replication to 
> un-upgraded old Couches refuses to work at all? This would prevent the 
> confusion by making it clear - you can't do things this way anymore.
> 
> We could do a point release of 3.x, sure.
> 
> -Joan
> 
>> B.
>>> On 16 Jul 2020, at 17:25, Paul Davis  wrote:
>>> 
>>> From what I'm reading it sounds like we have general consensus on a few 
>>> things:
>>> 
>>> 1. A single CouchDB API call should map to a single FDB transaction
>>> 2. We absolutely do not want to return a valid JSON response to any
>>> streaming API that hit a transaction boundary (because data
>>> loss/corruption)
>>> 3. We're willing to change the API requirements so that 2 is not an issue.
>>> 4. None of this applies to continuous changes since that API call was
>>> never a single snapshot.
>>> 
>>> If everyone generally agrees with that summarization, my suggestion
>>> would be that we just revisit the new pagination APIs and make them
>>> the only behavior rather than having them be opt-in. I believe those
>>> APIs already address all the concerns in this thread and the only
>>> reason we kept the older versions with `restart_tx` was to maintain
>>> API backwards compatibility at the expense of a slight change to
>>> semantics of snapshots. However, if there's a consensus that the
>>> semantics are more important than allowing a blanket `GET
>>> /db/_all_docs` I think it'd make the most sense to just embrace the
>>> pagination APIs that already exist and were written to cover these
>>> issues.
>>> 
>>> The only thing I'm not 100% on is how to deal with non-continuous
>>> replications. I.e., the older single shot replication. Do we go back
>>> with patches to older replicators to allow 4.0 compatibility? Just
>>> declare that you have to mediate a replication on the newer of the two
>>> CouchDB deployments? Sniff the replicator's UserAgent and behave
>>> differently on 4.x for just that special case?
>>> 
>>> Paul
>>> 
>>> On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski  wrote:
 
 Sorry, I also missed that you quoted this specific bit about eagerly 
 requesting a new snapshot. Currently the code will just react to the 
 transaction expiring, then wait till it acquires a new snapshot if 
 “restart_tx” is set (which can take a couple of milliseconds on a 
 FoundationDB cluster that is deployed across multiple AZs in a cloud 
 Region) and then proceed.
 
 Adam
 
> On Jul 15, 2020, at 6:54 PM, Adam Kocoloski  wrote:
> 
> Right now the code has an internal “restart_tx” flag that is used to 
> automatically request a new snapshot if the original one expires and 
> continue streaming the response. It can be used for all manner of 
> multi-row responses, not just _changes.
> 
> As this is a pretty big change to the isolation guarantees provided by 
> the database Bob volunteered to elevate the issue to the mailing list for 
> a deeper discussion.
> 
> Cheers, Adam
> 
>> On Jul 15, 2020, at 11:38 AM, Joan Touzet  wrote:
>> 
>> I'm having trouble following the thread...
>> 
>> On 14/07/2020 14:56, Adam Kocoloski wrote:
>>> For cases where you’re not concerned about the snapshot isolation (e.g. 
>>> streaming an entire _changes feed), there is a small performance 
>>> benefit to 

Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-16 Thread Joan Touzet




On 2020-07-16 4:50 p.m., Joan Touzet wrote:



On 2020-07-16 2:24 p.m., Robert Samuel Newson wrote:


Agreed on all 4 points. On the final point, it's worth noting that a 
continuous changes feed was two-phase, the first is indeed over a 
snapshot of the db as of the start of the _changes request, the second 
phase is an endless series of subsequent snapshots. the 4.0 behaviour 
won't exactly match that but it's definitely in the same spirit.


Agreed also on requiring pagination (I've not reviewed the proposed 
pagination api in sufficient detail to +1 it yet). Would we start the 
response as rows are retrieved, though? That's my preference, with an 
unclean termination if we hit txn_too_old, and an upper bound on the 
"limit" parameter or equivalent chosen such that txn_too_old is 
vanishingly unlikely.


On compatibility, there's precedent for a minor release of old 
branches just to add replicator compatibility. for example, the 
replicator could call _changes again if it received a complete 
_changes response (i.e, one that ended with a } that completes the 
json object) that did not include a "last_seq" row. The 4.0 replicator 
would always do this.


I wouldn't really want to release a new 1.x, would you? Augh.

If we're going to change how replication works, wouldn't it better to 
simply say "there is no guaranteed one-shot replication back from 4.x to 
1.x?" Or, intentionally break backward compatibility so one-shot 
replication to un-upgraded old Couches refuses to work at all? This 
would prevent the confusion by making it clear - you can't do things 
this way anymore.


Sorry, meant to say we publish that the workaround is you need either a 
"push" replication from 4.x -> 1.x, or must use a hypothetically patched 
3.x+ replicator as a "third party" to replicate successfully from 4.x -> 
non-patched older CouchDBs.


I'd rather support this scenario than have to support explaining why the 
"one shot" replication back to an old 1.x, when initiated by a 1.x 
cluster, is returning results "ahead" of the time at which the one-shot 
replication was started.




We could do a point release of 3.x, sure.

-Joan



B.

On 16 Jul 2020, at 17:25, Paul Davis  
wrote:


 From what I'm reading it sounds like we have general consensus on a 
few things:


1. A single CouchDB API call should map to a single FDB transaction
2. We absolutely do not want to return a valid JSON response to any
streaming API that hit a transaction boundary (because data
loss/corruption)
3. We're willing to change the API requirements so that 2 is not an 
issue.

4. None of this applies to continuous changes since that API call was
never a single snapshot.

If everyone generally agrees with that summarization, my suggestion
would be that we just revisit the new pagination APIs and make them
the only behavior rather than having them be opt-in. I believe those
APIs already address all the concerns in this thread and the only
reason we kept the older versions with `restart_tx` was to maintain
API backwards compatibility at the expense of a slight change to
semantics of snapshots. However, if there's a consensus that the
semantics are more important than allowing a blanket `GET
/db/_all_docs` I think it'd make the most sense to just embrace the
pagination APIs that already exist and were written to cover these
issues.

The only thing I'm not 100% on is how to deal with non-continuous
replications. I.e., the older single shot replication. Do we go back
with patches to older replicators to allow 4.0 compatibility? Just
declare that you have to mediate a replication on the newer of the two
CouchDB deployments? Sniff the replicator's UserAgent and behave
differently on 4.x for just that special case?

Paul

On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski  
wrote:


Sorry, I also missed that you quoted this specific bit about eagerly 
requesting a new snapshot. Currently the code will just react to the 
transaction expiring, then wait till it acquires a new snapshot if 
“restart_tx” is set (which can take a couple of milliseconds on a 
FoundationDB cluster that is deployed across multiple AZs in a cloud 
Region) and then proceed.


Adam

On Jul 15, 2020, at 6:54 PM, Adam Kocoloski  
wrote:


Right now the code has an internal “restart_tx” flag that is used 
to automatically request a new snapshot if the original one expires 
and continue streaming the response. It can be used for all manner 
of multi-row responses, not just _changes.


As this is a pretty big change to the isolation guarantees provided 
by the database Bob volunteered to elevate the issue to the mailing 
list for a deeper discussion.


Cheers, Adam


On Jul 15, 2020, at 11:38 AM, Joan Touzet  wrote:

I'm having trouble following the thread...

On 14/07/2020 14:56, Adam Kocoloski wrote:
For cases where you’re not concerned about the snapshot isolation 
(e.g. streaming an entire _changes feed), there is a small 
performance benefit to requesting a new FDB transaction 

Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-16 Thread Joan Touzet




On 2020-07-16 2:24 p.m., Robert Samuel Newson wrote:


Agreed on all 4 points. On the final point, it's worth noting that a continuous 
changes feed was two-phase, the first is indeed over a snapshot of the db as of 
the start of the _changes request, the second phase is an endless series of 
subsequent snapshots. the 4.0 behaviour won't exactly match that but it's 
definitely in the same spirit.

Agreed also on requiring pagination (I've not reviewed the proposed pagination api in 
sufficient detail to +1 it yet). Would we start the response as rows are retrieved, 
though? That's my preference, with an unclean termination if we hit txn_too_old, and an 
upper bound on the "limit" parameter or equivalent chosen such that txn_too_old 
is vanishingly unlikely.

On compatibility, there's precedent for a minor release of old branches just to add 
replicator compatibility. for example, the replicator could call _changes again if it 
received a complete _changes response (i.e, one that ended with a } that completes the 
json object) that did not include a "last_seq" row. The 4.0 replicator would 
always do this.


I wouldn't really want to release a new 1.x, would you? Augh.

If we're going to change how replication works, wouldn't it better to 
simply say "there is no guaranteed one-shot replication back from 4.x to 
1.x?" Or, intentionally break backward compatibility so one-shot 
replication to un-upgraded old Couches refuses to work at all? This 
would prevent the confusion by making it clear - you can't do things 
this way anymore.


We could do a point release of 3.x, sure.

-Joan



B.


On 16 Jul 2020, at 17:25, Paul Davis  wrote:

 From what I'm reading it sounds like we have general consensus on a few things:

1. A single CouchDB API call should map to a single FDB transaction
2. We absolutely do not want to return a valid JSON response to any
streaming API that hit a transaction boundary (because data
loss/corruption)
3. We're willing to change the API requirements so that 2 is not an issue.
4. None of this applies to continuous changes since that API call was
never a single snapshot.

If everyone generally agrees with that summarization, my suggestion
would be that we just revisit the new pagination APIs and make them
the only behavior rather than having them be opt-in. I believe those
APIs already address all the concerns in this thread and the only
reason we kept the older versions with `restart_tx` was to maintain
API backwards compatibility at the expense of a slight change to
semantics of snapshots. However, if there's a consensus that the
semantics are more important than allowing a blanket `GET
/db/_all_docs` I think it'd make the most sense to just embrace the
pagination APIs that already exist and were written to cover these
issues.

The only thing I'm not 100% on is how to deal with non-continuous
replications. I.e., the older single shot replication. Do we go back
with patches to older replicators to allow 4.0 compatibility? Just
declare that you have to mediate a replication on the newer of the two
CouchDB deployments? Sniff the replicator's UserAgent and behave
differently on 4.x for just that special case?

Paul

On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski  wrote:


Sorry, I also missed that you quoted this specific bit about eagerly requesting 
a new snapshot. Currently the code will just react to the transaction expiring, 
then wait till it acquires a new snapshot if “restart_tx” is set (which can 
take a couple of milliseconds on a FoundationDB cluster that is deployed across 
multiple AZs in a cloud Region) and then proceed.

Adam


On Jul 15, 2020, at 6:54 PM, Adam Kocoloski  wrote:

Right now the code has an internal “restart_tx” flag that is used to 
automatically request a new snapshot if the original one expires and continue 
streaming the response. It can be used for all manner of multi-row responses, 
not just _changes.

As this is a pretty big change to the isolation guarantees provided by the 
database Bob volunteered to elevate the issue to the mailing list for a deeper 
discussion.

Cheers, Adam


On Jul 15, 2020, at 11:38 AM, Joan Touzet  wrote:

I'm having trouble following the thread...

On 14/07/2020 14:56, Adam Kocoloski wrote:

For cases where you’re not concerned about the snapshot isolation (e.g. 
streaming an entire _changes feed), there is a small performance benefit to 
requesting a new FDB transaction asynchronously before the old one actually 
times out and swapping over to it. That’s a pattern I’ve seen in other FDB 
layers but I’m not sure we’ve used it anywhere in CouchDB yet.


How does _changes work right now in the proposed 4.0 code?

-Joan








Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-16 Thread Robert Samuel Newson


Agreed on all 4 points. On the final point, it's worth noting that a continuous 
changes feed was two-phase, the first is indeed over a snapshot of the db as of 
the start of the _changes request, the second phase is an endless series of 
subsequent snapshots. the 4.0 behaviour won't exactly match that but it's 
definitely in the same spirit.

Agreed also on requiring pagination (I've not reviewed the proposed pagination 
api in sufficient detail to +1 it yet). Would we start the response as rows are 
retrieved, though? That's my preference, with an unclean termination if we hit 
txn_too_old, and an upper bound on the "limit" parameter or equivalent chosen 
such that txn_too_old is vanishingly unlikely.

On compatibility, there's precedent for a minor release of old branches just to 
add replicator compatibility. for example, the replicator could call _changes 
again if it received a complete _changes response (i.e, one that ended with a } 
that completes the json object) that did not include a "last_seq" row. The 4.0 
replicator would always do this.

B.

> On 16 Jul 2020, at 17:25, Paul Davis  wrote:
> 
> From what I'm reading it sounds like we have general consensus on a few 
> things:
> 
> 1. A single CouchDB API call should map to a single FDB transaction
> 2. We absolutely do not want to return a valid JSON response to any
> streaming API that hit a transaction boundary (because data
> loss/corruption)
> 3. We're willing to change the API requirements so that 2 is not an issue.
> 4. None of this applies to continuous changes since that API call was
> never a single snapshot.
> 
> If everyone generally agrees with that summarization, my suggestion
> would be that we just revisit the new pagination APIs and make them
> the only behavior rather than having them be opt-in. I believe those
> APIs already address all the concerns in this thread and the only
> reason we kept the older versions with `restart_tx` was to maintain
> API backwards compatibility at the expense of a slight change to
> semantics of snapshots. However, if there's a consensus that the
> semantics are more important than allowing a blanket `GET
> /db/_all_docs` I think it'd make the most sense to just embrace the
> pagination APIs that already exist and were written to cover these
> issues.
> 
> The only thing I'm not 100% on is how to deal with non-continuous
> replications. I.e., the older single shot replication. Do we go back
> with patches to older replicators to allow 4.0 compatibility? Just
> declare that you have to mediate a replication on the newer of the two
> CouchDB deployments? Sniff the replicator's UserAgent and behave
> differently on 4.x for just that special case?
> 
> Paul
> 
> On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski  wrote:
>> 
>> Sorry, I also missed that you quoted this specific bit about eagerly 
>> requesting a new snapshot. Currently the code will just react to the 
>> transaction expiring, then wait till it acquires a new snapshot if 
>> “restart_tx” is set (which can take a couple of milliseconds on a 
>> FoundationDB cluster that is deployed across multiple AZs in a cloud Region) 
>> and then proceed.
>> 
>> Adam
>> 
>>> On Jul 15, 2020, at 6:54 PM, Adam Kocoloski  wrote:
>>> 
>>> Right now the code has an internal “restart_tx” flag that is used to 
>>> automatically request a new snapshot if the original one expires and 
>>> continue streaming the response. It can be used for all manner of multi-row 
>>> responses, not just _changes.
>>> 
>>> As this is a pretty big change to the isolation guarantees provided by the 
>>> database Bob volunteered to elevate the issue to the mailing list for a 
>>> deeper discussion.
>>> 
>>> Cheers, Adam
>>> 
 On Jul 15, 2020, at 11:38 AM, Joan Touzet  wrote:
 
 I'm having trouble following the thread...
 
 On 14/07/2020 14:56, Adam Kocoloski wrote:
> For cases where you’re not concerned about the snapshot isolation (e.g. 
> streaming an entire _changes feed), there is a small performance benefit 
> to requesting a new FDB transaction asynchronously before the old one 
> actually times out and swapping over to it. That’s a pattern I’ve seen in 
> other FDB layers but I’m not sure we’ve used it anywhere in CouchDB yet.
 
 How does _changes work right now in the proposed 4.0 code?
 
 -Joan
>>> 
>> 



Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-16 Thread Paul Davis
>From what I'm reading it sounds like we have general consensus on a few things:

1. A single CouchDB API call should map to a single FDB transaction
2. We absolutely do not want to return a valid JSON response to any
streaming API that hit a transaction boundary (because data
loss/corruption)
3. We're willing to change the API requirements so that 2 is not an issue.
4. None of this applies to continuous changes since that API call was
never a single snapshot.

If everyone generally agrees with that summarization, my suggestion
would be that we just revisit the new pagination APIs and make them
the only behavior rather than having them be opt-in. I believe those
APIs already address all the concerns in this thread and the only
reason we kept the older versions with `restart_tx` was to maintain
API backwards compatibility at the expense of a slight change to
semantics of snapshots. However, if there's a consensus that the
semantics are more important than allowing a blanket `GET
/db/_all_docs` I think it'd make the most sense to just embrace the
pagination APIs that already exist and were written to cover these
issues.

The only thing I'm not 100% on is how to deal with non-continuous
replications. I.e., the older single shot replication. Do we go back
with patches to older replicators to allow 4.0 compatibility? Just
declare that you have to mediate a replication on the newer of the two
CouchDB deployments? Sniff the replicator's UserAgent and behave
differently on 4.x for just that special case?

Paul

On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski  wrote:
>
> Sorry, I also missed that you quoted this specific bit about eagerly 
> requesting a new snapshot. Currently the code will just react to the 
> transaction expiring, then wait till it acquires a new snapshot if 
> “restart_tx” is set (which can take a couple of milliseconds on a 
> FoundationDB cluster that is deployed across multiple AZs in a cloud Region) 
> and then proceed.
>
> Adam
>
> > On Jul 15, 2020, at 6:54 PM, Adam Kocoloski  wrote:
> >
> > Right now the code has an internal “restart_tx” flag that is used to 
> > automatically request a new snapshot if the original one expires and 
> > continue streaming the response. It can be used for all manner of multi-row 
> > responses, not just _changes.
> >
> > As this is a pretty big change to the isolation guarantees provided by the 
> > database Bob volunteered to elevate the issue to the mailing list for a 
> > deeper discussion.
> >
> > Cheers, Adam
> >
> >> On Jul 15, 2020, at 11:38 AM, Joan Touzet  wrote:
> >>
> >> I'm having trouble following the thread...
> >>
> >> On 14/07/2020 14:56, Adam Kocoloski wrote:
> >>> For cases where you’re not concerned about the snapshot isolation (e.g. 
> >>> streaming an entire _changes feed), there is a small performance benefit 
> >>> to requesting a new FDB transaction asynchronously before the old one 
> >>> actually times out and swapping over to it. That’s a pattern I’ve seen in 
> >>> other FDB layers but I’m not sure we’ve used it anywhere in CouchDB yet.
> >>
> >> How does _changes work right now in the proposed 4.0 code?
> >>
> >> -Joan
> >
>


Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-15 Thread Adam Kocoloski
Sorry, I also missed that you quoted this specific bit about eagerly requesting 
a new snapshot. Currently the code will just react to the transaction expiring, 
then wait till it acquires a new snapshot if “restart_tx” is set (which can 
take a couple of milliseconds on a FoundationDB cluster that is deployed across 
multiple AZs in a cloud Region) and then proceed.

Adam

> On Jul 15, 2020, at 6:54 PM, Adam Kocoloski  wrote:
> 
> Right now the code has an internal “restart_tx” flag that is used to 
> automatically request a new snapshot if the original one expires and continue 
> streaming the response. It can be used for all manner of multi-row responses, 
> not just _changes.
> 
> As this is a pretty big change to the isolation guarantees provided by the 
> database Bob volunteered to elevate the issue to the mailing list for a 
> deeper discussion.
> 
> Cheers, Adam
> 
>> On Jul 15, 2020, at 11:38 AM, Joan Touzet  wrote:
>> 
>> I'm having trouble following the thread...
>> 
>> On 14/07/2020 14:56, Adam Kocoloski wrote:
>>> For cases where you’re not concerned about the snapshot isolation (e.g. 
>>> streaming an entire _changes feed), there is a small performance benefit to 
>>> requesting a new FDB transaction asynchronously before the old one actually 
>>> times out and swapping over to it. That’s a pattern I’ve seen in other FDB 
>>> layers but I’m not sure we’ve used it anywhere in CouchDB yet.
>> 
>> How does _changes work right now in the proposed 4.0 code?
>> 
>> -Joan
> 



Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-15 Thread Adam Kocoloski
Right now the code has an internal “restart_tx” flag that is used to 
automatically request a new snapshot if the original one expires and continue 
streaming the response. It can be used for all manner of multi-row responses, 
not just _changes.

As this is a pretty big change to the isolation guarantees provided by the 
database Bob volunteered to elevate the issue to the mailing list for a deeper 
discussion.

Cheers, Adam

> On Jul 15, 2020, at 11:38 AM, Joan Touzet  wrote:
> 
> I'm having trouble following the thread...
> 
> On 14/07/2020 14:56, Adam Kocoloski wrote:
>> For cases where you’re not concerned about the snapshot isolation (e.g. 
>> streaming an entire _changes feed), there is a small performance benefit to 
>> requesting a new FDB transaction asynchronously before the old one actually 
>> times out and swapping over to it. That’s a pattern I’ve seen in other FDB 
>> layers but I’m not sure we’ve used it anywhere in CouchDB yet.
> 
> How does _changes work right now in the proposed 4.0 code?
> 
> -Joan



Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-15 Thread Joan Touzet

I'm having trouble following the thread...

On 14/07/2020 14:56, Adam Kocoloski wrote:

For cases where you’re not concerned about the snapshot isolation (e.g. 
streaming an entire _changes feed), there is a small performance benefit to 
requesting a new FDB transaction asynchronously before the old one actually 
times out and swapping over to it. That’s a pattern I’ve seen in other FDB 
layers but I’m not sure we’ve used it anywhere in CouchDB yet.


How does _changes work right now in the proposed 4.0 code?

-Joan


Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-15 Thread Jan Lehnardt



> On 15. Jul 2020, at 16:12, Robert Newson  wrote:
> 
> 
> Thanks Jan
> 
> I would prefer not to have the configuration switch, instead remove what we 
> don’t want. As you said there’ll be a 3 / 4 split for a while (and not just 
> for this reason). 

I’d support an effort for folks to ease into 4.x, as long as it is not the 
default behaviour. I haven’t thought about this enough to have a definite 
opinion about what that looks like.

Best
Jan
—
> -- 
>  Robert Samuel Newson
>  rnew...@apache.org
> 
> On Wed, 15 Jul 2020, at 14:46, Jan Lehnardt wrote:
>> 
>>> On 14. Jul 2020, at 18:00, Adam Kocoloski  wrote:
>>> 
>>> I think there’s tremendous value in being able to tell our users that each 
>>> response served by CouchDB is constructed from a single isolated snapshot 
>>> of the underlying database. I’d advocate for this being the default 
>>> behavior of 4.0.
>> 
>> I too am in favour of this. I apologise for not speaking up in the 
>> earlier thread, which I followed closely, but never found the time to 
>> respond to.
>> 
>> From rnewson’s options, I’d suggest 3. the mandatory limit parameter. 
>> While this does indeed mean a BC break, it teaches the right semantics 
>> for folks on 4.0 and onwards. For client libraries like our own nano, 
>> we can easily wrap this behaviour, so the resulting API is mostly 
>> compatible still, at least when used in streaming mode, less so when 
>> buffering a big _all_docs response).
>> 
>>> If folks wanted to add an opt-in compatibility mode to support longer 
>>> responses, I suppose that could be OK. I think we should discourage that 
>>> access pattern in general, though, as it’s somewhat less friendly to 
>>> various other parts of the stack than a pattern of shorter responses and a 
>>> smart pagination API like the one we’re introducing. To wit, I don’t think 
>>> we’d want to support that compatibility mode in IBM Cloud.
>> 
>> Like Adam, I do not mind a compat mode, either through a different API 
>> endpoint, or even a config option. I think we will be fine in getting 
>> people on this path when we document this in our update guide for the 
>> 4.0 release. I don’t think this will lead to a Python 2/3 situation 
>> overall, because the 4.0+ features are compelling enough for relatively 
>> small changes required, and CouchDB 3.x in its then latest form will 
>> continue to be a fine database for years to come, for folks who can’t 
>> upgrade as easily. So yes, I anticipate we’ll live in a two-versions 
>> world a little longer than we did during 1.x to 2.x, but the reasons to 
>> leave 1.x behind were a little more severe than the improvements of 4.x 
>> over 3.x (while still significant, of course).
>> 
>> Best
>> Jan
>> —
>> 
>>> 
>>> Adam
>>> 
 On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson  
 wrote:
 
 Thanks Nick, very helpful, and it vindicates me opening this thread.
 
 I don't accept Mike Rhodes argument at all but I should explain why I 
 don't;
 
 In CouchDB 1.x, a response was generated from a single .couch file. There 
 was always a window between the start of the request as the client sees it 
 and CouchDB acquiring a snapshot of the relevant database. I don't think 
 that gap is meaningful and does not refute our statements of the time that 
 CouchDB responses are from a snapshot (specifically, that no change to the 
 database made _during_ the response will be visible in _this_ response). 
 In CouchDB 2.x (and continuing in 3.x), a CouchDB database typically 
 consists of multiple shards, each of which, once opened, remain 
 snapshotted for the duration of that response. The difference between 1.x 
 and 2.x/3.x is that the window is potentially larger (though the requests 
 are issued in parallel). The response, however much it returned, was 
 impervious to changes in other requests once it has begun.
 
 I don't think _all_docs, _view or a non-continuous _changes response 
 should allow changes made in other requests to appear midway through them 
 and I want to hear the opinions of folks that have watched over CouchDB 
 from its earliest days on this specific point (If I must name names, at 
 least Adam K, Paul D, Jan L, Joan T). If there's a majority for deviating 
 from this semantic, I will go with the majority.
 
 If we were to agree to preserve the 'single snapshot' behaviour, what 
 would the behaviour be if we can't honour it because of the FoundationDB 
 transaction limits?
 
 I see a few options.
 
 1) We could end the response uncleanly, mid-response. CouchDB does this 
 when it has no alternative, and it is ugly, but it is usually handled well 
 by clients. They are at least not usually convinced they got a complete 
 response if they are using a competent HTTP client.
 
 2) We could disavow the streaming API, as you've suggested, attempt to 
 gather the full 

Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-15 Thread Robert Newson


Thanks Jan

I would prefer not to have the configuration switch, instead remove what we 
don’t want. As you said there’ll be a 3 / 4 split for a while (and not just for 
this reason). 
-- 
  Robert Samuel Newson
  rnew...@apache.org

On Wed, 15 Jul 2020, at 14:46, Jan Lehnardt wrote:
> 
> > On 14. Jul 2020, at 18:00, Adam Kocoloski  wrote:
> > 
> > I think there’s tremendous value in being able to tell our users that each 
> > response served by CouchDB is constructed from a single isolated snapshot 
> > of the underlying database. I’d advocate for this being the default 
> > behavior of 4.0.
> 
> I too am in favour of this. I apologise for not speaking up in the 
> earlier thread, which I followed closely, but never found the time to 
> respond to.
> 
> From rnewson’s options, I’d suggest 3. the mandatory limit parameter. 
> While this does indeed mean a BC break, it teaches the right semantics 
> for folks on 4.0 and onwards. For client libraries like our own nano, 
> we can easily wrap this behaviour, so the resulting API is mostly 
> compatible still, at least when used in streaming mode, less so when 
> buffering a big _all_docs response).
> 
> > If folks wanted to add an opt-in compatibility mode to support longer 
> > responses, I suppose that could be OK. I think we should discourage that 
> > access pattern in general, though, as it’s somewhat less friendly to 
> > various other parts of the stack than a pattern of shorter responses and a 
> > smart pagination API like the one we’re introducing. To wit, I don’t think 
> > we’d want to support that compatibility mode in IBM Cloud.
> 
> Like Adam, I do not mind a compat mode, either through a different API 
> endpoint, or even a config option. I think we will be fine in getting 
> people on this path when we document this in our update guide for the 
> 4.0 release. I don’t think this will lead to a Python 2/3 situation 
> overall, because the 4.0+ features are compelling enough for relatively 
> small changes required, and CouchDB 3.x in its then latest form will 
> continue to be a fine database for years to come, for folks who can’t 
> upgrade as easily. So yes, I anticipate we’ll live in a two-versions 
> world a little longer than we did during 1.x to 2.x, but the reasons to 
> leave 1.x behind were a little more severe than the improvements of 4.x 
> over 3.x (while still significant, of course).
> 
> Best
> Jan
> —
> 
> > 
> > Adam
> > 
> >> On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson  
> >> wrote:
> >> 
> >> Thanks Nick, very helpful, and it vindicates me opening this thread.
> >> 
> >> I don't accept Mike Rhodes argument at all but I should explain why I 
> >> don't;
> >> 
> >> In CouchDB 1.x, a response was generated from a single .couch file. There 
> >> was always a window between the start of the request as the client sees it 
> >> and CouchDB acquiring a snapshot of the relevant database. I don't think 
> >> that gap is meaningful and does not refute our statements of the time that 
> >> CouchDB responses are from a snapshot (specifically, that no change to the 
> >> database made _during_ the response will be visible in _this_ response). 
> >> In CouchDB 2.x (and continuing in 3.x), a CouchDB database typically 
> >> consists of multiple shards, each of which, once opened, remain 
> >> snapshotted for the duration of that response. The difference between 1.x 
> >> and 2.x/3.x is that the window is potentially larger (though the requests 
> >> are issued in parallel). The response, however much it returned, was 
> >> impervious to changes in other requests once it has begun.
> >> 
> >> I don't think _all_docs, _view or a non-continuous _changes response 
> >> should allow changes made in other requests to appear midway through them 
> >> and I want to hear the opinions of folks that have watched over CouchDB 
> >> from its earliest days on this specific point (If I must name names, at 
> >> least Adam K, Paul D, Jan L, Joan T). If there's a majority for deviating 
> >> from this semantic, I will go with the majority.
> >> 
> >> If we were to agree to preserve the 'single snapshot' behaviour, what 
> >> would the behaviour be if we can't honour it because of the FoundationDB 
> >> transaction limits?
> >> 
> >> I see a few options.
> >> 
> >> 1) We could end the response uncleanly, mid-response. CouchDB does this 
> >> when it has no alternative, and it is ugly, but it is usually handled well 
> >> by clients. They are at least not usually convinced they got a complete 
> >> response if they are using a competent HTTP client.
> >> 
> >> 2) We could disavow the streaming API, as you've suggested, attempt to 
> >> gather the full response. If we do this within the FDB bounds, return a 
> >> 200 code and the response body. A 400 and an error body if we don't.
> >> 
> >> 3) We could make the "limit" parameter mandatory and with an upper bound, 
> >> in combination with 1 or 2, such that a valid request is very likely to be 
> >> 

Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-15 Thread Jan Lehnardt


> On 14. Jul 2020, at 18:00, Adam Kocoloski  wrote:
> 
> I think there’s tremendous value in being able to tell our users that each 
> response served by CouchDB is constructed from a single isolated snapshot of 
> the underlying database. I’d advocate for this being the default behavior of 
> 4.0.

I too am in favour of this. I apologise for not speaking up in the earlier 
thread, which I followed closely, but never found the time to respond to.

From rnewson’s options, I’d suggest 3. the mandatory limit parameter. While 
this does indeed mean a BC break, it teaches the right semantics for folks on 
4.0 and onwards. For client libraries like our own nano, we can easily wrap 
this behaviour, so the resulting API is mostly compatible still, at least when 
used in streaming mode, less so when buffering a big _all_docs response).

> If folks wanted to add an opt-in compatibility mode to support longer 
> responses, I suppose that could be OK. I think we should discourage that 
> access pattern in general, though, as it’s somewhat less friendly to various 
> other parts of the stack than a pattern of shorter responses and a smart 
> pagination API like the one we’re introducing. To wit, I don’t think we’d 
> want to support that compatibility mode in IBM Cloud.

Like Adam, I do not mind a compat mode, either through a different API 
endpoint, or even a config option. I think we will be fine in getting people on 
this path when we document this in our update guide for the 4.0 release. I 
don’t think this will lead to a Python 2/3 situation overall, because the 4.0+ 
features are compelling enough for relatively small changes required, and 
CouchDB 3.x in its then latest form will continue to be a fine database for 
years to come, for folks who can’t upgrade as easily. So yes, I anticipate 
we’ll live in a two-versions world a little longer than we did during 1.x to 
2.x, but the reasons to leave 1.x behind were a little more severe than the 
improvements of 4.x over 3.x (while still significant, of course).

Best
Jan
—

> 
> Adam
> 
>> On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson  
>> wrote:
>> 
>> Thanks Nick, very helpful, and it vindicates me opening this thread.
>> 
>> I don't accept Mike Rhodes argument at all but I should explain why I don't;
>> 
>> In CouchDB 1.x, a response was generated from a single .couch file. There 
>> was always a window between the start of the request as the client sees it 
>> and CouchDB acquiring a snapshot of the relevant database. I don't think 
>> that gap is meaningful and does not refute our statements of the time that 
>> CouchDB responses are from a snapshot (specifically, that no change to the 
>> database made _during_ the response will be visible in _this_ response). In 
>> CouchDB 2.x (and continuing in 3.x), a CouchDB database typically consists 
>> of multiple shards, each of which, once opened, remain snapshotted for the 
>> duration of that response. The difference between 1.x and 2.x/3.x is that 
>> the window is potentially larger (though the requests are issued in 
>> parallel). The response, however much it returned, was impervious to changes 
>> in other requests once it has begun.
>> 
>> I don't think _all_docs, _view or a non-continuous _changes response should 
>> allow changes made in other requests to appear midway through them and I 
>> want to hear the opinions of folks that have watched over CouchDB from its 
>> earliest days on this specific point (If I must name names, at least Adam K, 
>> Paul D, Jan L, Joan T). If there's a majority for deviating from this 
>> semantic, I will go with the majority.
>> 
>> If we were to agree to preserve the 'single snapshot' behaviour, what would 
>> the behaviour be if we can't honour it because of the FoundationDB 
>> transaction limits?
>> 
>> I see a few options.
>> 
>> 1) We could end the response uncleanly, mid-response. CouchDB does this when 
>> it has no alternative, and it is ugly, but it is usually handled well by 
>> clients. They are at least not usually convinced they got a complete 
>> response if they are using a competent HTTP client.
>> 
>> 2) We could disavow the streaming API, as you've suggested, attempt to 
>> gather the full response. If we do this within the FDB bounds, return a 200 
>> code and the response body. A 400 and an error body if we don't.
>> 
>> 3) We could make the "limit" parameter mandatory and with an upper bound, in 
>> combination with 1 or 2, such that a valid request is very likely to be 
>> served within the limits.
>> 
>> I'd like to hear more voices on which way we want to break the unachievable 
>> semantic of old where you could read _all_docs on a billion document 
>> database over, uptime gods willing, a snapshot of the database.
>> 
>> B.
>> 
>>> On 13 Jul 2020, at 21:15, Nick Vatamaniuc  wrote:
>>> 
>>> Thanks for bringing the topic up for the discussion!
>>> 
>>> For background, this topic was discussed on the mailing list starting
>>> in February, 

Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-14 Thread Adam Kocoloski
Technically, we could certainly terminate a response cleanly when the 
underlying FoundationDB transaction expires and offer a bookmark to resume the 
response using a new transaction in a subsequent request. Some of us have 
reservations about that approach because an application that did not know to 
look for the “txn_too_long” attribute would quietly proceed with an incomplete, 
corrupted dataset. Terminating the response brutally reduces the likelihood of 
that occurring to ~zero.

It’s true that we can’t absolutely guarantee that the database will never 
timeout, but setting a reasonable limit of ~2000 rows in a response should make 
it quite unlikely. I‘d expect those responses be delivered in 50ms or less, 
which is 100x faster than the 5 second transaction expiry.

For cases where you’re not concerned about the snapshot isolation (e.g. 
streaming an entire _changes feed), there is a small performance benefit to 
requesting a new FDB transaction asynchronously before the old one actually 
times out and swapping over to it. That’s a pattern I’ve seen in other FDB 
layers but I’m not sure we’ve used it anywhere in CouchDB yet.

Adam

> On Jul 14, 2020, at 2:06 PM, San Sato  wrote:
> 
> Interesting.
> 
> 1. end the response  ("uncleanly") - does this mean the HTTP response
> wouldn't be valid JSON?  I guess the HTTP response code can't be expected
> to reflect a non-normal result.  Maybe in a trailing attribute in json, can
> the response indicate that it's truncated for the reason of txn_too_long,
> to distinguish it from completed responses with less-than-a-page-size (e.g.
> limit=20k, 18k records sent, no more records present)?
> 
> Can bookmark/etc still be included at that point to resume in
> closest-key-order?  Even though it's a streaming, not paginated, response,
> it would match pre-v4 semantics of pagination over multiple http requests,
> right?
> 
> 2. Sending a 400 error seems like a good way to waste requests.  I imagine
> there's no constant limit= that can avoid the issue, so people will have to
> do things that are sensitive to the presence of the limit one way or
> other.  I'd way rather get a partial response with a flag indicating I
> should resume from  - but maybe that's the "rewrite the app" scenario
> Nick described designing to avoid.
> 
> 4.  request-level isolation=(TRUE|false) could be a way to express default
> a preference for 1, but allow opt-in for streaming the fresher rows.  I'd
> want to be able to know what kind of boundaries are used for switching to a
> newer txn snapshot - obviously there's a practical outer limit from FDB
> but is it a performance hit to switch with some greater frequency like
> 1000-rows, an FDB index-page-size if there's such a thing, every 250ms or
> similar?
> 
> 
> 
> On Tue, Jul 14, 2020 at 7:18 AM Robert Samuel Newson 
> wrote:
> 
>> Thanks Nick, very helpful, and it vindicates me opening this thread.
>> 
>> I don't accept Mike Rhodes argument at all but I should explain why I
>> don't;
>> 
>> In CouchDB 1.x, a response was generated from a single .couch file. There
>> was always a window between the start of the request as the client sees it
>> and CouchDB acquiring a snapshot of the relevant database. I don't think
>> that gap is meaningful and does not refute our statements of the time that
>> CouchDB responses are from a snapshot (specifically, that no change to the
>> database made _during_ the response will be visible in _this_ response). In
>> CouchDB 2.x (and continuing in 3.x), a CouchDB database typically consists
>> of multiple shards, each of which, once opened, remain snapshotted for the
>> duration of that response. The difference between 1.x and 2.x/3.x is that
>> the window is potentially larger (though the requests are issued in
>> parallel). The response, however much it returned, was impervious to
>> changes in other requests once it has begun.
>> 
>> I don't think _all_docs, _view or a non-continuous _changes response
>> should allow changes made in other requests to appear midway through them
>> and I want to hear the opinions of folks that have watched over CouchDB
>> from its earliest days on this specific point (If I must name names, at
>> least Adam K, Paul D, Jan L, Joan T). If there's a majority for deviating
>> from this semantic, I will go with the majority.
>> 
>> If we were to agree to preserve the 'single snapshot' behaviour, what
>> would the behaviour be if we can't honour it because of the FoundationDB
>> transaction limits?
>> 
>> I see a few options.
>> 
>> 1) We could end the response uncleanly, mid-response. CouchDB does this
>> when it has no alternative, and it is ugly, but it is usually handled well
>> by clients. They are at least not usually convinced they got a complete
>> response if they are using a competent HTTP client.
>> 
>> 2) We could disavow the streaming API, as you've suggested, attempt to
>> gather the full response. If we do this within the FDB bounds, return a 200
>> code and the 

Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-14 Thread San Sato
Interesting.

1. end the response  ("uncleanly") - does this mean the HTTP response
wouldn't be valid JSON?  I guess the HTTP response code can't be expected
to reflect a non-normal result.  Maybe in a trailing attribute in json, can
the response indicate that it's truncated for the reason of txn_too_long,
to distinguish it from completed responses with less-than-a-page-size (e.g.
limit=20k, 18k records sent, no more records present)?

Can bookmark/etc still be included at that point to resume in
closest-key-order?  Even though it's a streaming, not paginated, response,
it would match pre-v4 semantics of pagination over multiple http requests,
right?

2. Sending a 400 error seems like a good way to waste requests.  I imagine
there's no constant limit= that can avoid the issue, so people will have to
do things that are sensitive to the presence of the limit one way or
other.  I'd way rather get a partial response with a flag indicating I
should resume from  - but maybe that's the "rewrite the app" scenario
Nick described designing to avoid.

4.  request-level isolation=(TRUE|false) could be a way to express default
a preference for 1, but allow opt-in for streaming the fresher rows.  I'd
want to be able to know what kind of boundaries are used for switching to a
newer txn snapshot - obviously there's a practical outer limit from FDB
but is it a performance hit to switch with some greater frequency like
1000-rows, an FDB index-page-size if there's such a thing, every 250ms or
similar?



On Tue, Jul 14, 2020 at 7:18 AM Robert Samuel Newson 
wrote:

> Thanks Nick, very helpful, and it vindicates me opening this thread.
>
> I don't accept Mike Rhodes argument at all but I should explain why I
> don't;
>
> In CouchDB 1.x, a response was generated from a single .couch file. There
> was always a window between the start of the request as the client sees it
> and CouchDB acquiring a snapshot of the relevant database. I don't think
> that gap is meaningful and does not refute our statements of the time that
> CouchDB responses are from a snapshot (specifically, that no change to the
> database made _during_ the response will be visible in _this_ response). In
> CouchDB 2.x (and continuing in 3.x), a CouchDB database typically consists
> of multiple shards, each of which, once opened, remain snapshotted for the
> duration of that response. The difference between 1.x and 2.x/3.x is that
> the window is potentially larger (though the requests are issued in
> parallel). The response, however much it returned, was impervious to
> changes in other requests once it has begun.
>
> I don't think _all_docs, _view or a non-continuous _changes response
> should allow changes made in other requests to appear midway through them
> and I want to hear the opinions of folks that have watched over CouchDB
> from its earliest days on this specific point (If I must name names, at
> least Adam K, Paul D, Jan L, Joan T). If there's a majority for deviating
> from this semantic, I will go with the majority.
>
> If we were to agree to preserve the 'single snapshot' behaviour, what
> would the behaviour be if we can't honour it because of the FoundationDB
> transaction limits?
>
> I see a few options.
>
> 1) We could end the response uncleanly, mid-response. CouchDB does this
> when it has no alternative, and it is ugly, but it is usually handled well
> by clients. They are at least not usually convinced they got a complete
> response if they are using a competent HTTP client.
>
> 2) We could disavow the streaming API, as you've suggested, attempt to
> gather the full response. If we do this within the FDB bounds, return a 200
> code and the response body. A 400 and an error body if we don't.
>
> 3) We could make the "limit" parameter mandatory and with an upper bound,
> in combination with 1 or 2, such that a valid request is very likely to be
> served within the limits.
>
> I'd like to hear more voices on which way we want to break the
> unachievable semantic of old where you could read _all_docs on a billion
> document database over, uptime gods willing, a snapshot of the database.
>
> B.
>
> > On 13 Jul 2020, at 21:15, Nick Vatamaniuc  wrote:
> >
> > Thanks for bringing the topic up for the discussion!
> >
> > For background, this topic was discussed on the mailing list starting
> > in February, 2019
> >
> https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E
> >
> > The primary reason for restart_tx option is to provide compatibility
> > for _changes feeds to allow older replicators to handle 4.0 sources.
> > It starts a new transaction after 5 seconds or so (a current FDB
> > limitation, might go up in the future) and transparently continues to
> > stream data where it left off. Ex, streaming [a,b,c,d], times out
> > after b, then it will continue with c, d etc. Currently this is also
> > used for other streaming APIs as an alternative to returning mangled
> > 

Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-14 Thread Adam Kocoloski
I think there’s tremendous value in being able to tell our users that each 
response served by CouchDB is constructed from a single isolated snapshot of 
the underlying database. I’d advocate for this being the default behavior of 
4.0.

If folks wanted to add an opt-in compatibility mode to support longer 
responses, I suppose that could be OK. I think we should discourage that access 
pattern in general, though, as it’s somewhat less friendly to various other 
parts of the stack than a pattern of shorter responses and a smart pagination 
API like the one we’re introducing. To wit, I don’t think we’d want to support 
that compatibility mode in IBM Cloud.

Adam

> On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson  wrote:
> 
> Thanks Nick, very helpful, and it vindicates me opening this thread.
> 
> I don't accept Mike Rhodes argument at all but I should explain why I don't;
> 
> In CouchDB 1.x, a response was generated from a single .couch file. There was 
> always a window between the start of the request as the client sees it and 
> CouchDB acquiring a snapshot of the relevant database. I don't think that gap 
> is meaningful and does not refute our statements of the time that CouchDB 
> responses are from a snapshot (specifically, that no change to the database 
> made _during_ the response will be visible in _this_ response). In CouchDB 
> 2.x (and continuing in 3.x), a CouchDB database typically consists of 
> multiple shards, each of which, once opened, remain snapshotted for the 
> duration of that response. The difference between 1.x and 2.x/3.x is that the 
> window is potentially larger (though the requests are issued in parallel). 
> The response, however much it returned, was impervious to changes in other 
> requests once it has begun.
> 
> I don't think _all_docs, _view or a non-continuous _changes response should 
> allow changes made in other requests to appear midway through them and I want 
> to hear the opinions of folks that have watched over CouchDB from its 
> earliest days on this specific point (If I must name names, at least Adam K, 
> Paul D, Jan L, Joan T). If there's a majority for deviating from this 
> semantic, I will go with the majority.
> 
> If we were to agree to preserve the 'single snapshot' behaviour, what would 
> the behaviour be if we can't honour it because of the FoundationDB 
> transaction limits?
> 
> I see a few options.
> 
> 1) We could end the response uncleanly, mid-response. CouchDB does this when 
> it has no alternative, and it is ugly, but it is usually handled well by 
> clients. They are at least not usually convinced they got a complete response 
> if they are using a competent HTTP client.
> 
> 2) We could disavow the streaming API, as you've suggested, attempt to gather 
> the full response. If we do this within the FDB bounds, return a 200 code and 
> the response body. A 400 and an error body if we don't.
> 
> 3) We could make the "limit" parameter mandatory and with an upper bound, in 
> combination with 1 or 2, such that a valid request is very likely to be 
> served within the limits.
> 
> I'd like to hear more voices on which way we want to break the unachievable 
> semantic of old where you could read _all_docs on a billion document database 
> over, uptime gods willing, a snapshot of the database.
> 
> B.
> 
>> On 13 Jul 2020, at 21:15, Nick Vatamaniuc  wrote:
>> 
>> Thanks for bringing the topic up for the discussion!
>> 
>> For background, this topic was discussed on the mailing list starting
>> in February, 2019
>> https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E
>> 
>> The primary reason for restart_tx option is to provide compatibility
>> for _changes feeds to allow older replicators to handle 4.0 sources.
>> It starts a new transaction after 5 seconds or so (a current FDB
>> limitation, might go up in the future) and transparently continues to
>> stream data where it left off. Ex, streaming [a,b,c,d], times out
>> after b, then it will continue with c, d etc. Currently this is also
>> used for other streaming APIs as an alternative to returning mangled
>> JSON after emitting a 200 response and streaming some of the rows.
>> However it is not used for paginated responses, the new APIs developed
>> by Ilya. So users have an option to get the guaranteed snapshot
>> behavior option as well.
>> 
>> And for completeness, if we decide to remove the option, we should
>> specify what happens if we remove it and get a transaction_too_old
>> exception. Currently the behavior would be to restart the transaction,
>> resend all the headers and all the rows again down the socket, which I
>> don't think anyone wants, but is what we'd get if we just make
>> {restart_tx, false}
>> 
>>> I understand that automatically resetting the FDB txn during a response is 
>>> an attempt to work around that and maintain "compatibility" with CouchDB < 
>>> 4 semantics. I think it fails to 

Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-14 Thread Robert Samuel Newson
Thanks Nick, very helpful, and it vindicates me opening this thread.

I don't accept Mike Rhodes argument at all but I should explain why I don't;

In CouchDB 1.x, a response was generated from a single .couch file. There was 
always a window between the start of the request as the client sees it and 
CouchDB acquiring a snapshot of the relevant database. I don't think that gap 
is meaningful and does not refute our statements of the time that CouchDB 
responses are from a snapshot (specifically, that no change to the database 
made _during_ the response will be visible in _this_ response). In CouchDB 2.x 
(and continuing in 3.x), a CouchDB database typically consists of multiple 
shards, each of which, once opened, remain snapshotted for the duration of that 
response. The difference between 1.x and 2.x/3.x is that the window is 
potentially larger (though the requests are issued in parallel). The response, 
however much it returned, was impervious to changes in other requests once it 
has begun.

I don't think _all_docs, _view or a non-continuous _changes response should 
allow changes made in other requests to appear midway through them and I want 
to hear the opinions of folks that have watched over CouchDB from its earliest 
days on this specific point (If I must name names, at least Adam K, Paul D, Jan 
L, Joan T). If there's a majority for deviating from this semantic, I will go 
with the majority.

If we were to agree to preserve the 'single snapshot' behaviour, what would the 
behaviour be if we can't honour it because of the FoundationDB transaction 
limits?

I see a few options.

1) We could end the response uncleanly, mid-response. CouchDB does this when it 
has no alternative, and it is ugly, but it is usually handled well by clients. 
They are at least not usually convinced they got a complete response if they 
are using a competent HTTP client.

2) We could disavow the streaming API, as you've suggested, attempt to gather 
the full response. If we do this within the FDB bounds, return a 200 code and 
the response body. A 400 and an error body if we don't.

3) We could make the "limit" parameter mandatory and with an upper bound, in 
combination with 1 or 2, such that a valid request is very likely to be served 
within the limits.

I'd like to hear more voices on which way we want to break the unachievable 
semantic of old where you could read _all_docs on a billion document database 
over, uptime gods willing, a snapshot of the database.

B.

> On 13 Jul 2020, at 21:15, Nick Vatamaniuc  wrote:
> 
> Thanks for bringing the topic up for the discussion!
> 
> For background, this topic was discussed on the mailing list starting
> in February, 2019
> https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E
> 
> The primary reason for restart_tx option is to provide compatibility
> for _changes feeds to allow older replicators to handle 4.0 sources.
> It starts a new transaction after 5 seconds or so (a current FDB
> limitation, might go up in the future) and transparently continues to
> stream data where it left off. Ex, streaming [a,b,c,d], times out
> after b, then it will continue with c, d etc. Currently this is also
> used for other streaming APIs as an alternative to returning mangled
> JSON after emitting a 200 response and streaming some of the rows.
> However it is not used for paginated responses, the new APIs developed
> by Ilya. So users have an option to get the guaranteed snapshot
> behavior option as well.
> 
> And for completeness, if we decide to remove the option, we should
> specify what happens if we remove it and get a transaction_too_old
> exception. Currently the behavior would be to restart the transaction,
> resend all the headers and all the rows again down the socket, which I
> don't think anyone wants, but is what we'd get if we just make
> {restart_tx, false}
> 
>> I understand that automatically resetting the FDB txn during a response is 
>> an attempt to work around that and maintain "compatibility" with CouchDB < 4 
>> semantics. I think it fails to do so and is very misleading.
> 
> It is a trade-off in order to keep the same API shape as before. Sure,
> streaming all the docs with _all_docs or _changes feeds is not a great
> pattern but many applications are implemented that way already.
> Letting them migrate to 4.0 without having to rewrite the application
> with the caveat that they might see a document updated in the
> _all_docs stream after the request has already started, is a nicer
> choice, I think, than forcing them to rewrite their application, which
> could lead to a python 2/3 scenario.
> 
> Due to having multiple shards (Q>1), as discussed in the original
> mailing thread by Mike
> (https://lists.apache.org/thread.html/r8345f534a6fa88c107c1085fba13e660e0e2aedfd206c2748e002664%40%3Cdev.couchdb.apache.org%3E),
> we don't provide a strict read-only snapshot guarantee in 2.x and 3.x
> 

Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-13 Thread Nick Vatamaniuc
Thanks for bringing the topic up for the discussion!

For background, this topic was discussed on the mailing list starting
in February, 2019
https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E

The primary reason for restart_tx option is to provide compatibility
for _changes feeds to allow older replicators to handle 4.0 sources.
It starts a new transaction after 5 seconds or so (a current FDB
limitation, might go up in the future) and transparently continues to
stream data where it left off. Ex, streaming [a,b,c,d], times out
after b, then it will continue with c, d etc. Currently this is also
used for other streaming APIs as an alternative to returning mangled
JSON after emitting a 200 response and streaming some of the rows.
However it is not used for paginated responses, the new APIs developed
by Ilya. So users have an option to get the guaranteed snapshot
behavior option as well.

And for completeness, if we decide to remove the option, we should
specify what happens if we remove it and get a transaction_too_old
exception. Currently the behavior would be to restart the transaction,
resend all the headers and all the rows again down the socket, which I
don't think anyone wants, but is what we'd get if we just make
{restart_tx, false}

>  I understand that automatically resetting the FDB txn during a response is 
> an attempt to work around that and maintain "compatibility" with CouchDB < 4 
> semantics. I think it fails to do so and is very misleading.

It is a trade-off in order to keep the same API shape as before. Sure,
streaming all the docs with _all_docs or _changes feeds is not a great
pattern but many applications are implemented that way already.
Letting them migrate to 4.0 without having to rewrite the application
with the caveat that they might see a document updated in the
_all_docs stream after the request has already started, is a nicer
choice, I think, than forcing them to rewrite their application, which
could lead to a python 2/3 scenario.

Due to having multiple shards (Q>1), as discussed in the original
mailing thread by Mike
(https://lists.apache.org/thread.html/r8345f534a6fa88c107c1085fba13e660e0e2aedfd206c2748e002664%40%3Cdev.couchdb.apache.org%3E),
we don't provide a strict read-only snapshot guarantee in 2.x and 3.x
anyway, so users would have to handle scenarios where a document might
appear in the stream that wasn't there at the start of the request
already. Though, granted, a much smaller corner case but I wonder how
many users care to handle that...

Currently users do have an option of using the new paginated API which
disables restart_tx behavior
https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/chttpd/src/chttpd_db.erl#L947,
though I am not sure what happens when transaction_too_old exception
is thrown then (emit a bookmark?)

So based on the compatibility consideration, I'd vote to keep the
restart_tx option (configurable perhaps if we figure out what to do
when it is disabled) in order to allow users to migrate their
application to 4.0. At least informally we promised users to keep a
strong API compatibility when we released 3.0 with an eye towards 4.0
(https://blog.couchdb.org/2020/02/26/the-road-to-couchdb-3-0/). I'd
think not emitting all the data in a _changes or _all_docs response
would break that compatibility more than using multiple transactions.

As for what happens when a transaction_too_old is thrown, I could see
an option passed in, something like, single_snapshot=true, and then
use Adam's suggestion to accumulate all the rows in memory and if we
hit the end of the transaction return a 400 error. We won't emit
anything out while rows are accumulated, so users don't get partial
data, it will be every row requested or a 400 error (so no chance of
perceived data loss). Users may retry if they think it was a temporary
hiccup or may use a small limit number.

Cheers,
-Nick

On Mon, Jul 13, 2020 at 2:05 PM Robert Samuel Newson  wrote:
>
> Hi All,
>
> I'm concerned to see the restart_fold function in fabric2_fdb 
> (https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/src/fabric2_fdb.erl#L1828)
>  in the 4.0 development branch.
>
> The upshot of doing this is that a CouchDB response could be taken across 
> multiple snapshots of the database, which is not the behaviour of CouchDB 1 
> through 3.
>
> I don't think this is ok (with the obvious and established exception of a 
> continuous changes feed, where new snapshots are continuously visible at the 
> end of the response).
>
> FoundationDB imposes certain limits on transactions, the most notable being 
> the 5 second maximum duration. I understand that automatically resetting the 
> FDB txn during a response is an attempt to work around that and maintain 
> "compatibility" with CouchDB < 4 semantics. I think it fails to do so and is 
> very misleading.
>
> Discuss.
>
> B.
>


[DISCUSS] couchdb 4.0 transactional semantics

2020-07-13 Thread Robert Samuel Newson
Hi All,

I'm concerned to see the restart_fold function in fabric2_fdb 
(https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/src/fabric2_fdb.erl#L1828)
 in the 4.0 development branch.

The upshot of doing this is that a CouchDB response could be taken across 
multiple snapshots of the database, which is not the behaviour of CouchDB 1 
through 3.

I don't think this is ok (with the obvious and established exception of a 
continuous changes feed, where new snapshots are continuously visible at the 
end of the response).

FoundationDB imposes certain limits on transactions, the most notable being the 
5 second maximum duration. I understand that automatically resetting the FDB 
txn during a response is an attempt to work around that and maintain 
"compatibility" with CouchDB < 4 semantics. I think it fails to do so and is 
very misleading.

Discuss.

B.