Re: [DISCUSS] Streaming API in CouchDB 4.0

Robert Samuel Newson Thu, 23 Apr 2020 11:44:20 -0700


I think it's a key difference from "cursor" as I've seen them elsewhere, that 
ours will point at an ever changing database, you couldn't seamlessly cursor 
through a large data set, one "page" at a time.


Bookmarks began in search (raises guilty hand) in order to address a 
Lucene-specific issue (that high values of "skip" are incredibly inefficient, 
using lots of RAM). That is not true for CouchDB's own indexes, which can be 
navigated perfectly with startkey/endkey/startkey_docid/endkey_docid, etc.

I guess I'm not helping much with these observations but I wouldn't like to see 
CouchDB gain an additional and ugly method of doing something already possible.

B.

> On 23 Apr 2020, at 19:02, Joan Touzet <woh...@apache.org> wrote:
> 
> I realise this is bikeshedding, but I guess that's kind of the point... 
> Everything below is my opinion, not "fact."
> 
> It's unfortunate we need a new endpoint for all of this. In a vacuum I might 
> have just suggested we use the semantics we already have, perhaps with ?from= 
> instead of ?since= .
> 
> "page" only works if the size of a page is well known, either by server 
> preference or directly in the URL. If I ask for:
> 
>  GET /{db}/_all_docs?limit=20&page=3
> 
> I know that I'm always going to get document 41 through 60 in the default 
> collation order.
> 
> There's a *fantastic* summary of examples from popular REST APIs here:
> 
> https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4c
> 
> We are *pretty close* to what a cursor means in those other examples, except 
> for the fact that our cursor can go stale/invalid after a short time.
> 
> Bob, could you be a bit more detailed in your explanation how our definition 
> isn't close to these? Or did you mean SQL CURSOR (which is something entirely 
> different?) If so, I'm fine with this being a REST API cursor - something 
> clearly distinct.
> 
> I come back to wanting to preserve the existing endpoint syntax and naming, 
> without new endpoints, but specifying this new FDB token via ?cursor= and 
> this being the trigger for the new behaviour. At some point, we simply stop 
> accepting ?since= tokens. This seems inline with other popular REST APIs.
> 
> -Joan "still sick and not sleeping right" Touzet
> 
> 
> On 2020-04-23 12:30, Robert Newson wrote:
>> cursor has established meaning in other databases and ours would not be very 
>> close to them. I don’t think it’s a good idea.
>> B.
>>> On 23 Apr 2020, at 11:50, Ilya Khlopotov <iil...@apache.org> wrote:
>>> 
>>> 
>>>> 
>>>> The best I could come up with is replacing page with
>>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
>>> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor).
>>> 
>>>> On 2020/04/23 08:54:36, Garren Smith <gar...@apache.org> wrote:
>>>> I agree with Bob that page doesn't make sense as an endpoint. I'm also
>>>> rubbish with naming. The best I could come up with is replacing page with
>>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
>>>> All the fields in the bookmark make sense except timestamp. Why would it
>>>> matter if the timestamp is old? What happens if a node's time is an hour
>>>> behind another node?
>>>> 
>>>> 
>>>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov <iil...@apache.org> wrote:
>>>>> 
>>>>> - page is to provide some notion of progress for user
>>>>> - timestamp - I was thinking that we should drop requests if user would
>>>>> try to pass bookmark created an hour ago.
>>>>> 
>>>>> On 2020/04/22 21:58:40, Robert Samuel Newson <rnew...@apache.org> wrote:
>>>>>> "page" and "page number" are odd to me as these don't exist as concepts,
>>>>> I'd rather not invent them. I note there's no mention of page size, which
>>>>> makes "page number" very vague.
>>>>>> 
>>>>>> What is "timestamp" in the bookmark and what effect does it have when
>>>>> the bookmark is passed back in?
>>>>>> 
>>>>>> I guess, why does the bookmark include so much extraneous data? Items
>>>>> that are not needed to find the fdb key to begin the next response from.
>>>>>> 
>>>>>> 
>>>>>>> On 22 Apr 2020, at 21:18, Ilya Khlopotov <iil...@apache.org> wrote:
>>>>>>> 
>>>>>>> Hello everyone,
>>>>>>> 
>>>>>>> Based on the discussions on the thread I would like to propose a
>>>>> number of first steps:
>>>>>>> 1) introduce new endpoints
>>>>>>> - {db}/_all_docs/page
>>>>>>> - {db}/_all_docs/queries/page
>>>>>>> - _all_dbs/page
>>>>>>> - _dbs_info/page
>>>>>>> - {db}/_design/{ddoc}/_view/{view}/page
>>>>>>> - {db}/_design/{ddoc}/_view/{view}/queries/page
>>>>>>> - {db}/_find/page
>>>>>>> 
>>>>>>> These new endpoints would act as follows:
>>>>>>> - don't use delayed responses
>>>>>>> - return object with following structure
>>>>>>> ```
>>>>>>> {
>>>>>>>    "total": Total,
>>>>>>>    "bookmark": base64 encoded opaque value,
>>>>>>>    "completed": true | false,
>>>>>>>    "update_seq": when available,
>>>>>>>    "page": current page number,
>>>>>>>    "items": [
>>>>>>>    ]
>>>>>>> }
>>>>>>> ```
>>>>>>> - the bookmark would include following data (base64 or protobuff???):
>>>>>>> - direction
>>>>>>> - page
>>>>>>> - descending
>>>>>>> - endkey
>>>>>>> - endkey_docid
>>>>>>> - inclusive_end
>>>>>>> - startkey
>>>>>>> - startkey_docid
>>>>>>> - last_key
>>>>>>> - update_seq
>>>>>>> - timestamp
>>>>>>> ```
>>>>>>> 
>>>>>>> 2) Implement per-endpoint configurable max limits
>>>>>>> ```
>>>>>>> _all_docs = 5000
>>>>>>> _all_docs/queries = 5000
>>>>>>> _all_dbs = 5000
>>>>>>> _dbs_info = 5000
>>>>>>> _view = 2500
>>>>>>> _view/queries = 2500
>>>>>>> _find = 2500
>>>>>>> ```
>>>>>>> 
>>>>>>> Latter (after few years) CouchDB would deprecate and remove old
>>>>> endpoints.
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> iilyak
>>>>>>> 
>>>>>>> On 2020/02/19 22:39:45, Nick Vatamaniuc <vatam...@apache.org> wrote:
>>>>>>>> Hello everyone,
>>>>>>>> 
>>>>>>>> I'd like to discuss the shape and behavior of streaming APIs for
>>>>> CouchDB 4.x
>>>>>>>> 
>>>>>>>> By "streaming APIs" I mean APIs which stream data in row as it gets
>>>>>>>> read from the database. These are the endpoints I was thinking of:
>>>>>>>> 
>>>>>>>> _all_docs, _all_dbs, _dbs_info  and query results
>>>>>>>> 
>>>>>>>> I want to focus on what happens when FoundationDB transactions
>>>>>>>> time-out after 5 seconds. Currently, all those APIs except _changes[1]
>>>>>>>> feeds, will crash or freeze. The reason is because the
>>>>>>>> transaction_too_old error at the end of 5 seconds is retry-able by
>>>>>>>> default, so the request handlers run again and end up shoving the
>>>>>>>> whole request down the socket again, headers and all, which is
>>>>>>>> obviously broken and not what we want.
>>>>>>>> 
>>>>>>>> There are few alternatives discussed in couchdb-dev channel. I'll
>>>>>>>> present some behaviors but feel free to add more. Some ideas might
>>>>>>>> have been discounted on the IRC discussion already but I'll present
>>>>>>>> them anyway in case is sparks further conversation:
>>>>>>>> 
>>>>>>>> A) Do what _changes[1] feeds do. Start a new transaction and continue
>>>>>>>> streaming the data from the next key after last emitted in the
>>>>>>>> previous transaction. Document the API behavior change that it may
>>>>>>>> present a view of the data is never a point-in-time[4] snapshot of the
>>>>>>>> DB.
>>>>>>>> 
>>>>>>>> - Keeps the API shape the same as CouchDB <4.0. Client libraries
>>>>>>>> don't have to change to continue using these CouchDB 4.0 endpoints
>>>>>>>> - This is the easiest to implement since it would re-use the
>>>>>>>> implementation for _changes feed (an extra option passed to the fold
>>>>>>>> function).
>>>>>>>> - Breaks API behavior if users relied on having a point-in-time[4]
>>>>>>>> snapshot view of the data.
>>>>>>>> 
>>>>>>>> B) Simply end the stream. Let the users pass a `?transaction=true`
>>>>>>>> param which indicates they are aware the stream may end early and so
>>>>>>>> would have to paginate from the last emitted key with a skip=1. This
>>>>>>>> will keep the request bodies the same as current CouchDB. However, if
>>>>>>>> the users got all the data one request, they will end up wasting
>>>>>>>> another request to see if there is more data available. If they didn't
>>>>>>>> get any data they might have a too large of a skip value (see [2]) so
>>>>>>>> would have to guess different values for start/end keys. Or impose max
>>>>>>>> limit for the `skip` parameter.
>>>>>>>> 
>>>>>>>> C) End the stream and add a final metadata row like a "transaction":
>>>>>>>> "timeout" at the end. That will let the user know to keep paginating
>>>>>>>> from the last key onward. This won't work for `_all_dbs` and
>>>>>>>> `_dbs_info`[3] Maybe let those two endpoints behave like _changes
>>>>>>>> feeds and only use this for views and and _all_docs? If we like this
>>>>>>>> choice, let's think what happens for those as I couldn't come up with
>>>>>>>> anything decent there.
>>>>>>>> 
>>>>>>>> D) Same as C but to solve the issue with skips[2], emit a bookmark
>>>>>>>> "key" of where the iteration stopped and the current "skip" and
>>>>>>>> "limit" params, which would keep decreasing. Then user would pass
>>>>>>>> those in "start_key=..." in the next request along with the limit and
>>>>>>>> skip params. So something like "continuation":{"skip":599, "limit":5,
>>>>>>>> "key":"..."}. This has the same issue with array results for
>>>>>>>> `_all_dbs` and `_dbs_info`[3].
>>>>>>>> 
>>>>>>>> E) Enforce low `limit` and `skip` parameters. Enforce maximum values
>>>>>>>> there such that response time is likely to fit in one transaction.
>>>>>>>> This could be tricky as different runtime environments will have
>>>>>>>> different characteristics. Also, if the timeout happens there isn't a
>>>>>>>> a nice way to send an HTTP error since we already sent the 200
>>>>>>>> response. The downside is that this might break how some users use the
>>>>>>>> API, if say the are using large skips and limits already. Perhaps here
>>>>>>>> we do both B and D, such that if users want transactional behavior,
>>>>>>>> they specify that `transaction=true` param and only then we enforce
>>>>>>>> low limit and skip maximums.
>>>>>>>> 
>>>>>>>> F) At least for `_all_docs` it seems providing a point-in-time
>>>>>>>> snapshot view doesn't necessarily need to be tied to transaction
>>>>>>>> boundaries. We could check the update sequence of the database at the
>>>>>>>> start of the next transaction and if it hasn't changed we can continue
>>>>>>>> emitting a consistent view. This can apply to C and D and would just
>>>>>>>> determine when the stream ends. If there are no writes happening to
>>>>>>>> the db, this could potential streams all the data just like option A
>>>>>>>> would do. Not entirely sure if this would work for views.
>>>>>>>> 
>>>>>>>> So what do we think? I can see different combinations of options here,
>>>>>>>> maybe even different for each API point. For example `_all_dbs`,
>>>>>>>> `_dbs_info` are always A, and `_all_docs` and views default to A but
>>>>>>>> have parameters to do F, etc.
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> -Nick
>>>>>>>> 
>>>>>>>> Some footnotes:
>>>>>>>> 
>>>>>>>> [1] _changes feeds is the only one that works currently. It behaves as
>>>>>>>> per RFC
>>>>> https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-seq-index.md#access-patterns
>>>>> .
>>>>>>>> That is, we continue streaming the data by resetting the transaction
>>>>>>>> object and restarting from the last emitted key (db sequence in this
>>>>>>>> case). However, because the transaction restarts if a document is
>>>>>>>> updated while the streaming take place, it may appear in the _changes
>>>>>>>> feed twice. That's a behavior difference from CouchDB < 4.0 and we'd
>>>>>>>> have to document it, since previously we presented this point-in-time
>>>>>>>> snapshot of the database from when we started streaming.
>>>>>>>> 
>>>>>>>> [2] Our streaming APIs have both skips and limits. Since FDB doesn't
>>>>>>>> currently support efficient offsets for key selectors
>>>>>>>> (
>>>>> https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-selectors-for-paging
>>>>> )
>>>>>>>> we implemented skip by iterating over the data. This means that a skip
>>>>>>>> of say 100000 could keep timing out the transaction without yielding
>>>>>>>> any data.
>>>>>>>> 
>>>>>>>> [3] _all_dbs and _dbs_info return a JSON array so they don't have an
>>>>>>>> obvious place to insert a last metadata row.
>>>>>>>> 
>>>>>>>> [4] For example they have a constraint that documents "a" and "z"
>>>>>>>> cannot both be in the database at the same time. But when iterating
>>>>>>>> it's possible that "a" was there at the start. Then by the end, "a"
>>>>>>>> was removed and "z" added, so both "a" and "z" would appear in the
>>>>>>>> emitted stream. Note that FoundationDB has APIs which exhibit the same
>>>>>>>> "relaxed" constrains:
>>>>>>>> 
>>>>> https://apple.github.io/foundationdb/api-python.html#module-fdb.locality
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>

Re: [DISCUSS] Streaming API in CouchDB 4.0

Reply via email to