This is exactly what I did in my experiment. > I don't think a whole new API is required here Some existing endpoints return lists and not objects. Which means we have no place to return bookmark.
On 2020/04/23 21:33:49, Glynn Bird <glynn.b...@gmail.com> wrote: > I don't think a whole new API is required here, but I would like to see > some sort of "bookmark" facility for _all_docs and views, as pagination > with the current API is awkward. > > I would imagine it working as follows: > > // first request > curl $URL/mydb/_all_docs?startkey="aardvark"&endkey="moose"&limit=10 > > ^ the user has specified the range of values that they want, and limit > defines the "page size", if you will. > > If the response to the first request contained a bookmark, the > users's second and subsequent requests could look like: > > curl $URL/mydb/_all_docs?bookmark=qfqwfqwfqfqw // "get me page 2 of the > result set" > curl $URL/mydb/_all_docs?bookmark=iihwhfwfwhwh // "get me page 3 of the > result set" > > So the bookmark need only encode: > > - limit (the page size) > - starkey - that is the previous page's endkey "+1" e.g dog/u0000 > - endkey - the queries upper bound, as defined in the first request > > Obviously there's other parameters to think through (start_key, end_key, > start_key_docid, descending etc) but this small iteration of the API > doesn't add any more clutter to request path while making it substantially > easier for the end user or client libraries to iterate through a result set. > > On Thu, 23 Apr 2020 at 22:14, Paul Davis <paul.joseph.da...@gmail.com> > wrote: > > > I'd agree that my initial reaction to cursor was that its not a great > > fit, but there does seem to be the existing name used in the greater > > REST world for this sort of pagination so I'm not concerned about > > using that terminology. > > > > I'm generally on board with allowing and setting some default sane > > limits on pages. We probably should have done that quite awhile ago > > after moving to native clustering and now that we have FDB limits I > > think it makes even more sense to have an API that does not lend > > itself to crazy errors when people are just trying to poke at an API. > > > > I think we're all on board that one of the goals is to make sure that > > clients don't accidentally misinterpret a response. That is, we're > > trying to be quite diligent that a user doesn't get 1000 rows and not > > realize there's another 10 that were beyond the limit. The bookmark > > approach with hard caps seems like a generally fine approach to me. > > The current approach users extra URL path segments to try and avoid > > this confusion. I wonder if we should consider starting to properly > > version our API using one of the many schemes that are used. Having > > read through a few articles I don't have a very clear favorite though. > > > > As to this particular proposal I do see a couple issues: > > > > `total` - We can do this in most cases fairly easily. Though it's a > > bit odd for continuous changes. > > > > `complete` - I'm not sure whether this is entirely possible given the > > API that FDB presents us. Specifically, when we set a range and we get > > back exactly $num_rows in the response, if the data set ended at > > exactly that page I don't think the `more` flag from fdb would tell us > > that. So we'd have a clunky UX there where we say not complete but the > > next page is empty. That's also not to mention that depending on > > whether we're looking at snapshots and so on that there's no way for > > us to know between stateless requests whether there were more rows > > added to the end. > > > > `page` - This one is just hard/impossible to calculate. FDB doesn't > > provide us with offsets or even an efficient "about how many rows in > > this range?" type queries so providing that would be both inaccurate > > and fairly difficult/expensive to calculate. In some cases I think we > > could have something maybe close that didn't suck too badly, but it'd > > also fall down for changes as well due to the way that updates reorder > > them. > > > > `update_seq` - I'm just not sure on when this would be useful or what > > it would refer to. Maybe a version stamp of the last change for that > > request? If we had a future API that asked for a snapshot access then > > maybe? But if we did do something there with versionstamps or read > > versions I'd expect that to come with the rest of the API. > > > > For the bookmark fields: > > > > `direction` vs `descending` seems like a field duplication to me. > > > > `page` - This would seem to suggest we could skip to a certain > > location in the results numerically which we are not able to do with > > the FDB API. > > > > `last_key` vs `start_key` seems like a field duplication. We don't > > need to know where things started I don't think. Just where to start > > from and where to end. > > > > `update_seq` - is same as earlier. Not entirely sure on the intent there. > > > > `timestamp` - Expiring bookmarks based on time does not seem like a > > good idea. Both for clock skew and why bother when this would > > functionally just be a convenience API that users could already > > implement for themselves. > > > > Another thing might also be to provide our bookmark as a full link > > that seems to be fairly standard REST practice these days. Something > > that clients don't have to do any logic with so that we're free to > > change the implementation. > > > > And lastly, I don't think we should be neglecting the _changes API as > > part of this discussion. I realize that we'll need to support the > > older streaming semantics if we want to maintain replication > > compatibility (which I think we'll all agree is a Good Thing) but it > > also feels a bit wrong to ignore it as part of this work if we're > > going to be modernizing our APIs. Though if we do pick up a good > > versioning scheme then we could theoretically make those changes > > easily enough. Plus, who doesn't want to rewrite chttpd to be a whole > > lot less... chttpd-y? > > > > > > On Thu, Apr 23, 2020 at 1:43 PM Robert Samuel Newson <rnew...@apache.org> > > wrote: > > > > > > > > > I think it's a key difference from "cursor" as I've seen them elsewhere, > > that ours will point at an ever changing database, you couldn't seamlessly > > cursor through a large data set, one "page" at a time. > > > > > > Bookmarks began in search (raises guilty hand) in order to address a > > Lucene-specific issue (that high values of "skip" are incredibly > > inefficient, using lots of RAM). That is not true for CouchDB's own > > indexes, which can be navigated perfectly with > > startkey/endkey/startkey_docid/endkey_docid, etc. > > > > > > I guess I'm not helping much with these observations but I wouldn't like > > to see CouchDB gain an additional and ugly method of doing something > > already possible. > > > > > > B. > > > > > > > On 23 Apr 2020, at 19:02, Joan Touzet <woh...@apache.org> wrote: > > > > > > > > I realise this is bikeshedding, but I guess that's kind of the > > point... Everything below is my opinion, not "fact." > > > > > > > > It's unfortunate we need a new endpoint for all of this. In a vacuum I > > might have just suggested we use the semantics we already have, perhaps > > with ?from= instead of ?since= . > > > > > > > > "page" only works if the size of a page is well known, either by > > server preference or directly in the URL. If I ask for: > > > > > > > > GET /{db}/_all_docs?limit=20&page=3 > > > > > > > > I know that I'm always going to get document 41 through 60 in the > > default collation order. > > > > > > > > There's a *fantastic* summary of examples from popular REST APIs here: > > > > > > > > > > https://medium.com/@ignaciochiazzo/paginating-requests-in-apis-d4883d4c1c4c > > > > > > > > We are *pretty close* to what a cursor means in those other examples, > > except for the fact that our cursor can go stale/invalid after a short time. > > > > > > > > Bob, could you be a bit more detailed in your explanation how our > > definition isn't close to these? Or did you mean SQL CURSOR (which is > > something entirely different?) If so, I'm fine with this being a REST API > > cursor - something clearly distinct. > > > > > > > > I come back to wanting to preserve the existing endpoint syntax and > > naming, without new endpoints, but specifying this new FDB token via > > ?cursor= and this being the trigger for the new behaviour. At some point, > > we simply stop accepting ?since= tokens. This seems inline with other > > popular REST APIs. > > > > > > > > -Joan "still sick and not sleeping right" Touzet > > > > > > > > > > > > On 2020-04-23 12:30, Robert Newson wrote: > > > >> cursor has established meaning in other databases and ours would not > > be very close to them. I don’t think it’s a good idea. > > > >> B. > > > >>> On 23 Apr 2020, at 11:50, Ilya Khlopotov <iil...@apache.org> wrote: > > > >>> > > > >>> > > > >>>> > > > >>>> The best I could come up with is replacing page with > > > >>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs > > > >>> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor). > > > >>> > > > >>>> On 2020/04/23 08:54:36, Garren Smith <gar...@apache.org> wrote: > > > >>>> I agree with Bob that page doesn't make sense as an endpoint. I'm > > also > > > >>>> rubbish with naming. The best I could come up with is replacing > > page with > > > >>>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs > > > >>>> All the fields in the bookmark make sense except timestamp. Why > > would it > > > >>>> matter if the timestamp is old? What happens if a node's time is an > > hour > > > >>>> behind another node? > > > >>>> > > > >>>> > > > >>>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov <iil...@apache.org> > > wrote: > > > >>>>> > > > >>>>> - page is to provide some notion of progress for user > > > >>>>> - timestamp - I was thinking that we should drop requests if user > > would > > > >>>>> try to pass bookmark created an hour ago. > > > >>>>> > > > >>>>> On 2020/04/22 21:58:40, Robert Samuel Newson <rnew...@apache.org> > > wrote: > > > >>>>>> "page" and "page number" are odd to me as these don't exist as > > concepts, > > > >>>>> I'd rather not invent them. I note there's no mention of page > > size, which > > > >>>>> makes "page number" very vague. > > > >>>>>> > > > >>>>>> What is "timestamp" in the bookmark and what effect does it have > > when > > > >>>>> the bookmark is passed back in? > > > >>>>>> > > > >>>>>> I guess, why does the bookmark include so much extraneous data? > > Items > > > >>>>> that are not needed to find the fdb key to begin the next response > > from. > > > >>>>>> > > > >>>>>> > > > >>>>>>> On 22 Apr 2020, at 21:18, Ilya Khlopotov <iil...@apache.org> > > wrote: > > > >>>>>>> > > > >>>>>>> Hello everyone, > > > >>>>>>> > > > >>>>>>> Based on the discussions on the thread I would like to propose a > > > >>>>> number of first steps: > > > >>>>>>> 1) introduce new endpoints > > > >>>>>>> - {db}/_all_docs/page > > > >>>>>>> - {db}/_all_docs/queries/page > > > >>>>>>> - _all_dbs/page > > > >>>>>>> - _dbs_info/page > > > >>>>>>> - {db}/_design/{ddoc}/_view/{view}/page > > > >>>>>>> - {db}/_design/{ddoc}/_view/{view}/queries/page > > > >>>>>>> - {db}/_find/page > > > >>>>>>> > > > >>>>>>> These new endpoints would act as follows: > > > >>>>>>> - don't use delayed responses > > > >>>>>>> - return object with following structure > > > >>>>>>> ``` > > > >>>>>>> { > > > >>>>>>> "total": Total, > > > >>>>>>> "bookmark": base64 encoded opaque value, > > > >>>>>>> "completed": true | false, > > > >>>>>>> "update_seq": when available, > > > >>>>>>> "page": current page number, > > > >>>>>>> "items": [ > > > >>>>>>> ] > > > >>>>>>> } > > > >>>>>>> ``` > > > >>>>>>> - the bookmark would include following data (base64 or > > protobuff???): > > > >>>>>>> - direction > > > >>>>>>> - page > > > >>>>>>> - descending > > > >>>>>>> - endkey > > > >>>>>>> - endkey_docid > > > >>>>>>> - inclusive_end > > > >>>>>>> - startkey > > > >>>>>>> - startkey_docid > > > >>>>>>> - last_key > > > >>>>>>> - update_seq > > > >>>>>>> - timestamp > > > >>>>>>> ``` > > > >>>>>>> > > > >>>>>>> 2) Implement per-endpoint configurable max limits > > > >>>>>>> ``` > > > >>>>>>> _all_docs = 5000 > > > >>>>>>> _all_docs/queries = 5000 > > > >>>>>>> _all_dbs = 5000 > > > >>>>>>> _dbs_info = 5000 > > > >>>>>>> _view = 2500 > > > >>>>>>> _view/queries = 2500 > > > >>>>>>> _find = 2500 > > > >>>>>>> ``` > > > >>>>>>> > > > >>>>>>> Latter (after few years) CouchDB would deprecate and remove old > > > >>>>> endpoints. > > > >>>>>>> > > > >>>>>>> Best regards, > > > >>>>>>> iilyak > > > >>>>>>> > > > >>>>>>> On 2020/02/19 22:39:45, Nick Vatamaniuc <vatam...@apache.org> > > wrote: > > > >>>>>>>> Hello everyone, > > > >>>>>>>> > > > >>>>>>>> I'd like to discuss the shape and behavior of streaming APIs for > > > >>>>> CouchDB 4.x > > > >>>>>>>> > > > >>>>>>>> By "streaming APIs" I mean APIs which stream data in row as it > > gets > > > >>>>>>>> read from the database. These are the endpoints I was thinking > > of: > > > >>>>>>>> > > > >>>>>>>> _all_docs, _all_dbs, _dbs_info and query results > > > >>>>>>>> > > > >>>>>>>> I want to focus on what happens when FoundationDB transactions > > > >>>>>>>> time-out after 5 seconds. Currently, all those APIs except > > _changes[1] > > > >>>>>>>> feeds, will crash or freeze. The reason is because the > > > >>>>>>>> transaction_too_old error at the end of 5 seconds is retry-able > > by > > > >>>>>>>> default, so the request handlers run again and end up shoving > > the > > > >>>>>>>> whole request down the socket again, headers and all, which is > > > >>>>>>>> obviously broken and not what we want. > > > >>>>>>>> > > > >>>>>>>> There are few alternatives discussed in couchdb-dev channel. > > I'll > > > >>>>>>>> present some behaviors but feel free to add more. Some ideas > > might > > > >>>>>>>> have been discounted on the IRC discussion already but I'll > > present > > > >>>>>>>> them anyway in case is sparks further conversation: > > > >>>>>>>> > > > >>>>>>>> A) Do what _changes[1] feeds do. Start a new transaction and > > continue > > > >>>>>>>> streaming the data from the next key after last emitted in the > > > >>>>>>>> previous transaction. Document the API behavior change that it > > may > > > >>>>>>>> present a view of the data is never a point-in-time[4] snapshot > > of the > > > >>>>>>>> DB. > > > >>>>>>>> > > > >>>>>>>> - Keeps the API shape the same as CouchDB <4.0. Client libraries > > > >>>>>>>> don't have to change to continue using these CouchDB 4.0 > > endpoints > > > >>>>>>>> - This is the easiest to implement since it would re-use the > > > >>>>>>>> implementation for _changes feed (an extra option passed to the > > fold > > > >>>>>>>> function). > > > >>>>>>>> - Breaks API behavior if users relied on having a > > point-in-time[4] > > > >>>>>>>> snapshot view of the data. > > > >>>>>>>> > > > >>>>>>>> B) Simply end the stream. Let the users pass a > > `?transaction=true` > > > >>>>>>>> param which indicates they are aware the stream may end early > > and so > > > >>>>>>>> would have to paginate from the last emitted key with a skip=1. > > This > > > >>>>>>>> will keep the request bodies the same as current CouchDB. > > However, if > > > >>>>>>>> the users got all the data one request, they will end up wasting > > > >>>>>>>> another request to see if there is more data available. If they > > didn't > > > >>>>>>>> get any data they might have a too large of a skip value (see > > [2]) so > > > >>>>>>>> would have to guess different values for start/end keys. Or > > impose max > > > >>>>>>>> limit for the `skip` parameter. > > > >>>>>>>> > > > >>>>>>>> C) End the stream and add a final metadata row like a > > "transaction": > > > >>>>>>>> "timeout" at the end. That will let the user know to keep > > paginating > > > >>>>>>>> from the last key onward. This won't work for `_all_dbs` and > > > >>>>>>>> `_dbs_info`[3] Maybe let those two endpoints behave like > > _changes > > > >>>>>>>> feeds and only use this for views and and _all_docs? If we like > > this > > > >>>>>>>> choice, let's think what happens for those as I couldn't come > > up with > > > >>>>>>>> anything decent there. > > > >>>>>>>> > > > >>>>>>>> D) Same as C but to solve the issue with skips[2], emit a > > bookmark > > > >>>>>>>> "key" of where the iteration stopped and the current "skip" and > > > >>>>>>>> "limit" params, which would keep decreasing. Then user would > > pass > > > >>>>>>>> those in "start_key=..." in the next request along with the > > limit and > > > >>>>>>>> skip params. So something like "continuation":{"skip":599, > > "limit":5, > > > >>>>>>>> "key":"..."}. This has the same issue with array results for > > > >>>>>>>> `_all_dbs` and `_dbs_info`[3]. > > > >>>>>>>> > > > >>>>>>>> E) Enforce low `limit` and `skip` parameters. Enforce maximum > > values > > > >>>>>>>> there such that response time is likely to fit in one > > transaction. > > > >>>>>>>> This could be tricky as different runtime environments will have > > > >>>>>>>> different characteristics. Also, if the timeout happens there > > isn't a > > > >>>>>>>> a nice way to send an HTTP error since we already sent the 200 > > > >>>>>>>> response. The downside is that this might break how some users > > use the > > > >>>>>>>> API, if say the are using large skips and limits already. > > Perhaps here > > > >>>>>>>> we do both B and D, such that if users want transactional > > behavior, > > > >>>>>>>> they specify that `transaction=true` param and only then we > > enforce > > > >>>>>>>> low limit and skip maximums. > > > >>>>>>>> > > > >>>>>>>> F) At least for `_all_docs` it seems providing a point-in-time > > > >>>>>>>> snapshot view doesn't necessarily need to be tied to transaction > > > >>>>>>>> boundaries. We could check the update sequence of the database > > at the > > > >>>>>>>> start of the next transaction and if it hasn't changed we can > > continue > > > >>>>>>>> emitting a consistent view. This can apply to C and D and would > > just > > > >>>>>>>> determine when the stream ends. If there are no writes > > happening to > > > >>>>>>>> the db, this could potential streams all the data just like > > option A > > > >>>>>>>> would do. Not entirely sure if this would work for views. > > > >>>>>>>> > > > >>>>>>>> So what do we think? I can see different combinations of > > options here, > > > >>>>>>>> maybe even different for each API point. For example `_all_dbs`, > > > >>>>>>>> `_dbs_info` are always A, and `_all_docs` and views default to > > A but > > > >>>>>>>> have parameters to do F, etc. > > > >>>>>>>> > > > >>>>>>>> Cheers, > > > >>>>>>>> -Nick > > > >>>>>>>> > > > >>>>>>>> Some footnotes: > > > >>>>>>>> > > > >>>>>>>> [1] _changes feeds is the only one that works currently. It > > behaves as > > > >>>>>>>> per RFC > > > >>>>> > > https://github.com/apache/couchdb-documentation/blob/master/rfcs/003-fdb-seq-index.md#access-patterns > > > >>>>> . > > > >>>>>>>> That is, we continue streaming the data by resetting the > > transaction > > > >>>>>>>> object and restarting from the last emitted key (db sequence in > > this > > > >>>>>>>> case). However, because the transaction restarts if a document > > is > > > >>>>>>>> updated while the streaming take place, it may appear in the > > _changes > > > >>>>>>>> feed twice. That's a behavior difference from CouchDB < 4.0 and > > we'd > > > >>>>>>>> have to document it, since previously we presented this > > point-in-time > > > >>>>>>>> snapshot of the database from when we started streaming. > > > >>>>>>>> > > > >>>>>>>> [2] Our streaming APIs have both skips and limits. Since FDB > > doesn't > > > >>>>>>>> currently support efficient offsets for key selectors > > > >>>>>>>> ( > > > >>>>> > > https://apple.github.io/foundationdb/known-limitations.html#dont-use-key-selectors-for-paging > > > >>>>> ) > > > >>>>>>>> we implemented skip by iterating over the data. This means that > > a skip > > > >>>>>>>> of say 100000 could keep timing out the transaction without > > yielding > > > >>>>>>>> any data. > > > >>>>>>>> > > > >>>>>>>> [3] _all_dbs and _dbs_info return a JSON array so they don't > > have an > > > >>>>>>>> obvious place to insert a last metadata row. > > > >>>>>>>> > > > >>>>>>>> [4] For example they have a constraint that documents "a" and > > "z" > > > >>>>>>>> cannot both be in the database at the same time. But when > > iterating > > > >>>>>>>> it's possible that "a" was there at the start. Then by the end, > > "a" > > > >>>>>>>> was removed and "z" added, so both "a" and "z" would appear in > > the > > > >>>>>>>> emitted stream. Note that FoundationDB has APIs which exhibit > > the same > > > >>>>>>>> "relaxed" constrains: > > > >>>>>>>> > > > >>>>> > > https://apple.github.io/foundationdb/api-python.html#module-fdb.locality > > > >>>>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > > > >