A little behind on the discussion emails but +1 to option 1 for include_docs=true and option 3 for include_docs = false.
On Mon, Mar 25, 2019 at 12:26 PM Jan Lehnardt <j...@apache.org> wrote: > +1 on what Bob said. > > > On 21. Mar 2019, at 20:57, Robert Newson <rnew...@apache.org> wrote: > > > > Hi, > > > > Thanks for pushing forward, and I owe feedback on other threads you've > started. > > > > Rather feebly, I'm just agreeing with you. option 3 for > include_docs=false and option 1 for include_docs=true sounds ideal. both > flavours are very common so it makes sense to build a solution for each. At > a pinch we can just do option 3 + async doc lookups in a first release and > then circle back, but the RFC should propose 1 and 3 as our design > intention. > > > > -- > > Robert Samuel Newson > > rnew...@apache.org > > > > On Thu, 21 Mar 2019, at 19:50, Adam Kocoloski wrote: > >> Hi all, me again. This one will be shorter :) As I see it we have three > >> different options for serving the _all_docs endpoint from FDB: > >> > >> ## Option 1: Read the document data, discard the bodies > >> > >> We likely will have the documents stored in docid order already; we > >> could do range reads and discard everything but the ID and _rev by > >> default. This can be a very efficient implementation of > >> include_docs=true (though one needs to be careful about skipping the > >> conflict bodies), but pretty wasteful otherwise. > >> > >> ## Option 2: Read the “revisions” subspace > >> > >> We also have an entry for every document in ID order in the “revisions” > >> subspace. The disadvantage of this approach is that every deleted edit > >> branch shows up there, too, and some databases will have lots of > >> deleted documents. We may need to build skiplists to know how to scan > >> efficiently. This subspace is also doing a lot of heavy lifting for us > >> already, and if we wanted to toy with alternative revision history > >> representations in the future it could get complicated > >> > >> ## Option 3: Add specific entries to support _all_docs > >> > >> We can also write an extra KV containing the ID and winning _rev in a > >> special subspace just to support this endpoint. It would be a blind > >> write because we’re already coordinating concurrent transactions > >> through reads on the “revisions” subspace. This would be conceptually > >> quite clean and simple, and the fastest implementation for constructing > >> the default response. > >> > >> === > >> > >> My sense is Option 2 is a non-starter but I include it for completeness > >> in case anyone else thought of the same. I think Option 3 is a > >> reasonable space / efficiency / simplicity tradeoff, and it might also > >> be worth testing out Option 1 as an optimized implementation for > >> include_docs=true. > >> > >> Thoughts? I imagine we can move quickly to an RFC for at least having > >> the extra KVs for Option 3, and in that design also acknowledge the > >> option for scanning the docs space directly to support include_docs. > >> > >> Adam > > -- > Professional Support for Apache CouchDB: > https://neighbourhood.ie/couchdb-support/ > >