Thank you for that, those are the issues that are turning over in my mind also.

FoundationDB base would have improved CouchDB in a number of very helpful ways, 
beyond the scalability benefits themselves (though you are right that running 
FoundationDB in production is not a trivial undertaking). Namely, we would 
return to CouchDB 1.x era consistency within a cluster. That is, there’d be no 
such thing as a “rewind” and no chance of introducing a conflict within a 
cluster (though you still can _between_ clusters). Additionally, the 
representation of a document’s revision tree in FDB is superior to CouchDB 
3.x’s, as each edit branch is stored separately. In a lot of cases this means 
documents can be read and updated without loading up the entire revision tree.

If we’re not pursuing the couchdb-fdb work further, we should consider 
implementing some or all of the above again. All of those changes are possible 
but some are considerably harder than others, and we might have to pick our 
battles.

Some time ago I made a wishlist of breaking changes for CouchDB 3.x that 
address a number of the gotchas our users frequently encounter. Other wishlists 
exist, I know you have one yourself, but I present it below;

1) True delete by default: Remove branches and documents entirely after 
deletion and once all replications, indexes and other observers have 
checkpointed past the sequence. This also removes the ability to store 
information in a deleted document.
2) Flatten the revtree: Store each edit branch as a separate k/v entry (we do 
this on fdb main, so backport the concept)
3) Optional automatic deletion of losing revs after a period of significant 
wall-clock time
4) Can we make the N replicas more consistent? two possible levels: a) updates 
to the same doc in the same order. b) all updates in the same order.
5) true conflict resolution. allow a revision to have multiple parents, so 
revision tree narrows.

B.

> On 12 Mar 2022, at 09:26, Jan Lehnardt <j...@apache.org> wrote:
> 
> Thanks Bob for passing this along.
> 
> I’m looking forward to renewed interest in the 3.x codebase :)
> 
> For our 4.x plans, we’ll have to discuss here what we want to do with it and 
> I’m looking at everyone for input here. Even if you’ve never spoken up on 
> this list before, I’d lie to hear from you.
> 
> * * *
> 
> First off, as a project, CouchDB is not obliged to follow IBMs lead and 
> abandon the FDB-CouchDB effort. At the same time, it is not obliged to take 
> what they leave behind and finish it.
> 
> I know for some the 4.x release is highly anticipated and we as a project 
> hoped to make a generational jump for our underlying storage and distribution 
> technologies. During initial discussions about FDB-Couch and during its 
> development, we anticipated certain developments on the FDB side (especially 
> allowing longer transactions for consistent _changes responses with their new 
> Redwood storage engine). It is my understanding that these developments have 
> not materialised in the way we would like them. The consequence is that there 
> are certain API guarantees that 3.x CouchDB gives (consistent full-database 
> snapshots in _changes) are not possible to build with native FDB features. — 
> I can’t speak to the very specifics of this, and I hope we can dig into all 
> this together in this thread, but my takeaway from this is that *if* we 
> continue with FDB-Couch, I think we will have to reevaluate its compatibility 
> story, as we had hoped to make it mainly a seamless (but better) API upgrade 
> from 3.x.
> 
> We also learned that operating a FDB cluster is a significant effort that 
> somewhat goes against CouchDB’s mostly “just works” nature. We had asked the 
> IBM team to share their operational FDB learnings with the CoucHDB project, 
> so we can build up community knowledge around this, but this has not 
> materialised either.
> 
> I’m personally still excited about the opportunities we have with FDB-Couch, 
> but as a project, we might have to come up with a more realistic positioning 
> of FDB-CouchDB. Less a “new and improved drop-in replacement” and maybe more 
> a “if you exceed the scale/capacity of 3.x CouchDB, you can upgrade to 
> FDB-CouchDB at the expense of a few API differences and higher operational 
> cost”. This might be worth a trade-off for large users of CouchDB and thus it 
> might be worth having both of these codebases live alongside each other.
> 
> However, that comes with a number of consequences:
> 
> - The 3.x/4.x naming doesn’t quite work if these are meant to continue 
> alongside each other.
> 
> - Maybe FDB-Couch gets its own separate project name and versioning, with a 
> clear delineation between them.
> 
> - We would have to maintain two projects complete with release management, 
> vulnerability management, the lot. At the moment, CouchDB has just about 
> enough folks contributing to move forward at a reasonable pace. Doubling that 
> effort might be tricky. While we had an influx of contributors recently, this 
> would probably need more dedicated planning and outreach.
> 
> - New API features would have to be implemented twice, if we want to keep a 
> majority API overlap. This is not a fun proposition for folks who add 
> features, which is hard enough, but now they have to do it twice, onto two 
> different subsystems. Some features (say multi-doc-transactions) would only 
> be possible in one of the projects (FDB-Couch), what would our policy be for 
> deliberate API feature divergence?
> 
> - probably more that elude me at the moment.
> 
> While there are non-trivial points among these, they are not impossible tasks 
> *if* we find enough and the right folks to carry the work forward.
> 
> * * *
> 
> For myself, I still see a lot of potential in the 3.x codebase and I’m 
> looking forward to renewed roadmap discussions there. I know I have a long 
> list of things I’d like to see added.
> 
> From my professional observation, the thing that our (Neighbourhoodie) 
> customers tend to run into the most is the scaling limits of the 
> database-per-user pattern. We have a proposal for per-doc-authentication that 
> helps mitigate a subset of those use-cases, which would be a great help 
> overall. I have worked on a draft PR of this over the years, but it mostly 
> stalled out during the pandemic. I’m planning to restart work on this 
> shortly. If anyone wants to contribute with time and/or money, please do get 
> in touch.
> 
> The other major issue with 3.x as reported by IBM is _changes feed rewinds 
> when nodes are rotated in and out of clusters. We already fixed a number of 
> changes rewind bugs relatively recently. I don’t know if we got them all now, 
> or if there are theoretical limits to how far we can take this given our 
> consistency model, but it’d be worth spending some time on at least getting 
> rid of all rewind-to-zero cases.
> 
> * * *
> 
> I’m also looking forward to all your input on the discussion here. I’m sure 
> this will explode into a lot of detailed discussions quickly, so maybe as a 
> guide to come back to when get closer to having to make a decision, here are 
> three ways forward that I see:
> 
> 1. Follow IBM in abandoning FDB-Couch, refocus all effort on Erlang-Couch 
> (3.x).
> 
> 2. Take FDB-Couch development over fully, come up with a story for how 
> FDB-Couch and Erlang-Couch can coexist and when users should choose which one.
> 
> 3. Hand over the FDB-Couch codebase to an independent team that then can do 
> what they like with it (if this materialises from this discussion).
> 
> * * *
> 
> Best
> Jan
> —
> 
> 
>> On 10. Mar 2022, at 17:24, Robert Newson <rnew...@apache.org> wrote:
>> 
>> Hi,
>> 
>> For those that are following closely, and particularly those that build or 
>> use CouchDB from our git repo, you'll be aware that CouchDB embarked on an 
>> attempt to build a next-generation version of CouchDB using the FoundationDB 
>> database engine as its new base.
>> 
>> The principal sponsors of this work, the Cloudant team at IBM, have informed 
>> us that, unfortunately, they will not be continuing to fund the development 
>> of this version and are refocusing their efforts on CouchDB 3.x.
>> 
>> Cloudant developers will continue to contribute as they always have done and 
>> the CouchDB PMC thanks them for their efforts.
>> 
>> As the Project Management Committee for the CouchDB project, we are now 
>> asking the developer community how we’d like to proceed in light of this new 
>> information.
>> 
>> Regards,
>> Robert Newson
>> Apache CouchDB PMC
>> 
> 

Reply via email to