Re: Prototype CouchDB Layer for FoundationDB

Adam Kocoloski Thu, 28 Mar 2019 08:58:43 -0700

Hi Paul, good stuff.

I agree with you about the FDB Subspaces feature. I’ve been thinking that our 
layer code should maintain its own enumeration of the various “subspaces” to 
single-byte prefixes within the directory. I haven’t yet captured that in the 
RFCs, but e.g. we should be using ?REVISIONS instead of <<“revisions”>> as the 
prefix for the KVs representing the revision tree.


I also agree with the use of a top-level Directory to enable multiple 
applications to use the same FoundationDB cluster.

Adam

> On Mar 27, 2019, at 1:46 PM, Nick Vatamaniuc <[email protected]> wrote:
> 
> Looking over the code, it seems very simple and clean. Without knowing much
> of the internals or following the discussion too closely I think I was able
> to read and understand most of it.
> 
> I like split between db and fdb layers. Hopefully it means if we start from
> this we can do some parallel work implementing on top of db layer and below
> it at the same time.
> 
> The use of maps is nice and seems to simply things quite a bit.
> 
> Don't have much to add about metadata and other issues, will let others who
> know more chime in. It seems a bit similar to how we had the
> instance_start_time at one point or how we add the suffix to db shards.
> 
> Great work!
> -Nick
> 
> On Wed, Mar 27, 2019 at 12:53 PM Paul Davis <[email protected]>
> wrote:
> 
>> Hey everyone!
>> 
>> I've gotten enough of a FoundationDB layer prototype implemented [1]
>> to start sharing publicly. This is emphatically no where near useful
>> to non-CouchDB-developers. The motivation for this work was to try and
>> get enough of a basic prototype written so that we can all start
>> fleshing out the various RFCs with actual implementations to compare
>> and contrast and so on.
>> 
>> To be clear, I've made a lot of intentionally "bad" choices while
>> writing this to both limit the scope of what I was trying to
>> accomplish and also to make super clear that I don't expect any of
>> this code to be "final" in any way whatsoever. This work is purely so
>> that everyone has an initial code base that can be "turned on" so to
>> speak. To that end, here's a non-exhaustive list of some of the
>> silliness I've done:
>> 
>>  1. All document bodies must fit inside a single value
>>  2. All requests must fit within the single fdb transaction limits
>>  3. I'm using binary_to_term for things like the revision tree
>>  4. The revision tree has to fit in a single value
>>  5. There's basically 0 supported query string parameters at this point
>>  6. Nothing outside super basic db/doc ops is implemented (i.e., no views)
>> 
>> However, what it does do is start! And it talks to FoundationDB! So at
>> least that bit seems to be reliable (only tested on OS X via
>> FoundationDB binary installers so super duper caveat on that
>> "reliable").
>> 
>> There's a small test script [2] that shows what it's currently capable
>> of. A quick glance at that should give a pretty good idea of how
>> little is actually implemented in this prototype. There's also a list
>> of notes I've been keeping as I've been hacking on this that also
>> tries to gather a bunch of questions that'll need to be answered [3]
>> as we continue to work on this.
>> 
>> To that end, I have learned a couple lessons from working with
>> FoundationDB from this work that I'd like to share. First is that
>> while we can cache a bunch of stuff, we have to be able to ensure that
>> the cache is invalidated properly when various bits of metadata
>> change. There's a feature on FoundationDB master [1] for this specific
>> issue. I've faked the same behavior using an arbitrary key but the
>> `fabric2_fdb:is_current/1` function I think is a good implementation
>> of this done correctly.
>> 
>> Secondly, I spent a lot of time trying to figure out how to use
>> FoundationDB's Directory and Subspace layers inside the CouchDB layer.
>> After barking up that tree for a long time I've basically decided that
>> the best answer is probably "don't". I do open a single directory at
>> the root, but that's merely in order to play nice with any other
>> layers that use the directory layer. Inside the "CouchDB directory"
>> its all strictly Tuple Layer direct code.
>> 
>> The Subspace Layer seems to be basically useless in Erlang. First, its
>> a very thin wrapper over the Tuple Layer that basically just holds
>> onto a prefix that's prepended onto the tuple layer operations. In
>> other languages the Subspace Layer has a lot of syntactical sugar that
>> makes them useful. Erlang doesn't support any of that so it ends up
>> being more of a burden to use that rather than just using the Tuple
>> Layer directly. Dropping the use of directories and subspaces has
>> greatly simplified the implementation thus far.
>> 
>> In terms of code layout, nearly all of the new implementation is in
>> `src/fabric/src/fabric2*` modules. There's also a few changes to
>> chttpd obviously to call the new code as well as commenting out parts
>> of features so I didn't have to follow all the various call stacks
>> updating huge swathes of semi-unrelated code.
>> 
>> I'd be super interested to hear feed back and see people start running
>> with this in whatever direction catches their fancy. Hopefully this
>> proves useful for people to start writing implementations of the
>> various RFCs so we can make progress on those fronts.
>> 
>> [1] https://github.com/apache/couchdb/compare/prototype/fdb-layer
>> [2] https://github.com/apache/couchdb/blob/prototype/fdb-layer/fdb-test.py
>> [3]
>> https://github.com/apache/couchdb/blob/prototype/fdb-layer/FDB_NOTES.md
>> [4]
>> https://forums.foundationdb.org/t/a-new-tool-for-managing-layer-metadata/1191
>>

Re: Prototype CouchDB Layer for FoundationDB

Reply via email to