Hi everyone. I'm new here and just discovered the ongoing proposition for CouchDB to rely upon FDB.
With my team, we were considering providing an HTTP API over FDB in the form of the CouchDB API definition, so I'm very pleased to see there is already an ongoing effort for this (even if still a proposition). I've tried to catch up with all the good discussions on how you could make this work, mapping to the K/V model, but sorry if I could have missed a point. I'm curious on how you're considering to manage multi tenancy while ensuring a good scalability and avoiding hotspotting. I've read an idea from Mickael with CryptoHash to map the model this way : {bucket_id}/{cryptohash} : value We currently use this CryptoHash mecanism to manage some data in a multi tenancy context applied to Time Series. Here is a simple diagram that summarize it : {raw_data} -> ingress component -> {hashed_metadata+data} -> HBase -> {crypted_metadata} -> HBase -> {crypted_metadata} -> Directory service Query -> egress component -> HBase raw_data is in the metric{tags} format, like in Prometheus/OpenTSDB/Warp10 style. hashed metadata is a double 64 or 128 bits hashes of hash(metric) + hash(tags). Default is 64bits but it can lead to collision in the keyspace above 1B unique series where 128bits hashes are safer. egress will query the Directoy service to get the series list to be read in the store. While authenticating, a custom "application" label is embedded into a label that ends in the data model, then hashed that avoid conflict between users.Hashed metadata are suffixed with a timestamp because it's convenient for Time Series data. What makes it very useful is : - it can still use scans per series (metrics+tags) - it avoids hotspotting the cluster and ensures a very good distributions among nodes - it provides authentication through a directory service that act as an indirection - keys are consistent while metrics or tags can be very long I think this kind of model can perfectly apply to FDB for documents given that Namespace would be a user application/bucket/... : hash ( {NS} + {...} + {DOC_ID} ) / fields / ... Drawbacks are that it may require a bit more storage for keys, but hashing could be adjusted given the use case. Moreover, managing rights at the document level would also require additional fields or few bytes to manage this, while using a directory index (could be as memory inside CouchDB, outside relying on something like Elastic, or available directly inside FDB) I realize that just FDB as a backend is a considerable amount of work and pushing multi tenancy adds even more work maybe into CouchDB itself. For example, Tokens could embed rights and buckets ids, that would be used by CouchDB to authorize and build the underlying data model for storing with scalability and optimizations in mind. Also, did anyone considered reaching the FDB guys to try to align CouchDB document representation to the Document Layer ( https://foundationdb.github.io/fdb-document-layer/data-modeling.html ). This would make CouchDB to be also MongoDB API compatible. I don't where discussions are, but maybe we could help :)