Re: [DISCUSS] Attachment support in CouchDB with FDB

Joan Touzet Thu, 28 Feb 2019 09:46:54 -0800

Chiming in to agree with option 2, and if that's too small for
you, you should be able to use either an cloud backend, or a
local file system approach.


Cloud backend should be abstracted to the point where you could
build support for something other than S3 - for instance, B2[1]
would be lovely - but S3 is obviously the API to go after first.
(Lots of SMBs I know are picking B2 because of how cheap it is.)

NFS/CIFS/iSCSI mounts (provided by your favourite SAN) or something
like AWS's EFS would make sense here.

-Joan

[1]: https://www.backblaze.com/b2/cloud-storage.html

----- Original Message -----
> From: "Adam Kocoloski" <kocol...@apache.org>
> To: dev@couchdb.apache.org
> Sent: Thursday, February 28, 2019 6:41:15 AM
> Subject: Re: [DISCUSS] Attachment support in CouchDB with FDB
> 
> I would like to see a basic “native” attachment provider with the
> limitations described in 2), as well as an “object store” provider
> targeting the S3 API. I think the consistency considerations are
> tractable if you’re comfortable with the possibility that
> attachments could possibly be orphaned in the object store in the
> case of a failed transaction.
> 
> I had not considered the “just write them on the file system”
> provider but that’s probably partly my cloud-native blinders. I
> think the main question there is redundancy; I would argue against
> trying to do any sort of replication across local disks. Users who
> happen to have an NFS-style mount point accessible to all the
> CouchDB nodes could use this option reliably, though.
> 
> We should calculate a safe maximum attachment size for the native
> provider — as I understand things the FDB transaction size includes
> both keys and values, so our effective attachment size limit will be
> smaller.
> 
> Adam
> 
> > On Feb 28, 2019, at 6:21 AM, Robert Newson <rnew...@apache.org>
> > wrote:
> > 
> > Hi,
> > 
> > Yes, I agree we should have a framework like that. Folks should be
> > able to choose S3 or COS (IBM), etc.
> > 
> > I am personally on the hook for the implementation for CouchDB and
> > for IBM Cloudant and expect them to be different, so the
> > framework, IMO, is a given.
> > 
> > B.
> > 
> >> On 28 Feb 2019, at 10:33, Jan Lehnardt <j...@apache.org> wrote:
> >> 
> >> Thanks for getting this started, Bob!
> >> 
> >> In fear of derailing this right off the bat, is there a potential
> >> 4) approach where on the CouchDB side there is a way to specify
> >> “attachment backends”, one of which could be 2), but others could
> >> be “node local file storage”*, others could be S3-API compatible,
> >> etc?
> >> 
> >> *a bunch of heavy handwaving about how to ensure consistency and
> >> fault tolerance here.
> >> 
> >> * * *
> >> 
> >> My hypothetical 4) could also be a later addition, and we’ll do
> >> one of 1-3 first.
> >> 
> >> 
> >> * * *
> >> 
> >> From 1-3, I think 2 is most pragmatic in terms of keeping
> >> desirable functionality, while limiting it so it can be useful in
> >> practice.
> >> 
> >> I feel strongly about not dropping attachment support. While not
> >> ideal in all cases, it is an extremely useful and reasonably
> >> popular feature.
> >> 
> >> Best
> >> Jan
> >> —
> >> 
> >>> On 28. Feb 2019, at 11:22, Robert Newson <rnew...@apache.org>
> >>> wrote:
> >>> 
> >>> Hi All,
> >>> 
> >>> We've not yet discussed attachments in terms of the foundationdb
> >>> work so here's where we do that.
> >>> 
> >>> Today, CouchDB allows you to store large binary values, stored as
> >>> a series of much smaller chunks. These "attachments" cannot be
> >>> indexed, they can only be sent and received (you can fetch the
> >>> whole thing or you can fetch arbitrary subsets of them).
> >>> 
> >>> On the FDB side, we have a few constraints. A transaction cannot
> >>> be more than 10MB and cannot take more than 5 seconds.
> >>> 
> >>> Given that, there are a few paths to attachment support going
> >>> forward;
> >>> 
> >>> 1) Drop native attachment support.
> >>> 
> >>> I suspect this is not going to be a popular approach but it's
> >>> worth hearing a range of views. Instead of direct attachment
> >>> support, a user could store the URL to the large binary content
> >>> and could simply fetch that URL directly.
> >>> 
> >>> 2) Write attachments into FDB but with limits.
> >>> 
> >>> The next simplest is to write the attachments into FDB as a
> >>> series of key/value entries, where the key is {database_name,
> >>> doc_id, attachment_name, 0..N} and the value is a short byte
> >>> array (say, 16K to match current). The 0..N is just a counter
> >>> such that we can do an fdb range get / iterator to retrieve the
> >>> attachment. An embellishment would restore the http Range header
> >>> options, if we still wanted that (disclaimer: I implemented the
> >>> Range thing many years ago, I'm happy to drop support if no one
> >>> really cares for it in 2019).
> >>> 
> >>> This would be subject to the 10mb and 5s limit, which is less
> >>> that you _can_ do today with attachments but not, in my opinion,
> >>> any less that people actually do (with some notable outliers
> >>> like npm in the past).
> >>> 
> >>> 3) Full functionality
> >>> 
> >>> This would be the same as today. Attachments of arbitrary size
> >>> (up to the disk capacity of the fdb cluster). It would require
> >>> some extra cleverness to work over multiple txn transactions and
> >>> in such a way that an aborted upload doesn't leave partially
> >>> uploaded data in fdb forever. I have not sat down and designed
> >>> this yet, hence I would very much like to hear from the
> >>> community as to which of these paths are sufficient.
> >>> 
> >>> --
> >>> Robert Samuel Newson
> >>> rnew...@apache.org
> >> 
> >> --
> >> Professional Support for Apache CouchDB:
> >> https://neighbourhood.ie/couchdb-support/
> > 
> 
>

Re: [DISCUSS] Attachment support in CouchDB with FDB

Reply via email to