Chiming in to agree with option 2, and if that's too small for you, you should be able to use either an cloud backend, or a local file system approach.
Cloud backend should be abstracted to the point where you could build support for something other than S3 - for instance, B2[1] would be lovely - but S3 is obviously the API to go after first. (Lots of SMBs I know are picking B2 because of how cheap it is.) NFS/CIFS/iSCSI mounts (provided by your favourite SAN) or something like AWS's EFS would make sense here. -Joan [1]: https://www.backblaze.com/b2/cloud-storage.html ----- Original Message ----- > From: "Adam Kocoloski" <kocol...@apache.org> > To: dev@couchdb.apache.org > Sent: Thursday, February 28, 2019 6:41:15 AM > Subject: Re: [DISCUSS] Attachment support in CouchDB with FDB > > I would like to see a basic “native” attachment provider with the > limitations described in 2), as well as an “object store” provider > targeting the S3 API. I think the consistency considerations are > tractable if you’re comfortable with the possibility that > attachments could possibly be orphaned in the object store in the > case of a failed transaction. > > I had not considered the “just write them on the file system” > provider but that’s probably partly my cloud-native blinders. I > think the main question there is redundancy; I would argue against > trying to do any sort of replication across local disks. Users who > happen to have an NFS-style mount point accessible to all the > CouchDB nodes could use this option reliably, though. > > We should calculate a safe maximum attachment size for the native > provider — as I understand things the FDB transaction size includes > both keys and values, so our effective attachment size limit will be > smaller. > > Adam > > > On Feb 28, 2019, at 6:21 AM, Robert Newson <rnew...@apache.org> > > wrote: > > > > Hi, > > > > Yes, I agree we should have a framework like that. Folks should be > > able to choose S3 or COS (IBM), etc. > > > > I am personally on the hook for the implementation for CouchDB and > > for IBM Cloudant and expect them to be different, so the > > framework, IMO, is a given. > > > > B. > > > >> On 28 Feb 2019, at 10:33, Jan Lehnardt <j...@apache.org> wrote: > >> > >> Thanks for getting this started, Bob! > >> > >> In fear of derailing this right off the bat, is there a potential > >> 4) approach where on the CouchDB side there is a way to specify > >> “attachment backends”, one of which could be 2), but others could > >> be “node local file storage”*, others could be S3-API compatible, > >> etc? > >> > >> *a bunch of heavy handwaving about how to ensure consistency and > >> fault tolerance here. > >> > >> * * * > >> > >> My hypothetical 4) could also be a later addition, and we’ll do > >> one of 1-3 first. > >> > >> > >> * * * > >> > >> From 1-3, I think 2 is most pragmatic in terms of keeping > >> desirable functionality, while limiting it so it can be useful in > >> practice. > >> > >> I feel strongly about not dropping attachment support. While not > >> ideal in all cases, it is an extremely useful and reasonably > >> popular feature. > >> > >> Best > >> Jan > >> — > >> > >>> On 28. Feb 2019, at 11:22, Robert Newson <rnew...@apache.org> > >>> wrote: > >>> > >>> Hi All, > >>> > >>> We've not yet discussed attachments in terms of the foundationdb > >>> work so here's where we do that. > >>> > >>> Today, CouchDB allows you to store large binary values, stored as > >>> a series of much smaller chunks. These "attachments" cannot be > >>> indexed, they can only be sent and received (you can fetch the > >>> whole thing or you can fetch arbitrary subsets of them). > >>> > >>> On the FDB side, we have a few constraints. A transaction cannot > >>> be more than 10MB and cannot take more than 5 seconds. > >>> > >>> Given that, there are a few paths to attachment support going > >>> forward; > >>> > >>> 1) Drop native attachment support. > >>> > >>> I suspect this is not going to be a popular approach but it's > >>> worth hearing a range of views. Instead of direct attachment > >>> support, a user could store the URL to the large binary content > >>> and could simply fetch that URL directly. > >>> > >>> 2) Write attachments into FDB but with limits. > >>> > >>> The next simplest is to write the attachments into FDB as a > >>> series of key/value entries, where the key is {database_name, > >>> doc_id, attachment_name, 0..N} and the value is a short byte > >>> array (say, 16K to match current). The 0..N is just a counter > >>> such that we can do an fdb range get / iterator to retrieve the > >>> attachment. An embellishment would restore the http Range header > >>> options, if we still wanted that (disclaimer: I implemented the > >>> Range thing many years ago, I'm happy to drop support if no one > >>> really cares for it in 2019). > >>> > >>> This would be subject to the 10mb and 5s limit, which is less > >>> that you _can_ do today with attachments but not, in my opinion, > >>> any less that people actually do (with some notable outliers > >>> like npm in the past). > >>> > >>> 3) Full functionality > >>> > >>> This would be the same as today. Attachments of arbitrary size > >>> (up to the disk capacity of the fdb cluster). It would require > >>> some extra cleverness to work over multiple txn transactions and > >>> in such a way that an aborted upload doesn't leave partially > >>> uploaded data in fdb forever. I have not sat down and designed > >>> this yet, hence I would very much like to hear from the > >>> community as to which of these paths are sufficient. > >>> > >>> -- > >>> Robert Samuel Newson > >>> rnew...@apache.org > >> > >> -- > >> Professional Support for Apache CouchDB: > >> https://neighbourhood.ie/couchdb-support/ > > > >