Thanks to you both, and I agree. Adam's "I would like to see a basic “native” attachment provider with the limitations described in 2), as well as an “object store” provider targeting the S3 API." is my position/preference too.
-- Robert Samuel Newson rnew...@apache.org On Thu, 28 Feb 2019, at 11:41, Adam Kocoloski wrote: > I would like to see a basic “native” attachment provider with the > limitations described in 2), as well as an “object store” provider > targeting the S3 API. I think the consistency considerations are > tractable if you’re comfortable with the possibility that attachments > could possibly be orphaned in the object store in the case of a failed > transaction. > > I had not considered the “just write them on the file system” provider > but that’s probably partly my cloud-native blinders. I think the main > question there is redundancy; I would argue against trying to do any > sort of replication across local disks. Users who happen to have an > NFS-style mount point accessible to all the CouchDB nodes could use > this option reliably, though. > > We should calculate a safe maximum attachment size for the native > provider — as I understand things the FDB transaction size includes > both keys and values, so our effective attachment size limit will be > smaller. > > Adam > > > On Feb 28, 2019, at 6:21 AM, Robert Newson <rnew...@apache.org> wrote: > > > > Hi, > > > > Yes, I agree we should have a framework like that. Folks should be able to > > choose S3 or COS (IBM), etc. > > > > I am personally on the hook for the implementation for CouchDB and for IBM > > Cloudant and expect them to be different, so the framework, IMO, is a > > given. > > > > B. > > > >> On 28 Feb 2019, at 10:33, Jan Lehnardt <j...@apache.org> wrote: > >> > >> Thanks for getting this started, Bob! > >> > >> In fear of derailing this right off the bat, is there a potential 4) > >> approach where on the CouchDB side there is a way to specify “attachment > >> backends”, one of which could be 2), but others could be “node local file > >> storage”*, others could be S3-API compatible, etc? > >> > >> *a bunch of heavy handwaving about how to ensure consistency and fault > >> tolerance here. > >> > >> * * * > >> > >> My hypothetical 4) could also be a later addition, and we’ll do one of 1-3 > >> first. > >> > >> > >> * * * > >> > >> From 1-3, I think 2 is most pragmatic in terms of keeping desirable > >> functionality, while limiting it so it can be useful in practice. > >> > >> I feel strongly about not dropping attachment support. While not ideal in > >> all cases, it is an extremely useful and reasonably popular feature. > >> > >> Best > >> Jan > >> — > >> > >>> On 28. Feb 2019, at 11:22, Robert Newson <rnew...@apache.org> wrote: > >>> > >>> Hi All, > >>> > >>> We've not yet discussed attachments in terms of the foundationdb work so > >>> here's where we do that. > >>> > >>> Today, CouchDB allows you to store large binary values, stored as a > >>> series of much smaller chunks. These "attachments" cannot be indexed, > >>> they can only be sent and received (you can fetch the whole thing or you > >>> can fetch arbitrary subsets of them). > >>> > >>> On the FDB side, we have a few constraints. A transaction cannot be more > >>> than 10MB and cannot take more than 5 seconds. > >>> > >>> Given that, there are a few paths to attachment support going forward; > >>> > >>> 1) Drop native attachment support. > >>> > >>> I suspect this is not going to be a popular approach but it's worth > >>> hearing a range of views. Instead of direct attachment support, a user > >>> could store the URL to the large binary content and could simply fetch > >>> that URL directly. > >>> > >>> 2) Write attachments into FDB but with limits. > >>> > >>> The next simplest is to write the attachments into FDB as a series of > >>> key/value entries, where the key is {database_name, doc_id, > >>> attachment_name, 0..N} and the value is a short byte array (say, 16K to > >>> match current). The 0..N is just a counter such that we can do an fdb > >>> range get / iterator to retrieve the attachment. An embellishment would > >>> restore the http Range header options, if we still wanted that > >>> (disclaimer: I implemented the Range thing many years ago, I'm happy to > >>> drop support if no one really cares for it in 2019). > >>> > >>> This would be subject to the 10mb and 5s limit, which is less that you > >>> _can_ do today with attachments but not, in my opinion, any less that > >>> people actually do (with some notable outliers like npm in the past). > >>> > >>> 3) Full functionality > >>> > >>> This would be the same as today. Attachments of arbitrary size (up to the > >>> disk capacity of the fdb cluster). It would require some extra cleverness > >>> to work over multiple txn transactions and in such a way that an aborted > >>> upload doesn't leave partially uploaded data in fdb forever. I have not > >>> sat down and designed this yet, hence I would very much like to hear from > >>> the community as to which of these paths are sufficient. > >>> > >>> -- > >>> Robert Samuel Newson > >>> rnew...@apache.org > >> > >> -- > >> Professional Support for Apache CouchDB: > >> https://neighbourhood.ie/couchdb-support/ > > > >