Thanks to you both, and I agree. 

Adam's "I would like to see a basic “native” attachment provider with the 
limitations described in 2), as well as an “object store” provider targeting 
the S3 API." is my position/preference too. 

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Thu, 28 Feb 2019, at 11:41, Adam Kocoloski wrote:
> I would like to see a basic “native” attachment provider with the 
> limitations described in 2), as well as an “object store” provider 
> targeting the S3 API. I think the consistency considerations are 
> tractable if you’re comfortable with the possibility that attachments 
> could possibly be orphaned in the object store in the case of a failed 
> transaction.
> 
> I had not considered the “just write them on the file system” provider 
> but that’s probably partly my cloud-native blinders. I think the main 
> question there is redundancy; I would argue against trying to do any 
> sort of replication across local disks. Users who happen to have an 
> NFS-style mount point accessible to all the CouchDB nodes could use 
> this option reliably, though.
> 
> We should calculate a safe maximum attachment size for the native 
> provider — as I understand things the FDB transaction size includes 
> both keys and values, so our effective attachment size limit will be 
> smaller.
> 
> Adam
> 
> > On Feb 28, 2019, at 6:21 AM, Robert Newson <rnew...@apache.org> wrote:
> > 
> > Hi,
> > 
> > Yes, I agree we should have a framework like that. Folks should be able to 
> > choose S3 or COS (IBM), etc. 
> > 
> > I am personally on the hook for the implementation for CouchDB and for IBM 
> > Cloudant and expect them to be different, so the framework, IMO, is a 
> > given. 
> > 
> > B. 
> > 
> >> On 28 Feb 2019, at 10:33, Jan Lehnardt <j...@apache.org> wrote:
> >> 
> >> Thanks for getting this started, Bob!
> >> 
> >> In fear of derailing this right off the bat, is there a potential 4) 
> >> approach where on the CouchDB side there is a way to specify “attachment 
> >> backends”, one of which could be 2), but others could be “node local file 
> >> storage”*, others could be S3-API compatible, etc?
> >> 
> >> *a bunch of heavy handwaving about how to ensure consistency and fault 
> >> tolerance here.
> >> 
> >> * * *
> >> 
> >> My hypothetical 4) could also be a later addition, and we’ll do one of 1-3 
> >> first.
> >> 
> >> 
> >> * * *
> >> 
> >> From 1-3, I think 2 is most pragmatic in terms of keeping desirable 
> >> functionality, while limiting it so it can be useful in practice.
> >> 
> >> I feel strongly about not dropping attachment support. While not ideal in 
> >> all cases, it is an extremely useful and reasonably popular feature.
> >> 
> >> Best
> >> Jan
> >> —
> >> 
> >>> On 28. Feb 2019, at 11:22, Robert Newson <rnew...@apache.org> wrote:
> >>> 
> >>> Hi All,
> >>> 
> >>> We've not yet discussed attachments in terms of the foundationdb work so 
> >>> here's where we do that.
> >>> 
> >>> Today, CouchDB allows you to store large binary values, stored as a 
> >>> series of much smaller chunks. These "attachments" cannot be indexed, 
> >>> they can only be sent and received (you can fetch the whole thing or you 
> >>> can fetch arbitrary subsets of them).
> >>> 
> >>> On the FDB side, we have a few constraints. A transaction cannot be more 
> >>> than 10MB and cannot take more than 5 seconds.
> >>> 
> >>> Given that, there are a few paths to attachment support going forward;
> >>> 
> >>> 1) Drop native attachment support. 
> >>> 
> >>> I suspect this is not going to be a popular approach but it's worth 
> >>> hearing a range of views. Instead of direct attachment support, a user 
> >>> could store the URL to the large binary content and could simply fetch 
> >>> that URL directly.
> >>> 
> >>> 2) Write attachments into FDB but with limits.
> >>> 
> >>> The next simplest is to write the attachments into FDB as a series of 
> >>> key/value entries, where the key is {database_name, doc_id, 
> >>> attachment_name, 0..N} and the value is a short byte array (say, 16K to 
> >>> match current). The 0..N is just a counter such that we can do an fdb 
> >>> range get / iterator to retrieve the attachment. An embellishment would 
> >>> restore the http Range header options, if we still wanted that 
> >>> (disclaimer: I implemented the Range thing many years ago, I'm happy to 
> >>> drop support if no one really cares for it in 2019).
> >>> 
> >>> This would be subject to the 10mb and 5s limit, which is less that you 
> >>> _can_ do today with attachments but not, in my opinion, any less that 
> >>> people actually do (with some notable outliers like npm in the past).
> >>> 
> >>> 3) Full functionality
> >>> 
> >>> This would be the same as today. Attachments of arbitrary size (up to the 
> >>> disk capacity of the fdb cluster). It would require some extra cleverness 
> >>> to work over multiple txn transactions and in such a way that an aborted 
> >>> upload doesn't leave partially uploaded data in fdb forever. I have not 
> >>> sat down and designed this yet, hence I would very much like to hear from 
> >>> the community as to which of these paths are sufficient.
> >>> 
> >>> -- 
> >>> Robert Samuel Newson
> >>> rnew...@apache.org
> >> 
> >> -- 
> >> Professional Support for Apache CouchDB:
> >> https://neighbourhood.ie/couchdb-support/
> > 
> 
>

Reply via email to