Alex, That's basically the right approach but I'll say it over in my own words with some background;
The way a document with attachments is replicated is as follows; 1) All the bytes of the attachments are written, the offsets of the starts of each chunk of each attachment is remembered in memory. 2) The document is transferred and the chunk offsets are recorded atomically with the document write. A crash before the end of step 1 forces a full restart for that document. We certainly cannot show a document through the HTTP interface in a partially replicated state (all attachments and the updated document body must appear atomically at the target). Instead, we could update the database (not the document) with the offsets and the _id/_rev they belong to, to allow resumption. We'd need to clean it up automatically though. Something like the way we remember the last purge in the db header. As you say, we could then use Range headers to fetch the parts we're missing from the source. B. On 27 November 2013 12:26, Alexander Shorin <kxe...@gmail.com> wrote: > On Wed, Nov 27, 2013 at 3:59 PM, Robert Newson <rnew...@apache.org> wrote: >> Particularly, we could make >> attachment replication resumable. Currently, if we replicate 99.9% of >> a large attachment, lose our connection, and resume, we'll start over >> from byte 0. This is why, elsewhere, there's a suggestion of 'one >> attachment per document'. That is a horrible and artificial constraint >> just to work around replicator deficiencies. We should encourage sane >> design (related attachments together in the same document) and fix the >> bugs that prevent heavy users from following it. > > I think the key issue there is in missing some semi-persistent buffer > on the other side that could be used as temporary buffer for already > received data. In this case replicator may use Range header to send > only missed attachment chunks to Target (since doc and other bit are > already there in the buffer). When every bit had been sent > successfully, doc and his attachments moves from this buffer to the > target database (or been deleted after some timeout). But this isn't a > good solution, right? > > -- > ,,,^..^,,,