[jira] Commented: (COUCHDB-639) Make replication profit of attachment compression and improve push replication for large attachments

Chris Anderson (JIRA) Fri, 19 Feb 2010 07:38:51 -0800

    [ 
https://issues.apache.org/jira/browse/COUCHDB-639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835779#action_12835779
 ]


Chris Anderson commented on COUCHDB-639:
----------------------------------------

This patch applies cleanly and the tests are passing. I'm also +1 on the 
feature (and I sure wouldn't mind committing this before 0.11 is tarballed as 
the code changes are enough that it might make backporting fixes to 0.11 a pain 
later on.)

However, I'm not 100% sure about _bulk_docs_rep.

I'm concerned about having a separate endpoint designed for replication (gives 
the wrong idea to people -- that replication is special. Replication is just 
another HTTP client.)

I'm also concerned about the implementation (does this copy only new 
attachments, or does it copy all attachments?) I'd like it of Adam or someone 
else familiar with the replicator could review this patch. (And apply it if you 
think it is right.)



> Make replication profit of attachment compression and improve push 
> replication for large attachments
> ----------------------------------------------------------------------------------------------------
>
>                 Key: COUCHDB-639
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-639
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Replication
>    Affects Versions: 0.11
>         Environment: trunk
>            Reporter: Filipe Manana
>         Attachments: rep-att-comp-and-multipart-trunk.patch
>
>
> At the moment, for compressed attachments, the replication uncompresses and 
> then compresses again the attachments. Therefore, a waste of CPU time.
> The push replication is also not reliable for very large attachments (500mb + 
> for example). Currently it sends the attachments in-lined in the respective 
> JSON doc. Not only this requires too much ram memory, it also wastes too much 
> CPU time doing the base64 encoding of the attachment (and also a 
> decompression if the attachment is compressed).
> The following patch (rep-att-comp-and-multipart-trunk*.patch) addresses both 
> issues. Docs containing attachments are now streamed to the target remote DB 
> using the multipart doc streaming feature provided by couch_doc.erl, and 
> compressed attachments are not uncompressed and re-compressed during the 
> replication
> JavaScript tests included.
> Previously doing a replication of a DB containing 2 docs with attachments of 
> 100mb and 500mb caused the Erlang VM to consume near 1.2GB of ram memory in 
> my system. With that patch applied, it uses about 130Mb of ram memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (COUCHDB-639) Make replication profit of attachment compression and improve push replication for large attachments

Reply via email to