[ 
https://issues.apache.org/jira/browse/COUCHDB-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543211#comment-15543211
 ] 

ASF GitHub Bot commented on COUCHDB-3168:
-----------------------------------------

GitHub user nickva opened a pull request:

    https://github.com/apache/couchdb-couch-replicator/pull/49

    Fix replicator handling of max_document_size when posting to _bulk_docs

    Currently `max_document_size` setting is a misnomer, it actually configures
    maximum request body size. For single document requests it is a good enough
    approximation. However, _bulk_docs updates could fail the total request size
    check even if individual documents stay below the maximum limit.
    
    Before this fix during replication, `_bulk_docs` reqeust would crash, which
    eventually leads to an infinite cycles of crashes and restarts (with a
    potential large state being dumped to logs), without replicaton job making
    progress.
    
    The is to do binary split on the batch size until either all documents will
    fit under max_document_size limit, or some documents will fail to replicate.
    
    If documents fail to replicate, they bump the `doc_write_failures` count.
    Effectively `max_document_size` acts as in implicit replication filter in 
this
    case.
    
    Jira: COUCHDB-3168

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3168

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-couch-replicator/pull/49.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #49
    
----
commit a9cd0b191524428ece0ebd0a1e18c88bb2afcbaa
Author: Nick Vatamaniuc <vatam...@apache.org>
Date:   2016-10-03T19:30:23Z

    Fix replicator handling of max_document_size when posting to _bulk_docs
    
    Currently `max_document_size` setting is a misnomer, it actually configures
    maximum request body size. For single document requests it is a good enough
    approximation. However, _bulk_docs updates could fail the total request size
    check even if individual documents stay below the maximum limit.
    
    Before this fix during replication, `_bulk_docs` reqeust would crash, which
    eventually leads to an infinite cycles of crashes and restarts (with a
    potential large state being dumped to logs), without replicaton job making
    progress.
    
    The is to do binary split on the batch size until either all documents will
    fit under max_document_size limit, or some documents will fail to replicate.
    
    If documents fail to replicate, they bump the `doc_write_failures` count.
    Effectively `max_document_size` acts as in implicit replication filter in 
this
    case.
    
    Jira: COUCHDB-3168

----


> Replicator doesn't handle well writing documents to a target db which has a 
> small max_document_size
> ---------------------------------------------------------------------------------------------------
>
>                 Key: COUCHDB-3168
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-3168
>             Project: CouchDB
>          Issue Type: Bug
>            Reporter: Nick Vatamaniuc
>
> If a target db has set a smaller document max size, replication crashes.
> It might make sense for the replication to not crash and instead treat 
> document size as an implicit replication filter then display doc write 
> failures in the stats / task info / completion record of normal replications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to