[
https://issues.apache.org/jira/browse/COUCHDB-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067263#comment-13067263
]
Filipe Manana commented on COUCHDB-1226:
----------------------------------------
Which Erlang OTP version?
Also, have you tried compacting the databases before replication to see if it
helps? (might have some relation to COUCHDB-968)
> Replication causes CouchDB to crash. I *suspect* a memory leak of some kind
> ----------------------------------------------------------------------------
>
> Key: COUCHDB-1226
> URL: https://issues.apache.org/jira/browse/COUCHDB-1226
> Project: CouchDB
> Issue Type: Bug
> Components: Replication
> Affects Versions: 1.1
> Environment: Gentoo Linux, CouchDB built using standard ebuild.
> Rebuilt July 2011.
> Reporter: James Marca
> Attachments: topcouch.log
>
>
> When replicating databases (pull replication), CouchDB will silently crash.
> I suspect a memory leak is leading to the crash, because I watch the beam
> process slowly creep up in RAM usage, then the server dies.
> For the crashing server, the log on "debug" doesn't seem very helpful. It
> says (with manually scrubbed server address):
> [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10054.0>] didn't find a
> replication log for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/
> [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10054.0>] didn't find a
> replication log for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/
> [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10054.0>] didn't find a
> replication log for vdsdata/d12/2007/1210882
> [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10054.0>] didn't find a
> replication log for vdsdata/d12/2007/1210882
> [Mon, 18 Jul 2011 16:23:20 GMT] [info] [<0.10032.0>] starting new replication
> "431a3f5bae52a6b27da72e42dc7b9fe3+create_target" at <0.10054.0>
> [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10070.0>] missing_revs updating
> committed seq to 1
> [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #1
> [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10070.0>] missing_revs updating
> committed seq to 2
> [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #2
> [Mon, 18 Jul 2011 16:23:23 GMT] [debug] [<0.10070.0>] missing_revs updating
> committed seq to 10
> [Mon, 18 Jul 2011 16:23:23 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #10
> [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #14
> [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [<0.10070.0>] missing_revs updating
> committed seq to 14
> [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [<0.10070.0>] missing_revs updating
> committed seq to 20
> [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #20
> [Mon, 18 Jul 2011 16:23:25 GMT] [debug] [<0.10054.0>] target doesn't need a
> full commit
> [Mon, 18 Jul 2011 16:23:36 GMT] [info] [<0.10054.0>] recording a checkpoint
> for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882 at source update_seq 20
> Then, when I restart CouchDB, and restart the node.js program that is setting
> up the replication jobs, the crashed replication job picks up where it left
> off and completes just fine. Again, I scrubbed my server addresses in this
> log snippet.:
> [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [<0.3562.0>] 'POST' /_replicate {1,1}
> from "128.*.*.*"
> Headers: [{'Authorization',"Basic amFtZXM6bWdpY24wbWIzcg=="},
> {'Connection',"close"},
> {'Content-Type',"application/json"},
> {'Host',"***[pullserver]***.edu"},
> {'Transfer-Encoding',"chunked"}]
> [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [<0.3562.0>] OAuth Params: []
> [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [<0.3580.0>] found a replication log
> for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/
> [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [<0.3580.0>] found a replication log
> for vdsdata/d12/2007/1210882
> [Mon, 18 Jul 2011 17:22:53 GMT] [info] [<0.3562.0>] starting new replication
> "431a3f5bae52a6b27da72e42dc7b9fe3+create_target" at <0.3580.0>
> [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [<0.3595.0>] missing_revs updating
> committed seq to 22
> [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #22
> [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [<0.3595.0>] missing_revs updating
> committed seq to 37
> [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #37
> [Mon, 18 Jul 2011 17:22:58 GMT] [debug] [<0.3595.0>] missing_revs updating
> committed seq to 39
> [Mon, 18 Jul 2011 17:22:58 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #39
> [Mon, 18 Jul 2011 17:22:58 GMT] [debug] [<0.3595.0>] missing_revs updating
> committed seq to 47
> [Mon, 18 Jul 2011 17:22:58 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #47
> [Mon, 18 Jul 2011 17:23:00 GMT] [debug] [<0.3595.0>] missing_revs updating
> committed seq to 57
> [Mon, 18 Jul 2011 17:23:00 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #57
> [Mon, 18 Jul 2011 17:23:01 GMT] [debug] [<0.3580.0>] target doesn't need a
> full commit
> [Mon, 18 Jul 2011 17:23:09 GMT] [info] [<0.3580.0>] recording a checkpoint
> for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882 at source update_seq 57
> [Mon, 18 Jul 2011 17:23:19 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #62
> [Mon, 18 Jul 2011 17:23:19 GMT] [debug] [<0.3595.0>] missing_revs updating
> committed seq to 62
> [Mon, 18 Jul 2011 17:23:22 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #78
> [Mon, 18 Jul 2011 17:23:24 GMT] [debug] [<0.3580.0>] target doesn't need a
> full commit
> [Mon, 18 Jul 2011 17:23:29 GMT] [info] [<0.3580.0>] recording a checkpoint
> for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882 at source update_seq 78
> [Mon, 18 Jul 2011 17:23:57 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #255
> [Mon, 18 Jul 2011 17:24:02 GMT] [debug] [<0.3580.0>] target doesn't need a
> full commit
> [Mon, 18 Jul 2011 17:24:02 GMT] [info] [<0.3580.0>] recording a checkpoint
> for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882 at source update_seq 255
> [Mon, 18 Jul 2011 17:24:09 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: W Processed source update #347
> [Mon, 18 Jul 2011 17:24:09 GMT] [debug] [<0.3580.0>] target doesn't need a
> full commit
> [Mon, 18 Jul 2011 17:24:09 GMT] [info] [<0.3580.0>] recording a checkpoint
> for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882 at source update_seq 347
> [Mon, 18 Jul 2011 17:24:09 GMT] [debug] [<0.83.0>] New task status for
> 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ ->
> vdsdata/d12/2007/1210882: Finishing
> [Mon, 18 Jul 2011 17:24:09 GMT] [info] [<0.3562.0>] 128.*.*.* - - 'POST'
> /_replicate 200
> Letting that replication program run, and watching top, CouchDB's total share
> of RAM crept up to 70%, then it crashed.
> Again, the log on the crashing server isn't helpful (more or less the same as
> above)
> The replication program gets through about 8 to 12 databases before it
> crashes.
> Each database (when replicated to the target server) takes up on average
> around 700MB, un-compacted.
> The databases are all similar (annual data for detectors), with one doc per
> day's data. Each document is around 700K.
> If there is any more information (or more helpful information) I can provide,
> please let me know.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira