[ https://issues.apache.org/jira/browse/COUCHDB-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069301#comment-13069301 ]
Filipe Manana commented on COUCHDB-1226: ---------------------------------------- Thanks for testing and reporting James > Replication causes CouchDB to crash. I *suspect* a memory leak of some kind > ---------------------------------------------------------------------------- > > Key: COUCHDB-1226 > URL: https://issues.apache.org/jira/browse/COUCHDB-1226 > Project: CouchDB > Issue Type: Bug > Components: Replication > Affects Versions: 1.1 > Environment: Gentoo Linux, CouchDB built using standard ebuild. > Rebuilt July 2011. > Reporter: James Marca > Attachments: topcouch.log > > > When replicating databases (pull replication), CouchDB will silently crash. > I suspect a memory leak is leading to the crash, because I watch the beam > process slowly creep up in RAM usage, then the server dies. > For the crashing server, the log on "debug" doesn't seem very helpful. It > says (with manually scrubbed server address): > [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10054.0>] didn't find a > replication log for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ > [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10054.0>] didn't find a > replication log for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ > [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10054.0>] didn't find a > replication log for vdsdata/d12/2007/1210882 > [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10054.0>] didn't find a > replication log for vdsdata/d12/2007/1210882 > [Mon, 18 Jul 2011 16:23:20 GMT] [info] [<0.10032.0>] starting new replication > "431a3f5bae52a6b27da72e42dc7b9fe3+create_target" at <0.10054.0> > [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10070.0>] missing_revs updating > committed seq to 1 > [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #1 > [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.10070.0>] missing_revs updating > committed seq to 2 > [Mon, 18 Jul 2011 16:23:20 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #2 > [Mon, 18 Jul 2011 16:23:23 GMT] [debug] [<0.10070.0>] missing_revs updating > committed seq to 10 > [Mon, 18 Jul 2011 16:23:23 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #10 > [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #14 > [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [<0.10070.0>] missing_revs updating > committed seq to 14 > [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [<0.10070.0>] missing_revs updating > committed seq to 20 > [Mon, 18 Jul 2011 16:23:24 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #20 > [Mon, 18 Jul 2011 16:23:25 GMT] [debug] [<0.10054.0>] target doesn't need a > full commit > [Mon, 18 Jul 2011 16:23:36 GMT] [info] [<0.10054.0>] recording a checkpoint > for http://***.edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882 at source update_seq 20 > Then, when I restart CouchDB, and restart the node.js program that is setting > up the replication jobs, the crashed replication job picks up where it left > off and completes just fine. Again, I scrubbed my server addresses in this > log snippet.: > [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [<0.3562.0>] 'POST' /_replicate {1,1} > from "128.*.*.*" > Headers: [{'Authorization',"Basic amFtZXM6bWdpY24wbWIzcg=="}, > {'Connection',"close"}, > {'Content-Type',"application/json"}, > {'Host',"***[pullserver]***.edu"}, > {'Transfer-Encoding',"chunked"}] > [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [<0.3562.0>] OAuth Params: [] > [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [<0.3580.0>] found a replication log > for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ > [Mon, 18 Jul 2011 17:22:53 GMT] [debug] [<0.3580.0>] found a replication log > for vdsdata/d12/2007/1210882 > [Mon, 18 Jul 2011 17:22:53 GMT] [info] [<0.3562.0>] starting new replication > "431a3f5bae52a6b27da72e42dc7b9fe3+create_target" at <0.3580.0> > [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [<0.3595.0>] missing_revs updating > committed seq to 22 > [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #22 > [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [<0.3595.0>] missing_revs updating > committed seq to 37 > [Mon, 18 Jul 2011 17:22:56 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #37 > [Mon, 18 Jul 2011 17:22:58 GMT] [debug] [<0.3595.0>] missing_revs updating > committed seq to 39 > [Mon, 18 Jul 2011 17:22:58 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #39 > [Mon, 18 Jul 2011 17:22:58 GMT] [debug] [<0.3595.0>] missing_revs updating > committed seq to 47 > [Mon, 18 Jul 2011 17:22:58 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #47 > [Mon, 18 Jul 2011 17:23:00 GMT] [debug] [<0.3595.0>] missing_revs updating > committed seq to 57 > [Mon, 18 Jul 2011 17:23:00 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #57 > [Mon, 18 Jul 2011 17:23:01 GMT] [debug] [<0.3580.0>] target doesn't need a > full commit > [Mon, 18 Jul 2011 17:23:09 GMT] [info] [<0.3580.0>] recording a checkpoint > for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882 at source update_seq 57 > [Mon, 18 Jul 2011 17:23:19 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #62 > [Mon, 18 Jul 2011 17:23:19 GMT] [debug] [<0.3595.0>] missing_revs updating > committed seq to 62 > [Mon, 18 Jul 2011 17:23:22 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #78 > [Mon, 18 Jul 2011 17:23:24 GMT] [debug] [<0.3580.0>] target doesn't need a > full commit > [Mon, 18 Jul 2011 17:23:29 GMT] [info] [<0.3580.0>] recording a checkpoint > for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882 at source update_seq 78 > [Mon, 18 Jul 2011 17:23:57 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #255 > [Mon, 18 Jul 2011 17:24:02 GMT] [debug] [<0.3580.0>] target doesn't need a > full commit > [Mon, 18 Jul 2011 17:24:02 GMT] [info] [<0.3580.0>] recording a checkpoint > for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882 at source update_seq 255 > [Mon, 18 Jul 2011 17:24:09 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: W Processed source update #347 > [Mon, 18 Jul 2011 17:24:09 GMT] [debug] [<0.3580.0>] target doesn't need a > full commit > [Mon, 18 Jul 2011 17:24:09 GMT] [info] [<0.3580.0>] recording a checkpoint > for http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882 at source update_seq 347 > [Mon, 18 Jul 2011 17:24:09 GMT] [debug] [<0.83.0>] New task status for > 431a3f: http://[sourceserver].edu:5984/vdsdata%2fd12%2f2007%2f1210882/ -> > vdsdata/d12/2007/1210882: Finishing > [Mon, 18 Jul 2011 17:24:09 GMT] [info] [<0.3562.0>] 128.*.*.* - - 'POST' > /_replicate 200 > Letting that replication program run, and watching top, CouchDB's total share > of RAM crept up to 70%, then it crashed. > Again, the log on the crashing server isn't helpful (more or less the same as > above) > The replication program gets through about 8 to 12 databases before it > crashes. > Each database (when replicated to the target server) takes up on average > around 700MB, un-compacted. > The databases are all similar (annual data for detectors), with one doc per > day's data. Each document is around 700K. > If there is any more information (or more helpful information) I can provide, > please let me know. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira