On Fri, Feb 27, 2009 at 8:57 AM, Adam Kocoloski <[email protected]> wrote: > Hi Jeff, I can pick this one up, but not before Monday. We do have some > replicating-attachment JIRA tickets open and active, but it looks like > there's some new stuff in this report too. Feel free to file another one. > Best, > > Adam I'll review the current JIRA tickets to avoid a dupe if found, I'll also work on building a reproducible test case for you. Hope that python script is ok with you.
Regards, Jeff > > Sent from my iPhone > > On Feb 27, 2009, at 9:13 AM, "Jeff Hinrichs - DM&T" <[email protected]> > wrote: > >> Attempting to replicate a database with largish attachments (<= ~18MB >> of attachments in a doc, less thatn 200 docs) from one machine to >> another fails consistently and at the same point. >> >> Scenario: >> Both servers are running from HEAD and I've been tracking for some >> time. This problem has been around as long as I've been using couch. >> >> Machine A holds the original database, Machine B is the server that is >> doing a PULL replication >> >> During the replication, Machine A starts showing the following >> sporadically in the log: >> [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5902.3>] 'GET' >> >> /delasco-invoices/INV00652429?revs=true&attachments=true&latest=true&open_revs=["425644723"] >> {1, >> >> 1} >> Headers: [{'Host',"192.168.2.52:5984"}] >> >> [Fri, 27 Feb 2009 14:02:48 GMT] [error] [<0.5901.3>] Uncaught error in >> HTTP request: {exit,normal} >> >> [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5901.3>] Stacktrace: >> [{mochiweb_request,send,2}, >> {couch_httpd,send_chunk,2}, >> {couch_httpd_db,db_doc_req,3}, >> {couch_httpd_db,do_db_req,2}, >> {couch_httpd,handle_request,3}, >> {mochiweb_http,headers,5}, >> {proc_lib,init_p,5}] >> >> [Fri, 27 Feb 2009 14:02:48 GMT] [debug] [<0.5901.3>] HTTPd 500 error >> response: >> {"error":"error","reason":"normal"} >> >> As the replication continues, the frequency of these error "Uncaught >> error in HTTP request: {exit,normal}" increase. Until the error is >> being constantly repeated. Then Machine B stops sending requests, no >> mor log output, no errors, the last thing in Machine B's log file is: >> [Fri, 27 Feb 2009 14:03:24 GMT] [info] [<0.20893.1>] retrying >> couch_rep HTTP get request due to {error, req_timedout}: [104,116, >> >> 116,112,58, >> 47,47,49, >> 57,50,46, >> 49,54,56, >> 46,50,46, >> 53,50,58, >> 53,57,56, >> >> 52,47,100, >> >> 101,108,97, >> >> 115,99,111, >> >> 45,105,110, >> 118,111, >> >> 105,99,101, >> >> 115,47,73, >> 78,86,48, >> 48,54,53, >> 50,49,51, >> >> 56,63,114, >> 101,118, >> >> 115,61,116, >> 114,117, >> >> 101,38,97, >> >> 116,116,97, >> >> 99,104,109, >> 101,110, >> >> 116,115,61, >> 116,114, >> >> 117,101,38, >> >> 108,97,116, >> 101,115, >> >> 116,61,116, >> 114,117, >> >> 101,38,111, >> 112,101, >> >> 110,95,114, >> 101,118, >> >> 115,61,91, >> 34, >> >> <<"3070455362">>, >> 34,93] >> >> A request for status from the couchdb init.d script returns nothing >> and checking the processes returns: >> (demo-couchdb)j...@mars:~/projects/venvs/demo-couchdb/src$ ps ax|grep cou >> 29281 pts/2 S+ 0:00 grep cou >> (demo-couchdb)j...@mars:~/projects/venvs/demo-couchdb/src$ ps ax|grep beam >> 29305 pts/2 R+ 0:00 grep beam >> >> In fact, couch has gone away completely on Machine B. In fact, >> couch's death is so quick it can't even say why. >> >> Attempts to incrementally replicate after the first failure die at >> exactly the same place. >> >> I can replicate this same database on the same machine from one >> database to another without issue. I can dump and reload the database >> with no problems. >> >> I have reported this earlier and no one seemed to have an answer. Is >> there a specific issue in JIRA that addresses this problem? If not, >> is what I have here enough to start one and should I? >> >> Regards, >> >> Jeff Hinrichs >
