COUCHDB-901 COUCHDB-1429 and stray/zombie couchjs processes
Sorry for cross-posting, but I'm wondering if CouchDB devs have any opinion on this issue? Cheers, Cliff. -- Forwarded message -- From: Cliffano Subagio Date: Tue, May 1, 2012 at 11:36 AM Subject: COUCHDB-901 COUCHDB-1429 and stray/zombie couchjs processes To: u...@couchdb.apache.org Hi, Searching through the mailing list, there are some threads [1] [2] related to stray/zombie couchjs processes with no obvious resolution yet. Similar to those threads, I'm also getting stray/zombie couchjs processes when the server (couchdb 1.1.1) is under heavy load and couchjs starts timing out until the total (ps ax | grep couchjs | wc -l) is greater than os_process_limit, and couchdb would then consistently logs timeout errors. At that point, the client code can no longer read/write any doc to couchdb until couchdb is restarted or the stray couchjs processes are killed. Adam mentioned on the first thread [1] that there's a branch [3] which might fix the issue. The associated JIRA issue COUCHDB-901 [4] has since moved the target release from 1.1 to 1.2 to 1.3. Is COUCHDB-901 strictly related to BigCouch or also applicable to CouchDB? Could 'losing track of OS processes' possibly contribute to stray couchjs processes? Is there any chance of including this fix in 1.3? There is also COUCHDB-1429 [5] which describes a similar situation with zombie couchjs processes. I checked with Nate, who raised COUCHDB-1429, to confirm that this is not related to a reduce function so we can rule out COUCHDB-1246 as the culprit. Any couchdb dev willing to have a look at COUCHDB-1429? Thanks in advance. [1] http://mail-archives.apache.org/mod_mbox/couchdb-dev/201104.mbox/%3cbanlktikto5r0d2nh-8p81l3piysexwo...@mail.gmail.com%3E [2] http://mail-archives.apache.org/mod_mbox/couchdb-dev/201203.mbox/%3c62af289b-b26a-4abd-aba4-8b2cee56a...@calftrail.com%3E [3] https://github.com/kocolosk/couchdb/tree/COUCHDB-901 [4] https://issues.apache.org/jira/browse/COUCHDB-901 [5] https://issues.apache.org/jira/browse/COUCHDB-1429 [6] https://issues.apache.org/jira/browse/COUCHDB-1246 Cheers, Cliff.
[jira] [Commented] (COUCHDB-1491) view cleanup can kill the viewserver handling process
[ https://issues.apache.org/jira/browse/COUCHDB-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289322#comment-13289322 ] Robert Newson commented on COUCHDB-1491: Reproduced on master (and ronny reports it for 1.2.0), the issue is pretty clear from the log. Deleting the ddoc causes the view group process to exit, but get_index/1 reads a public ets table directly which still contains the old pid. It seems we try to avoid the overhead of the gen_server call if we get back a valid pid; case ets:lookup(?BY_SIG, {DbName, Sig}) of [{_, Pid}] when is_pid(Pid) -> {ok, Pid}; _ -> Args = {Module, IdxState, DbName, Sig}, gen_server:call(?MODULE, {get_index, Args}, infinity) but the pid in this case is probably exiting after the is_pid call but before the subsequent link call. Relevant log output; [info] [<0.133.0>] 127.0.0.1 - - DELETE /test_viewserver_fail/_design/test?rev=1-490d7535c94dda36a8ff75a78faecd6d 200 [info] [<0.179.0>] Closing index for db: test_viewserver_fail idx: _design/test sig: "3c74ba64baa80fe929c51adb42c511ad" reason: normal [info] [<0.133.0>] 127.0.0.1 - - POST /test_viewserver_fail/_view_cleanup 202 [info] [<0.134.0>] 127.0.0.1 - - PUT /test_viewserver_fail/_design/test 201 [error] [<0.134.0>] Uncaught error in HTTP request: {exit, {noproc, {gen_server,call, [<0.179.0>, {get_state,23}, infinity]}}} [info] [<0.134.0>] Stacktrace: [{gen_server,call,3}, {couch_mrview_util,get_view,4}, {couch_mrview,query_view,6}, {couch_httpd,etag_maybe,2}, {couch_mrview_http,design_doc_view,5}, {couch_httpd_db,do_db_req,2}, {couch_httpd,handle_request_int,5}, {mochiweb_http,headers,5}] > view cleanup can kill the viewserver handling process > - > > Key: COUCHDB-1491 > URL: https://issues.apache.org/jira/browse/COUCHDB-1491 > Project: CouchDB > Issue Type: Bug > Components: View Server Support >Reporter: Ronny Pfannschmidt > Attachments: testcase.py > > > basic steps to create the issue on a empty db > 1. save a few docs > 2. create a ddoc, triger view update > 3. delete the ddoc > 4. view cleanup > 6. push ddoc again > 7. view update -> > {"error":"noproc","reason":"{gen_server,call,[<0.485.0>,{get_state,23},infinity]}"} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (COUCHDB-1490) Problems with views on large documents JSONs
[ https://issues.apache.org/jira/browse/COUCHDB-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco updated COUCHDB-1490: --- Environment: Mac Os x 10.6.8, intel Architecture (x86_64), 8Gb of Ram, Erlang R15B01 (erts-5.9.1) (was: Mac Os x 10.6.8, intel Architecture (x86), 8Gb of Ram) > Problems with views on large documents JSONs > > > Key: COUCHDB-1490 > URL: https://issues.apache.org/jira/browse/COUCHDB-1490 > Project: CouchDB > Issue Type: Bug >Affects Versions: 1.2 > Environment: Mac Os x 10.6.8, intel Architecture (x86_64), 8Gb of > Ram, Erlang R15B01 (erts-5.9.1) >Reporter: Francesco > > Hi, > i run a couchdb server (v1.2.0) over a mac (intel architecture, 8gb of ram, > os x version 10.6.8) installed with brew. > The server itself is used as a storage of big jsons (example: > https://raw.github.com/cvdlab-bio/webpdb/develop/docs/jsons/2LGB-pretty-print.json > and > https://raw.github.com/cvdlab-bio/webpdb/develop/docs/jsons/2CRK-pretty-print.json > ) for a tiny uni project. > When we load more than 3 of these jsons, all the map functions (we created to > retrieve documents besides a simple get by id) does not work. > A typical map is: > function(doc){if(doc.TITLE.title.match('.*INSULIN.*') !== null) emit(doc.ID, > doc);} > but even a > function(doc){emit(doc.ID, doc.ID)} > cease to work. > while when there are just 3 or 2 jsons in the database they work just fine. I > tried increasing the stack for couchjs (1gb now, going over 1gb doesn't work > it seems), increasing limits for files (4096), increasing timeout for > processes but in the end i don't get any results and only a (Error: > os_process_error {exit_status,0}) from the db. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira