[ 
https://issues.apache.org/jira/browse/COUCHDB-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277498#comment-14277498
 ] 

ASF GitHub Bot commented on COUCHDB-2425:
-----------------------------------------

GitHub user mikewallace1979 opened a pull request:

    https://github.com/apache/couchdb-couch/pull/31

    Add a configurable timeout for get_proc calls

    Previously the gen_server calls to couch_proc_manager/get_proc
    used a timeout of infinity. There are multiple places in the
    couch_proc_manager code path where that process can die without
    replying. With an infinity timeout the couch_query_server process
    would then hang around forever.
    
    This commit makes the gen_server call to get_proc use a configurable
    timeout, set by the couchdb/get_proc_timeout config variable. It
    defaults to 60000ms.
    
    Closes:
    
      COUCHDB-2425
      COUCHDB-2426

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mikewallace1979/couchdb-couch 
2425-fix-dangling-couch_query_server-procs

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-couch/pull/31.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #31
    
----
commit c6f3aafd32fe9b3ddfa93634d0a27fb51017db93
Author: Mike Wallace <mikewall...@apache.org>
Date:   2015-01-14T18:39:44Z

    Add a configurable timeout for get_proc calls
    
    Previously the gen_server calls to couch_proc_manager/get_proc
    used a timeout of infinity. There are multiple places in the
    couch_proc_manager code path where that process can die without
    replying. With an infinity timeout the couch_query_server process
    would then hang around forever.
    
    This commit makes the gen_server call to get_proc use a configurable
    timeout, set by the couchdb/get_proc_timeout config variable. It
    defaults to 60000ms.
    
    Closes:
    
      COUCHDB-2425
      COUCHDB-2426

----


> Exceptions in couch_proc_manager:new_proc/1 leave dangling couch_query_servers
> ------------------------------------------------------------------------------
>
>                 Key: COUCHDB-2425
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2425
>             Project: CouchDB
>          Issue Type: Bug
>      Security Level: public(Regular issues) 
>          Components: JavaScript View Server
>    Affects Versions: 2.0.0
>            Reporter: Mike Wallace
>             Fix For: 2.0.0
>
>
> If an exception is thrown in the try blocks in either of the 
> couch_proc_manager:new_proc/1 clauses [1] [2] then a spawn_error is returned 
> which is handled by a specific handle_info clause [3] which does not reply to 
> the calling process. When the caller is couch_query_servers:get_os_process/1 
> [4] there is an infinite timeout so that process will hang around until 
> either the node reboots or someone intervenes.
> The user-facing symptom is an entry in _active_tasks that makes no progress 
> and never goes away.
> The easiest way to reproduce this is to create a new view and then patch the 
> code to force an exception in the appropriate place, e.g.: (assuming a live 
> dev/run instance): 
>  1. Create DB, add a ddoc and a doc:
> {code}
> $ curl -X PUT http://localhost:15984/kitteh
> {"ok":true}
> $ curl -X POST http://localhost:15984/kitteh -d '{"_id":"_design/view", 
> "views": {"test": {"map": "function(doc) { emit(doc.id, 1); }"}}}' -H 
> 'Content-Type: application/json'
> {"ok":true,"id":"_design/view","rev":"1-ef10f980522d4c8e691e3c26d4c3fac5"}
> $ curl -X PUT http://localhost:15984/kitteh/ohai -d '{}'
> {"ok":true,"id":"ohai","rev":"1-967a00dff5e02add41819138abb3284d"}
> {code}
>  2. Apply https://gist.github.com/mikewallace1979/79f823da25f6a78e2725 then 
> re-run make and dev/run.
>  3. Attempt to query the view:
> {code}
> $ curl -X GET http://localhost:15984/kitteh/_design/view/_view/test
> {"error":"timeout","reason":"The request could not be processed in a 
> reasonable amount of time."}
> {code}
>  4. Observe index build tasks which never go away:
> {code}
> $ curl -X GET http://localhost:15986/_active_tasks
> [{"pid":"<0.2403.0>","changes_done":0,"database":"shards/40000000-5fffffff/kitteh.1414779043","design_document":"_design/view","progress":0,"started_on":1414779439,"total_changes":1,"type":"indexer","updated_on":1414779439},{"pid":"<0.2444.0>","changes_done":0,"database":"shards/00000000-1fffffff/kitteh.1414779043","design_document":"_design/view","progress":0,"started_on":1414779439,"total_changes":1,"type":"indexer","updated_on":1414779439}]
> {code}
> [1] 
> https://github.com/apache/couchdb-couch/blob/master/src/couch_query_servers.erl#L354
> [2] 
> https://github.com/apache/couchdb-couch/blob/master/src/couch_proc_manager.erl#L368-L384
> [3] 
> https://github.com/apache/couchdb-couch/blob/master/src/couch_proc_manager.erl#L239-L245
> [4] 
> https://github.com/apache/couchdb-couch/blob/master/src/couch_query_servers.erl#L354



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to