Couch 2.x cluster returning inconsistent _all_docs

Arif Khan Thu, 16 Aug 2018 11:26:26 -0700

Hi,

We have a three node couchDB 2.x (one 2.1.1 and two 2.1.2) cluster which is
returning inconsistent number of _all_docs when we hit _all_docs endpoint
on any db that has more documents in addition to two _design docs (i.e
logDocs and permissions). Following are some error messages from the log
file


[error] 2018-08-16T18:16:15.150627Z [email protected]
<0.16547.1789> -------- Error opening view group `permissions` from
database `shards/55555554-71c71c6f/shadow_4495.1514409098`:
{'EXIT',{{{badmatch,{error,emfile}},[{couch_file,init,1,[{file,"src/couch_file.erl"},{line,398}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]},{gen_server,call,[couch_index_server,{get_index,{couch_mrview_index,{mrst,<<62,130,60,42,67,131,172,12,24,212,229,116,19,90,91,8>>,nil,undefined,<<"shards/55555554-71c71c6f/shadow_4495.1514409098">>,<<"_design/permissions">>,<<"javascript">>,[],false,false,{[]},[],nil,nil,0,0,undefined,undefined,undefined,undefined,undefined,nil},<<"shards/55555554-71c71c6f/shadow_4495.1514409098">>,<<62,130,60,42,67,131,172,12,24,212,229,116,19,90,91,8>>}},infinity]}}}

[error] 2018-08-16T18:15:00.068553Z [email protected]
<0.26642.4837> -------- CRASH REPORT Process couch_index_server
(<0.26642.4837>) with 1 neighbors exited with reason: no match of right
hand value {error,emfile} at couch_file:init/1(line:398) <=
gen_server:init_it/6(line:304) <= proc_lib:init_p_do_apply/3(line:239) at
gen_server:terminate/6(line:744) <= proc_lib:init_p_do_apply/3(line:239);
initial_call: {couch_index_server,init,['Argument__1']}, ancestors:
[couch_secondary_services,couch_sup,<0.204.0>], messages: [], links:
[<0.4278.4837>,<0.19436.4831>], dictionary: [], trap_exit: true, status:
running, heap_size: 6772, stack_size: 27, reductions: 1050564


Please note, we recently replaced two nodes in the cluster. The nodes got
replaced were named like couchdb@<ip-address> whereas the nodes replaced
them are named as couchdb@<fqdn> . Following is a high level process we
followed:

- Stop couchDB process in the outgoing node
- Add the incoming node to the cluster
- Update cluster metadata for all databases replacing the outgoing node
with the incoming one (in both by_node and by_range dictionaries)
- Wait until all shards have arrived at the new node
- Delete the old node from the cluster

Please help.

Thanks
Arif

Couch 2.x cluster returning inconsistent _all_docs

Reply via email to