Hi, There's a bug in 3.1.0 that affects you. Namely that the default 5 second gen_server timeout is used for some requests if ioq bypass is enabled. Please check if your config has a [ioq.bypass] section and try again without bypasses for a time.
If you could explain your migration process in more detail perhaps we can find other explanations. I note that such migrations are better done online using replication, moving the files around is a bit more challenging. B. > On 7 Jul 2022, at 08:28, Luca Morandini <luca.morandi...@gmail.com> wrote: > > Dear All, > > I moved some CouchDB 3.1.0 databases to a new 4-node cluster via > copying the shard files. > > The operation worked for 5 out of 6 databases; the biggest database > (about 200GB, 12 shards, 2 replicas) did not come online on the new > cluster. > > I suspect high disk latency, but... could someone shed some light on this? > > The relevant logs are: > > [info] 2022-07-06T04:30:44.697901Z couchdb@10.0.0.80 > \u003c0.228.0\u003e -------- db > shards/95555553-aaaaaaa7/twitter.1657067184 died with reason > {timeout,{gen_server,call,[\u003c0.26790.5\u003e,find_header]}} > [error] 2022-07-06T04:30:44.698269Z couchdb@10.0.0.80 > \u003c0.26789.5\u003e -------- CRASH REPORT Process > (\u003c0.26789.5\u003e) with 2 neighbors exited with reason: > {timeout,{gen_server,call,[\u003c0.26790.5\u003e,find_header]}} at > gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378) > \u003c= couch_bt_engine:init/2(line:157) \u003c= > couch_db_engine:init/3(line:775) \u003c= > couch_db_updater:init/1(line:43) \u003c= > proc_lib:init_p_do_apply/3(line:247); initial_call: > {couch_db_updater,init,['Argument__1']}, ancestors: > [\u003c0.26784.5\u003e], message_queue_len: 0, messages: [], links: > [\u003c0.26784.5\u003e,\u003c0.26790.5\u003e], dictionary: > [{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...], > trap_exit: false, status: running, heap_size: 610, stack_size: 27, > reductions: 250 > [error] 2022-07-06T04:56:10.077664Z couchdb@10.0.0.80 > \u003c0.6591.6\u003e -------- CRASH REPORT Process > (\u003c0.6591.6\u003e) with 2 neighbors exited with reason: > {timeout,{gen_server,call,[\u003c0.6593.6\u003e,find_header]}} at > gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378) > \u003c= couch_bt_engine:init/2(line:157) \u003c= > couch_db_engine:init/3(line:775) \u003c= > couch_db_updater:init/1(line:43) \u003c= > proc_lib:init_p_do_apply/3(line:247); initial_call: > {couch_db_updater,init,['Argument__1']}, ancestors: > [\u003c0.6584.6\u003e], message_queue_len: 0, messages: [], links: > [\u003c0.6584.6\u003e,\u003c0.6593.6\u003e], dictionary: > [{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...], > trap_exit: false, status: running, heap_size: 610, stack_size: 27, > reductions: 250 > [info] 2022-07-06T04:56:10.077711Z couchdb@10.0.0.80 > \u003c0.228.0\u003e -------- db > shards/95555553-aaaaaaa7/twitter.1657067184 died with reason > {timeout,{gen_server,call,[\u003c0.6593.6\u003e,find_header]}} > [info] 2022-07-07T06:44:13.863950Z couchdb@10.0.0.80 > \u003c0.228.0\u003e -------- db > shards/95555553-aaaaaaa7/twitter.1657067184 died with reason > {timeout,{gen_server,call,[\u003c0.9139.29\u003e,find_header]}} > [error] 2022-07-07T06:44:13.864516Z couchdb@10.0.0.80 > \u003c0.9152.29\u003e -------- CRASH REPORT Process > (\u003c0.9152.29\u003e) with 2 neighbors exited with reason: > {timeout,{gen_server,call,[\u003c0.9139.29\u003e,find_header]}} at > gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378) > \u003c= couch_bt_engine:init/2(line:157) \u003c= > couch_db_engine:init/3(line:775) \u003c= > couch_db_updater:init/1(line:43) \u003c= > proc_lib:init_p_do_apply/3(line:247); initial_call: > {couch_db_updater,init,['Argument__1']}, ancestors: > [\u003c0.9136.29\u003e], message_queue_len: 0, messages: [], links: > [\u003c0.9136.29\u003e,\u003c0.9139.29\u003e], dictionary: > [{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...], > trap_exit: false, status: running, heap_size: 610, stack_size: 27, > reductions: 250 > > Cheers, > > Luca Morandini