Raviraj,

Please run 'riak-debug'.  This is in the bin directory along with 'riak start' 
and 'riak-admin'.

riak-debug will produce a file named similar to 
/home/user/r...@10.0.0.15-riak-debug.tar.gz 
<mailto:home/user/r...@10.0.0.15-riak-debug.tar.gz>

You should email that file to me directly, or post it to dropbox or similar and 
send me a link.  You do not want to send that file to the entire mailing list.

I will review the file and suggest next steps.

Matthew

> On Feb 22, 2016, at 5:13 AM, Raviraj Vaishampayan <rvaishampa...@vmware.com> 
> wrote:
> 
> Hi,
> 
> We have been using riak to gather our test data and analyze results after 
> test completes.
> Recently we have observed riak crash in riak console logs.
> This causes our tests failing to record data to riak and bailing out :-(
> 
> The crash logs are as follow:
> 2016-02-19 16:25:26.255 [error] <0.2160.0> gen_fsm <0.2160.0> in state active 
> terminated with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195
> 2016-02-19 16:25:26.260 [error] <0.2160.0> CRASH REPORT Process <0.2160.0> 
> with 2 neighbours exited with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195 in gen_fsm:terminate/7 line 622
> 2016-02-19 16:25:26.260 [error] <0.172.0> Supervisor riak_core_vnode_sup had 
> child undefined started with {riak_core_vnode,start_link,undefined} at 
> <0.2160.0> exit with reason no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195 in context child_terminated
> 2016-02-19 16:25:26.261 [error] <0.4319.0> gen_fsm <0.4319.0> in state ready 
> terminated with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195
> 2016-02-19 16:25:26.275 [error] <0.4319.0> CRASH REPORT Process <0.4319.0> 
> with 10 neighbours exited with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195 in gen_fsm:terminate/7 line 622
> 2016-02-19 16:25:26.278 [error] <0.4320.0> Supervisor 
> {<0.4320.0>,poolboy_sup} had child riak_core_vnode_worker started with 
> riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[268322566228720457638957762256505085639956365312,...]},...])
>  at undefined exit with reason no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195 in context shutdown_error
> 2016-02-19 16:25:26.278 [error] <0.4320.0> gen_server <0.4320.0> terminated 
> with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195
> 2016-02-19 16:25:26.278 [error] <0.4320.0> CRASH REPORT Process <0.4320.0> 
> with 0 neighbours exited with reason: no function clause matching 
> riak_kv_vnode:handle_info({#Ref<0.0.482.161540>,{ok,<0.11042.842>}}, 
> {state,268322566228720457638957762256505085639956365312,riak_kv_eleveldb_backend,true,{state,<<>>,...},...})
>  line 1195 in gen_server:terminate/6 line 744
> 2016-02-19 16:25:26.806 [error] <0.2157.0> gen_fsm <0.2157.0> in state active 
> terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
> 2016-02-19 16:25:26.808 [error] <0.2157.0> CRASH REPORT Process <0.2157.0> 
> with 2 neighbours exited with reason: 
> {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_fsm:terminate/7 line 600
> 2016-02-19 16:25:26.809 [error] <0.5450.0> gen_fsm <0.5450.0> in state ready 
> terminated with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
> 2016-02-19 16:25:26.809 [error] <0.172.0> Supervisor riak_core_vnode_sup had 
> child undefined started with {riak_core_vnode,start_link,undefined} at 
> <0.2157.0> exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} in 
> context child_terminated
> 2016-02-19 16:25:26.809 [error] <0.5450.0> CRASH REPORT Process <0.5450.0> 
> with 10 neighbours exited with reason: 
> {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_fsm:terminate/7 line 622
> 2016-02-19 16:25:26.809 [error] <0.5451.0> Supervisor 
> {<0.5451.0>,poolboy_sup} had child riak_core_vnode_worker started with 
> riak_core_vnode_worker:start_link([{worker_module,riak_core_vnode_worker},{worker_args,[211232658520482062396626323478525280184646500352,...]},...])
>  at undefined exit with reason {timeout,{gen_server,call,[<0.5141.0>,stop]}} 
> in context shutdown_error
> 2016-02-19 16:25:26.809 [error] <0.5451.0> gen_server <0.5451.0> terminated 
> with reason: {timeout,{gen_server,call,[<0.5141.0>,stop]}}
> 2016-02-19 16:25:26.809 [error] <0.5451.0> CRASH REPORT Process <0.5451.0> 
> with 0 neighbours exited with reason: 
> {timeout,{gen_server,call,[<0.5141.0>,stop]}} in gen_server:terminate/6 line 
> 744
> 
> Our setup is as follow:
> We have a riak cluster with 10 nodes, configuration of each node is as follow:
> RAM: 48GB
> Disk:
>          80GB (/)
>          504GB (separate riak partition)
> Riak Version: 2.1.3-1 (2.1.3)
> Data in riak: After observing crash, total data in riak partition was ~50GB
> 
> Riak config is as follow:
> riak.conf
> [Attached with this email]
> 
> advanced.config:
> [
>  {riak_kv, [{add_paths, ["/usr/local/lib/scale_riak/ebin"]}]},
>  {webmachine, [{backlog, 511}, {nodelay, true}]},
>  {yokozuna, [{solr_request_timeout, 120000}]}
> ].
> 
> We have observed this a few times now, and after this crash we observed 
> latency increases and our application starts timing out.
> We would really like to understand what might be causing this crash and if it 
> is something due to missing config on our nodes we would like to fix it.
> 
> Thanks for your help in advanced :-)
> 
> Regards,
> Raviraj
> <riak.conf>_______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
  • riak crash Raviraj Vaishampayan
    • Re: riak crash Matthew Von-Maszewski

Reply via email to