Very strange. Are you seeing the
16:40:50.037 [notice] JS call failed: All VMs are busy. message in your console.log files on any of the nodes in your cluster? Jon On Mon, Apr 2, 2012 at 11:48 AM, Matthew Tovbin <[email protected]> wrote: > Jon, > > Thanks for the reply. I tried increasing the 'map_js_vm_count' and > 'reduce_js_vm_count' > on the cluster to the values you suggested 24/18 and even larger ones 32/24 > and 128/128, but with no success - I still get the same error in the > console log of the node. > > What are we missing?! > > -Matthew > > > > On Fri, Mar 30, 2012 at 16:19, Jon Meredith <[email protected]> wrote: > >> Hi list, >> >> I spend some time looking at this issue today. I suspect it is due to >> all the Javascript VMs being busy inside Riak due to many parallel >> MapReduce jobs. >> >> If you're seeing this issue please check for this message in the >> console.log file >> >> 16:40:50.037 [notice] JS call failed: All VMs are busy. >> >> I've filed https://github.com/basho/riak_pipe/issues/44 to track >> improvements to error reporting. >> >> If you are seeing the log message, the number of VMs allocated for map >> and reduce tasks can be configured in the app.config - defaults are 8 for >> map and 6 for reduce, increasing them will use more memory but should >> reduce the incidence of the error message. >> >> {riak_kv, [ >> ... >> {map_js_vm_count, 24 }, >> {reduce_js_vm_count, 18 }, >> ...] >> >> If you were affected by it and changing this does not resolve your issue, >> I'll keep digging. >> >> Cheers, Jon. >> >> On Thu, Mar 29, 2012 at 10:29 AM, Matthew Tovbin <[email protected]>wrote: >> >>> Guys, >>> >>> Any updates on the issue?! >>> >>> -Matthew >>> >>> >>> >>> On Tue, Mar 13, 2012 at 18:29, Matthew Tovbin <[email protected]>wrote: >>> >>>> Here is a log from one of the servers: >>>> >>>> ==> /mnt/dataraid/log/crash.log <== >>>> 2012-03-13 18:24:44 =CRASH REPORT==== >>>> crasher: >>>> initial call: riak_pipe_builder:init/1 >>>> pid: <0.20949.24> >>>> registered_name: [] >>>> exception exit: {sink_died,shutdown} >>>> in function gen_fsm:terminate/7 >>>> in call from proc_lib:init_p_do_apply/3 >>>> ancestors: [riak_pipe_builder_sup,riak_pipe_sup,<0.172.0>] >>>> messages: [{'DOWN',#Ref<0.0.5.38339>,process,<0.20950.24>,shutdown}] >>>> links: [<0.175.0>] >>>> dictionary: >>>> [{eunit,[{module,riak_pipe_builder},{ref,#Ref<0.0.5.38318>},{spec,[{fitting_spec,listkeys,riak_kv_pipe_listkeys,undefined,{chash,key_of},1,64}]},{options,[{sink,{fitting,<0.20947.24>,#Ref<0.0.5.38318>,{riak_kv_pipe_get,bkey_chash},{riak_kv_pipe_get,bkey_nval}}},{trace,{set,1,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[error],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}},{log,{sink,{fitting,<0.20060.24>,#Ref<0.0.5.38318>,sink,undefined}}}]},{fittings,[{{fitting,<0.20950.24>,#Ref<0.0.5.38318>,{chash,key_of},1},#Ref<0.0.5.38339>}]}]}] >>>> trap_exit: false >>>> status: running >>>> heap_size: 987 >>>> stack_size: 24 >>>> reductions: 251 >>>> neighbours: >>>> 2012-03-13 18:24:44 =SUPERVISOR REPORT==== >>>> Supervisor: {local,riak_pipe_builder_sup} >>>> Context: child_terminated >>>> Reason: {sink_died,shutdown} >>>> Offender: >>>> [{pid,<0.20949.24>},{name,undefined},{mfargs,{riak_pipe_builder,start_link,undefined}},{restart_type,temporary},{shutdown,brutal_kill},{child_type,worker}] >>>> >>>> >>>> ==> /mnt/dataraid/log/console.log <== >>>> 2012-03-13 18:24:44.645 [error] <0.20949.24> CRASH REPORT Process >>>> <0.20949.24> with 0 neighbours crashed with reason: {sink_died,shutdown} >>>> 2012-03-13 18:24:44.652 [error] <0.175.0> Supervisor >>>> riak_pipe_builder_sup had child undefined started with >>>> {riak_pipe_builder,start_link,undefined} at <0.20949.24> exit with reason >>>> {sink_died,shutdown} in context child_terminated >>>> 2012-03-13 18:24:56.420 [info] >>>> <0.71.0>@riak_core_sysmon_handler:handle_event:85 monitor long_gc >>>> <0.32603.24> >>>> [{initial_call,{mochiweb_acceptor,init,3}},{almost_current_function,{gen,do_call,4}},{message_queue_len,0}] >>>> [{timeout,120},{old_heap_block_size,0},{heap_block_size,4181},{mbuf_size,0},{stack_size,40},{old_heap_size,0},{heap_size,1955}] >>>> >>>> >>>> ==> /mnt/dataraid/log/erlang.log.1 <== >>>> 18:24:44.634 [error] gen_fsm <0.20949.24> in state >>>> wait_pipeline_shutdown terminated with reason: {sink_died,shutdown} >>>> (riak@1)1> 18:24:44.641 [error] Pipe worker startup failed:fitting was >>>> gone before startup >>>> (riak@1)1> >>>> 18:24:44.645 [error] CRASH REPORT Process <0.20949.24> with 0 >>>> neighbours crashed with reason: {sink_died,shutdown} >>>> (riak@1)1> 18:24:44.652 [error] Supervisor riak_pipe_builder_sup had >>>> child undefined started with {riak_pipe_builder,start_link,undefined} at >>>> <0.20949.24> exit with reason {sink_died,shutdown} in context >>>> child_terminated >>>> (riak@1)1> 18:24:56.420 [info] monitor long_gc <0.32603.24> >>>> [{initial_call,{mochiweb_acceptor,init,3}},{almost_current_function,{gen,do_call,4}},{message_queue_len,0}] >>>> [{timeout,120},{old_heap_block_size,0},{heap_block_size,4181},{mbuf_size,0},{stack_size,40},{old_heap_size,0},{heap_size,1955}] >>>> (riak@1)1> >>>> >>>> >>>> >>>> ==> /mnt/dataraid/log/error.log <== >>>> 2012-03-13 18:24:44.645 [error] <0.20949.24> CRASH REPORT Process >>>> <0.20949.24> with 0 neighbours crashed with reason: {sink_died,shutdown} >>>> 2012-03-13 18:24:44.652 [error] <0.175.0> Supervisor >>>> riak_pipe_builder_sup had child undefined started with >>>> {riak_pipe_builder,start_link,undefined} at <0.20949.24> exit with reason >>>> {sink_died,shutdown} in context child_terminated >>>> >>>> >>>> >>>> >>>> >>>> >>>> -Matthew >>>> >>>> >>>> >>>> On Tue, Mar 13, 2012 at 18:17, Matthew Tovbin <[email protected]>wrote: >>>> >>>>> Hi, >>>>> >>>>> I got the same problem today on 1.1.0. >>>>> >>>>> As suggested, I updated all the nodes to 1.1.1. The error remains the >>>>> same: >>>>> >>>>> { stack: [Getter/Setter], >>>>> arguments: undefined, >>>>> type: undefined, >>>>> message: 'HTTP error 500: >>>>> {"phase":0,"error":"[preflist_exhausted]","input":"{ok,{r_object ....."}}} >>>>> >>>>> >>>>> -Matthew >>>>> >>>>> >>>>> >>>>> On Tue, Mar 13, 2012 at 06:48, Jon Meredith <[email protected]>wrote: >>>>> >>>>>> Hi Arne, >>>>>> >>>>>> Could you try with 1.1.1 please. We've fixed a number of bugs with >>>>>> how MapReduce handles errors that could explain some of what you are >>>>>> seeing. >>>>>> >>>>>> Jon >>>>>> >>>>>> >>>>>> On Tue, Mar 13, 2012 at 12:55 AM, Arne Claassen < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> I'm running 1.1.0 built from source on Centos 6. I set up two nodes >>>>>>> on localhost following the instructions on the basho site for testing >>>>>>> and >>>>>>> everything was going fine until i needed to fetch all items in a bucket. >>>>>>> >>>>>>> For this i was using node.js with riak-js and it's db.getAll >>>>>>> call,which from the debug logs looks like it uses /mapred >>>>>>> >>>>>>> The bucket only has about 1200 items in it, each a json payload of >>>>>>> about 3KB. Initially, it'd keep getting inconsistent results, sometimes >>>>>>> the >>>>>>> data would come back, sometimes data would be undefined. So i kept >>>>>>> running >>>>>>> it, trying to see if there was a pattern, adding more debugging code, >>>>>>> etc. >>>>>>> But still it would be about a request every couple of seconds. Then it >>>>>>> just >>>>>>> stopped working altogether and now i always get: >>>>>>> >>>>>>> { [Error: HTTP error 500: >>>>>>> {"phase":0,"error":"[preflist_exhausted]","input":"{ok,{r_object,<<\"tweets_15368536\">>,<<\"178528155714330620\">>,[{r_content,{dict,6,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[[<<\"Links\">>]],[],[],[],[],[],[],[],[[<<\"content-type\">>,97,112,112,108,105,99,97,116,105,111,110,47,106,115,111,110],[<<\"X-Riak-VTag\">>,49,76,115,54,70,75,74,71,119,65,111,70,88,73,117,84,98,88,79,69,52,116]],[[<<\"index\">>]],[],[[<<\"X-Riak-Last-Modified\">>|{1331,487523,218968}]],[],[[<<\"X-Riak-Meta\">>]]}}},<<\"{\"coordinates\":null,\"create...\">>}],...},...}","type":"forward_preflist","stack":"[]"}] >>>>>>> statusCode: 500 } >>>>>>> >>>>>>> in the console. I haven't tried restarting riak, just because i'd >>>>>>> rather find out what's happening than it going away and coming back >>>>>>> later >>>>>>> when i don't expect it. >>>>>>> >>>>>>> Any suggestions on how to troubleshoot this? I can still access >>>>>>> items individually and i can even list the keys via ?keys=true, it's >>>>>>> just >>>>>>> he map reduce that seems to be busted. >>>>>>> >>>>>>> cheers, >>>>>>> arne >>>>>>> _______________________________________________ >>>>>>> riak-users mailing list >>>>>>> [email protected] >>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Jon Meredith >>>>>> Platform Engineering Manager >>>>>> Basho Technologies, Inc. >>>>>> [email protected] >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> riak-users mailing list >>>>>> [email protected] >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>> >>>>>> >>>>> >>>> >>> >> >> >> -- >> Jon Meredith >> Platform Engineering Manager >> Basho Technologies, Inc. >> [email protected] >> >> > -- Jon Meredith Platform Engineering Manager Basho Technologies, Inc. [email protected]
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
