Re: Debugging mapreduce
Hi, On 25 September 2013 03:44, Toby Corkindale wrote: > Have you tried executing your javascript outside of Riak? > ie. paste the function into the Chrome debugger, then call it with a > Riak-like data structure. The problem with this approach is I need to make some assumptions on what the data as inout to my function looks like. > Also, consider wrapping the code in your function with an eval so you can > catch errors that occur. (Then either ejslog them or return them as results > of the map phase) With ejsLog() also not working for me I am finding it hard to inspect what riak is passing into my function to debug it elsewhere (like a js repl). -- Ciao Charl "I will either find a way, or make one." -- Hannibal ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Debugging mapreduce
On 25/09/13 11:20, Charl Matthee wrote: Hi, I am trying to run the following mapreduce query across my cluster: # curl -XPOST http://10.179.229.209:8098/mapred -H "Content-Type: application/json" -d '{"inputs":"tweets", "query":[{"map":{"language":"javascript", "source":"function(value, keyData, arg) {t = JSON.parse(value.values[0].data)[0]; if ((new Date - new Date(t.created_at)) / 1000 > 2592000) return [t.id]; else return []}", "keep":true}}]}' {"lineno":466,"message":"SyntaxError: syntax error","source":"()"} Have you tried executing your javascript outside of Riak? ie. paste the function into the Chrome debugger, then call it with a Riak-like data structure. Also, consider wrapping the code in your function with an eval so you can catch errors that occur. (Then either ejslog them or return them as results of the map phase) ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Debugging mapreduce
Hi, I am trying to run the following mapreduce query across my cluster: # curl -XPOST http://10.179.229.209:8098/mapred -H "Content-Type: application/json" -d '{"inputs":"tweets", "query":[{"map":{"language":"javascript", "source":"function(value, keyData, arg) {t = JSON.parse(value.values[0].data)[0]; if ((new Date - new Date(t.created_at)) / 1000 > 2592000) return [t.id]; else return []}", "keep":true}}]}' {"lineno":466,"message":"SyntaxError: syntax error","source":"()"} The riak logs only have the following to report: ==> /var/log/riak/crash.log <== 2013-09-24 05:42:51 =ERROR REPORT webmachine error: path="/mapred" "Internal Server Error" ==> /var/log/riak/console.log <== 2013-09-24 05:42:51.272 [error] <0.20367.1441> Webmachine error at path "/mapred" : "Internal Server Error" ==> /var/log/riak/error.log <== 2013-09-24 05:42:51.272 [error] <0.20367.1441> Webmachine error at path "/mapred" : "Internal Server Error" Is there any way to get some more info on this to debug it further? I have tried using ejsLog() (from http://docs.basho.com/riak/1.3.2/references/appendices/MapReduce-Implementation/#Debugging-Javascript-Map-Reduce-Phases) to inspect the data in the function body but that simply gives me: # curl -XPOST http://10.179.229.209:8098/mapred -H "Content-Type: application/json" -d '{"inputs":"tweets", "query":[{"map":{"language":"javascript", "source":"function(value, keyData, arg) {t = JSON.parse(value.values[0].data)[0]; ejsLog('/tmp/map_reduce.log', JSON.stringify(t)); if ((new Date - new Date(t.created_at)) / 1000 > 2592000) return [t.id]; else return []}", "keep":true}}]}' {"lineno":1,"message":"SyntaxError: invalid flag after regular expression","source":"JSON.stringify(function(value, keyData, arg) {t = JSON.parse(value.values[0].data)[0]; ejsLog(/tmp/map_reduce.log, JSON.stringify(t)); if ((new Date - new Date(t.created_at)) / 1000 > 2592000) return [t.id]; else return []}({\"bucket\":\"tweets\",\"key\":\"37456"} I have also tried checking for already deleted documents in case that was what tripping things up but adding a check in for the X-Riak-Deleted header also results in an error: # curl -XPOST http://10.179.229.209:8098/mapred -H "Content-Type: application/json" -d '{"inputs":"tweets", "query":[{"map":{"language":"javascript", "source":"function(value, keyData, arg) {if (value.values[0].metadata['X-Riak-Deleted'] == 'true') return []; t = JSON.parse(value.values[0].data)[0]; if ((new Date - new Date(t.created_at)) / 1000 > 2592000) return [t.id]; else return []}", "keep":true}}]}' {"lineno":1,"message":"ReferenceError: X is not defined","source":"unknown"} -- Ciao Charl "I will either find a way, or make one." -- Hannibal ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Node not leaving cluster
Looks like Dmitry's + your suggestion did the trick. I upgraded the rest of the nodes in-place using the method you suggested and the hanging "leave" finally handed off it's data. Thank you! Dave On Sep 24, 2013, at 5:14 PM, Brian Sparrow wrote: > Ahh, gotcha. > > Let us know how things go and if we can offer any more assistance. > > Thanks! > > -- > Brian Sparrow > Developer Advocate > Basho Technologies > > Sent with Sparrow > > On Tuesday, September 24, 2013 at 5:10 PM, David Greenstein wrote: > >> >> I'm actually joining new nodes that have the latest version to the cluster. >> Once I join the nodes I have the old nodes leave. This has worked great in >> the past. I'll use the recommended method next time :) >> >> Dave >> >> On Sep 24, 2013, at 5:05 PM, Brian Sparrow wrote: >> >>> David, >>> >>> The standard way to kick transfers is setting transfer_limit to 0 and then >>> back up to 2(default) or higher(up to 8 without turning other knobs). This >>> can be done with `riak-admin transfer_limit 0` then `riak-admin >>> transfer_limit 4`. >>> >>> With that said, may I ask why you are upgrading nodes by leaving them and >>> then re-joining them back to the cluster? Unless you are changing backend >>> properties this should not be necessary and simply taking nodes down, >>> upgrading them, and restarting them is the standard way to do a rolling >>> upgrade[1]. >>> >>> Let us know how things go after kicking the transfers. >>> >>> Thanks! >>> >>> [1] http://docs.basho.com/riak/latest/ops/running/rolling-upgrades/ >>> >>> -- >>> Brian Sparrow >>> Developer Advocate >>> Basho Technologies >>> >>> Sent with Sparrow >>> >>> On Tuesday, September 24, 2013 at 5:00 PM, David Greenstein wrote: >>> I'm not receiving any errors actually. I'm far from an expert in riak logs, but the console.log messages seem to indicate the handoff is working… just extremely slowly. Here's a snippet from the console.log of the node that is attempting to leave… 2013-09-24 20:59:43.991 [info] <0.7140.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff data for partition riak_kv_vnode:685078892498860742907977265335757665463718379520 2013-09-24 20:59:43.994 [info] <0.7140.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for partition 685078892498860742907977265335757665463718379520 exited after processing 0 objects 2013-09-24 20:59:44.035 [info] <0.7142.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff data for partition riak_kv_vnode:45671926166590716193865151022383844364247891968 2013-09-24 20:59:44.038 [info] <0.7142.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for partition 45671926166590716193865151022383844364247891968 exited after processing 0 objects 2013-09-24 20:59:44.068 [info] <0.7144.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff data for partition riak_kv_vnode:1415829711164312202009819681693899175291684651008 2013-09-24 20:59:44.073 [info] <0.7144.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for partition 1415829711164312202009819681693899175291684651008 exited after processing 0 objects The percentages in member_status still have not changed though. Thank you again for any help!!! Dave On Sep 24, 2013, at 3:42 PM, Dmitry Demeshchuk wrote: > Seems like a potential problem with handoff. We had similar problems > upgrading from 1.2.1 to 1.4.0. Check the logs for handoff errors > (something like <<"unknown_msg">> or similar). > > If that's the case, leave that node be, and do in-place upgrade for the > rest of the nodes, without making them leave the cluster. The third node > will probably leave after that, so you'll be able to re-join it. > > > On Tue, Sep 24, 2013 at 12:33 PM, David Greenstein > wrote: >> >> I'm performing a rolling upgrade to 1.4.2 (from 1.3.1). The first two >> nodes that I replaced left the cluster without an issue and the new >> nodes joined without an issue. Now, the next node seams to be in a state >> where it won't leave the cluster. The status is leaving but it has been >> pending for several hours. Perhaps it is due to the pending ownership >> handoff from ring_status that also doesn't seem to be completing. Any >> insight or help on how to "kickstart" the leave would be greatly >> appreciated! >> >> Dave >> >> [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin ring_status >> == Claimant >> === >> Claimant: 'riak@10.0.1.10' >> Status: up >> Ring Ready: true >> >> == Ownership Handoff >> ==
Re: Node not leaving cluster
Ahh, gotcha. Let us know how things go and if we can offer any more assistance. Thanks! -- Brian Sparrow Developer Advocate Basho Technologies Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday, September 24, 2013 at 5:10 PM, David Greenstein wrote: > > I'm actually joining new nodes that have the latest version to the cluster. > Once I join the nodes I have the old nodes leave. This has worked great in > the past. I'll use the recommended method next time :) > > Dave > On Sep 24, 2013, at 5:05 PM, Brian Sparrow (mailto:bspar...@basho.com)> wrote: > > David, > > > > The standard way to kick transfers is setting transfer_limit to 0 and then > > back up to 2(default) or higher(up to 8 without turning other knobs). This > > can be done with `riak-admin transfer_limit 0` then `riak-admin > > transfer_limit 4`. > > > > With that said, may I ask why you are upgrading nodes by leaving them and > > then re-joining them back to the cluster? Unless you are changing backend > > properties this should not be necessary and simply taking nodes down, > > upgrading them, and restarting them is the standard way to do a rolling > > upgrade[1]. > > > > Let us know how things go after kicking the transfers. > > > > Thanks! > > > > [1] http://docs.basho.com/riak/latest/ops/running/rolling-upgrades/ > > > > -- > > Brian Sparrow > > Developer Advocate > > Basho Technologies > > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig) > > > > > > On Tuesday, September 24, 2013 at 5:00 PM, David Greenstein wrote: > > > > > I'm not receiving any errors actually. I'm far from an expert in riak > > > logs, but the console.log messages seem to indicate the handoff is > > > working… just extremely slowly. Here's a snippet from the console.log of > > > the node that is attempting to leave… > > > > > > > > > 2013-09-24 20:59:43.991 [info] > > > <0.7140.0>@riak_core_handoff_receiver:process_message:99 Receiving > > > handoff data for partition > > > riak_kv_vnode:685078892498860742907977265335757665463718379520 > > > 2013-09-24 20:59:43.994 [info] > > > <0.7140.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for > > > partition 685078892498860742907977265335757665463718379520 exited after > > > processing 0 objects > > > 2013-09-24 20:59:44.035 [info] > > > <0.7142.0>@riak_core_handoff_receiver:process_message:99 Receiving > > > handoff data for partition > > > riak_kv_vnode:45671926166590716193865151022383844364247891968 > > > 2013-09-24 20:59:44.038 [info] > > > <0.7142.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for > > > partition 45671926166590716193865151022383844364247891968 exited after > > > processing 0 objects > > > 2013-09-24 20:59:44.068 [info] > > > <0.7144.0>@riak_core_handoff_receiver:process_message:99 Receiving > > > handoff data for partition > > > riak_kv_vnode:1415829711164312202009819681693899175291684651008 > > > 2013-09-24 20:59:44.073 [info] > > > <0.7144.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for > > > partition 1415829711164312202009819681693899175291684651008 exited after > > > processing 0 objects > > > > > > The percentages in member_status still have not changed though. > > > > > > Thank you again for any help!!! > > > > > > Dave > > > > > > On Sep 24, 2013, at 3:42 PM, Dmitry Demeshchuk > > (mailto:demeshc...@gmail.com)> wrote: > > > > Seems like a potential problem with handoff. We had similar problems > > > > upgrading from 1.2.1 to 1.4.0. Check the logs for handoff errors > > > > (something like <<"unknown_msg">> or similar). > > > > > > > > If that's the case, leave that node be, and do in-place upgrade for the > > > > rest of the nodes, without making them leave the cluster. The third > > > > node will probably leave after that, so you'll be able to re-join it. > > > > > > > > > > > > On Tue, Sep 24, 2013 at 12:33 PM, David Greenstein > > > > mailto:d...@collaborate.com)> wrote: > > > > > > > > > > I'm performing a rolling upgrade to 1.4.2 (from 1.3.1). The first two > > > > > nodes that I replaced left the cluster without an issue and the new > > > > > nodes joined without an issue. Now, the next node seams to be in a > > > > > state where it won't leave the cluster. The status is leaving but it > > > > > has been pending for several hours. Perhaps it is due to the pending > > > > > ownership handoff from ring_status that also doesn't seem to be > > > > > completing. Any insight or help on how to "kickstart" the leave would > > > > > be greatly appreciated! > > > > > > > > > > Dave > > > > > > > > > > [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin ring_status > > > > > == Claimant > > > > > === > > > > > Claimant: 'riak@10.0.1.10 (mailto:riak@10.0.1.10)' > > > > > Status: up > > > > > Ring Ready: true > > > > > > > > > > ==
Re: Node not leaving cluster
I'm actually joining new nodes that have the latest version to the cluster. Once I join the nodes I have the old nodes leave. This has worked great in the past. I'll use the recommended method next time :) Dave On Sep 24, 2013, at 5:05 PM, Brian Sparrow wrote: > David, > > The standard way to kick transfers is setting transfer_limit to 0 and then > back up to 2(default) or higher(up to 8 without turning other knobs). This > can be done with `riak-admin transfer_limit 0` then `riak-admin > transfer_limit 4`. > > With that said, may I ask why you are upgrading nodes by leaving them and > then re-joining them back to the cluster? Unless you are changing backend > properties this should not be necessary and simply taking nodes down, > upgrading them, and restarting them is the standard way to do a rolling > upgrade[1]. > > Let us know how things go after kicking the transfers. > > Thanks! > > [1] http://docs.basho.com/riak/latest/ops/running/rolling-upgrades/ > > -- > Brian Sparrow > Developer Advocate > Basho Technologies > > Sent with Sparrow > > On Tuesday, September 24, 2013 at 5:00 PM, David Greenstein wrote: > >> I'm not receiving any errors actually. I'm far from an expert in riak logs, >> but the console.log messages seem to indicate the handoff is working… just >> extremely slowly. Here's a snippet from the console.log of the node that is >> attempting to leave… >> >> >> 2013-09-24 20:59:43.991 [info] >> <0.7140.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff >> data for partition >> riak_kv_vnode:685078892498860742907977265335757665463718379520 >> 2013-09-24 20:59:43.994 [info] >> <0.7140.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for >> partition 685078892498860742907977265335757665463718379520 exited after >> processing 0 objects >> 2013-09-24 20:59:44.035 [info] >> <0.7142.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff >> data for partition >> riak_kv_vnode:45671926166590716193865151022383844364247891968 >> 2013-09-24 20:59:44.038 [info] >> <0.7142.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for >> partition 45671926166590716193865151022383844364247891968 exited after >> processing 0 objects >> 2013-09-24 20:59:44.068 [info] >> <0.7144.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff >> data for partition >> riak_kv_vnode:1415829711164312202009819681693899175291684651008 >> 2013-09-24 20:59:44.073 [info] >> <0.7144.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for >> partition 1415829711164312202009819681693899175291684651008 exited after >> processing 0 objects >> >> The percentages in member_status still have not changed though. >> >> Thank you again for any help!!! >> >> Dave >> >> On Sep 24, 2013, at 3:42 PM, Dmitry Demeshchuk wrote: >> >>> Seems like a potential problem with handoff. We had similar problems >>> upgrading from 1.2.1 to 1.4.0. Check the logs for handoff errors (something >>> like <<"unknown_msg">> or similar). >>> >>> If that's the case, leave that node be, and do in-place upgrade for the >>> rest of the nodes, without making them leave the cluster. The third node >>> will probably leave after that, so you'll be able to re-join it. >>> >>> >>> On Tue, Sep 24, 2013 at 12:33 PM, David Greenstein >>> wrote: I'm performing a rolling upgrade to 1.4.2 (from 1.3.1). The first two nodes that I replaced left the cluster without an issue and the new nodes joined without an issue. Now, the next node seams to be in a state where it won't leave the cluster. The status is leaving but it has been pending for several hours. Perhaps it is due to the pending ownership handoff from ring_status that also doesn't seem to be completing. Any insight or help on how to "kickstart" the leave would be greatly appreciated! Dave [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin ring_status == Claimant === Claimant: 'riak@10.0.1.10' Status: up Ring Ready: true == Ownership Handoff == Owner: riak@10.0.1.12 Next Owner: riak@10.0.1.16 Index: 456719261665907161938651510223838443642478919680 Waiting on: [riak_kv_vnode] Complete: [riak_pipe_vnode] --- == Unreachable Nodes == All nodes are up and reachable [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin member_status = Membership == Status RingPendingNode
Re: Deleting data from bitcask backend
Hello Charl, Everything looks as expected in the vnode status and the logs. Feel free to run `bitcask:merge("/PATH/TO/PARTITION")` for other partitions on the node(s) to reclaim space. Let us know how things go. Thanks, -- Brian Sparrow Developer Advocate Basho Technologies Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Monday, September 16, 2013 at 4:34 PM, Charl Matthee wrote: > Hi, > > On 16 September 2013 21:15, Alex Moore (mailto:amo...@basho.com)> wrote: > > > I would recommend running "riak-admin vnode-status" during an off-peak time > > like Evan mentioned, that way we can see what your dead bytes/fragmentation > > levels look like. This will also let us know if you are hitting the > > triggers or if they need to be adjusted more. > > > > > I've attached the output from "riak-admin vnode-status". > > > One thing I did notice is that your dead_bytes_merge_trigger and > > dead_bytes_threshold are both set to 64MB. > > This will means that a merge will be triggered when a bitcask file has more > > than 64MB of dead objects in it (dead_bytes_merge_trigger), and only files > > with more than 64 MB of dead objects will be merged (dead_bytes_threshold). > > If you want more than that single file to be merged, you can reduce > > dead_bytes_threshold further so it can include files that are nearing the > > limit. > > > > > Great, thanks for that. I'll update my nodes tomorrow and monitor the > effects during the day. > > -- > Ciao > > Charl > > "I will either find a way, or make one." -- Hannibal > ___ > riak-users mailing list > riak-users@lists.basho.com (mailto:riak-users@lists.basho.com) > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > Attachments: > - riak-vnode-status > ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Node not leaving cluster
I take that back, a 1.4.2 node handing off data to an older node during rebalancing is in fact reporting an unknown_msg error. I'll try your suggestion. 2013-09-24 21:02:35.337 [error] <0.15962.4>@riak_core_handoff_sender:start_fold:269 ownership_transfer transfer of riak_kv_vnode from 'riak@10.0.1.12' 456719261665907161938651510223838443642478919680 to 'riak@10.0.1.16' 456719261665907161938651510223838443642478919680 failed because of error:{case_clause,{ok,[255|<<"unknown_msg">>]}} [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,222}]}] Dave On Sep 24, 2013, at 3:42 PM, Dmitry Demeshchuk wrote: > Seems like a potential problem with handoff. We had similar problems > upgrading from 1.2.1 to 1.4.0. Check the logs for handoff errors (something > like <<"unknown_msg">> or similar). > > If that's the case, leave that node be, and do in-place upgrade for the rest > of the nodes, without making them leave the cluster. The third node will > probably leave after that, so you'll be able to re-join it. > > > On Tue, Sep 24, 2013 at 12:33 PM, David Greenstein > wrote: > > I'm performing a rolling upgrade to 1.4.2 (from 1.3.1). The first two nodes > that I replaced left the cluster without an issue and the new nodes joined > without an issue. Now, the next node seams to be in a state where it won't > leave the cluster. The status is leaving but it has been pending for several > hours. Perhaps it is due to the pending ownership handoff from ring_status > that also doesn't seem to be completing. Any insight or help on how to > "kickstart" the leave would be greatly appreciated! > > Dave > > [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin ring_status > == Claimant > === > Claimant: 'riak@10.0.1.10' > Status: up > Ring Ready: true > > == Ownership Handoff > == > Owner: riak@10.0.1.12 > Next Owner: riak@10.0.1.16 > > Index: 456719261665907161938651510223838443642478919680 > Waiting on: [riak_kv_vnode] > Complete: [riak_pipe_vnode] > > --- > > == Unreachable Nodes > == > All nodes are up and reachable > > > > > > [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin member_status > = Membership > == > Status RingPendingNode > --- > leaving14.1% 14.1%'riak@10.0.1.15' > valid 14.1% 14.1%'riak@10.0.1.10' > valid 14.1% 14.1%'riak@10.0.1.11' > valid 17.2% 15.6%'riak@10.0.1.12' > valid 12.5% 14.1%'riak@10.0.1.16' > valid 14.1% 14.1%'riak@10.0.1.17' > valid 14.1% 14.1%'riak@10.0.1.18' > --- > Valid:6 / Leaving:1 / Exiting:0 / Joining:0 / Down:0 > > > > > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > -- > Best regards, > Dmitry Demeshchuk ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Node not leaving cluster
I'm not receiving any errors actually. I'm far from an expert in riak logs, but the console.log messages seem to indicate the handoff is working… just extremely slowly. Here's a snippet from the console.log of the node that is attempting to leave… 2013-09-24 20:59:43.991 [info] <0.7140.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff data for partition riak_kv_vnode:685078892498860742907977265335757665463718379520 2013-09-24 20:59:43.994 [info] <0.7140.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for partition 685078892498860742907977265335757665463718379520 exited after processing 0 objects 2013-09-24 20:59:44.035 [info] <0.7142.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff data for partition riak_kv_vnode:45671926166590716193865151022383844364247891968 2013-09-24 20:59:44.038 [info] <0.7142.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for partition 45671926166590716193865151022383844364247891968 exited after processing 0 objects 2013-09-24 20:59:44.068 [info] <0.7144.0>@riak_core_handoff_receiver:process_message:99 Receiving handoff data for partition riak_kv_vnode:1415829711164312202009819681693899175291684651008 2013-09-24 20:59:44.073 [info] <0.7144.0>@riak_core_handoff_receiver:handle_info:69 Handoff receiver for partition 1415829711164312202009819681693899175291684651008 exited after processing 0 objects The percentages in member_status still have not changed though. Thank you again for any help!!! Dave On Sep 24, 2013, at 3:42 PM, Dmitry Demeshchuk wrote: > Seems like a potential problem with handoff. We had similar problems > upgrading from 1.2.1 to 1.4.0. Check the logs for handoff errors (something > like <<"unknown_msg">> or similar). > > If that's the case, leave that node be, and do in-place upgrade for the rest > of the nodes, without making them leave the cluster. The third node will > probably leave after that, so you'll be able to re-join it. > > > On Tue, Sep 24, 2013 at 12:33 PM, David Greenstein > wrote: > > I'm performing a rolling upgrade to 1.4.2 (from 1.3.1). The first two nodes > that I replaced left the cluster without an issue and the new nodes joined > without an issue. Now, the next node seams to be in a state where it won't > leave the cluster. The status is leaving but it has been pending for several > hours. Perhaps it is due to the pending ownership handoff from ring_status > that also doesn't seem to be completing. Any insight or help on how to > "kickstart" the leave would be greatly appreciated! > > Dave > > [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin ring_status > == Claimant > === > Claimant: 'riak@10.0.1.10' > Status: up > Ring Ready: true > > == Ownership Handoff > == > Owner: riak@10.0.1.12 > Next Owner: riak@10.0.1.16 > > Index: 456719261665907161938651510223838443642478919680 > Waiting on: [riak_kv_vnode] > Complete: [riak_pipe_vnode] > > --- > > == Unreachable Nodes > == > All nodes are up and reachable > > > > > > [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin member_status > = Membership > == > Status RingPendingNode > --- > leaving14.1% 14.1%'riak@10.0.1.15' > valid 14.1% 14.1%'riak@10.0.1.10' > valid 14.1% 14.1%'riak@10.0.1.11' > valid 17.2% 15.6%'riak@10.0.1.12' > valid 12.5% 14.1%'riak@10.0.1.16' > valid 14.1% 14.1%'riak@10.0.1.17' > valid 14.1% 14.1%'riak@10.0.1.18' > --- > Valid:6 / Leaving:1 / Exiting:0 / Joining:0 / Down:0 > > > > > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > > -- > Best regards, > Dmitry Demeshchuk ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Node not leaving cluster
Seems like a potential problem with handoff. We had similar problems upgrading from 1.2.1 to 1.4.0. Check the logs for handoff errors (something like <<"unknown_msg">> or similar). If that's the case, leave that node be, and do in-place upgrade for the rest of the nodes, without making them leave the cluster. The third node will probably leave after that, so you'll be able to re-join it. On Tue, Sep 24, 2013 at 12:33 PM, David Greenstein wrote: > > I'm performing a rolling upgrade to 1.4.2 (from 1.3.1). The first two > nodes that I replaced left the cluster without an issue and the new nodes > joined without an issue. Now, the next node seams to be in a state where it > won't leave the cluster. The status is leaving but it has been pending for > several hours. Perhaps it is due to the pending ownership handoff from > ring_status that also doesn't seem to be completing. Any insight or help on > how to "kickstart" the leave would be greatly appreciated! > > Dave > > [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin ring_status > == Claimant > === > Claimant: 'riak@10.0.1.10' > Status: up > Ring Ready: true > > == Ownership Handoff > == > Owner: riak@10.0.1.12 > Next Owner: riak@10.0.1.16 > > Index: 456719261665907161938651510223838443642478919680 > Waiting on: [riak_kv_vnode] > Complete: [riak_pipe_vnode] > > > --- > > == Unreachable Nodes > == > All nodes are up and reachable > > > > > > [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin member_status > = Membership > == > Status RingPendingNode > > --- > leaving14.1% 14.1%'riak@10.0.1.15' > valid 14.1% 14.1%'riak@10.0.1.10' > valid 14.1% 14.1%'riak@10.0.1.11' > valid 17.2% 15.6%'riak@10.0.1.12' > valid 12.5% 14.1%'riak@10.0.1.16' > valid 14.1% 14.1%'riak@10.0.1.17' > valid 14.1% 14.1%'riak@10.0.1.18' > > --- > Valid:6 / Leaving:1 / Exiting:0 / Joining:0 / Down:0 > > > > > > ___ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > -- Best regards, Dmitry Demeshchuk ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Node not leaving cluster
I'm performing a rolling upgrade to 1.4.2 (from 1.3.1). The first two nodes that I replaced left the cluster without an issue and the new nodes joined without an issue. Now, the next node seams to be in a state where it won't leave the cluster. The status is leaving but it has been pending for several hours. Perhaps it is due to the pending ownership handoff from ring_status that also doesn't seem to be completing. Any insight or help on how to "kickstart" the leave would be greatly appreciated! Dave [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin ring_status == Claimant === Claimant: 'riak@10.0.1.10' Status: up Ring Ready: true == Ownership Handoff == Owner: riak@10.0.1.12 Next Owner: riak@10.0.1.16 Index: 456719261665907161938651510223838443642478919680 Waiting on: [riak_kv_vnode] Complete: [riak_pipe_vnode] --- == Unreachable Nodes == All nodes are up and reachable [user@ip-10-0-1-12 user]# /db/riak/bin/riak-admin member_status = Membership == Status RingPendingNode --- leaving14.1% 14.1%'riak@10.0.1.15' valid 14.1% 14.1%'riak@10.0.1.10' valid 14.1% 14.1%'riak@10.0.1.11' valid 17.2% 15.6%'riak@10.0.1.12' valid 12.5% 14.1%'riak@10.0.1.16' valid 14.1% 14.1%'riak@10.0.1.17' valid 14.1% 14.1%'riak@10.0.1.18' --- Valid:6 / Leaving:1 / Exiting:0 / Joining:0 / Down:0 ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak won't start when configured for Riak-CS
Brian, Thanks. I will try that again. Perhaps there is a way for the package scripts or build scripts to detect this situation in the future? thanks, Darren On 09/24/2013 09:42 AM, Brian Sparrow wrote: Hey Darren, As Dave said, the reason for a bad file error is the beam was compiled with one version of the Erlang VM and it is not being run in a different version. My suggestion would be to completely remove all RiakCS installations you have(from source and from package) and re-install from the .deb package. This has the correct version of Erlang packaged within it so you shouldn't have a problem. Thanks! Brian -- Brian Sparrow Developer Advocate Basho Technologies Sent with Sparrow On Tuesday, September 24, 2013 at 5:14 AM, Darren Govoni wrote: Hi! I did build Riak-CS from source, but there is no 'install' target in the make so I wasn't sure how that got done. There was no 'install' readme also. I then installed the .deb instead. I installed erlang from ubuntu 13.04 repo. On 09/23/2013 11:16 PM, Dave Parfitt wrote: Hello - Sounds like you have an older version of Erlang trying to load code compiled with a newer version of Erlang. Cheers, Dave On Sep 23, 2013, at 10:48 PM, Darren Govoniwrote: Hi, I am following the directions for configuring Riak with Riak-CS. http://docs.basho.com/riakcs/latest/cookbooks/configuration/Configuring-Riak/ When I follow the directions, riak console gives a problem when starting: 22:43:33.174 [error] Loading of /usr/lib/riak-cs/lib/riak_cs-1.4.1/ebin/riak_cs_kv_multi_backend.beam failed: badfile 22:43:33.174 [error] Failed to start riak_cs_kv_multi_backend Reason: {undef,[{riak_cs_kv_multi_backend,start,[0,[{async_folds,true},[{vnode_vclocks,true},{included_applications,[]},{allow_strfun,false},{reduce_js_vm_count,6},{storage_backend,riak_cs_kv_multi_backend},{legacy_keylisting,false},{pb_ip,"127.0.0.1"},{hook_js_vm_count,2},{listkeys_backpressure,true},{mapred_name,"mapred"},{stats_urlpath,"stats"},{legacy_stats,true},{js_thread_stack,16},{multi_backend,[{be_default,riak_kv_eleveldb_backend,[{max_open_files,50},{data_root,"/var/lib/riak/leveldb"}]},{be_blocks,riak_kv_bitcask_backend,[{data_root,"/var/lib/riak/bitcask"}]}]},{multi_backend_prefix_list,[{<<"0b:">>,be_blocks}]},{riak_kv_stat,true},{add_paths,["/usr/lib/riak-cs/lib/riak_cs-1.4.1/ebin"]},{http_url_encoding,on},{map_js_vm_count,8},{pb_port,8087},{multi_backend_default,be_default},{mapred_2i_pipe,true},{mapred_system,pipe},{js_max_vm_mem,8}]]]},{riak_kv_vnode,init,1},{riak_core_vnode,init,1},{gen_fsm,init_it,6},{proc_lib,init_p_do_apply,3}]} 22:43:33.250 [error] beam/beam_load.c(1365): Error loading module riak_cs_kv_multi_backend: use of opcode 153; this emulator supports only up to 152 22:43:33.330 [notice] "backend module failed to start." I am on Ubuntu 13.04 thanks for any tips. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Very high CPU usage by bean.smp
Jared, Is it possible to elaborate more on "meet me in the middle settings/scenarios?", let me explain, let's say the quorum is configured with low values, say, R=W=1 and N=3, doesn't that add more work to AAE background process? Could there be ways to sacrifice some client performance with let's say, in the middle kind of generic for most scenario settings to have the client wait a bit longer (R=W=2 or similar combinations) and AAE to work a bit less? Just thoughts though, I leave to the experts the deep explanation of this. HTH, Guido. On 24/09/13 13:41, Jared Morrow wrote: Interesting. Someone knows what this entropy thing is doing exactly? Can it be switched off by default maybe? Here is a description of AAE when it was released in Riak 1.3 http://basho.com/introducing-riak-1-3/ You can turn off AAE in your config files if you want it off. With Riak (and distributed systems in general) there are always tradeoffs for any feature. Riak more than others tends to pick 'safety' for the default case. AAE protects your data's integrity even when not being read for a long time. It is a very important feature, but we understand it does not fit everyone's needs, so we provide the option to turn it off. Hope that helps, Jared ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Very high CPU usage by bean.smp
> > Interesting. Someone knows what this entropy thing is doing exactly? Can > it be switched off by default maybe? > Here is a description of AAE when it was released in Riak 1.3 http://basho.com/introducing-riak-1-3/ You can turn off AAE in your config files if you want it off. With Riak (and distributed systems in general) there are always tradeoffs for any feature. Riak more than others tends to pick 'safety' for the default case. AAE protects your data's integrity even when not being read for a long time. It is a very important feature, but we understand it does not fit everyone's needs, so we provide the option to turn it off. Hope that helps, Jared ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: Riak won't start when configured for Riak-CS
Hi! I did build Riak-CS from source, but there is no 'install' target in the make so I wasn't sure how that got done. There was no 'install' readme also. I then installed the .deb instead. I installed erlang from ubuntu 13.04 repo. On 09/23/2013 11:16 PM, Dave Parfitt wrote: Hello - Sounds like you have an older version of Erlang trying to load code compiled with a newer version of Erlang. Cheers, Dave On Sep 23, 2013, at 10:48 PM, Darren Govoniwrote: Hi, I am following the directions for configuring Riak with Riak-CS. http://docs.basho.com/riakcs/latest/cookbooks/configuration/Configuring-Riak/ When I follow the directions, riak console gives a problem when starting: 22:43:33.174 [error] Loading of /usr/lib/riak-cs/lib/riak_cs-1.4.1/ebin/riak_cs_kv_multi_backend.beam failed: badfile 22:43:33.174 [error] Failed to start riak_cs_kv_multi_backend Reason: {undef,[{riak_cs_kv_multi_backend,start,[0,[{async_folds,true},[{vnode_vclocks,true},{included_applications,[]},{allow_strfun,false},{reduce_js_vm_count,6},{storage_backend,riak_cs_kv_multi_backend},{legacy_keylisting,false},{pb_ip,"127.0.0.1"},{hook_js_vm_count,2},{listkeys_backpressure,true},{mapred_name,"mapred"},{stats_urlpath,"stats"},{legacy_stats,true},{js_thread_stack,16},{multi_backend,[{be_default,riak_kv_eleveldb_backend,[{max_open_files,50},{data_root,"/var/lib/riak/leveldb"}]},{be_blocks,riak_kv_bitcask_backend,[{data_root,"/var/lib/riak/bitcask"}]}]},{multi_backend_prefix_list,[{<<"0b:">>,be_blocks}]},{riak_kv_stat,true},{add_paths,["/usr/lib/riak-cs/lib/riak_cs-1.4.1/ebin"]},{http_url_encoding,on},{map_js_vm_count,8},{pb_port,8087},{multi_backend_default,be_default},{mapred_2i_pipe,true},{mapred_system,pipe},{js_max_vm_mem,8}]]]},{riak_kv_vnode,init,1},{riak_core_vnode,init,1},{gen_fsm,init_it,6},{proc_lib,init_p_do_apply,3}]} 22:43:33.250 [error] beam/beam_load.c(1365): Error loading module riak_cs_kv_multi_backend: use of opcode 153; this emulator supports only up to 152 22:43:33.330 [notice] "backend module failed to start." I am on Ubuntu 13.04 thanks for any tips. ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
RE: Debugging mapreduce
You're using single quotes in a JS program which is itself in single quotes - so they don't get to be part of the program, so the "/tmp/m..." part looks to JS like a bad regex. Hence the error message. Oprindelig meddelelse Fra: Charl Matthee Dato: Til: riak-users Emne: Debugging mapreduce Hi, I am trying to run the following mapreduce query across my cluster: # curl -XPOST http://10.179.229.209:8098/mapred -H "Content-Type: application/json" -d '{"inputs":"tweets", "query":[{"map":{"language":"javascript", "source":"function(value, keyData, arg) {t = JSON.parse(value.values[0].data)[0]; if ((new Date - new Date(t.created_at)) / 1000 > 2592000) return [t.id]; else return []}", "keep":true}}]}' {"lineno":466,"message":"SyntaxError: syntax error","source":"()"} The riak logs only have the following to report: ==> /var/log/riak/crash.log <== 2013-09-24 05:42:51 =ERROR REPORT webmachine error: path="/mapred" "Internal Server Error" ==> /var/log/riak/console.log <== 2013-09-24 05:42:51.272 [error] <0.20367.1441> Webmachine error at path "/mapred" : "Internal Server Error" ==> /var/log/riak/error.log <== 2013-09-24 05:42:51.272 [error] <0.20367.1441> Webmachine error at path "/mapred" : "Internal Server Error" Is there any way to get some more info on this to debug it further? I have tried using ejsLog() (from http://docs.basho.com/riak/1.3.2/references/appendices/MapReduce-Implementation/#Debugging-Javascript-Map-Reduce-Phases) to inspect the data in the function body but that simply gives me: # curl -XPOST http://10.179.229.209:8098/mapred -H "Content-Type: application/json" -d '{"inputs":"tweets", "query":[{"map":{"language":"javascript", "source":"function(value, keyData, arg) {t = JSON.parse(value.values[0].data)[0]; ejsLog('/tmp/map_reduce.log', JSON.stringify(t)); if ((new Date - new Date(t.created_at)) / 1000 > 2592000) return [t.id]; else return []}", "keep":true}}]}' {"lineno":1,"message":"SyntaxError: invalid flag after regular expression","source":"JSON.stringify(function(value, keyData, arg) {t = JSON.parse(value.values[0].data)[0]; ejsLog(/tmp/map_reduce.log, JSON.stringify(t)); if ((new Date - new Date(t.created_at)) / 1000 > 2592000) return [t.id]; else return []}({\"bucket\":\"tweets\",\"key\":\"37456"} I have also tried checking for already deleted documents in case that was what tripping things up but adding a check in for the X-Riak-Deleted header also results in an error: # curl -XPOST http://10.179.229.209:8098/mapred -H "Content-Type: application/json" -d '{"inputs":"tweets", "query":[{"map":{"language":"javascript", "source":"function(value, keyData, arg) {if (value.values[0].metadata['X-Riak-Deleted'] == 'true') return []; t = JSON.parse(value.values[0].data)[0]; if ((new Date - new Date(t.created_at)) / 1000 > 2592000) return [t.id]; else return []}", "keep":true}}]}' {"lineno":1,"message":"ReferenceError: X is not defined","source":"unknown"} -- Ciao Charl "I will either find a way, or make one." -- Hannibal ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com