Re: riak performance

Grant Schofield Fri, 17 Sep 2010 08:22:42 -0700

VMWare is probably contributing to a certain amount of the latency, but the 
overhead in using the Javascript VMs is most likely where most of the latency 
is coming from. You can increase the amount of Javascript VMs you have, but you 
may be hitting a wall with CPU usage (especially because some nodes are under 
VMWare). Erlang would be much faster but still may have some of the same 
problems with CPU usage.


Grant Schofield
Developer Advocate
Basho Technologies, Inc. 

On Sep 17, 2010, at 7:02 AM, Nils Petersohn wrote:

> same setup with the tip, did not speedup the map phase, i even generated a 
> 2Mb post request file:
> 
> {"inputs":[["actionbucket","10000"],["actionbucket","10001"],["actionbucket","10002"],["actionbucket","10003"],["actionbucket","10004"],["actionbucket","10005"],["actionbucket","10006"],["actionbucket","10
>     
> 007"],["actionbucket","10008"],["actionbucket","10009"],["actionbucket","10010"],["actionbucket","10011"],["actionbucket","10012"],["actionbucket","10013"],["actionbucket","10014"],["actionbucket","1001...
> 
> but this did not speedup the process too...
> 
> my computer setup is the following:
> Processor Name:       Intel Core 2 Duo
>  Processor Speed:     2.53 GHz
>  Number Of Processors:        1
>  Total Number Of Cores:       2
>  L2 Cache:    3 MB
>  Memory:      4 GB
> 
> i use win7 on a intel quadcore 6600 cpu with 4gb ram.
> the two vmware virtual machines have 2 cpus and ~ 1400mb ram configured + 
> fedora 12.
> the firewalls are turned off completly
> 
> everything is connected with ethernet through a small switch and a static ips.
> 
> all the nine riak instances have this config (only the ip and port are 
> changing):
> ------------------------------------------------------------------------------------------------------------------------------------------
> %% -*- tab-width: 4;erlang-indent-level: 4;indent-tabs-mode: nil -*-
> %% ex: ts=4 sw=4 et
> 
> %%
> %% etc/app.config
> %%
> {ring_state_dir,    "data/ring"}.
> {web_ip,            "192.168.0.100"}.
> {web_port,          8091}.
> {handoff_port,      8101}.
> {pb_ip,             "192.168.0.100"}.
> {pb_port,           8081}.
> {bitcask_data_root, "data/bitcask"}.
> {sasl_error_log,    "log/sasl-error.log"}.
> {sasl_log_dir,      "log/sasl"}.
> 
> %%
> %% etc/vm.args
> %%
> {node,         "[email protected]"}.
> 
> %%
> %% bin/riak
> %%
> {runner_script_dir,  "$(cd ${0%/*} && pwd)"}.
> {runner_base_dir,    "${RUNNER_SCRIPT_DIR%/*}"}.
> {runner_etc_dir,     "$RUNNER_BASE_DIR/etc"}.
> {runner_log_dir,     "$RUNNER_BASE_DIR/log"}.
> {pipe_dir,           "/tmp/$RUNNER_BASE_DIR/"}.
> {runner_user,        ""}.
> ------------------------------------------------------------------------------------------------------------------------------------------
> 
> is vmware the bottleneck? should i use erlang to do the mr job? 
> 
> best regards
> nils
> 
> 
> 
> On Sep 17, 2010, at 12:40 AM, Grant Schofield wrote:
> 
>> I think the slowness is coming from the older list keys implementation in 
>> 0.12.1, list keys has been changed in the tip version of Riak and is quite a 
>> bit faster now. In addition there have been a lot of improvements to the 
>> Javascript map reduce implementation that should help the speed of your 
>> query. For the time being you will need to run Riak tip to get access to 
>> these enhancements. 
>> 
>> Grant Schofield
>> Developer Advocate
>> Basho Technologies, Inc.
>> 
>> 
>> On Sep 16, 2010, at 5:17 PM, Nils Petersohn wrote:
>> 
>>> ok, my ring seems ok now.
>>> what i did was to change the rel/vars/dev[1,2,3]_vars.config file.
>>> in there i was just replacing the ips...
>>> this reip thing did not really work out ...
>>> 
>>> here is my riak ring now:
>>> ([email protected])1> riak_core_ring_manager:get_my_ring().
>>> {ok,{chstate,'[email protected]',
>>>           [{'[email protected]',{65,63451889794}},
>>>            {'[email protected]',{13,63451889512}},
>>>            {'[email protected]',{104,63451889512}},
>>>            {'[email protected]',{49,63451889512}},
>>>            {'[email protected]',{32,63451889009}},
>>>            {'[email protected]',{94,63451889253}},
>>>            {'[email protected]',{9,63451889769}},
>>>            {'[email protected]',{97,63451889494}}],
>>>           {64,
>>>            [{0,'[email protected]'},
>>>             {22835963083295358096932575511191922182123945984,
>>>              '[email protected]'},
>>>             {45671926166590716193865151022383844364247891968,
>>>              '[email protected]'},
>>>             {68507889249886074290797726533575766546371837952,
>>>              '[email protected]'},
>>>             {91343852333181432387730302044767688728495783936,
>>>              '[email protected]'},
>>>             {114179815416476790484662877555959610910619729920,
>>>              '[email protected]'},
>>>             {137015778499772148581595453067151533092743675904,
>>>              '[email protected]'},
>>>             {159851741583067506678528028578343455274867621888,
>>>              '[email protected]'},
>>>             {182687704666362864775460604089535377456991567872,
>>>              '[email protected]'},
>>>             {205523667749658222872393179600727299639115513856,
>>>              '[email protected]'},
>>>             {228359630832953580969325755111919221821239459840,
>>>              '[email protected]'},
>>>             {251195593916248939066258330623111144003363405824,
>>>              '[email protected]'},
>>>             {274031556999544297163190906134303066185487351808,
>>>              '[email protected]'},
>>>             {296867520082839655260123481645494988367611297792,
>>>              '[email protected]'},
>>>             {319703483166135013357056057156686910549735243776,
>>>              '[email protected]'},
>>>             {342539446249430371453988632667878832731859189760,
>>>              '[email protected]'},
>>>             {365375409332725729550921208179070754913983135744,
>>>              '[email protected]'},
>>>             {388211372416021087647853783690262677096107081728,
>>>              '[email protected]'},
>>>             {411047335499316445744786359201454599278231027712,
>>>              '[email protected]'},
>>>             {433883298582611803841718934712646521460354973696,...},
>>>             {...}|...]},
>>>           {dict,0,16,16,8,80,48,
>>>                 {[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
>>>                 {{[],[],[],[],[],[],[],[],[],[],[],[],...}}}}}
>>> ([email protected])2> 
>>> 
>>> i am using 0.12.1 on my mac and 0.12 on both vms. i have now a set of 
>>> 100.000 entrys like this (just for testing):
>>> {"id":"42164", "actionTime":"2007-05-11 17:08:55", "action":"some action", 
>>> "res":"7024", "user":"5", "client":"2787"}
>>> 
>>> 
>>> and my mr job looks like this (just for testing):
>>> {"inputs":"actionbucket",
>>> "query":[
>>> {"map":{"language":"javascript", "source":
>>> "function(values, keyData, arg) {
>>>      
>>>     var value = Riak.mapValuesJson(values)[0];
>>>      if(value.reservation == '4084'){
>>>             return [value];
>>>     }
>>>     return [];
>>> }","keep":true}}
>>> ],"timeout": 900000
>>> }
>>> 
>>> 
>>> the beam instances are all showing on "top" now, and there is some traffic 
>>> going back and forth. (~200kb / s)
>>> 
>>> but this job takes like 1:30 min.
>>> 
>>> i know that this is not really comparable with a mysql query because you 
>>> can do more calculations in the mr job to produce much more special results 
>>> and the mr job has a ~linear "worktime"... but ~1:30 min is still pretty 
>>> bad .... 
>>> 
>>> is there any way to do much better ?
>>> 
>>> best regards
>>> nils
>>> 
>>> On Sep 16, 2010, at 7:08 PM, Grant Schofield wrote:
>>> 
>>>> 
>>>> On Sep 15, 2010, at 2:40 PM, Nils Petersohn wrote:
>>>> 
>>>>> hello,
>>>>> 
>>>>> i was setting up 9 riak instances:
>>>>> 
>>>>> three on my mac with the appropriate app config
>>>>> and six with two virtual machines on a different computer.
>>>>> 
>>>>> all 8 joined the [email protected]
>>>>> and the join request was sent.
>>>>> 
>>>>> after setting this up:
>>>>> i wanted to put data with the java client on [email protected] than i got 
>>>>> a timeout ?!?
>>>>> 
>>>> 
>>>> I am curious if you started this node and then changed its name in the 
>>>> config file? Errors like this can happen if you don't riak-admin reip the 
>>>> node, also the ring file would be wrong and this could lead to some of the 
>>>> other errors you saw below.  One thing you may want to look at is the 
>>>> state of your ring from the Riak console using 
>>>> riak_core_ring_manager:get_my_ring(). That might show any problems with 
>>>> the ring, feel free to send that along so we can take a look at it.
>>>> 
>>>>> when i put data on one of the other machines than only this machine was 
>>>>> using cpu time and none of the other ...
>>>>> if consistent hashing works like expected, than all the machines should 
>>>>> show up on "top"
>>>>> 
>>>>> when i did a mapreduce job than only this machine was using cpu time and 
>>>>> none of the other ...
>>>>> 
>>>>> i had "top" running on all of them.
>>>>> 
>>>>> -------------------------------------------------------
>>>>> the other problem is:
>>>>> 
>>>>> when i have 1/2 mio. entrys in one bucket with less than 100 chars for 
>>>>> each entry
>>>>> and i do a really simple mapreduce job, than it takes forever (15 minutes 
>>>>> ...)
>>>>> while sql uses .005 secons....
>>>>> 
>>>>> i know that doing a mr on a complete bucket, than it takes very long if i 
>>>>> don't secify keys in the bucket. but how should i know which keys to use 
>>>>> ...
>>>> 
>>>> What version of Riak are you using?  There has been a fair amount of 
>>>> improvement to the map reduce system as well as list keys. Are the map 
>>>> reduce jobs you are running javascript?
>>>> 
>>>>> ------------------------------------------------------
>>>>> 
>>>>> if i put stuff in one bucket and add a machine with the join request, how 
>>>>> can i rebalance the bucket???? so that the other machine is taking some 
>>>>> values too.
>>>> 
>>>> This happens automatically. When the new node joins the cluster you should 
>>>> see handoff messages in the erlang.log.X log file.   Rebalancing is 
>>>> handled by the cluster and shouldn't be done manually.
>>>> 
>>>> Grant Schofield
>>>> Developer Advocate
>>>> Basho Technologies, Inc.
>>>> 
>>>> 
>>>>> 
>>>>> ------------------------------------------------------
>>>>> 
>>>>> i don't understand these issues/behaviors (timeout, 15min. etc., 
>>>>> rebalancing), maybe i was setting the one of the three params incorrect ? 
>>>>> i left everything to the default settings.
>>>>> 
>>>>> thx in advance for any hints...
>>>>> 
>>>>> nils
>>>>> _______________________________________________
>>>>> riak-users mailing list
>>>>> [email protected]
>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>> 
>>> 
>>> Nils M. Petersohn
>>> xing.com/profile/Nils_Petersohn
>>> blog.srvme.de
>>> twitter.com/snackycracky
>>> facebook.com/nils.petersohn
>>> myspace.com/electrash
>>> 
>>> [email protected]
>>> 0049 (0)151 40 511 351
>>> skype: nilz_berlin
>>> 
>>> Ebertystr. 47
>>> 10249 Berlin
>>> 
>> 
> 
> Nils M. Petersohn
> xing.com/profile/Nils_Petersohn
> blog.srvme.de
> twitter.com/snackycracky
> facebook.com/nils.petersohn
> myspace.com/electrash
> 
> [email protected]
> 0049 (0)151 40 511 351
> skype: nilz_berlin
> 
> Ebertystr. 47
> 10249 Berlin
> 


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: riak performance

Reply via email to