Just an update.

I ran the benchmark again, but now, using Memory backend:
https://dl.dropbox.com/u/308392/memory_summary.png

This was the result using Bitcask backend:
https://dl.dropbox.com/u/308392/bitcask_summary.png

The difference is not that big in my environment. I was expecting much
better results, but I don't know if it was supposed to happen.
Anyway, your results are still much better, even when I'm using memory only
backend (50% of yours).

Maybe it can help to understand what is happening.

On Sat, Nov 3, 2012 at 7:31 PM, Uruka Dark <urukad...@gmail.com> wrote:

> Jared,
>
> Again, thank you very much.
> You helped me a lot.
>
> I perfectly understand your point. I'm just starting to know Riak and I
> want to go much deeper. But, before I keep going, I want make sure that I'm
> starting with the right foot :)
> I double/triple-checked and I still have no additional clues about what is
> happening.
>
> You've reached much better results than mine using your default settings,
> and, given my numbers, I'm still missing something. I would like at least
> to get closer to your results. If you think that I'll not make any better
> than this with my default settings, please, let me know.
>
> Anyway, this is my app.config:
>
> %% -*- mode: erlang;erlang-indent-level: 4;indent-tabs-mode: nil -*-
> %% ex: ft=erlang ts=4 sw=4 et
> [
>  %% Riak Client APIs config
>  {riak_api, [
>             %% pb_backlog is the maximum length to which the queue of
> pending
>             %% connections may grow. If set, it must be an integer >= 0.
>             %% By default the value is 5. If you anticipate a huge number
> of
>             %% connections being initialised *simultaneously*, set this
> number
>             %% higher.
>             %% {pb_backlog, 64},
>
>             %% pb_ip is the IP address that the Riak Protocol Buffers
> interface
>             %% will bind to.  If this is undefined, the interface will not
> run.
>             {pb_ip,   "10.1.1.221" },
>
>             %% pb_port is the TCP port that the Riak Protocol Buffers
> interface
>             %% will bind to
>             {pb_port, 8087 }
>             ]},
>
>  %% Riak Core config
>  {riak_core, [
>               %% Default location of ringstate
>               {ring_state_dir, "/var/lib/riak/ring"},
>
>               %% Default ring creation size.  Make sure it is a power of 2,
>               %% e.g. 16, 32, 64, 128, 256, 512 etc
>               %{ring_creation_size, 64},
>
>               %% http is a list of IP addresses and TCP ports that the Riak
>               %% HTTP interface will bind.
>               {http, [ {"10.1.1.221", 8098 } ]},
>
>               %% https is a list of IP addresses and TCP ports that the
> Riak
>               %% HTTPS interface will bind.
>               {https, [{ "10.1.1.221", 8069 }]},
>
>               %% Default cert and key locations for https can be overridden
>               %% with the ssl config variable, for example:
>               {ssl, [
>                      {certfile, "/etc/riak/server.crt"},
>                      {keyfile, "/etc/riak/server.key"}
>                     ]},
>
>               %% riak_handoff_port is the TCP port that Riak uses for
>               %% intra-cluster data handoff.
>               {handoff_port, 8099 },
>
>               %% To encrypt riak_core intra-cluster data handoff traffic,
>               %% uncomment the following line and edit its path to an
>               %% appropriate certfile and keyfile.  (This example uses a
>               %% single file with both items concatenated together.)
>               %{handoff_ssl_options, [{certfile, "/tmp/erlserver.pem"}]},
>
>               %% DTrace support
>               %% Do not enable 'dtrace_support' unless your Erlang/OTP
>               %% runtime is compiled to support DTrace.  DTrace is
>               %% available in R15B01 (supported by the Erlang/OTP
>               %% official source package) and in R14B04 via a custom
>               %% source repository & branch.
>               {dtrace_support, false},
>
>               %% Platform-specific installation paths (substituted by
> rebar)
>               {platform_bin_dir, "/usr/sbin"},
>               {platform_data_dir, "/var/lib/riak"},
>               {platform_etc_dir, "/etc/riak"},
>               {platform_lib_dir, "/usr/lib/riak/lib"},
>               {platform_log_dir, "/var/log/riak"}
>              ]},
>
>  %% Riak KV config
>  {riak_kv, [
>             %% Storage_backend specifies the Erlang module defining the
> storage
>             %% mechanism that will be used on this node.
>             %{storage_backend, riak_kv_memory_backend},
>             {storage_backend, riak_kv_bitcask_backend},
>             %{storage_backend, riak_kv_eleveldb_backend},
>
>             %% raw_name is the first part of all URLS used by the Riak raw
> HTTP
>             %% interface.  See riak_web.erl and raw_http_resource.erl for
>             %% details.
>             %{raw_name, "riak"},
>
>             %% mapred_name is URL used to submit map/reduce requests to
> Riak.
>             {mapred_name, "mapred"},
>
>             %% mapred_system indicates which version of the MapReduce
>             %% system should be used: 'pipe' means riak_pipe will
>             %% power MapReduce queries, while 'legacy' means that luke
>             %% will be used
>             {mapred_system, pipe},
>
>             %% mapred_2i_pipe indicates whether secondary-index
>             %% MapReduce inputs are queued in parallel via their own
>             %% pipe ('true'), or serially via a helper process
>             %% ('false' or undefined).  Set to 'false' or leave
>             %% undefined during a rolling upgrade from 1.0.
>             {mapred_2i_pipe, true},
>
>             %% directory used to store a transient queue for pending
>             %% map tasks
>             %% Only valid when mapred_system == legacy
>             %% {mapred_queue_dir, "/var/lib/riak/mr_queue" },
>
>             %% Each of the following entries control how many Javascript
>             %% virtual machines are available for executing map, reduce,
>             %% pre- and post-commit hook functions.
>             {map_js_vm_count, 8 },
>             {reduce_js_vm_count, 6 },
>             {hook_js_vm_count, 2 },
>
>             %% Number of items the mapper will fetch in one request.
>             %% Larger values can impact read/write performance for
>             %% non-MapReduce requests.
>             %% Only valid when mapred_system == legacy
>             %% {mapper_batch_size, 5},
>
>             %% js_max_vm_mem is the maximum amount of memory, in megabytes,
>             %% allocated to the Javascript VMs. If unset, the default is
>             %% 8MB.
>             {js_max_vm_mem, 8},
>
>             %% js_thread_stack is the maximum amount of thread stack, in
> megabyes,
>             %% allocate to the Javascript VMs. If unset, the default is
> 16MB.
>             %% NOTE: This is not the same as the C thread stack.
>             {js_thread_stack, 16},
>
>             %% Number of objects held in the MapReduce cache. These will be
>             %% ejected when the cache runs out of room or the bucket/key
>             %% pair for that entry changes
>             %% Only valid when mapred_system == legacy
>             %% {map_cache_size, 10000},
>
>             %% js_source_dir should point to a directory containing
> Javascript
>             %% source files which will be loaded by Riak when it
> initializes
>             %% Javascript VMs.
>             %{js_source_dir, "/tmp/js_source"},
>
>             %% http_url_encoding determines how Riak treats URL encoded
>             %% buckets, keys, and links over the REST API. When set to 'on'
>             %% Riak always decodes encoded values sent as URLs and Headers.
>             %% Otherwise, Riak defaults to compatibility mode where links
>             %% are decoded, but buckets and keys are not. The compatibility
>             %% mode will be removed in a future release.
>             {http_url_encoding, on},
>
>             %% Switch to vnode-based vclocks rather than client ids.  This
>             %% significantly reduces the number of vclock entries.
>             %% Only set true if *all* nodes in the cluster are upgraded to
> 1.0
>             {vnode_vclocks, true},
>
>             %% This option enables compatability of bucket and key listing
>             %% with 0.14 and earlier versions. Once a rolling upgrade to
>             %% a version > 0.14 is completed for a cluster, this should be
>             %% set to false for improved performance for bucket and key
>             %% listing operations.
>             {legacy_keylisting, false},
>
>             %% This option toggles compatibility of keylisting with 1.0
>             %% and earlier versions.  Once a rolling upgrade to a version
>             %% > 1.0 is completed for a cluster, this should be set to
>             %% true for better control of memory usage during key listing
>             %% operations
>             {listkeys_backpressure, true}
>            ]},
>
>  %% Riak Search Config
>  {riak_search, [
>                 %% To enable Search functionality set this 'true'.
>                 {enabled, false}
>                ]},
>
>  %% Merge Index Config
>  {merge_index, [
>                 %% The root dir to store search merge_index data
>                 {data_root, "/var/lib/riak/merge_index"},
>
>                 %% Size, in bytes, of the in-memory buffer.  When this
>                 %% threshold has been reached the data is transformed
>                 %% into a segment file which resides on disk.
>                 {buffer_rollover_size, 1048576},
>
>                 %% Overtime the segment files need to be compacted.
>                 %% This is the maximum number of segments that will be
>                 %% compacted at once.  A lower value will lead to
>                 %% quicker but more frequent compactions.
>                 {max_compact_segments, 20}
>                ]},
>
>  %% Bitcask Config
>  {bitcask, [
>              {data_root, "/var/lib/riak/bitcask"}
>            ]},
>
>  %% eLevelDB Config
>  {eleveldb, [
>              {data_root, "/var/lib/riak/leveldb"},
>      {write_buffer_size_min, 31457280}, %% 30 MB in bytes
>              {write_buffer_size_max, 62914560} %% 60 MB in bytes
>             ]},
>
>  %% Lager Config
>  {lager, [
>             %% What handlers to install with what arguments
>             %% The defaults for the logfiles are to rotate the files when
>             %% they reach 10Mb or at midnight, whichever comes first, and
> keep
>             %% the last 5 rotations. See the lager README for a
> description of
>             %% the time rotation format:
>             %% https://github.com/basho/lager/blob/master/README.org
>             %%
>             %% If you wish to disable rotation, you can either set the
> size to 0
>             %% and the rotation time to "", or instead specify a 2-tuple
> that only
>             %% consists of {Logfile, Level}.
>             {handlers, [
>                 {lager_console_backend, info},
>                 {lager_file_backend, [
>                     {"/var/log/riak/error.log", error, 10485760, "$D0", 5},
>                     {"/var/log/riak/console.log", info, 10485760, "$D0", 5}
>                 ]}
>             ]},
>
>             %% Whether to write a crash log, and where.
>             %% Commented/omitted/undefined means no crash logger.
>             {crash_log, "/var/log/riak/crash.log"},
>
>             %% Maximum size in bytes of events in the crash log - defaults
> to 65536
>             {crash_log_msg_size, 65536},
>
>             %% Maximum size of the crash log in bytes, before its rotated,
> set
>             %% to 0 to disable rotation - default is 0
>             {crash_log_size, 10485760},
>
>             %% What time to rotate the crash log - default is no time
>             %% rotation. See the lager README for a description of this
> format:
>             %% https://github.com/basho/lager/blob/master/README.org
>             {crash_log_date, "$D0"},
>
>             %% Number of rotated crash logs to keep, 0 means keep only the
>             %% current one - default is 0
>             {crash_log_count, 5},
>
>             %% Whether to redirect error_logger messages into lager -
> defaults to true
>             {error_logger_redirect, true}
>         ]},
>
>  %% riak_sysmon config
>  {riak_sysmon, [
>          %% To disable forwarding events of a particular type, use a
>          %% limit of 0.
>          {process_limit, 30},
>          {port_limit, 2},
>
>          %% Finding reasonable limits for a given workload is a matter
>          %% of experimentation.
>          {gc_ms_limit, 100},
>          {heap_word_limit, 40111000},
>
>          %% Configure the following items to 'false' to disable logging
>          %% of that event type.
>          {busy_port, true},
>          {busy_dist_port, true}
>         ]},
>
>  %% SASL config
>  {sasl, [
>          {sasl_error_logger, false}
>         ]},
>
>  %% riak_control config
>  {riak_control, [
>                 %% Set to false to disable the admin panel.
>                 {enabled, true},
>
>                 %% Authentication style used for access to the admin
>                 %% panel. Valid styles are 'userlist' <TODO>.
>                 {auth, none},
>
>                 %% If auth is set to 'userlist' then this is the
>                 %% list of usernames and passwords for access to the
>                 %% admin panel.
>                 {userlist, [{"user", "pass"}
>                            ]},
>
>                 %% The admin panel is broken up into multiple
>                 %% components, each of which is enabled or disabled
>                 %% by one of these settings.
>                 {admin, true}
>                 ]}
> ].
>
> ------
> I have two machines with those settings: 10.1.1.221 and 10.1.1.222. They
> are working together.
> Do you see any problem on that?
>
> Again, if you think I can't go any further with those default settings
> (without tuning FS, etc), please, let me know.
>
> Thank you.
>
> On Sat, Nov 3, 2012 at 4:43 PM, Jared Morrow <ja...@basho.com> wrote:
>
>> Uruka,
>>
>> Now that you got some somewhat reasonable numbers, it is probably time to
>> discuss what you are trying to get out of Riak.  We typically recommend 4
>> or 5 nodes minimum for a Riak install because that is the point where the
>> distribution becomes a performance benefit rather than a hindrance.  I know
>> you were just load testing, but I'd recommend considering a test with 4 or
>> 5 nodes, with default N values.  During the test, remove a node (power it
>> off, or 'riak stop' it).  Or like someone else mentioned start with a 3 or
>> 4 node cluster and add a node to see how the performance goes up and no
>> further operations work is needed to rebalance the data around the cluster.
>>  This is really where Riak shines over some alternative databases, the ease
>> of scaling and dealing with failures.  SIngle node performance although fun
>> to try and tune to get the most out of it, isn't as interesting on a long
>> timeline when trying to scale the system.  Obviously single node
>> performance is still important, dont' get me wrong.  Riak isn't always the
>> best choice, but when it comes with staying available and performance while
>> systems are failing no other system has a better real-world story than Riak.
>>
>> If you still want to get your single node performance up, we have several
>> pages on our docs page based around tuning.  A good place to start is the
>> file system tuning page
>> http://docs.basho.com/riak/latest/cookbooks/File-System-Tuning/ .
>>  Reading that and other pages in the Operations section might be helpful in
>> squeezing out those last bits of speed.
>>
>> I am glad to see your initial 60 writes/sec has gone up to 800
>> writes/sec, but we definitely can do better once you start utilizing our
>> strengths.
>>
>> Hope my rambling helped,
>> -Jared
>>
>> On Sat, Nov 3, 2012 at 4:55 AM, Uruka Dark <urukad...@gmail.com> wrote:
>>
>>> Jared,
>>>
>>> Thank you for you time and reply.
>>>
>>> I got impressed by your numbers and I started to double check my
>>> settings. I found a big problem here, my third machine (the one out of the
>>> cluster, making the load), was not talking to Riak in gigabit speed, it was
>>> 100 Mbs. I changed the network cable and it's working fine now.
>>> I ran my python script again and I already could see better results: 252
>>> ops/sec (before the fix it was 175 ops/sec).
>>>
>>> I also ran your benchmark .config, and these are my numbers:
>>> https://dl.dropbox.com/u/308392/summary.png
>>>
>>> As you can see, even so, I'm still far from your results.. not even
>>> close, and now I'm using Bitcask.
>>> Anyway, my current position is much better than at the beggining. I'll
>>> double-check all over again, cause now I have a confirmation that there is
>>> something wrong.
>>>
>>> If you have any suggestion, please, let me know.
>>> Once again, thank you.
>>>
>>> On Sat, Nov 3, 2012 at 3:08 AM, Jared Morrow <ja...@basho.com> wrote:
>>>
>>>> I forgot to mention that 2000 ops/sec was on bitcask, not memory.  I
>>>> didn't bother with the memory backend.
>>>>
>>>> -Jared
>>>>
>>>>
>>>> On Sat, Nov 3, 2012 at 12:05 AM, Jared Morrow <ja...@basho.com> wrote:
>>>>
>>>>> Uruka,
>>>>>
>>>>> So looking at your results something is really wrong with your setup.
>>>>>  I was surprised by your numbers, so I made two VM's each with only 1gb of
>>>>> RAM on two different boxes also on a 1gb switch.
>>>>>
>>>>> I ran a put of 100,000 keys at 10kb in size.
>>>>>
>>>>> I didn't do any tuning at all on the VM's and these were quick Ubuntu
>>>>> 10.04 VM's with 2 virtual CPU's and 1 gig of ram.  I also didn't change 
>>>>> any
>>>>> settings in Riak, except for the IP address and listening ports.
>>>>>
>>>>> Here is the summary of the results showing around 2000 ops/sec
>>>>> https://dl.dropbox.com/u/183971/summary.png
>>>>>
>>>>> So my main thought is that you weren't actually using N=1 for your
>>>>> puts and you were using the default N value of 3, meaning you were writing
>>>>> each key/value 3 times, and with 2 nodes this is doing a lot of writes to
>>>>> the same disk multiple times.
>>>>>
>>>>> To be sure you have N=1, you can use 'riak attach' on each node and
>>>>> enter the following command:
>>>>>
>>>>> riak_core_bucket:set_bucket(<<"pop1">>,[{n_val,1}]).
>>>>>
>>>>>
>>>>> If you bucket name is "pop1" as in my case.  That name is completely
>>>>> arbitrary.
>>>>>
>>>>> Sorry I'm late to this thread, I had to find some time to setup the
>>>>> test.
>>>>>
>>>>> For reference I used https://github.com/basho/basho_bench for the
>>>>> benchmark.  With the following .config file
>>>>> https://gist.github.com/e630b63f4a025a0fb634
>>>>>
>>>>> Hope this helps,
>>>>> Jared
>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to