Re: Performance Issues with LevelDB Backend on 1.0.0RC1

Dan Reverri Mon, 26 Sep 2011 14:25:47 -0700

Hi Patrick,

Did you restart Riak after changing the configuration option in app.config?
Also, where in the config file did you add the option? Would it be possible
to see your app.config file? How many of the file descriptors listed by lsof
reference the "data/leveldb" directory?


Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
[email protected]


On Mon, Sep 26, 2011 at 12:23 PM, Patrick Van Stee
<[email protected]>wrote:

> Good suggestion. Erlang only has 39 ports open. I initially tried
> increasing max_open_files to a huge number just to see what would happen,
> but there was no noticeable difference in performance. Also lsof -u riak
> shows over 12000 fds open even after limiting max_open_files back to 20 and
> the number continues to grow until it hits the system limit which then
> causes that "Accept failed error". I'm ok with increasing the max_open_files
> limit but it doesn't really seem to improve performance at least in my case.
> Also even when I try to limit the amount of file descriptors with
> max_open_files, riak still opens new ones until it crashes.
>
> On Sep 26, 2011, at 2:54 PM, Jon Meredith wrote:
>
> Hi Patrick,
>
> I suggested increasing ports as you had an emfile on a socket accept call.
> Erlang uses ports for things like network sockets and file handles when
> opened by *erlang* processes.  However, the leveldb library manages it's own
> sockets as it is a C++ library dynamically loaded by the emulator and so
> doesn't count towards ports.
>
> Is it possible that Riak started getting more client load if request
> latency increased?  Changing max_open_files will keep the number of
> process-level file handles lower, but will cause more opening and closing of
> files to search them.  If you have a nice modern OS with lots of file
> handles available, you may be able to increase max_open_files for increased
> performance.
>
> If you want to check how many ports you are using you can run this from the
> riak console.
>
>   ([email protected])7> length(erlang:ports()).
>   39
>
> Try increasing your max_open_ports and checking how many file handles are
> in use using a tool like lsof and check the number of ports you have opened.
>
> Cheers, Jon.
>
> On Mon, Sep 26, 2011 at 12:44 PM, Patrick Van Stee <[email protected]
> > wrote:
>
>> Thanks for the quick response Jon. I bumped it from 4096 up to the max I
>> have set in /etc/riak/defaults and writes actually slowed down a little bit
>> (~10 less writes per second). Shouldn't the max_open_files setting keep the
>> total amount of fd's pretty low? Maybe I'm misunderstanding what that option
>> is used for.
>>
>> Patrick
>>
>> On Sep 26, 2011, at 2:34 PM, Jon Meredith wrote:
>>
>> Hi Patrick,
>>
>> You may be running out of ports which erlang uses for TCP sockets - try
>> increasing ERL_MAX_PORTS in etc/vm.args
>>
>> Cheers, Jon
>> Basho Technologies.
>>
>> On Mon, Sep 26, 2011 at 12:17 PM, Patrick Van Stee <
>> [email protected]> wrote:
>>
>>> We're running a small, 2 node riak cluster (on 2 m1.large boxes) using
>>> the LevelDB backend and have been trying to write ~250 keys a second at it.
>>> With a small dataset everything was running smoothly. However, after storing
>>> several hundred thousand keys some performance issues started to show up.
>>>
>>> * We're running out of file descriptors which is causing nodes to crash
>>> with the following error:
>>>
>>> 2011-09-24 00:23:52.097 [error] <0.110.0> CRASH REPORT Process [] with 0
>>> neighbours crashed with reason: {error,accept_failed}
>>> 2011-09-24 00:23:52.098 [error] <0.121.0> application: mochiweb, "Accept
>>> failed error", "{error,emfile}
>>>
>>> Setting the max_open_files limit in the app.config doesn't seem to help.
>>>
>>> * Writes have slowed down by an order of magnitude. I even set the n_val,
>>> w, and dw bucket properties to 1 without any noticeable difference. Also we
>>> switched to using protocol buffers to make sure there wasn't any extra
>>> overhead when using HTTP.
>>>
>>> * Running map reduce jobs that use a range query on a secondary index
>>> started returning an error, {"error":"map_reduce_error"}, once our dataset
>>> increased in size. Feeding a list of keys works fine, but querying the index
>>> for keys seems to be timing out:
>>>
>>> 2011-09-26 16:37:57.192 [error] <0.136.0> Supervisor
>>> riak_pipe_fitting_sup had child undefined started with
>>> {riak_pipe_fitting,start_link,undefined} at <0.3497.0> exit with reason
>>> {timeout,{gen_server,call,[{riak_pipe_vnode_master,'[email protected]
>>> '},{return_vnode,{'riak_vnode_req_v1',502391187832497878132516661246222288006726811648,{raw,#Ref<0.0.1.88700>,<0.3500.0>},{cmd_enqueue,{fitting,<0.3499.0>,#Ref<0.0.1.88700>,#Fun<riak_kv_mrc_pipe.0.133305895>,#Fun<riak_kv_mrc_pipe.1.125635227>},{<<"ip_queries">>,<<"uaukXZn5rZQ0LrSED3pi-fE-JjU">>},infinity,[{502391187832497878132516661246222288006726811648,'
>>> [email protected]'}]}}}]}} in context child_terminated
>>>
>>> Is anyone familiar with these problems or is there anything else I can
>>> try to increase the performance when using LevelDB?
>>> _______________________________________________
>>> riak-users mailing list
>>> [email protected]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>
>>
>>
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Performance Issues with LevelDB Backend on 1.0.0RC1

Reply via email to