Your max file size is (far!) less than your small file size threshold - which 
means that at each merge,  *all* of the files will participate in the merge. No 
wonder you need a lot of simultaneously open files... and long merge times too, 
of course.
Try changing these parameters.



-------- Oprindelig meddelelse --------
Fra: Idan Shinberg <[email protected]>
Dato:
Til: riak-users <[email protected]>
Cc: Arik Katsav <[email protected]>,Assaf Fogel <[email protected]>
Emne: Riak Memory Bloat issues with RiakCS/BitCask


Hi all

We have a ~300GB Riak Single Node Cluster
This seems to have worked fine ( merging worked good ) until an 
OpenFile/OpenPorts limit was reached ( since then , we've tweaked both to 64K )
The above error caused a crash that left corrupted hint files .We've deleted 
the hint ( and their corrosponding the data files ) to allow a clean start to 
riak ( no errors upon start) .

However , merges have not been really  working  ( taking forever to complete  ) 
since then  , therefor causing :

 *   Huge Bloat on disk ( Data is around 150K objects of roughly 8MB each , but 
has already more then quadrupled in size the riak storage used ( around 1.2 TB )
 *   Huge Bloat in memory , which eventually kills riak itself ( OOM killer )

We're not doing anything complex , just using riak and riak-cs to emulate S3 
access ( and only it ) for roughly 15 client writes  per minute.

Our merge settings ( uber-low , but have worked correctly in the up till a few 
days ago ) :

 {riak_kv, [
            %% Storage_backend specifies the Erlang module defining the storage
            %% mechanism that will be used on this node.
                {add_paths, ["/usr/lib64/riak-cs/lib/riak_cs-1.3.1/ebin"]},
                {storage_backend, riak_cs_kv_multi_backend},
                {multi_backend_prefix_list, [{<<"0b:">>, be_blocks}]},
                {multi_backend_default, be_default},
                {multi_backend, [
                    {be_default, riak_kv_eleveldb_backend, [
                        {max_open_files, 50},
                        {data_root, "/var/lib/riak/leveldb"}
                    ]},
                    {be_blocks, riak_kv_bitcask_backend, [

                        {max_file_size, 16#2000000}, %% 32MB

                        %% Trigger a merge if any of the following are true:
                        {frag_merge_trigger, 10}, %% fragmentation >= 10%
                        {dead_bytes_merge_trigger, 8388608}, %% dead bytes > 8 
MB

                        %% Conditions that determine if a file will be examined 
during a merge:
                        {frag_threshold, 5}, %% fragmentation >= 5%
                        {dead_bytes_threshold, 2097152}, %% dead bytes > 2 MB
                        {small_file_threshold, 16#80000000}, %% file is < 2GB

                        {data_root, "/var/lib/riak/bitcask"},
                        {log_needs_merge, true}


                    ]}
                ]},

As you've noticed , log_needs_merge is set to true and we do get our logs 
filled with needs_merge messages such as this one :

,{"/var/l...",...},...]
2013-08-19 00:09:49.043 [info] <0.17972.0> 
"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728" 
needs_merge: 
[{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1153.bitcask.data",[{small_file,20506434}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1152.bitcask.data",[{small_file,33393237}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1151.bitcask.data",[{small_file,33123254}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1150.bitcask.data",[{small_file,32505520}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1149.
...
...
...

Yet a merge Only a single merge happened  ( and only after around 20 minutes 
since we started putting pressure on the riak) :

2013-08-19 00:17:29.456 [info] <0.18964.14> Merged 
{["/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/712.bitcask.data","/var/lib/riak/
bitcask/388211372416021087647853783690262677096107081728/711.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/710.bitcask
.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/709.bitcask.data","/var/lib/riak/bitcask/38821137241602108764785378369026267709
6107081728/708.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/707.bitcask.data","/var/lib/riak/bitcask/388211
...
...
...
var/lib/riak/bitc
ask/388211372416021087647853783690262677096107081728/697.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/696.bitcask.dat
a","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/695.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107
081728/694.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/693.bitcask.data","/var/lib/riak/bitcask/38821137241602108764
7853783690262677096107081728/692.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/691.bitcask.data","/var/lib/riak/bitcas
k/38821137241602108...",...],...} in 1325.611982 seconds.

Is it reasonable for a merge to take more then 20 minutes ?
Especially assuming riak's memory usage is bloating much faster ?
Will Scaling the cluster from a single node to a 3-node cluster ease the 
problem ?

As for the server and usage specs

- Virtual machine having around 8 virtual cores
- 12 GB of RAM
- 8 TB of Storage composed of 4 x 2TB disks in Raid 10 ( 4TB available storage )
- ~150 keys several 10s of bytes long ( using Riak-CS for s3 storage ) .
- ~8MB value size for each key ( raw file )
- ~22000 Open files ( mostly hint files ) by riak
- Replication factor of  1
- Ring size is 64

I'll provide the logs if needed , yet I doubt they'll prove useful .

Any ideas/advice will be appreciated


Regards,

Idan Shinberg


System Architect

Idomoo Ltd.



Mob +972.54.562.2072

email [email protected]<mailto:[email protected]>

web www.idomoo.com<http://www.idomoo.com/>

[cid:[email protected]]

<<inline: image001.jpg>>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to