IIRC cs stores ~3x1mb entries for each 1mb of object that you store at default n_val. so, 300k * 3 * 9 = 810k, * 150b (to be conservative) = 1.1 GB. There will be some other static and per-entry overheads, but you should be able to do better than that before you OOM, even on a 1 node system.
Might be best to start again with all of the default files before you started tuning and retry, there may be a subtle error in your configs that's causing an issue. On Tue, Aug 20, 2013 at 4:31 PM, Idan Shinberg <[email protected]>wrote: > Thank you all for you kind and quick answers > > However , even on a 3 node or 5 node cluster > We're still seeing memory bloat ( only much notably slower , as load is > distributed between more machines ) > > it's important to stress , this is an "read-append" only cluster - This > means the data never expires , and from the moment the cluster is > up , we keep adding data in the form of S3 puts ( of around 9MB objects ) > , until we reach around 300K PUTS > > This is also why merges don't happen ( no stale data ) > > Has anyone come across this situation in the past ? > does Riak even fit for something like this ? > > > Regards, > > Idan Shinberg > > > System Architect > > Idomoo Ltd. > > > > Mob +972.54.562.2072 > > email [email protected] > > web www.idomoo.com > > [image: Description: cid:[email protected]] > > > On Tue, Aug 20, 2013 at 11:32 AM, Erik Søe Sørensen <[email protected]>wrote: > >> Your max file size is (far!) less than your small file size threshold - >> which means that at each merge, *all* of the files will participate in the >> merge. No wonder you need a lot of simultaneously open files... and long >> merge times too, of course. >> Try changing these parameters. >> >> >> >> -------- Oprindelig meddelelse -------- >> Fra: Idan Shinberg <[email protected]> >> Dato: >> Til: riak-users <[email protected]> >> Cc: Arik Katsav <[email protected]>,Assaf Fogel <[email protected]> >> Emne: Riak Memory Bloat issues with RiakCS/BitCask >> >> >> Hi all >> >> We have a ~300GB Riak Single Node Cluster >> This seems to have worked fine ( merging worked good ) until an >> OpenFile/OpenPorts limit was reached ( since then , we've tweaked both to >> 64K ) >> The above error caused a crash that left corrupted hint files .We've >> deleted the hint ( and their corrosponding the data files ) to allow a >> clean start to riak ( no errors upon start) . >> >> However , merges have not been really working ( taking forever to >> complete ) since then , therefor causing : >> >> * Huge Bloat on disk ( Data is around 150K objects of roughly 8MB each >> , but has already more then quadrupled in size the riak storage used ( >> around 1.2 TB ) >> * Huge Bloat in memory , which eventually kills riak itself ( OOM >> killer ) >> >> We're not doing anything complex , just using riak and riak-cs to emulate >> S3 access ( and only it ) for roughly 15 client writes per minute. >> >> Our merge settings ( uber-low , but have worked correctly in the up till >> a few days ago ) : >> >> {riak_kv, [ >> %% Storage_backend specifies the Erlang module defining the >> storage >> %% mechanism that will be used on this node. >> {add_paths, >> ["/usr/lib64/riak-cs/lib/riak_cs-1.3.1/ebin"]}, >> {storage_backend, riak_cs_kv_multi_backend}, >> {multi_backend_prefix_list, [{<<"0b:">>, be_blocks}]}, >> {multi_backend_default, be_default}, >> {multi_backend, [ >> {be_default, riak_kv_eleveldb_backend, [ >> {max_open_files, 50}, >> {data_root, "/var/lib/riak/leveldb"} >> ]}, >> {be_blocks, riak_kv_bitcask_backend, [ >> >> {max_file_size, 16#2000000}, %% 32MB >> >> %% Trigger a merge if any of the following are >> true: >> {frag_merge_trigger, 10}, %% fragmentation >= 10% >> {dead_bytes_merge_trigger, 8388608}, %% dead >> bytes > 8 MB >> >> %% Conditions that determine if a file will be >> examined during a merge: >> {frag_threshold, 5}, %% fragmentation >= 5% >> {dead_bytes_threshold, 2097152}, %% dead bytes > >> 2 MB >> {small_file_threshold, 16#80000000}, %% file is < >> 2GB >> >> {data_root, "/var/lib/riak/bitcask"}, >> {log_needs_merge, true} >> >> >> ]} >> ]}, >> >> As you've noticed , log_needs_merge is set to true and we do get our logs >> filled with needs_merge messages such as this one : >> >> ,{"/var/l...",...},...] >> 2013-08-19 00:09:49.043 [info] <0.17972.0> >> "/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728" >> needs_merge: >> [{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1153.bitcask.data",[{small_file,20506434}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1152.bitcask.data",[{small_file,33393237}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1151.bitcask.data",[{small_file,33123254}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1150.bitcask.data",[{small_file,32505520}]},{"/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/1149. >> ... >> ... >> ... >> >> Yet a merge Only a single merge happened ( and only after around 20 >> minutes since we started putting pressure on the riak) : >> >> 2013-08-19 00:17:29.456 [info] <0.18964.14> Merged >> {["/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/712.bitcask.data","/var/lib/riak/ >> >> bitcask/388211372416021087647853783690262677096107081728/711.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/710.bitcask >> >> .data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/709.bitcask.data","/var/lib/riak/bitcask/38821137241602108764785378369026267709 >> >> 6107081728/708.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/707.bitcask.data","/var/lib/riak/bitcask/388211 >> ... >> ... >> ... >> var/lib/riak/bitc >> >> ask/388211372416021087647853783690262677096107081728/697.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/696.bitcask.dat >> >> a","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/695.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107 >> >> 081728/694.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/693.bitcask.data","/var/lib/riak/bitcask/38821137241602108764 >> >> 7853783690262677096107081728/692.bitcask.data","/var/lib/riak/bitcask/388211372416021087647853783690262677096107081728/691.bitcask.data","/var/lib/riak/bitcas >> k/38821137241602108...",...],...} in 1325.611982 seconds. >> >> Is it reasonable for a merge to take more then 20 minutes ? >> Especially assuming riak's memory usage is bloating much faster ? >> Will Scaling the cluster from a single node to a 3-node cluster ease the >> problem ? >> >> As for the server and usage specs >> >> - Virtual machine having around 8 virtual cores >> - 12 GB of RAM >> - 8 TB of Storage composed of 4 x 2TB disks in Raid 10 ( 4TB available >> storage ) >> - ~150 keys several 10s of bytes long ( using Riak-CS for s3 storage ) . >> - ~8MB value size for each key ( raw file ) >> - ~22000 Open files ( mostly hint files ) by riak >> - Replication factor of 1 >> - Ring size is 64 >> >> I'll provide the logs if needed , yet I doubt they'll prove useful . >> >> Any ideas/advice will be appreciated >> >> >> Regards, >> >> Idan Shinberg >> >> >> System Architect >> >> Idomoo Ltd. >> >> >> >> Mob +972.54.562.2072 >> >> email [email protected]<mailto:[email protected]> >> >> web www.idomoo.com<http://www.idomoo.com/> >> >> [cid:[email protected]] >> > > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
<<image001.jpg>>
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
