Idan, I'll investigate this a bit and see if I can replicate similar behavior and hopefully I can get back to you with more information. Thanks for sharing the info.
Kelly On Wed, May 22, 2013 at 3:23 AM, Idan Shinberg <[email protected]>wrote: > Hey Kelly > > Thanks for getting back to me ... > > You were right to bring up the point - these setting were indeed > applied gradually . > > I have thus started from scratch with the same settings mentioned above in > place > > I made 3 batch of 48 uploads of the same 32 MB files to 48 different keys > in s3 > I Wound up with 48 keys in the S3 ( uploads overwrote old data ) , each > is 32 MB of size , for a total of 144 uploads > > BTW , I also forgot to mention n_val is set to 1 in default_bucket_props . > Bitcask dir was around 5.5 GB and after merges kicked in it shrunk to 3.4 > GB > > still , actual data-set size should be 48 x 32 MB , which is 1.5 GB . > I also noticed each time I upload a file , 2x of it's size is > automatically used , And I'm guessing that's related :-) > > The Single Riak node is running on CentOS 6.3 with 1.3.1 packaged > version... > > > Thanks > > Idan Shinberg > idomoo > > > On Wed, May 22, 2013 at 2:26 AM, Kelly McLaughlin <[email protected]> wrote: > >> Idan, >> >> Bitcask can sometimes be slow to reclaim space after deleting objects >> from Riak CS. Are the settings you included the settings that have been in >> place during all of your uploads and deletions? I am surprised that just a >> few tens of uploads of 32 MB objects used up 15 GB of space. Can you be >> more specific on a count of uploads? Also do you have any error output in >> the riak or riak cs log files that may be related? Finally, which packages >> are you using for your testing? >> >> Kelly >> >> >> On Tue, May 21, 2013 at 2:18 PM, Idan Shinberg >> <[email protected]>wrote: >> >>> Thus , I fear Riak never treats their data as "dead-bytes" and they >>> never get merged >>> >>> I created 2 buckets using s3cmd and made several tens of uploads of >>> 32mb sized files , deleting them right afterwards ( with proper s3cmd >>> commands , of course) . >>> >>> I ended up with no buckets and no keys in my riak s3 database , >>> however , directory /var/lib/riak/bitcask/ 64 partitions now occupy 15GB >>> worth of space >>> >>> several riak restarts did not trigger any merges , and my merge settings >>> are set to impose very though merge triggering criterias , So I'm guessing >>> the only reason the data is not being cleared is the fact that it's still >>> in use ... >>> >>> Relevant riak-cs config : >>> >>> * %% == Garbage Collection ==* >>> * >>> * >>> * %% The number of seconds to retain the block* >>> * %% for an object after it has been deleted.* >>> * %% This leeway time is set to give the delete* >>> * %% indication time to propogate to all replicas.* >>> * %% 86400 is 24-hours.* >>> * {leeway_seconds, 30},* >>> * >>> * >>> * %% How often the garbage collection daemon* >>> * %% waits in-between gc batches.* >>> * %% 900 is 15-minutes.* >>> * {gc_interval, 60},* >>> * >>> * >>> * %% How long a move to the garbage* >>> * %% collection to do list can remain* >>> * %% failed, before we retry it.* >>> * %% 21600 is 6-hours.* >>> * {gc_retry_interval,300},* >>> >>> >>> >>> Relevant Riak Config >>> >>> *{riak_kv, [* >>> * %% Storage_backend specifies the Erlang module defining >>> the storage* >>> * %% mechanism that will be used on this node.* >>> * {add_paths, >>> ["/usr/lib64/riak-cs/lib/riak_cs-1.3.1/ebin"]},* >>> * {storage_backend, riak_cs_kv_multi_backend},* >>> * {multi_backend_prefix_list, [{<<"0b:">>, be_blocks}]},* >>> * {multi_backend_default, be_default},* >>> * {multi_backend, [* >>> * {be_default, riak_kv_eleveldb_backend, [* >>> * {max_open_files, 50},* >>> * {data_root, "/var/lib/riak/leveldb"}* >>> * ]},* >>> * {be_blocks, riak_kv_bitcask_backend, [* >>> * >>> * >>> * {max_file_size, 16#4000000}, %% 64MB* >>> * >>> * >>> * %% Trigger a merge if any of the following are >>> true:* >>> * {frag_merge_trigger, 10}, %% fragmentation >= >>> 10%* >>> * {dead_bytes_merge_trigger, 33554432}, %% dead >>> bytes > 32 MB* >>> * >>> * >>> * %% Conditions that determine if a file will be >>> examined during a merge:* >>> * {frag_threshold, 5}, %% fragmentation >= 5%* >>> * {dead_bytes_threshold, 8388608}, %% dead bytes >>> > 8 MB* >>> * {small_file_threshold, 16#80000000}, %% file >>> is < 2GB* >>> * >>> * >>> * {data_root, "/var/lib/riak/bitcask"}* >>> * ]}* >>> * ]},* >>> >>> ... >>> ... >>> ... >>> >>> * {bitcask, [* >>> * %% Configure how Bitcask writes data to disk.* >>> * %% erlang: Erlang's built-in file API* >>> * %% nif: Direct calls to the POSIX C API* >>> * %%* >>> * %% The NIF mode provides higher throughput for certain* >>> * %% workloads, but has the potential to negatively impact* >>> * %% the Erlang VM, leading to higher worst-case latencies* >>> * %% and possible throughput collapse.* >>> * {io_mode, erlang},* >>> * >>> * >>> * {max_file_size, 16#4000000}, %% 64MB* >>> * {merge_window, always}, %% Span of hours during which >>> merge is acceptable.* >>> * >>> * >>> * %% Trigger a merge if any of the following are true:* >>> * {frag_merge_trigger, 10}, %% fragmentation >= 10%* >>> * {dead_bytes_merge_trigger, 33554432}, %% dead bytes > 32 >>> MB* >>> * >>> * >>> * %% Conditions that determine if a file will be examined >>> during a merge:* >>> * {frag_threshold, 5}, %% fragmentation >= 5%* >>> * {dead_bytes_threshold, 8388608}, %% dead bytes > 8 MB* >>> * {small_file_threshold, 16#80000000}, %% file is < 2GB* >>> * >>> * >>> * {data_root, "/var/lib/riak/bitcask"}* >>> * >>> * >>> * ]},* >>> >>> I do see merges taking place in riak's console.log , they're just not >>> making that much of a difference ... >>> >>> Any idea what I might be missing here ? >>> >>> Thanks >>> >>> Idan Shinberg >>> idomoo >>> >>> >>> >>> _______________________________________________ >>> riak-users mailing list >>> [email protected] >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >>> >> >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
