Hey Kelly

Thanks for getting back to me ...

You were right to bring up the point -  these setting were indeed
applied gradually .

I have thus started from scratch with the same settings mentioned above in
place

I made 3 batch of 48 uploads of the same 32 MB files to 48 different keys
in s3
I Wound up with 48 keys in the S3 ( uploads overwrote old data )  , each is
32 MB of size  , for a total of 144 uploads

BTW , I also forgot to mention n_val is set to 1 in default_bucket_props .
Bitcask dir was around 5.5 GB  and after merges kicked in it shrunk to 3.4
GB

still , actual data-set size  should be 48 x 32 MB , which is 1.5 GB .
I also noticed each time I upload a file , 2x of it's size is automatically
used , And I'm guessing that's related :-)

The Single Riak node is running on CentOS 6.3 with 1.3.1 packaged version...


Thanks

Idan Shinberg
idomoo


On Wed, May 22, 2013 at 2:26 AM, Kelly McLaughlin <[email protected]> wrote:

> Idan,
>
> Bitcask can sometimes be slow to reclaim space after deleting objects from
> Riak CS. Are the settings you included the settings that have been in place
> during all of your uploads and deletions? I am surprised that just a few
> tens of uploads of 32 MB objects used up 15 GB of space. Can you be more
> specific on a count of uploads? Also do you have any error output in the
> riak or riak cs log files that may be related? Finally, which packages are
> you using for your testing?
>
> Kelly
>
>
> On Tue, May 21, 2013 at 2:18 PM, Idan Shinberg 
> <[email protected]>wrote:
>
>> Thus , I fear Riak never treats their data as "dead-bytes" and they never
>> get merged
>>
>> I created 2 buckets using s3cmd and made several tens of uploads  of 32mb
>> sized files , deleting them right afterwards ( with proper s3cmd commands ,
>> of course) .
>>
>> I ended up with no buckets and no keys in my riak s3 database ,
>> however , directory /var/lib/riak/bitcask/ 64 partitions now occupy 15GB
>> worth of space
>>
>> several riak restarts did not trigger any merges , and my merge settings
>> are set to impose very though merge triggering criterias , So I'm guessing
>> the only reason the data is not being cleared is the fact that it's still
>> in use ...
>>
>> Relevant riak-cs config :
>>
>> *              %% == Garbage Collection ==*
>> *
>> *
>> *              %% The number of seconds to retain the block*
>> *              %% for an object after it has been deleted.*
>> *              %% This leeway time is set to give the delete*
>> *              %% indication time to propogate to all replicas.*
>> *              %% 86400 is 24-hours.*
>> *              {leeway_seconds, 30},*
>> *
>> *
>> *              %% How often the garbage collection daemon*
>> *              %% waits in-between gc batches.*
>> *              %% 900 is 15-minutes.*
>> *              {gc_interval, 60},*
>> *
>> *
>> *              %% How long a move to the garbage*
>> *              %% collection to do list can remain*
>> *              %% failed, before we retry it.*
>> *              %% 21600 is 6-hours.*
>> *              {gc_retry_interval,300},*
>>
>>
>>
>> Relevant Riak Config
>>
>> *{riak_kv, [*
>> *            %% Storage_backend specifies the Erlang module defining the
>> storage*
>> *            %% mechanism that will be used on this node.*
>> *                {add_paths,
>> ["/usr/lib64/riak-cs/lib/riak_cs-1.3.1/ebin"]},*
>> *                {storage_backend, riak_cs_kv_multi_backend},*
>> *                {multi_backend_prefix_list, [{<<"0b:">>, be_blocks}]},*
>> *                {multi_backend_default, be_default},*
>> *                {multi_backend, [*
>> *                    {be_default, riak_kv_eleveldb_backend, [*
>> *                        {max_open_files, 50},*
>> *                        {data_root, "/var/lib/riak/leveldb"}*
>> *                    ]},*
>> *                    {be_blocks, riak_kv_bitcask_backend, [*
>> *
>> *
>> *                        {max_file_size, 16#4000000}, %% 64MB*
>> *
>> *
>> *                        %% Trigger a merge if any of the following are
>> true:*
>> *                        {frag_merge_trigger, 10}, %% fragmentation >=
>> 10%*
>> *                        {dead_bytes_merge_trigger, 33554432}, %% dead
>> bytes > 32 MB*
>> *
>> *
>> *                        %% Conditions that determine if a file will be
>> examined during a merge:*
>> *                        {frag_threshold, 5}, %% fragmentation >= 5%*
>> *                        {dead_bytes_threshold, 8388608}, %% dead bytes
>> > 8 MB*
>> *                        {small_file_threshold, 16#80000000}, %% file is
>> < 2GB*
>> *
>> *
>> *                        {data_root, "/var/lib/riak/bitcask"}*
>> *                    ]}*
>> *                ]},*
>>
>> ...
>> ...
>> ...
>>
>> * {bitcask, [*
>> *             %% Configure how Bitcask writes data to disk.*
>> *             %%   erlang: Erlang's built-in file API*
>> *             %%      nif: Direct calls to the POSIX C API*
>> *             %%*
>> *             %% The NIF mode provides higher throughput for certain*
>> *             %% workloads, but has the potential to negatively impact*
>> *             %% the Erlang VM, leading to higher worst-case latencies*
>> *             %% and possible throughput collapse.*
>> *             {io_mode, erlang},*
>> *
>> *
>> *             {max_file_size, 16#4000000}, %% 64MB*
>> *             {merge_window, always}, %% Span of hours during which
>> merge is acceptable.*
>> *
>> *
>> *             %% Trigger a merge if any of the following are true:*
>> *             {frag_merge_trigger, 10}, %% fragmentation >= 10%*
>> *             {dead_bytes_merge_trigger, 33554432}, %% dead bytes > 32 MB
>> *
>> *
>> *
>> *             %% Conditions that determine if a file will be examined
>> during a merge:*
>> *             {frag_threshold, 5}, %% fragmentation >= 5%*
>> *             {dead_bytes_threshold, 8388608}, %% dead bytes > 8 MB*
>> *             {small_file_threshold, 16#80000000}, %% file is < 2GB*
>> *
>> *
>> *             {data_root, "/var/lib/riak/bitcask"}*
>> *
>> *
>> *           ]},*
>>
>> I do see merges taking place in riak's console.log , they're just not
>> making that much of a difference ...
>>
>> Any idea what I might be missing here ?
>>
>> Thanks
>>
>> Idan Shinberg
>> idomoo
>>
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to