Hi,

We have a 8-node riak v1.4.0 cluster writing data to bitcask backends.

We've recently started running out of disk across all nodes and so
implemented a 30-day sliding window data retention policy. This policy
is enforced by a go app that concurrently deletes documents outside
the window.

The problem is that even though documents seem to no longer be
available (doing a GET on a deleted document returns an expected 404)
the disk usage is not seeming reducing much and has currently been at
~80% utilisation across all nodes for almost a week.

At first I though the large amount of deletes being performed might be
causing fragmentation of the merge index so I've been regularly
running forced compaction as documented here:
https://gist.github.com/rzezeski/3996286.

This has helped somewhat but I suspect it has reached the limits of
what can be done so I wonder if there is not further fragmentation
elsewhere that is not being compacted.

Could this be an issue? How can I tell whether merge indexes or
something else needs compaction/attention?

Our nodes were initially configured to run with the default settings
for the bitcask backend but when this all started I switched to the
following to try and see if I can trigger compaction more frequently:

 {bitcask, [
             %% Configure how Bitcask writes data to disk.
             %%   erlang: Erlang's built-in file API
             %%      nif: Direct calls to the POSIX C API
             %%
             %% The NIF mode provides higher throughput for certain
             %% workloads, but has the potential to negatively impact
             %% the Erlang VM, leading to higher worst-case latencies
             %% and possible throughput collapse.
             {io_mode, erlang},

             {data_root, "/var/lib/riak/bitcask"},

             {frag_merge_trigger, 40}, %% trigger merge if
framentation is > 40% default is 60%
             {dead_bytes_merge_trigger, 67108864}, %% trigger if dead
bytes for keys > 64MB default is 512MB
             {frag_threshold, 20}, %% framentation >= 20% default is 40
             {dead_bytes_threshold, 67108864} %% trigger if dead bytes
for data > 64MB default is 128MB
           ]},

>From my observations this change did not make much of a difference.

The data we're inserting is hierarchical JSON data that roughly falls
into the following size (in bytes) profile:

Max: 10320
Min: 1981
Avg: 3707
Med: 2905

-- 
Ciao

Charl

"I will either find a way, or make one." -- Hannibal

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to