riak-admin vnode-status can be used to get information about the
number of bitcask files, their fragmentation and dead bytes, but since
it uses a lot of blocking vnode commands, it can spike latencies, so
should only be used off-peak.

On Mon, Sep 16, 2013 at 7:36 AM, Alex Moore <amo...@basho.com> wrote:
> Hi Charl,
>
> The problem is that even though documents seem to no longer be
> available (doing a GET on a deleted document returns an expected 404)
> the disk usage is not seeming reducing much and has currently been at
> ~80% utilisation across all nodes for almost a week.
>
> When you delete a document, a tombstone record is written to bitcask, and
> the reference to the key is removed from memory (which is why you get
> 404's).  The old entry isn't actually removed until the next bitcask merge.
>
> At first I though the large amount of deletes being performed might be
> causing fragmentation of the merge index so I've been regularly
> running forced compaction as documented here:
> https://gist.github.com/rzezeski/3996286.
>
> That merge index is for Riak Search, not bitcask.
>
> There are ways of forcing a merge, but let's double check your settings/logs
> first. Can you send me your app.config and a console.log from one of your
> nodes?
>
> Thanks,
> Alex
>
>  --
> Alex Moore
> Sent with Airmail
>
> On September 16, 2013 at 4:43:07 AM, Charl Matthee (ch...@ntrippy.net)
> wrote:
>
> Hi,
>
> We have a 8-node riak v1.4.0 cluster writing data to bitcask backends.
>
> We've recently started running out of disk across all nodes and so
> implemented a 30-day sliding window data retention policy. This policy
> is enforced by a go app that concurrently deletes documents outside
> the window.
>
> The problem is that even though documents seem to no longer be
> available (doing a GET on a deleted document returns an expected 404)
> the disk usage is not seeming reducing much and has currently been at
> ~80% utilisation across all nodes for almost a week.
>
> At first I though the large amount of deletes being performed might be
> causing fragmentation of the merge index so I've been regularly
> running forced compaction as documented here:
> https://gist.github.com/rzezeski/3996286.
>
> This has helped somewhat but I suspect it has reached the limits of
> what can be done so I wonder if there is not further fragmentation
> elsewhere that is not being compacted.
>
> Could this be an issue? How can I tell whether merge indexes or
> something else needs compaction/attention?
>
> Our nodes were initially configured to run with the default settings
> for the bitcask backend but when this all started I switched to the
> following to try and see if I can trigger compaction more frequently:
>
> {bitcask, [
> %% Configure how Bitcask writes data to disk.
> %% erlang: Erlang's built-in file API
> %% nif: Direct calls to the POSIX C API
> %%
> %% The NIF mode provides higher throughput for certain
> %% workloads, but has the potential to negatively impact
> %% the Erlang VM, leading to higher worst-case latencies
> %% and possible throughput collapse.
> {io_mode, erlang},
>
> {data_root, "/var/lib/riak/bitcask"},
>
> {frag_merge_trigger, 40}, %% trigger merge if
> framentation is > 40% default is 60%
> {dead_bytes_merge_trigger, 67108864}, %% trigger if dead
> bytes for keys > 64MB default is 512MB
> {frag_threshold, 20}, %% framentation >= 20% default is 40
> {dead_bytes_threshold, 67108864} %% trigger if dead bytes
> for data > 64MB default is 128MB
> ]},
>
> From my observations this change did not make much of a difference.
>
> The data we're inserting is hierarchical JSON data that roughly falls
> into the following size (in bytes) profile:
>
> Max: 10320
> Min: 1981
> Avg: 3707
> Med: 2905
>
> --
> Ciao
>
> Charl
>
> "I will either find a way, or make one." -- Hannibal
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to