riak-admin vnode-status can be used to get information about the number of bitcask files, their fragmentation and dead bytes, but since it uses a lot of blocking vnode commands, it can spike latencies, so should only be used off-peak.
On Mon, Sep 16, 2013 at 7:36 AM, Alex Moore <amo...@basho.com> wrote: > Hi Charl, > > The problem is that even though documents seem to no longer be > available (doing a GET on a deleted document returns an expected 404) > the disk usage is not seeming reducing much and has currently been at > ~80% utilisation across all nodes for almost a week. > > When you delete a document, a tombstone record is written to bitcask, and > the reference to the key is removed from memory (which is why you get > 404's). The old entry isn't actually removed until the next bitcask merge. > > At first I though the large amount of deletes being performed might be > causing fragmentation of the merge index so I've been regularly > running forced compaction as documented here: > https://gist.github.com/rzezeski/3996286. > > That merge index is for Riak Search, not bitcask. > > There are ways of forcing a merge, but let's double check your settings/logs > first. Can you send me your app.config and a console.log from one of your > nodes? > > Thanks, > Alex > > -- > Alex Moore > Sent with Airmail > > On September 16, 2013 at 4:43:07 AM, Charl Matthee (ch...@ntrippy.net) > wrote: > > Hi, > > We have a 8-node riak v1.4.0 cluster writing data to bitcask backends. > > We've recently started running out of disk across all nodes and so > implemented a 30-day sliding window data retention policy. This policy > is enforced by a go app that concurrently deletes documents outside > the window. > > The problem is that even though documents seem to no longer be > available (doing a GET on a deleted document returns an expected 404) > the disk usage is not seeming reducing much and has currently been at > ~80% utilisation across all nodes for almost a week. > > At first I though the large amount of deletes being performed might be > causing fragmentation of the merge index so I've been regularly > running forced compaction as documented here: > https://gist.github.com/rzezeski/3996286. > > This has helped somewhat but I suspect it has reached the limits of > what can be done so I wonder if there is not further fragmentation > elsewhere that is not being compacted. > > Could this be an issue? How can I tell whether merge indexes or > something else needs compaction/attention? > > Our nodes were initially configured to run with the default settings > for the bitcask backend but when this all started I switched to the > following to try and see if I can trigger compaction more frequently: > > {bitcask, [ > %% Configure how Bitcask writes data to disk. > %% erlang: Erlang's built-in file API > %% nif: Direct calls to the POSIX C API > %% > %% The NIF mode provides higher throughput for certain > %% workloads, but has the potential to negatively impact > %% the Erlang VM, leading to higher worst-case latencies > %% and possible throughput collapse. > {io_mode, erlang}, > > {data_root, "/var/lib/riak/bitcask"}, > > {frag_merge_trigger, 40}, %% trigger merge if > framentation is > 40% default is 60% > {dead_bytes_merge_trigger, 67108864}, %% trigger if dead > bytes for keys > 64MB default is 512MB > {frag_threshold, 20}, %% framentation >= 20% default is 40 > {dead_bytes_threshold, 67108864} %% trigger if dead bytes > for data > 64MB default is 128MB > ]}, > > From my observations this change did not make much of a difference. > > The data we're inserting is hierarchical JSON data that roughly falls > into the following size (in bytes) profile: > > Max: 10320 > Min: 1981 > Avg: 3707 > Med: 2905 > > -- > Ciao > > Charl > > "I will either find a way, or make one." -- Hannibal > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com