Wes,

I was away from Basho for five months.  I suspect there was confusion as who 
should respond and how to respond.

I have two guesses as to your 5% growth:

Guess 1 … more metadata at the block level

2.0 Riak contains a new leveldb feature called dynamic block size.  This is 
where leveldb starts constructing .sst table files differently to optimize 
memory usage for overall performance.  Each file has overall index and a per 
block index.  This new feature begins to adjust more index information into the 
blocks to intentionally reduce the size of file’s overall index.  The overall 
index must reside completely in memory while the file is open.  Shifting index 
data to the blocks allows a greater number of files to be opened simultaneously 
for a given amount of computer memory.  This helps performance because opening 
a file is huge time cost.

The content changes of both the file level index and the block level indexes 
typically create a net increase in file size, though the changes to your 
compression ratio can go either way.  We would have to look at sample .sst 
files from levels 3 and 4 with the sst_scan tool to make a valid assessment 
(https://github.com/basho/leveldb/blob/develop/tools/sst_scan.cc).

Technical details of dynamic block sizing are here:

     https://github.com/basho/leveldb/wiki/mv-dynamic-block-size


Guess 2 … weaker than Guess 1, but possible

I am going to guess that you are getting more, smaller .sst table files than 
before at levels 0 and 1.  More files means more disk space lost to due to the 
difference between space needed and whole blocks allocated by the file system.  
There can be a slight reduction in compression of file metadata too, but that 
is a questionable contributor.  The impact is limited to levels 0 and 1, but 
that still adds up.

A bug was discovered mid December 2014 and a fix placed on a branch for 
subsequent releases.  The fix too was lost in the above confusion and is just 
now making its way into the 2.0.x and 2.1.x releases.

The missing fix is here:

   https://github.com/basho/leveldb/wiki/mv-sequential-tuning 
<https://github.com/basho/leveldb/wiki/mv-sequential-tuning>


Matthew


> On Jul 16, 2015, at 2:06 PM, Wes Jossey <weston.jos...@gmail.com> wrote:
> 
> Nope. Never did. 
> 
> 
> 
>> On Jul 16, 2015, at 13:32, Matthew Von-Maszewski <matth...@basho.com> wrote:
>> 
>> Did you ever get a reply to this query?
>> 
>> Matthew
>> 
>>> On Jan 11, 2015, at 5:48 PM, Weston Jossey <weston.jos...@gmail.com> wrote:
>>> 
>>> Hi All,
>>> Just wanted to put out an observation and see if it's either just me, or 
>>> something expected.
>>> 
>>> I've begun updating our large Riak 1.4 cluster to Riak 2.0.  Each cluster 
>>> has 43TB spread evenly over 32 nodes.  The riak 2.0 test nodes, after 
>>> running for 14 days, have on average around 5% more disk usage (in terms of 
>>> size, not IOPS) than the riak 1.4 cluster. Given that the cluster is evenly 
>>> balanced, I'd expect all nodes to be roughly the same size (or at least 
>>> within a point or two).  
>>> 
>>> Is this expected?  Does this have something to do with the dynamic settings 
>>> for the leveldb configuration parameters that is built into Riak 2?
>>> 
>>> The issue isn't a big one.  I'm just curious if this is expected / 
>>> anticipated, as it'll probably be worth noting in the Riak documentation as 
>>> part of the upgrade process.
>>> 
>>> Thanks!
>>> -Wes
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to