I think it depends on how large we expect the initially flushed HFile to be 
(just to state the obvious).
The current default matches the memstore flushsize, so if we mostly flush 
because of that limit the current default should be good.


If we have many column families, where one dominates, we want to decrease this 
to make sure that the smallest files - that are created because we need to 
flush all CFs - first.
Not sure what a good default would be, or much we could auto configure this.


On the other hand maybe setting this to a very small amount might be a good 
default after all. The larger files will eventually be collected by the ratio 
based selection, and having this small will immediately pick abnormally tiny 
HFiles for compaction.

A good test might be to set this to 0 (so it's never used for file selection) 
and then see how this effects selection in common workloads.


We'll probably not find defaults that are right for every workload.


-- Lars

________________________________
From: Stack <[email protected]>
To: HBase Dev List <[email protected]> 
Sent: Monday, June 24, 2013 8:59 AM
Subject: Re: [COMPACTIONS] Anyone seen hbase.hstore.compaction.min.size in 
trunk/0.95?


On Thu, Jun 20, 2013 at 3:43 PM, Stack <[email protected]> wrote:

> On Thu, Jun 20, 2013 at 2:41 PM, Sergey Shelukhin 
> <[email protected]>wrote:
>
>> Part of HBASE-7055 patch that we picked includes CompactionConfiguration
>> class, which uses a prefix for config values.
>> See ::getMinCompactSize on that class, it's still used in compaction.
>>
>>
> Thanks Sergey.  Found it.
>
> Now, should we do Nicolas's suggestion as a default; i.e. any file < 4MB
> is always added to compaction set (where currently, IIUC, any file <
> flushsize is  added to the compaction set)?
>
>
Ping on above question.  Any compactor's have an opinion?
Thanks,
St.Ack

Reply via email to