[
https://issues.apache.org/jira/browse/HBASE-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149504#comment-17149504
]
Michael Stack commented on HBASE-24632:
---------------------------------------
Thanks [~anoop.hbase]. Will wait on [~13.pankajkumar] input.
On small hfiles being quickly compacted away -- I think this concern belongs
against HBASE-23634 -- but by default, we generally pick up the small files
first (from RatioBasedCompactionPolicy, our default compaction policy and the
policy subclassed by the likes of DateTieredCompaction):
{code}
/**
* -- Default minor compaction selection algorithm:
* choose CompactSelection from candidates --
* First exclude bulk-load files if indicated in configuration.
* Start at the oldest file and stop when you find the first file that
* meets compaction criteria:
* (1) a recently-flushed, small file (i.e. <= minCompactSize)
* OR
* (2) within the compactRatio of sum(newer_files)
* Given normal skew, any newer files will also meet this criteria
* <p/>
* Additional Note:
* If fileSizes.size() >> maxFilesToCompact, we will recurse on
* compact(). Consider the oldest files first to avoid a
* situation where we always compact [end-threshold,end). Then, the
* last file becomes an aggregate of the previous compactions.
*
* normal skew:
*
* older ----> newer (increasing seqID)
* _
* | | _
* | | | | _
* --|-|- |-|- |-|---_-------_------- minCompactSize
* | | | | | | | | _ | |
* | | | | | | | | | | | |
* | | | | | | | | | | | |
* @param candidates pre-filtrate
* @return filtered subset
*/
{code}
> Enable procedure-based log splitting as default in hbase3
> ---------------------------------------------------------
>
> Key: HBASE-24632
> URL: https://issues.apache.org/jira/browse/HBASE-24632
> Project: HBase
> Issue Type: Sub-task
> Components: wal
> Reporter: Michael Stack
> Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Means changing this value in HConstants to false:
> public static final boolean DEFAULT_HBASE_SPLIT_COORDINATED_BY_ZK = true;
> Should probably also deprecate the current zk distributed split too so we can
> clear out those classes to.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)