d-c-manning commented on code in PR #5545:
URL: https://github.com/apache/hbase/pull/5545#discussion_r1485004519


##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreEngine.java:
##########
@@ -70,36 +72,17 @@ protected void createComponents(Configuration conf, HStore 
store, CellComparator
   }
 
   protected void createCompactor(Configuration conf, HStore store) throws 
IOException {
-    String className = conf.get(DEFAULT_COMPACTOR_CLASS_KEY, 
DEFAULT_COMPACTOR_CLASS.getName());
-    try {
-      compactor = ReflectionUtils.instantiateWithCustomCtor(className,
-        new Class[] { Configuration.class, HStore.class }, new Object[] { 
conf, store });
-    } catch (Exception e) {
-      throw new IOException("Unable to load configured compactor '" + 
className + "'", e);
-    }
+    createCompactor(conf, store, DEFAULT_COMPACTOR_CLASS_KEY, 
DEFAULT_COMPACTOR_CLASS.getName());
   }
 
   protected void createCompactionPolicy(Configuration conf, HStore store) 
throws IOException {
-    String className =
-      conf.get(DEFAULT_COMPACTION_POLICY_CLASS_KEY, 
DEFAULT_COMPACTION_POLICY_CLASS.getName());
-    try {
-      compactionPolicy = ReflectionUtils.instantiateWithCustomCtor(className,
-        new Class[] { Configuration.class, StoreConfigInformation.class },
-        new Object[] { conf, store });
-    } catch (Exception e) {
-      throw new IOException("Unable to load configured compaction policy '" + 
className + "'", e);
-    }
+    createCompactionPolicy(conf, store, DEFAULT_COMPACTION_POLICY_CLASS_KEY,

Review Comment:
   The default `CompactionConfiguration` would have 
`hbase.hstore.compaction.min == 3` for `HBASE_HSTORE_COMPACTION_MIN_KEY`. The 
hbase book (https://hbase.apache.org/book.html#compaction.parameters) has this 
comment:
   >The goal of tuning hbase.hstore.compaction.min is to avoid ending up with 
too many tiny StoreFiles to compact. Setting this value to 2 would cause a 
minor compaction each time you have two StoreFiles in a Store, and this is 
probably not appropriate.
   
   I confess I don't know why 2 is (probably) inappropriate. I assume it's 
because compacting immediately after every flush is perhaps "too much" 
compaction, and it is only a tradeoff between read performance and compaction 
load.
   
   But if that is at all true, then will we now be in that state by default 
with the dual file writer? If a compaction often leaves us with 2 files, then 
we will potentially compact on every flush (instead of after every 2 flushes, 
by default?)
   
   If someone was "inappropriate" and set `hbase.hstore.compaction.min` = `2`, 
would enabling DualFile writing cause us to always select the same 2 output 
files for compaction, and we would re-compact them infinitely?
   
   Do we need to add 1 to this `min` value if dual writing is enabled? Or 
otherwise change the compaction selection algorithm to be aware of live + 
historical files?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to