There was a issue where hbase regionservers had higher likelihood of major compacting on startup (if you started up a lot of clusters at once that hadn't major compacted their files, then the chore would not jitter and compact everything right away - only if necessary).
https://issues.apache.org/jira/browse/HBASE-17912 On Thu, Jun 28, 2018 at 1:31 PM Mingliang LIU <[email protected]> wrote: > Marcell, > > In Hadoop side, the NameNode (NN) will not schedule block re-replication > unless the DataNode (DN) has been claimed "dead". By default the interval > is >10mins. Usually your DN should have restarted before being "dead" in > NN. If that still is a concern, you can make that interval longer > indirectly via configurations "dfs.namenode.heartbeat.recheck-interval" and > "dfs.heartbeat.interval". The interval is calculated following this code > < > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L290 > > > . > > Thanks, > > On Thu, Jun 28, 2018 at 1:02 PM Marcell Ortutay > <[email protected]> wrote: > > > Er, to I made a mistake in the above question ; the issue is not so much > > the major compaction but rather that during restart (as nodes go up / > > down), Hadoop and HBase attempt to rebalance blocks and regions, causing > > unnecessary movement. So what I'm actually looking for is a way to avoid > > the balancing for the duration of the restart, which would avoid the need > > for major compaction afterwards. > > > > Marcell > > > > On Thu, Jun 28, 2018 at 12:55 PM, Marcell Ortutay <[email protected]> > > wrote: > > > > > Hi all, > > > > > > I'm interested in ways to avoid a major compaction when restarting all > > the > > > HBase region servers in a cluster (for example, for a version upgrade). > > Are > > > there any recommended techniques for achieving this? > > > > > > Thanks, > > > Marcell > > > > > > > > >
