Question #1 seems better suited on the Ambari mailing list.

Have you checked whether hdfs balancer (not hbase balancer) was active from
the restart to observation of locality drop ?

For StochasticLoadBalancer, there is this cost factor:

    private static final String LOCALITY_COST_KEY =
"hbase.master.balancer.stochastic.localityCost";

    private static final float DEFAULT_LOCALITY_COST = 25;

Default weight is very low. If you don't want to see drop in locality, you
can increase the weight (to ballpark of 500, e.g.) so that hbase balancer
doesn't move many regions.

What you described in #2 (locality drop prevention) and #3 (rapid increase
of locality for moved regions) are two sides of the coin - the reason of
adding new nodes is to offload from existing region servers which would
result in drop of locality.

Cheers

On Thu, Jan 5, 2017 at 4:47 PM, Ganesh Viswanathan <gan...@gmail.com> wrote:

> Hello,
>
> I have three questions related to Hbase major compactions:
>
> 1) During a scheduled maintenance event on the Hbase cluster to add 2 new
> regionservers, Ambari said restart of all HDFS nodes (both name and data)
> was required. In the logs, it looks like the Hbase balancer turned on
> actively after the two nodes got registered.
> Is it normal to restart all HDFS nodes to add a new node into the cluster?
> I am using HDP 2.4.
>
> 2) Should I turn off the Hbase balancer before adding new nodes. If so,
> when should I turn it back on and what would be the impact? Would it cause
> a large drop in locality again?
>
> 3) When all the nodes in the cluster were restarted with Ambari, locality
> dropped to ~13% and Hbase was almost non-responsive. Only triggering a
> manual major compaction seems to help improve the locality after this. But
> the data-locality increase is very gradual (about 4% every hour). Is there
> any way to speed up major compaction (increase the number of threads etc)
> in HDP distribution?
>
>
> Thanks,
> Ganesh
>

Reply via email to