[ 
https://issues.apache.org/jira/browse/HBASE-26147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389133#comment-17389133
 ] 

Nick Dimiduk commented on HBASE-26147:
--------------------------------------

Not a comment on this feature as such, but on the problem space. For various 
reasons, I recently had to implement a "manual" balancer external to the HBase 
master. It is implemented as a python script that scrapes the output of cluster 
status and calculates a balancer plan given our requirements. The interesting 
part, and maybe relevant to your interests, is the next bit. With a plan 
calculated, it then simulates application of that plan to the known cluster 
state. Finally it prints out charts showing the cluster status before and after 
the balance is applied. Specifically, stacked bar charts that are total store 
file size by table, by host. I then used these visualizations to tune the 
parameters of my manual balancer until I was satisfied with the result.

I started implementing the visualization portion of the tool on HBASE-25865.

So, IMHO, having a dry-run feature is very helpful if it can be used together 
with such a visualization tool for making sense of the current state and the 
new proposed target state.

> Add dry run mode to hbase balancer
> ----------------------------------
>
>                 Key: HBASE-26147
>                 URL: https://issues.apache.org/jira/browse/HBASE-26147
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer, master
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>
> It's often rather hard to know how the cost function changes you're making 
> will affect the balance of the cluster, and currently the only way to know is 
> to run it. If the cost decisions are not good, you may have just moved many 
> regions towards a non-ideal balance. Region moves themselves are not free for 
> clients, and the resulting balance may cause a regression.
> We should add a mode to the balancer so that it can be invoked without 
> actually executing any plans. This will allow an administrator to iterate on 
> their cost functions and used the balancer's logging to see how their changes 
> would affect the cluster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to