[ https://issues.apache.org/jira/browse/HBASE-26147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389133#comment-17389133 ]
Nick Dimiduk commented on HBASE-26147: -------------------------------------- Not a comment on this feature as such, but on the problem space. For various reasons, I recently had to implement a "manual" balancer external to the HBase master. It is implemented as a python script that scrapes the output of cluster status and calculates a balancer plan given our requirements. The interesting part, and maybe relevant to your interests, is the next bit. With a plan calculated, it then simulates application of that plan to the known cluster state. Finally it prints out charts showing the cluster status before and after the balance is applied. Specifically, stacked bar charts that are total store file size by table, by host. I then used these visualizations to tune the parameters of my manual balancer until I was satisfied with the result. I started implementing the visualization portion of the tool on HBASE-25865. So, IMHO, having a dry-run feature is very helpful if it can be used together with such a visualization tool for making sense of the current state and the new proposed target state. > Add dry run mode to hbase balancer > ---------------------------------- > > Key: HBASE-26147 > URL: https://issues.apache.org/jira/browse/HBASE-26147 > Project: HBase > Issue Type: Improvement > Components: Balancer, master > Reporter: Bryan Beaudreault > Assignee: Bryan Beaudreault > Priority: Major > > It's often rather hard to know how the cost function changes you're making > will affect the balance of the cluster, and currently the only way to know is > to run it. If the cost decisions are not good, you may have just moved many > regions towards a non-ideal balance. Region moves themselves are not free for > clients, and the resulting balance may cause a regression. > We should add a mode to the balancer so that it can be invoked without > actually executing any plans. This will allow an administrator to iterate on > their cost functions and used the balancer's logging to see how their changes > would affect the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005)