[ 
https://issues.apache.org/jira/browse/HBASE-26147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389152#comment-17389152
 ] 

Bryan Beaudreault commented on HBASE-26147:
-------------------------------------------

Thank you both for the comments.

[~ndimiduk] that sounds great. I also wrote a separate java process which 
effectively ran the full balancer in a separate process and printed a bunch of 
info. It was useful for some balancer iteration I needed to do, but was super 
hacky so I eventually threw it away. I like the idea of combining dry run mode 
with a visualization tool. Dry run mode itself is useful for some cases (just 
seeing how many regions move and cost reduction), but in other cases you really 
need better data. I'd be interested to see how we can combine our two tools 
once they both land in master.

[~claraxiong] Yea, I thought about that and love the idea. My current workflow 
is to edit hbase-site.xml and then use update_config to load it into the 
master, then run dry_run_balance, rinse and repeat. This is clunky and you need 
to remember to revert your hbase-site.xml changes and re-run update_config 
before re-enabling the balancer. I think a separate Jira could be added to 
allow dry_run_balance to take arguments for temporarily modifying the cost 
functions on the fly.

Thanks for the pointer to 25973, I didn't have that one in my fork when I 
tested this. I've added it now.

> Add dry run mode to hbase balancer
> ----------------------------------
>
>                 Key: HBASE-26147
>                 URL: https://issues.apache.org/jira/browse/HBASE-26147
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer, master
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>
> It's often rather hard to know how the cost function changes you're making 
> will affect the balance of the cluster, and currently the only way to know is 
> to run it. If the cost decisions are not good, you may have just moved many 
> regions towards a non-ideal balance. Region moves themselves are not free for 
> clients, and the resulting balance may cause a regression.
> We should add a mode to the balancer so that it can be invoked without 
> actually executing any plans. This will allow an administrator to iterate on 
> their cost functions and used the balancer's logging to see how their changes 
> would affect the cluster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to