[ 
https://issues.apache.org/jira/browse/HBASE-25865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340992#comment-17340992
 ] 

Nick Dimiduk commented on HBASE-25865:
--------------------------------------

The original scripts were written in Python, used Pandas to manipulate the raw 
data, and Plot.ly to render charts in a Jupiter notebook. I've started a PoC 
implementation using Vega-Lite to directly parse a json conversion of the 
{{ClusterMetrics}} object and create an interactive chart. Of note, Vega seems 
to handle the data volume produced by this cluster much better than Plot.ly did.

> Visualize current state of region assignment
> --------------------------------------------
>
>                 Key: HBASE-25865
>                 URL: https://issues.apache.org/jira/browse/HBASE-25865
>             Project: HBase
>          Issue Type: New Feature
>          Components: master
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>            Priority: Major
>
> After several months of debugging and tuning the balancer and normalizer on a 
> large production cluster, we found that working from visualizations of the 
> current region state was very useful for understanding behaviors and 
> quantifying improvements we made along the way. Specifically, we found that a 
> chart of total assigned region count and total assigned region store files 
> size per table per host was immensely useful for tuning the balancer. 
> Histograms of store file size made understanding normalizer activity much 
> more intuitive.
> Our scripts would parse the output of the shell's {{status 'detailed'}} 
> command, extract the desired metric, and produce charts. I'd like to build 
> into the master UI the equivalent functionality, with data coming directly 
> from the {{ClusterMetrics object}}, and data rendered into an interactive 
> chart rendered in the browser.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to