[ https://issues.apache.org/jira/browse/HBASE-25865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340992#comment-17340992 ]
Nick Dimiduk commented on HBASE-25865: -------------------------------------- The original scripts were written in Python, used Pandas to manipulate the raw data, and Plot.ly to render charts in a Jupiter notebook. I've started a PoC implementation using Vega-Lite to directly parse a json conversion of the {{ClusterMetrics}} object and create an interactive chart. Of note, Vega seems to handle the data volume produced by this cluster much better than Plot.ly did. > Visualize current state of region assignment > -------------------------------------------- > > Key: HBASE-25865 > URL: https://issues.apache.org/jira/browse/HBASE-25865 > Project: HBase > Issue Type: New Feature > Components: master > Reporter: Nick Dimiduk > Assignee: Nick Dimiduk > Priority: Major > > After several months of debugging and tuning the balancer and normalizer on a > large production cluster, we found that working from visualizations of the > current region state was very useful for understanding behaviors and > quantifying improvements we made along the way. Specifically, we found that a > chart of total assigned region count and total assigned region store files > size per table per host was immensely useful for tuning the balancer. > Histograms of store file size made understanding normalizer activity much > more intuitive. > Our scripts would parse the output of the shell's {{status 'detailed'}} > command, extract the desired metric, and produce charts. I'd like to build > into the master UI the equivalent functionality, with data coming directly > from the {{ClusterMetrics object}}, and data rendered into an interactive > chart rendered in the browser. -- This message was sent by Atlassian Jira (v8.3.4#803005)