[ https://issues.apache.org/jira/browse/HDFS-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15348563#comment-15348563 ]
Zhe Zhang commented on HDFS-10534: ---------------------------------- Thanks Andrew. I just reverted the change. bq. Why not present a histogram rather than a single threshold like this? That way we don't add a new config, present more info, and don't require a restart to change this threshold. In our case we are mostly interested in the 95th percentile because it serves as an alarm that 5% DNs are becoming hot nodes and will likely cause job failures. A histogram is a nice idea actually. We can think about an appropriate granularity (e.g. every 5%?) for it. The only drawback is that it will add more content to NN web UI and make it busier -- I imagine it will a table. bq. This is also a metric that could be calculated in client-side JS from existing information. True. But I think showing on NN web UI is more convenient for admins. We proposed the change because median (50th percentile) is actually a poor metric to illustrate imbalance level; especially in a busy cluster with say > 70% overall utilization. We therefore wanted a "better median". bq. the config says it's a percentile, but it's really a quantile. Good catch. We could change the config to be a real percentile to be b/w 0 and 100. Per above, we could also show a histogram instead. So overall I like the histogram idea. [~lewuathe] What are you thoughts? > NameNode WebUI should display DataNode usage rate with a certain percentile > --------------------------------------------------------------------------- > > Key: HDFS-10534 > URL: https://issues.apache.org/jira/browse/HDFS-10534 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode, ui > Reporter: Zhe Zhang > Assignee: Kai Sasaki > Attachments: HDFS-10534.01.patch, HDFS-10534.02.patch, > HDFS-10534.03.patch, HDFS-10534.04.patch, HDFS-10534.05.patch, Screen Shot > 2016-06-23 at 6.25.50 AM.png > > > In addition of *Min/Median/Max*, another meaningful metric for cluster > balance is DN usage rate at a certain percentile (e.g. 90 or 95). We should > add a config option, and another filed on NN WebUI, to display this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org