[ 
https://issues.apache.org/jira/browse/HDFS-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15348563#comment-15348563
 ] 

Zhe Zhang commented on HDFS-10534:
----------------------------------

Thanks Andrew. I just reverted the change.

bq. Why not present a histogram rather than a single threshold like this? That 
way we don't add a new config, present more info, and don't require a restart 
to change this threshold.
In our case we are mostly interested in the 95th percentile because it serves 
as an alarm that 5% DNs are becoming hot nodes and will likely cause job 
failures. A histogram is a nice idea actually. We can think about an 
appropriate granularity (e.g. every 5%?) for it. The only drawback is that it 
will add more content to NN web UI and make it busier -- I imagine it will a 
table.

bq. This is also a metric that could be calculated in client-side JS from 
existing information.
True. But I think showing on NN web UI is more convenient for admins. We 
proposed the change because median (50th percentile) is actually a poor metric 
to illustrate imbalance level; especially in a busy cluster with say > 70% 
overall utilization. We therefore wanted a "better median".

bq. the config says it's a percentile, but it's really a quantile.
Good catch. We could change the config to be a real percentile to be b/w 0 and 
100. Per above, we could also show a histogram instead.

So overall I like the histogram idea. [~lewuathe] What are you thoughts?

> NameNode WebUI should display DataNode usage rate with a certain percentile
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-10534
>                 URL: https://issues.apache.org/jira/browse/HDFS-10534
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode, ui
>            Reporter: Zhe Zhang
>            Assignee: Kai Sasaki
>         Attachments: HDFS-10534.01.patch, HDFS-10534.02.patch, 
> HDFS-10534.03.patch, HDFS-10534.04.patch, HDFS-10534.05.patch, Screen Shot 
> 2016-06-23 at 6.25.50 AM.png
>
>
> In addition of *Min/Median/Max*, another meaningful metric for cluster 
> balance is DN usage rate at a certain percentile (e.g. 90 or 95). We should 
> add a config option, and another filed on NN WebUI, to display this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to