[GitHub] [hadoop] virajjasani commented on pull request #4107: HDFS-16521. DFS API to retrieve slow datanodes

GitBox Wed, 30 Mar 2022 04:46:23 -0700


virajjasani commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1083038635

@iwasakims @ayushtkn I was earlier thinking about adding SLOW_NODE in
`DatanodeReportType` so that ClientProtocol#getDatanodeReport can take care of
retrieval of slownodes but the server side implementation seems to be getting
bit more complicated with it and hence to make this a separate and clean
workflow, I thought of adding it as new API in ClientProtocol. But other than
that, this is quite similar to getDatanodeReport() API only.

When HDFS throughput is affected, it would be really great for operators to
check for slownode details (similar command to retrieve decommission, dead,
live nodes) using `dfsadmin -report` command.

> How about enhancing metrics if the current information in the
SlowPeersReport is insufficient?

We can do this but I believe if we can add more info to slownode only when
required i.e. by user triggered API (similar to ClientProtocol), that would be
less overhead than continuously exposing additional details in the metrics.
WDYT?

> Thanks to
[JMXJsonServlet](https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/jmx/JMXJsonServlet.java),
we can get metrics in JSON format via HTTP/HTTPS port of NameNode without
additional configuration.

Yes this is helpful for sure but only if Namenode port is exposed to
downstream application.
For instance, in K8S cluster, namenode port access might be restricted to
only namenode and datanode pods/containers, so other service pods (e.g. hbase
service pods/containers) would not even have access to namenode port and hence
no way for it to derive metric values. Metric exposure is definitely good for
the end customers to get a high level view, I agree with it. But applications
on the other hand, depending on the environment, might or might not even have
access to values derived from metrics.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] virajjasani commented on pull request #4107: HDFS-16521. DFS API to retrieve slow datanodes

Reply via email to