[ 
https://issues.apache.org/jira/browse/HDFS-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201220#comment-14201220
 ] 

Aaron T. Myers commented on HDFS-7331:
--------------------------------------

{quote}
If I understand correctly, the metrics record failures for each DataXceiver 
endpoint. I think you have the assumption that all accesses comes within the 
cluster. The assumption no longer holds when the computation and storage are 
separated.

I've no strong opinion on this, I'm fine with using a HashMap in this jira and 
bounding its size down the road if problem occurs.
{quote}

Got it. I'd suggest then that we leave support for bounding the size in the 
patch, and make it configurable, but default it to be unbounded. That way if it 
ends up causing a problem for someone they can configure the size smaller 
without having to deploy a new version of the software.

This brings up a related point - I think the current patch groups the counts by 
remote address, which I believe includes the remote port. In the case of non-DN 
clients, this port can be anything in the ephemeral port range, which isn't 
very useful. I think better would be to group just by IP address, and not 
include the port.

Do folks agree with the above?

> Add Datanode network counts to datanode jmx page
> ------------------------------------------------
>
>                 Key: HDFS-7331
>                 URL: https://issues.apache.org/jira/browse/HDFS-7331
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>            Priority: Minor
>         Attachments: HDFS-7331.001.patch, HDFS-7331.002.patch
>
>
> Add per-datanode counts to the datanode jmx page. For example, networkErrors 
> could be exposed like this:
> {noformat}
>   }, {
> ...
>     "DatanodeNetworkCounts" : "{\"dn1\":{\"networkErrors\":1}}",
> ...
>     "NamenodeAddresses" : 
> "{\"localhost\":\"BP-1103235125-127.0.0.1-1415057084497\"}",
>     "VolumeInfo" : 
> "{\"/tmp/hadoop-cwl/dfs/data/current\":{\"freeSpace\":3092725760,\"usedSpace\":28672,\"reservedSpace\":0}}",
>     "ClusterId" : "CID-4b38f2ae-5e58-4e15-b3cf-3ba3f46e724e"
>   }, {
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to