[ 
https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116671#comment-15116671
 ] 

Sangjin Lee commented on HDFS-9579:
-----------------------------------

I went over the existing version of the patch.

First, I don't think using a {{HashMap}} for the bytes read per distance is 
thread safe. Note that one thread (the owner) will modify this map in 
{{incrementBytesReadByDistance()}} while any thread can read the values off the 
map via {{getBytesReadByDistance()}} and {{visitAll()}}, all unsynchronized. 
The problems could range memory visibility, ConcurrentModificationException, 
and worse. We need to make this thread safe.

Another reservation I have with using a map: I'm a little concerned about 
memory implications. An additional map per {{StatisticsData}} can add up. Can 
we find a way of avoiding using a map? I know it may sound ugly, but one other 
option is to use individual long (volatile) variables. That can also address 
the thread safety. Thoughts?

Also, in NetworkTopology.java (lines 373-381) {{equals()}} and {{hashCode()}} 
are superfluous here as it does not modify the super behavior in any way.

> Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-9579
>                 URL: https://issues.apache.org/jira/browse/HDFS-9579
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HDFS-9579-2.patch, HDFS-9579-3.patch, HDFS-9579-4.patch, 
> HDFS-9579.patch, MR job counters.png
>
>
> For cross DC distcp or other applications, it becomes useful to have insight 
> as to the traffic volume for each network distance to distinguish cross-DC 
> traffic, local-DC-remote-rack, etc.
> FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To 
> provide additional metrics for each network distance, we can add additional 
> metrics to FileSystem level and have {{DFSInputStream}} update the value 
> based on the network distance between client and the datanode.
> {{DFSClient}} will resolve client machine's network location as part of its 
> initialization. It doesn't need to resolve datanode's network location for 
> each read as {{DatanodeInfo}} already has the info.
> There are existing HDFS specific metrics such as {{ReadStatistics}} and 
> {{DFSHedgedReadMetrics}}. But these metrics are only accessible via 
> {{DFSClient}} or {{DFSInputStream}}. Not something that application framework 
> such as MR and Tez can get to. That is the benefit of storing these new 
> metrics in FileSystem.Statistics.
> This jira only includes metrics generation by HDFS. The consumption of these 
> metrics at MR and Tez will be tracked by separated jiras.
> We can add similar metrics for HDFS write scenario later if it is necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to