[ https://issues.apache.org/jira/browse/HDFS-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400985#comment-13400985 ]
Hadoop QA commented on HDFS-3170: --------------------------------- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533205/hdfs-3170.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2696//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2696//console This message is automatically generated. > Add more useful metrics for write latency > ----------------------------------------- > > Key: HDFS-3170 > URL: https://issues.apache.org/jira/browse/HDFS-3170 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node > Affects Versions: 2.0.0-alpha > Reporter: Todd Lipcon > Assignee: Matthew Jacobs > Attachments: hdfs-3170.txt > > > Currently, the only write-latency related metric we expose is the total > amount of time taken by opWriteBlock. This is practically useless, since (a) > different blocks may be wildly different sizes, and (b) if the writer is only > generating data slowly, it will make a block write take longer by no fault of > the DN. I would like to propose two new metrics: > 1) *flush-to-disk time*: count how long it takes for each call to flush an > incoming packet to disk (including the checksums). In most cases this will be > close to 0, as it only flushes to buffer cache, but if the backing block > device enters congested writeback, it can take much longer, which provides an > interesting metric. > 2) *round trip to downstream pipeline node*: track the round trip latency for > the part of the pipeline between the local node and its downstream neighbors. > When we add a new packet to the ack queue, save the current timestamp. When > we receive an ack, update the metric based on how long since we sent the > original packet. This gives a metric of the total RTT through the pipeline. > If we also include this metric in the ack to upstream, we can subtract the > amount of time due to the later stages in the pipeline and have an accurate > count of this particular link. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira