[ https://issues.apache.org/jira/browse/MAPREDUCE-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122346#comment-16122346 ]
Konstantin Shvachko commented on MAPREDUCE-6931: ------------------------------------------------ [~dennishuo] I agree that "Total Throughput" metric highly depends on how you run the job. This is exactly the point that it makes it a Mapreduce metric, not HDFS. One can go to Yarn UI and divide HDFS bytes written by the job time for any job, but it does not measure HDFS write operation. I think we should just remove it. > Fix TestDFSIO "Total Throughput" calculation > -------------------------------------------- > > Key: MAPREDUCE-6931 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6931 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: benchmarks, test > Affects Versions: 2.8.0 > Reporter: Dennis Huo > Priority: Trivial > > The new "Total Throughput" line added in > https://issues.apache.org/jira/browse/HDFS-9153 is currently calculated as > {{toMB(size) / ((float)execTime)}} and claims to be in units of "MB/s", but > {{execTime}} is in milliseconds; thus, the reported number is 1/1000x the > actual value: > {code:java} > String resultLines[] = { > "----- TestDFSIO ----- : " + testType, > " Date & time: " + new Date(System.currentTimeMillis()), > " Number of files: " + tasks, > " Total MBytes processed: " + df.format(toMB(size)), > " Throughput mb/sec: " + df.format(size * 1000.0 / (time * > MEGA)), > "Total Throughput mb/sec: " + df.format(toMB(size) / > ((float)execTime)), > " Average IO rate mb/sec: " + df.format(med), > " IO rate std deviation: " + df.format(stdDev), > " Test exec time sec: " + df.format((float)execTime / 1000), > "" }; > {code} > The different calculated fields can also use toMB and a shared > milliseconds-to-seconds conversion to make it easier to keep units consistent. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org