Hello, I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and 4 slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because the values "throughput" and "average IO rate" are similar, I just post the values of "throughput" of the same command with 3 running times.
- > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048 -nrFiles 1 + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95 + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70 I found something strange while reading the source code. - The value of mapred.reduce.tasks is always set to 1 (in the source code) job.setNumReduceTasks(1) in the function runIOTest() and reduceFile = new Path(WRITE_DIR, "part-00000") in analyzeResult(). I tested with other values of mapred.reduce.tasks, e.g. mapred.reduce.tasks = 2 and have fast the same result in comparision to mapred.reduce.tasks = 1 - And i don't understand the line with "double med = rate / 1000 / tasks". Is it not "double med = rate * tasks / 1000" ? Can anyone give me a hint. Any help will be appreciated, thanks lots ! -- View this message in context: http://www.nabble.com/TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21321597p21321597.html Sent from the Hadoop core-dev mailing list archive at Nabble.com.
