Hi,
I am trying the smoke test for Hadoop (2.4.1). About “terasort”, below is my
test command, the Map part was completed very fast because it was split into
many subtasks, however the Reduce part takes very long time and only 1 running
Reduce job. Is there a way speed up the reduce phase by splitting the large
reduce job into many smaller ones and run them across the cluster like the Map
part?
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar terasort
/tmp/teragenout /tmp/terasortout
Job ID NameState
Maps Total Maps Completed Reduce Total
Reduce Complted
job_1409876705457_0002 TeraSortRUNNING 22352
22352 1 0
Regards
Arthur