Performance question

Mark Kerzner Sun, 19 Apr 2009 21:27:07 -0700

Hi,

I ran a Hadoop MapReduce task in the local mode, reading and writing from
HDFS, and it took 2.5 minutes. Essentially the same operations on the local
file system without MapReduce took 1/2 minute.  Is this to be expected?


It seemed that the system lost most of the time in the MapReduce operation,
such as after these messages

09/04/19 23:23:01 INFO mapred.LocalJobRunner: reduce > reduce
09/04/19 23:23:01 INFO mapred.JobClient:  map 100% reduce 92%
09/04/19 23:23:04 INFO mapred.LocalJobRunner: reduce > reduce

it waited for a long time. The final output lines were

09/04/19 23:24:12 INFO mapred.LocalJobRunner: reduce > reduce
09/04/19 23:24:12 INFO mapred.TaskRunner: Task
'attempt_local_0001_r_000000_0' done.
09/04/19 23:24:12 INFO mapred.TaskRunner: Saved output of task
'attempt_local_0001_r_000000_0' to hdfs://localhost/output
09/04/19 23:24:13 INFO mapred.JobClient: Job complete: job_local_0001
09/04/19 23:24:13 INFO mapred.JobClient: Counters: 13
09/04/19 23:24:13 INFO mapred.JobClient:   File Systems
09/04/19 23:24:13 INFO mapred.JobClient:     HDFS bytes read=138103444
09/04/19 23:24:13 INFO mapred.JobClient:     HDFS bytes written=107357785
09/04/19 23:24:13 INFO mapred.JobClient:     Local bytes read=282509133
09/04/19 23:24:13 INFO mapred.JobClient:     Local bytes written=376697552
09/04/19 23:24:13 INFO mapred.JobClient:   Map-Reduce Framework
09/04/19 23:24:13 INFO mapred.JobClient:     Reduce input groups=184
09/04/19 23:24:13 INFO mapred.JobClient:     Combine output records=185
09/04/19 23:24:13 INFO mapred.JobClient:     Map input records=209
09/04/19 23:24:13 INFO mapred.JobClient:     Reduce output records=184
09/04/19 23:24:13 INFO mapred.JobClient:     Map output bytes=91863989
09/04/19 23:24:13 INFO mapred.JobClient:     Map input bytes=69051592
09/04/19 23:24:13 INFO mapred.JobClient:     Combine input records=185
09/04/19 23:24:13 INFO mapred.JobClient:     Map output records=209
09/04/19 23:24:13 INFO mapred.JobClient:     Reduce input records=184

Performance question

Reply via email to