Re: wordcount getting slower with more mappers and reducers?

Arun C Murthy Thu, 05 Mar 2009 10:15:25 -0800

I assume you have only 2 map and 2 reduce slots per tasktracker -which totals to 2 maps/reduces for you cluster. This means with moremaps/reduces they are serialized to 2 at a time.

Also, the -m is only a hint to the JobTracker, you might see less/morethan the number of maps you have specified on the command line.

The -r however is followed faithfully.


Arun

On Mar 4, 2009, at 2:46 PM, Sandy wrote:

Hello all,
For the sake of benchmarking, I ran the standard hadoop wordcountexample on
an input file using 2, 4, and 8 mappers and reducers for my job.
In other words,  I do:

time -p bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -m 2 -r 2
sample.txt output
time -p bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -m 4 -r 4
sample.txt output2
time -p bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -m 8 -r 8
sample.txt output3

Strangely enough, when this increase in mappers and reducers result in
slower running times!
-On 2 mappers and reducers it ran for 40 seconds
on 4 mappers and reducers it ran for 60 seconds
on 8 mappers and reducers it ran for 90 seconds!
Please note that the "sample.txt" file is identical in each of theseruns.
I have the following questions:
- Shouldn't wordcount get -faster- with additional mappers andreducers,
instead of slower?
- If it does get faster for other people, why does it become slowerfor me?I am running hadoop on psuedo-distributed mode on a single 64-bitMac Pro
with 2 quad-core processors, 16 GB of RAM and 4 1TB HDs
I would greatly appreciate it if someone could explain this behaviorto me,and tell me if I'm running this wrong. How can I change my settings(if atall) to get wordcount running faster when i increases that number ofmaps
and reduces?

Thanks,
-SM

Re: wordcount getting slower with more mappers and reducers?

Reply via email to