On 25-Nov-08, at 7:38 AM, Chris Quach wrote:
Hi,
I'm testing Hadoop to see if we could use for complex calculations
next to
the 'standard' implementation. I've set up a grid with 10 nodes and
if I run
the RandomTextWriter example only 2 nodes are used as mappers, while I
specified 10 mappers to be used. The other nodes are used for
storage, but I
want them to also execute the map function. (I've had this same
behaviour
with my own test program..)
Is there a way to tell the framework to use all available nodes as
mappers?
Thanks in advance,
Chris
Assuming you have more than two tasks to run in total, you're probably
seeing all nodes being used, but only 2 at once. If you're only
seeing two *tasks*, that's your problem, set mapred.map.tasks and
mapred.reduce.tasks.
If that isn't it, make sure mapred.tasktracker.map.tasks.maximum and
mapred.tasktracker.reduce.tasks.maximum are large enough in hadoop-
site.xml on each node. AFAIK setting conf parameters within the job or
by command-line flags has no effect on these. If you use the hadoop-
ec2 tools, you can do this with hadoop-ec2-env.sh.
Karl Anderson
[EMAIL PROTECTED]
http://monkey.org/~kra