Adding a combiner step first then reduce?
On Feb 8, 2013, at 11:18 PM, Harsh J ha...@cloudera.com wrote:
Hey David,
There's no readily available way to do this today (you may be
interested in MAPREDUCE-199 though) but if your Job scheduler's not
doing multiple-assignments on reduce
[mailto:michael_se...@hotmail.com]
Sent: Monday, February 11, 2013 8:30 AM
To: user@hadoop.apache.org
Subject: Re: How can I limit reducers to one-per-node?
Adding a combiner step first then reduce?
On Feb 8, 2013, at 11:18 PM, Harsh J ha...@cloudera.com wrote:
Hey David,
There's
across hosts enough to be reasonably safe.
From: Ted Dunning [mailto:tdunn...@maprtech.com]
Sent: Monday, February 11, 2013 12:55 PM
To: user@hadoop.apache.org
Subject: Re: How can I limit reducers to one-per-node?
For crawler type apps, typically you direct all of the URL's to crawl from
I have a cluster of boxes with 3 reducers per node. I want to limit a
particular job to only run 1 reducer per node.
This job is network IO bound, gathering images from a set of webservers.
My job has certain parameters set to meet web politeness standards (e.g.
limit connects and
I think set tasktracker.reduce.tasks.maximum to be 1 may meet your requirement
Best,
--
Nan Zhu
School of Computer Science,
McGill University
On Friday, 8 February, 2013 at 10:54 PM, David Parks wrote:
I have a cluster of boxes with 3 reducers per node. I want to limit a
particular
(using 15 m1.xlarge boxes which come with 3 reducer
slots configured by default).
From: Nan Zhu [mailto:zhunans...@gmail.com]
Sent: Saturday, February 09, 2013 10:59 AM
To: user@hadoop.apache.org (mailto:user@hadoop.apache.org)
Subject: Re: How can I limit reducers to one-per
to a different node, but in the last run 3
nodes had nothing to do, and 3 other nodes had 2 reduce tasks assigned.
From: Nan Zhu [mailto:zhunans...@gmail.com]
Sent: Saturday, February 09, 2013 11:31 AM
To: user@hadoop.apache.org
Subject: Re: How can I limit reducers to one-per-node?
I haven't
]
Sent: Saturday, February 09, 2013 11:31 AM
To: user@hadoop.apache.org (mailto:user@hadoop.apache.org)
Subject: Re: How can I limit reducers to one-per-node?
I haven't use AWS MR beforeā¦..if your instances are configured with 3 reducer
slots, it means that 3 reducers can run at the same
Hey David,
There's no readily available way to do this today (you may be
interested in MAPREDUCE-199 though) but if your Job scheduler's not
doing multiple-assignments on reduce tasks, then only one is assigned
per TT heartbeat, which gives you almost what you're looking for: 1
reduce task per