Hey David, There's no readily available way to do this today (you may be interested in MAPREDUCE-199 though) but if your Job scheduler's not doing multiple-assignments on reduce tasks, then only one is assigned per TT heartbeat, which gives you almost what you're looking for: 1 reduce task per node, round-robin'd (roughly).
On Sat, Feb 9, 2013 at 9:24 AM, David Parks <[email protected]> wrote: > I have a cluster of boxes with 3 reducers per node. I want to limit a > particular job to only run 1 reducer per node. > > > > This job is network IO bound, gathering images from a set of webservers. > > > > My job has certain parameters set to meet “web politeness” standards (e.g. > limit connects and connection frequency). > > > > If this job runs from multiple reducers on the same node, those per-host > limits will be violated. Also, this is a shared environment and I don’t > want long running network bound jobs uselessly taking up all reduce slots. -- Harsh J
