If this cluster is being used exclusively for this goal, you could just set
the mapred.tasktracker.map.tasks.maximum to 1.


On Tue, Jan 28, 2014 at 5:00 PM, Keith Wiley <kwi...@keithwiley.com> wrote:

> I'm running a program which in the streaming layer automatically
> multithreads and does so by automatically detecting the number of cores on
> the machine.  I realize this model is somewhat in conflict with Hadoop, but
> nonetheless, that's what I'm doing.  Thus, for even resource utilization,
> it would be nice to not only assign one mapper per core, but only one
> mapper per machine.  I realize that if I saturate the cluster none of this
> really matters, but consider the following example for clarity: 4-core
> nodes, 10-node cluster, thus 40 slots, fully configured across mappers and
> reducers (40 slots of each).  Say I run this program with just two mappers.
>  It would run much more efficiently (in essentially half the time) if I
> could force the two mappers to go to slots on two separate machines instead
> of running the risk that Hadoop may assign them both to the same machine.
>
> Can this be done?
>
> Thanks.
>
>
> ________________________________________________________________________________
> Keith Wiley     kwi...@keithwiley.com     keithwiley.com
> music.keithwiley.com
>
> "Yet mark his perfect self-contentment, and hence learn his lesson, that
> to be
> self-contented is to be vile and ignorant, and that to aspire is better
> than to
> be blindly and impotently happy."
>                                            --  Edwin A. Abbott, Flatland
>
> ________________________________________________________________________________
>
>

Reply via email to