Or you can implement your own InputSplit and InputFormat, which you can control how to send tasks to which node, and how many per node. Some detail examples you can get from book "Professional Hadoop Solution" Character 4. Yong
> Subject: Re: Force one mapper per machine (not core)? > From: kwi...@keithwiley.com > Date: Tue, 28 Jan 2014 15:41:22 -0800 > To: user@hadoop.apache.org > > Yeah, it isn't, not even remotely, but thanks. > > On Jan 28, 2014, at 14:06 , Bryan Beaudreault wrote: > > > If this cluster is being used exclusively for this goal, you could just set > > the mapred.tasktracker.map.tasks.maximum to 1. > > > > > > On Tue, Jan 28, 2014 at 5:00 PM, Keith Wiley <kwi...@keithwiley.com> wrote: > > I'm running a program which in the streaming layer automatically > > multithreads and does so by automatically detecting the number of cores on > > the machine. I realize this model is somewhat in conflict with Hadoop, but > > nonetheless, that's what I'm doing. Thus, for even resource utilization, > > it would be nice to not only assign one mapper per core, but only one > > mapper per machine. I realize that if I saturate the cluster none of this > > really matters, but consider the following example for clarity: 4-core > > nodes, 10-node cluster, thus 40 slots, fully configured across mappers and > > reducers (40 slots of each). Say I run this program with just two mappers. > > It would run much more efficiently (in essentially half the time) if I > > could force the two mappers to go to slots on two separate machines instead > > of running the risk that Hadoop may assign them both to the same machine. > > > > Can this be done? > > > > Thanks. > > ________________________________________________________________________________ > Keith Wiley kwi...@keithwiley.com keithwiley.com > music.keithwiley.com > > "Luminous beings are we, not this crude matter." > -- Yoda > ________________________________________________________________________________ >