Hello,

My first time posting this in the news group.    My question sounds more like a 
MapReduce question 
instead of Hadoop HDFS itself.

To my understanding, the JobClient will submit all Mapper and Reduce class 
in a uniform way to the cluster?  Can I assume this is more like a uniform 
scheduler 
for all the task?

For example, if I have a 100 node cluster, 1 master (namenode), 99 slaves 
(datanodes).
When I do 
"JobClient.runJob(jconf)"
the JobClient will uniformly distributes all Mapper and Reduce class to all 99 
nodes.

In the slaves, they will all have the same hadoop-site.xml and 
hadoop-default.xml.
Here comes the main concern, what if some of the nodes don't have the same 
hardware spec such as 
memory or CPU speed?  E.g. different batch purchase and repairment overtime 
that causes this.

Is there any way that the JobClient can be aware of this and submit different 
number of tasks to different slaves 
during start-up?
For example, for some slaves, it has 16 cores CPU instead of 8 cores.  The 
problem I see here is that 
for the 16 cores, only 8 cores are used.

P.S. I'm looking into the JobClient source code and JobProfile/JobTracker to 
see if this can be done.
But not sure if I am on the right track.

If this topic is more likely to be in the [EMAIL PROTECTED], please let me 
know.  I'll send another one to that news group.

Regards,
-Andy

TREND MICRO EMAIL NOTICE
The information contained in this email and any attachments is confidential and 
may be subject to copyright or other intellectual property protection. If you 
are not the intended recipient, you are not authorized to use or disclose this 
information, and we request that you notify us by reply mail or telephone and 
delete the original message from your mail system.

Reply via email to