The input splits are not copied, only the information on the location of
the splits is copied to the jobtracker so that it can assign tasktrackers
which are local to the split.
Check the Job Initialization section at
http://answers.oreilly.com/topic/459-anatomy-of-a-mapreduce-job-run-with-hadoop/
For the JobClient to compute the input splits doesn't it need to contact
Name Node. Only Name Node knows where the splits are, how can it compute it
without that additional call?
On Fri, Sep 27, 2013 at 1:41 AM, Sonal Goyal sonalgoy...@gmail.com wrote:
The input splits are not copied, only the
Technically, the block locations are provided by the InputSplit which in
the FileInputFormat case, is provided by the FileSystem Interface.
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/InputSplit.html
The thing to realize here is that the FileSystem implementation is