Re: Retrieve and compute input splits

2013-09-27 Thread Sonal Goyal
The input splits are not copied, only the information on the location of the splits is copied to the jobtracker so that it can assign tasktrackers which are local to the split. Check the Job Initialization section at http://answers.oreilly.com/topic/459-anatomy-of-a-mapreduce-job-run-with-hadoop/

Re: Retrieve and compute input splits

2013-09-27 Thread Peyman Mohajerian
For the JobClient to compute the input splits doesn't it need to contact Name Node. Only Name Node knows where the splits are, how can it compute it without that additional call? On Fri, Sep 27, 2013 at 1:41 AM, Sonal Goyal sonalgoy...@gmail.com wrote: The input splits are not copied, only the

Re: Retrieve and compute input splits

2013-09-27 Thread Jay Vyas
Technically, the block locations are provided by the InputSplit which in the FileInputFormat case, is provided by the FileSystem Interface. http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/InputSplit.html The thing to realize here is that the FileSystem implementation is