Can Mapper get paths of inputSplits ?

2011-05-12 Thread Mark question
Hi

   I'm using FileInputFormat which will split files logically according to
their sizes into splits. Can the mapper get a pointer to these splits? and
know which split it is assigned ?

   I tried looking up the Reporter class and see how is it printing the
logical splits on the UI for each mapper .. but it's an interface.

   Eg.
Mapper1:  is assigned the logical split
hdfs://localhost:9000/user/Hadoop/input:23+24
Mapper2:  is assigned the logical split
hdfs://localhost:9000/user/Hadoop/input:0+23

 Then inside map, I want to ask what are the logical splits and get the
upper two strings and know which one my current mapper is assigned.

 Thanks,
Mark


Re: Can Mapper get paths of inputSplits ?

2011-05-12 Thread Owen O'Malley
On Thu, May 12, 2011 at 8:59 PM, Mark question markq2...@gmail.com wrote:

 Hi

   I'm using FileInputFormat which will split files logically according to
 their sizes into splits. Can the mapper get a pointer to these splits? and
 know which split it is assigned ?


Look at
http://hadoop.apache.org/common/docs/r0.20.203.0/mapred_tutorial.html#Task+JVM+Reuse

 In particular, map.input.file and map.input.offset are the configuration
parameters that you want.

-- Owen


Re: Can Mapper get paths of inputSplits ?

2011-05-12 Thread Mark question
Thanks for the reply Owen, I only knew about map.input.file.

 So there is no way I can see the other possible splits (start+length)? like
some function that returns strings of map.input.file and map.input.offset of
the other mappers ?

Thanks,
Mark

On Thu, May 12, 2011 at 9:08 PM, Owen O'Malley omal...@apache.org wrote:

 On Thu, May 12, 2011 at 8:59 PM, Mark question markq2...@gmail.com
 wrote:

  Hi
 
I'm using FileInputFormat which will split files logically according to
  their sizes into splits. Can the mapper get a pointer to these splits?
 and
  know which split it is assigned ?
 

 Look at

 http://hadoop.apache.org/common/docs/r0.20.203.0/mapred_tutorial.html#Task+JVM+Reuse

  In particular, map.input.file and map.input.offset are the configuration
 parameters that you want.

 -- Owen



Re: Can Mapper get paths of inputSplits ?

2011-05-12 Thread Owen O'Malley
On Thu, May 12, 2011 at 9:23 PM, Mark question markq2...@gmail.com wrote:

  So there is no way I can see the other possible splits (start+length)?
 like
 some function that returns strings of map.input.file and map.input.offset
 of
 the other mappers ?


No, there isn't any way to do it using the public API.

The only way would be to look under the covers and read the split file
(job.split).

-- Owen


Re: Can Mapper get paths of inputSplits ?

2011-05-12 Thread Mark question
Then which class is filling the
Thanks again Owen, hopefully last but:

   Who's filling the map.input.file and map.input.offset (ie. which class)
so I can extend it to have a function to return these strings.

Thanks,
Mark

On Thu, May 12, 2011 at 10:07 PM, Owen O'Malley omal...@apache.org wrote:

 On Thu, May 12, 2011 at 9:23 PM, Mark question markq2...@gmail.com
 wrote:

   So there is no way I can see the other possible splits (start+length)?
  like
  some function that returns strings of map.input.file and map.input.offset
  of
  the other mappers ?
 

 No, there isn't any way to do it using the public API.

 The only way would be to look under the covers and read the split file
 (job.split).

 -- Owen