Can Mapper get paths of inputSplits ?
Hi I'm using FileInputFormat which will split files logically according to their sizes into splits. Can the mapper get a pointer to these splits? and know which split it is assigned ? I tried looking up the Reporter class and see how is it printing the logical splits on the UI for each mapper .. but it's an interface. Eg. Mapper1: is assigned the logical split hdfs://localhost:9000/user/Hadoop/input:23+24 Mapper2: is assigned the logical split hdfs://localhost:9000/user/Hadoop/input:0+23 Then inside map, I want to ask what are the logical splits and get the upper two strings and know which one my current mapper is assigned. Thanks, Mark
Re: Can Mapper get paths of inputSplits ?
On Thu, May 12, 2011 at 8:59 PM, Mark question markq2...@gmail.com wrote: Hi I'm using FileInputFormat which will split files logically according to their sizes into splits. Can the mapper get a pointer to these splits? and know which split it is assigned ? Look at http://hadoop.apache.org/common/docs/r0.20.203.0/mapred_tutorial.html#Task+JVM+Reuse In particular, map.input.file and map.input.offset are the configuration parameters that you want. -- Owen
Re: Can Mapper get paths of inputSplits ?
Thanks for the reply Owen, I only knew about map.input.file. So there is no way I can see the other possible splits (start+length)? like some function that returns strings of map.input.file and map.input.offset of the other mappers ? Thanks, Mark On Thu, May 12, 2011 at 9:08 PM, Owen O'Malley omal...@apache.org wrote: On Thu, May 12, 2011 at 8:59 PM, Mark question markq2...@gmail.com wrote: Hi I'm using FileInputFormat which will split files logically according to their sizes into splits. Can the mapper get a pointer to these splits? and know which split it is assigned ? Look at http://hadoop.apache.org/common/docs/r0.20.203.0/mapred_tutorial.html#Task+JVM+Reuse In particular, map.input.file and map.input.offset are the configuration parameters that you want. -- Owen
Re: Can Mapper get paths of inputSplits ?
On Thu, May 12, 2011 at 9:23 PM, Mark question markq2...@gmail.com wrote: So there is no way I can see the other possible splits (start+length)? like some function that returns strings of map.input.file and map.input.offset of the other mappers ? No, there isn't any way to do it using the public API. The only way would be to look under the covers and read the split file (job.split). -- Owen
Re: Can Mapper get paths of inputSplits ?
Then which class is filling the Thanks again Owen, hopefully last but: Who's filling the map.input.file and map.input.offset (ie. which class) so I can extend it to have a function to return these strings. Thanks, Mark On Thu, May 12, 2011 at 10:07 PM, Owen O'Malley omal...@apache.org wrote: On Thu, May 12, 2011 at 9:23 PM, Mark question markq2...@gmail.com wrote: So there is no way I can see the other possible splits (start+length)? like some function that returns strings of map.input.file and map.input.offset of the other mappers ? No, there isn't any way to do it using the public API. The only way would be to look under the covers and read the split file (job.split). -- Owen