It is also pretty easy to over-ride bits of TextInputFormat to give the file
as the key instead of the offset.


On 10/12/07 10:19 AM, "Benjamin Reed" <[EMAIL PROTECTED]> wrote:

> We do this in Pig by using our own InputSplits.
> 
> ben
> 
> On Friday 12 October 2007, Owen O'Malley wrote:
>> On Oct 12, 2007, at 5:51 AM, Shailendra Mudgal wrote:
>>> I am adding two input dir in a job. Both the input dirs have same
>>> <Key.class,
>>> Value.class>. Inside the map method i want to know that which
>>> pair<key,
>>> value> has come from which input dir. How can i do this ? Any help
>>> will be
>>> appreciated..
>> 
>> *sigh* We've _almost_ had that feature for a long time now, see
>> HADOOP-372.
>> 
>> The work around is to use the information on:
>> http://wiki.apache.org/lucene-hadoop/TaskExecutionEnvironment
>> and get the "map.input.file" from the map's JobConf and match against
>> the prefix.
> 
> 

Reply via email to