I am using Hadoop Streaming. The input are multiple files.
Is there a way to get the current filename in mapper?

For example:
$HADOOP_HOME/bin/hadoop  \
jar $HADOOP_HOME/hadoop-streaming.jar \
    -input file1 \
    -input file2 \
    -output myOutputDir \
    -mapper mapper \
    -reducer reducer

In mapper:
while (<STDIN>){
  //how to tell the current line is from file1 or file2?
}




      

Reply via email to