[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1697:
-----------------------------------------------

    Attachment: patch-1697-2.txt

Patch deprecates -file option and updates usage message with proper behavior of 
-file option.

> Document the behavior of -file option in streaming
> --------------------------------------------------
>
>                 Key: MAPREDUCE-1697
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1697
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/streaming, documentation
>    Affects Versions: 0.20.1
>            Reporter: Amareshwari Sriramadasu
>             Fix For: 0.21.0, 0.22.0
>
>         Attachments: patch-1697-1.txt, patch-1697-2.txt, patch-1697.txt
>
>
> The behavior of -file option in streaming is not documented anywhere.
> The behavior of -file is the following :
> 1) All the files passed through  -file option are packaged into job.jar.
> 2) If -file option is used for .class or .jar files, they are unjarred on 
> tasktracker and placed in 
> ${mapred.local.dir}/taskTracker/jobcache/job_ID/jars/classes or /lib, 
> respectively. Symlinks to the directories classes and lib are created from 
> the cwd of the task, . The names of symlinks are "classes", "lib". So file 
> names of .class or .jar files do not appear in cwd of the task. 
> Paths to these files are automatically added to classpath. The tricky part is 
> that hadoop framework can pick .class or .jar using classpath, but actual 
> mapper script cannot. If you'd like to access these .class or .jar inside 
> script, please do something like "java -cp lib/*;classes/* <ClassName>". 
> 3) If -file option is used for files other than .class or .jar (e.g, .txt or 
> .pl), these files are unjarred into 
> ${mapred.local.dir}/taskTracker/jobcache/job_ID/jars/. Symlinks to these 
> files are created from the cwd of the task. Names of these symlinks are 
> actually file names. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to