[ 
https://issues.apache.org/jira/browse/HADOOP-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated HADOOP-2116:
----------------------------------

    Status: Open  (was: Patch Available)

I light of HADOOP-2570, I'm cancelling this patch.

Reasoning:

The *-file* option works by putting the script into the job's jar file by 
unjar-ing, copying and then jar-ing it again. (yuck!) 

This means that on the TaskTracker the script has moved from jobCache/work to 
jobCache/job_jar_xml (I propose we rename that to *private*, heh). Clearly 
user-scripts which rely on "../work/<script_name>" will break again...

Having said that we need to debate whether this feature is an 
incompatible-change, what do folks think?

If people say otherwise we need to ensure all files in jobCache/private are 
smylinked into jobCache/work... ugh!

----

I'd like to take this opportunity to take a hard look at streaming's *-file* 
option too. The unjar/jar way is completely backwards! We _should_ rework the 
-file option to use the DistributedCache and the symlink option it provides.
So, user-scripts can simply be "./<script>" rather than "../work/<script>". 
Yes, the way to maintain compatibility (if we want) is to use the previous 
option of symlinking files into jobCache/work also. I'd strongly vote for this 
option.

Thoughts?

> Job.local.dir to be exposed to tasks
> ------------------------------------
>
>                 Key: HADOOP-2116
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2116
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.14.3
>         Environment: All
>            Reporter: Milind Bhandarkar
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.0
>
>         Attachments: patch-2116.txt, patch-2116.txt
>
>
> Currently, since all task cwds are created under a jobcache directory, users 
> that need a job-specific shared directory for use as scratch space, create 
> ../work. This is hacky, and will break when HADOOP-2115 is addressed. For 
> such jobs, hadoop mapred should expose job.local.dir via localized 
> configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to