[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861028#action_12861028
 ] 

Allen Wittenauer commented on MAPREDUCE-1288:
---------------------------------------------

Why would a task from an already running job not be able to find version-0?  
Why is the task tracker removing content from the cache of a running job?  If 
the content moved/is different, shouldn't the job tracker be able to reschedule 
tasks onto a task tracker that has a copy?   Why can't the task tracker copy 
the dcache from another task tracker that does have a copy?

That said, I'm not convinced that in the majority of cases that version-0 vs. 
version-1 is undefined.  From what I've seen, most of the time different 
versions of a dcache are downwardly compatible.  As much as folks hate 
tunables, perhaps that is the answer here:  mapred.job.dcache.failonupdate.



> DistributedCache localizes only once per cache URI
> --------------------------------------------------
>
>                 Key: MAPREDUCE-1288
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1288
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: security, tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.21.0
>
>
> As part of the file localization the distributed cache localizer creates a 
> copy of the file in the corresponding user's private directory. The 
> localization in DistributedCache assumes the key as the URI of the cachefile 
> and if it already exists in the map, the localization is not done again. This 
> means that another user cannot access the same distributed cache file. We 
> should change the key to include the username so that localization is done 
> for every user.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to