[ https://issues.apache.org/jira/browse/MAPREDUCE-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861028#action_12861028 ]
Allen Wittenauer commented on MAPREDUCE-1288: --------------------------------------------- Why would a task from an already running job not be able to find version-0? Why is the task tracker removing content from the cache of a running job? If the content moved/is different, shouldn't the job tracker be able to reschedule tasks onto a task tracker that has a copy? Why can't the task tracker copy the dcache from another task tracker that does have a copy? That said, I'm not convinced that in the majority of cases that version-0 vs. version-1 is undefined. From what I've seen, most of the time different versions of a dcache are downwardly compatible. As much as folks hate tunables, perhaps that is the answer here: mapred.job.dcache.failonupdate. > DistributedCache localizes only once per cache URI > -------------------------------------------------- > > Key: MAPREDUCE-1288 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1288 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: security, tasktracker > Affects Versions: 0.21.0 > Reporter: Devaraj Das > Priority: Blocker > Fix For: 0.21.0 > > > As part of the file localization the distributed cache localizer creates a > copy of the file in the corresponding user's private directory. The > localization in DistributedCache assumes the key as the URI of the cachefile > and if it already exists in the map, the localization is not done again. This > means that another user cannot access the same distributed cache file. We > should change the key to include the username so that localization is done > for every user. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.