[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769148#action_12769148
 ] 

Vinod K V commented on MAPREDUCE-1140:
--------------------------------------


Here's the pseudo code that each Task runs w.r.t Distributed cache files:

{code}
TaskSetup() {
  for(URI uri: list of cache uris) {
    localize(URI)
  }
}

TaskFinish() {
  for(URI uri: list of cache uris) {
    release(URI)
  }
}
{code}

The problem happens when, out of N URIs to be localized, n go through and the 
localization fails for the n+1 th URI. In this case, the task goes ahead and 
does a release on *every* URI even though URIs n+2 through N-1 are not even 
localized!

This may result in -ve refcounts for some URIs, and deletion of files that are 
actually in use when refcount becomes zero.

> Per cache-file refcount can become negative when tasks release 
> distributed-cache files
> --------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1140
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1140
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.2, 0.21.0, 0.22.0
>            Reporter: Vinod K V
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to