[
https://issues.apache.org/jira/browse/MAPREDUCE-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769148#action_12769148
]
Vinod K V commented on MAPREDUCE-1140:
--------------------------------------
Here's the pseudo code that each Task runs w.r.t Distributed cache files:
{code}
TaskSetup() {
for(URI uri: list of cache uris) {
localize(URI)
}
}
TaskFinish() {
for(URI uri: list of cache uris) {
release(URI)
}
}
{code}
The problem happens when, out of N URIs to be localized, n go through and the
localization fails for the n+1 th URI. In this case, the task goes ahead and
does a release on *every* URI even though URIs n+2 through N-1 are not even
localized!
This may result in -ve refcounts for some URIs, and deletion of files that are
actually in use when refcount becomes zero.
> Per cache-file refcount can become negative when tasks release
> distributed-cache files
> --------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1140
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1140
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: tasktracker
> Affects Versions: 0.20.2, 0.21.0, 0.22.0
> Reporter: Vinod K V
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.