[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13395942#comment-13395942
 ] 

Robert Joseph Evans commented on MAPREDUCE-4342:
------------------------------------------------

A couple of comments.
 # Minor correction to the grammar. {code}LOG.warn("Local Cache is been 
deleted... Downloading the cache again");{code} should be {code}LOG.warn("Local 
Cache has been deleted... Downloading the cache again");{code} 
 # Please run test-patch on it and post the results.
 # I believe that this problem also exists in trunk and branch 2.  It would be 
good to investigate and possibly file a JIRA, or post a patch for them as well.

It looks good, but it is not perfect.  It will work in the case where a single 
base distributed cache file or directory was deleted, but it will not work in 
the case where a file was corrupted, where a file in a cache archive was 
deleted, where new files were added, etc.  I agree that we want to be able to 
deal with a file being removed, but I personally think that prevention is 
preferable to recovery, although it may not be as backwards compatible.  I 
would prefer to see all of the files created in the distributed cache be marked 
as read only.  If the files are part of a private cache and someone messes with 
them, by modifying the permissions then it is on their head, and they need to 
modify the original HDFS file to force it to download a new copy.

Checking for corruption in because of FS/Disk issues is a separate one that we 
probably want to also look into, now that the data in the distributed cache can 
live for long periods of time.
                
> Distributed Cache gives inconsistent result if cache files get deleted from 
> task tracker 
> -----------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4342
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4342
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.22.0, 1.0.3, trunk
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>         Attachments: MAPREDUCE-4342-22-1.patch, MAPREDUCE-4342-22.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to