[ https://issues.apache.org/jira/browse/MAPREDUCE-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085503#comment-14085503 ]
zhihai xu commented on MAPREDUCE-5968: -------------------------------------- thanks for the comments. these are good findings. 1. Based on your suggestion, I can optimize the code: remove delWorkDir, check whether it exist before delete the work dir in final block. 2. define "-work-" as a constant in the class, so it can be reused by both production and test code. > Work directory is not deleted in DistCache if Exception happen in > downloadCacheObject. > --------------------------------------------------------------------------------------- > > Key: MAPREDUCE-5968 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5968 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1 > Affects Versions: 1.2.1 > Reporter: zhihai xu > Assignee: zhihai xu > Attachments: MAPREDUCE-5968.branch1.patch > > > Work directory is not deleted in DistCache if Exception happen in > downloadCacheObject. In downloadCacheObject, the cache file will be copied to > temporarily work directory first, then the work directory will be renamed to > the final directory. If IOException happens during the copy, the work > directory will not be deleted. This will cause garbage data left in local > disk cache. For example If the MR application use Distributed Cache to send a > very large Archive/file(50G), if the disk is full during the copy, then the > IOException will be triggered, the work directory will be not deleted or > renamed and the work directory will occupy a big chunk of disk space. -- This message was sent by Atlassian JIRA (v6.2#6252)