[ 
https://issues.apache.org/jira/browse/HADOOP-5445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700259#action_12700259
 ] 

Michael Andrews commented on HADOOP-5445:
-----------------------------------------

Just wondering if there is any interest in this, it hasn't been commented on in 
a while. 
The utility doesn't seem to do very much, but it does become very useful when 
your environment consists of  many map/reduce jobs that are chained together in 
a pipeline.  Each of our jobs potentially reports intermediate results and/or 
leaves the intermediate results for another job to pick up.  Each of these 
results are transient and should probably be removed (as they can be rebuilt 
later), but need to be around long enough for other jobs to pickupt.  In this 
context it is really helpful for a cronjob to come along and cleanup transient 
results that no one is using so that disk space can be recovered. 

> HDFS Tmpreaper
> --------------
>
>                 Key: HADOOP-5445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5445
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>         Environment: CentOs 4/5, Java 1.5, Hadoop 0.17.3
>            Reporter: Michael Andrews
>            Priority: Minor
>             Fix For: 0.17.3
>
>         Attachments: DateDelta.java, TmpReaper.java
>
>
> Java implementation of tmpreaper utility for HDFS.  Helps when you expect 
> processes to die before they can clean up.  I have perl unit tests that can 
> be ported over to java or groovy if the hadoop team is interested in this 
> utility.  One issue is that the unit tests set the modification time of test 
> files, which is unsupported in HDFS (as far as I can tell). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to