[ 
https://issues.apache.org/jira/browse/YARN-5658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15781518#comment-15781518
 ] 

Jian He commented on YARN-5658:
-------------------------------

[~templedf], not just HDFS, allowing deleting a path from ZK is also a required 
use-case for yarn-service-registry,  so the implementation should to be 
somewhat generic.
I think an option to clean a path is useful.  One approach in my mind is to 
leverage the getApplicationsToCleanup signal sent in the node heartbeat when 
the application finally completes, after which the NM where AM container ran 
could do the post cleanup.  The difference from YARN-2261 is that instead of 
running in a separate container, it could be run from NodeManager. And this 
approach does not require significant code change in application. YARN-2261 
could be used for more advanced use-cases which AM requires.  Problem with this 
approach is that if the NM crashes, the files may not get cleanup, even 
YARN-2261 has the same problem. For simplicity, may be we can allow this to 
occur and warn the user in the UI that the clean up is not done successfully 
and ask user do it manually.  thoughts?

> YARN should have a hook to delete a path from HDFS when an application ends
> ---------------------------------------------------------------------------
>
>                 Key: YARN-5658
>                 URL: https://issues.apache.org/jira/browse/YARN-5658
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Daniel Templeton
>            Assignee: Daniel Templeton
>
> There are many cases when a client uploads data to HDFS and then needs to 
> subsequently clean it up, such as with the distributed cache.  It would be 
> helpful if YARN would do that cleanup automatically on job completion.
> The hook could be generic to an URI supported by {{FileSystem}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to