[
https://issues.apache.org/jira/browse/TEZ-693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13991308#comment-13991308
]
Hitesh Shah edited comment on TEZ-693 at 5/6/14 11:10 PM:
----------------------------------------------------------
[~kamrul] The above sounds good. A minor tweak to it though - can we fix all
code such that the following happens:
- all data generated by the framework goes into a tez-specific ( and appId
specific too ) sub-dir that resides under the configured staging dir
- dag specific data could go into a dagId specific sub dir of the above
dir.
- this will help make it simple to do all the cleanup as an rm on the top
level dir will automatically delete all the data
This could be split into multiple jiras if needed:
- The 1st to tackle would be ensuring everything is being written into the
correct dirs.
- second jira could be basically clean up everything on the last attempt
- third jira could be to do more complex dag specific cleanup on dag
completion
was (Author: hitesh):
[~kamrul] The above sounds good. A minor tweak to it though - can we fix all
code such that the following happens:
- all data generated by the framework goes into a tez-specific ( and appId
specific too ) sub-dir that resides under the configured staging dir
- dag specific data could go into a dagId specific sub dir
- this will help make it simple to do all the cleanup as an rm on the top
level dir will automatically delete all the data
This could be split into multiple jiras if needed:
- The 1st to tackle would be ensuring everything is being written into the
correct dirs.
- second jira could be basically clean up everything on the last attempt
- third jira could be to do more complex dag specific cleanup on dag
completion
> Deletion of DAG specific data after DAG completion
> --------------------------------------------------
>
> Key: TEZ-693
> URL: https://issues.apache.org/jira/browse/TEZ-693
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Mohammad Kamrul Islam
>
> Currently the client uploads some dag specific data to a remote directory
> specified by the user. The burden is on the client to clean this data after
> the dag completes. The post dag completion code in the AM should be able to
> clean this custom uploaded data.
--
This message was sent by Atlassian JIRA
(v6.2#6252)