[ https://issues.apache.org/jira/browse/HDFS-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhaoyunjiong updated HDFS-7044: ------------------------------- Attachment: Retention policy design.pdf Attach a simple design document. The major difference between HDFS-7044 and HDFS-6382 are(Please correct me if I'm wrong, I just knew HDFS-6382 was trying to solve same problem): # HDFS-6382 is standalone daemon outside NameNode, HDFS-7044 will be inside NameNode, I believe HDFS-7044 will be more simple and efficient. # HDFS-7044 allows user set policy based on access time or modify time, HDFS-6382 only support one ttl. > Support retention policy based on access time and modify time, use XAttr to > store policy > ---------------------------------------------------------------------------------------- > > Key: HDFS-7044 > URL: https://issues.apache.org/jira/browse/HDFS-7044 > Project: Hadoop HDFS > Issue Type: New Feature > Components: namenode > Reporter: zhaoyunjiong > Assignee: zhaoyunjiong > Attachments: Retention policy design.pdf > > > The basic idea is set retention policy on directory based on access time and > modify time and use XAttr to store policy. > Files under directory which have retention policy will be delete if meet the > retention rule. > There are three rule: > # access time > #* If (accessTime + retentionTimeForAccess < now), the file will be delete > # modify time > #* If (modifyTime + retentionTimeForModify < now), the file will be delete > # access time and modify time > #* If (accessTime + retentionTimeForAccess < now && modifyTime + > retentionTimeForModify < now ), the file will be delete -- This message was sent by Atlassian JIRA (v6.3.4#6332)