[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156265#comment-14156265 ]
dhruba borthakur commented on HDFS-3107: ---------------------------------------- Hi KonstantinS, I have been following this jira, mostly as a passive observer. Can you pl explain me the use-case for truncate? You might have already explained this earlier, but if you could again elaborate the reason why you need truncate. I would appreciate it a lot. from your comments, it feels that you have a database layer on top of hdfs and the database is using an hdfs file as the transaction log. But I am not able to understand the rest of the story. > HDFS truncate > ------------- > > Key: HDFS-3107 > URL: https://issues.apache.org/jira/browse/HDFS-3107 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode > Reporter: Lei Chang > Assignee: Plamen Jeliazkov > Attachments: HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, > HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, > editsStored > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > Systems with transaction support often need to undo changes made to the > underlying storage when a transaction is aborted. Currently HDFS does not > support truncate (a standard Posix operation) which is a reverse operation of > append, which makes upper layer applications use ugly workarounds (such as > keeping track of the discarded byte range per file in a separate metadata > store, and periodically running a vacuum process to rewrite compacted files) > to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)