[jira] [Commented] (HDFS-3107) HDFS truncate

Tsz Wo (Nicholas), SZE (Commented) (JIRA) Thu, 22 Mar 2012 12:42:48 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235941#comment-13235941
 ]


Tsz Wo (Nicholas), SZE commented on HDFS-3107:
----------------------------------------------

The full truncate feature may be hard to implement.  Below are some ideas.

h4. (1) Support only block boundary truncate.
When the length is not a multiple of the block size, throw an exception.

This is very easy to implement.  The client could use it to support full 
truncate by (i ) copy the data in the truncated block, (ii) truncate to block 
boundary and (iii) append the data back.

Example 1:  Suppose file size is 290 and block size is 100.   Then, 
truncate(file, 180) can be done by (i ) reading bytes from position 101 to 180, 
(ii) truncate(file, 100) and (iii) append the 80 bytes back.


h4. (2) truncate with concat. (Sanjay’s idea)

A problem of (1) is that it is not atomic.  It may end up with finishing (ii) 
but failing at (iii).  A remedy is to add a parameter and the API becomes 
truncate(file, length, concateFile), where length must be a multiple of the 
block size.  HDFS will first truncate file to length and then concatenate the 
block in concateFile to the end of file.  Note that this is a namenode-only 
operation and can be easily implemented as atomic.

Example 2: For the case in e.g. 1, truncate can be done by first copying bytes 
101 to 180 to a new file f and then truncate(file, 100, f).


h4. (3) Copy on truncate.
Support full truncate by first copying the last block to a new block, then 
truncate the replicas of the new block at the datanodes and then commit the 
truncate once the datanodes report to new block to the namenode.  Rollback is 
possible since the old block is still around.  This is harder to implement than 
(1) or (2).

                
> HDFS truncate
> -------------
>
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, name-node
>            Reporter: Lei Chang
>         Attachments: HDFS_truncate_semantics_Mar15.pdf, 
> HDFS_truncate_semantics_Mar21.pdf
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3107) HDFS truncate

Reply via email to