[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236298#comment-13236298
 ] 

Zhanwei.Wang commented on HDFS-3107:
------------------------------------

A problem of truncate is the "visibility". Since to truncate a file needs to 
get the lease first, we do not need to take care of concurrent write, but we 
need to take care of concurrent read when we truncate a file. Hdfs client will 
buffer some block info when open and read a file, while these blocks may be 
truncated. Furthermore, socket and Hdfs client may buffer some data which may 
will be truncated.

When I implement the first edition of truncate prototype, if the block or data 
the client required is truncated, datanode will throw a exception and client 
will update the metadata to check if the data is truncated or the real error 
happened. But this cannot prevent the client reading buffered data.

Any comment and suggestion? 

                
> HDFS truncate
> -------------
>
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, name-node
>            Reporter: Lei Chang
>         Attachments: HDFS_truncate_semantics_Mar15.pdf, 
> HDFS_truncate_semantics_Mar21.pdf
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to