[jira] [Comment Edited] (HDFS-3107) HDFS truncate

Colin Patrick McCabe (JIRA) Wed, 01 Oct 2014 18:14:36 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14155816#comment-14155816
 ]


Colin Patrick McCabe edited comment on HDFS-3107 at 10/2/14 1:12 AM:
---------------------------------------------------------------------

So to summarize all this above, myself, [~jingzhao], and [~sureshms] have made 
the point that this feature is still incomplete until it supports snapshots.  
Since snapshot support may involve fundamentally changing the design (e.g. 
copying the final partial block file versus using block recovery), we need to 
figure it out before merging to trunk or branch-2.

As far as I can see, there are two options here.  We could roll the snapshots 
support into this patch, or start a feature branch with the above commit... and 
then do snapshots support (and whatever else is needed) in that feature branch. 
 I'm fine with either option, I have absolutely no preference.  I'm happy to 
review anything and Jing has offered to help with snapshots support.

[~rvs]: I realize that getting truncate into a release is important to you.  If 
people get this done by 2.6, I wouldn't oppose putting it in.  But you will 
have to convince the release manager for 2.6, and propose it to the community.  
Since 2.6 is already a very big release, I think you will get pushback.

I also think you should evaluate writing length-delimited records instead of 
using truncate.  Truncate is an operation that can fail, and relying on 
truncate to clean up mistakes will always be more fragile than writing in a 
format that can ignore torn records automatically.  If a client gets an error 
from DFSOutputStream#write because it has dropped off the network, truncate is 
also going to fail for the same reason.  And then you have a torn record.


was (Author: cmccabe):
So to summarize all this above, myself, [~jingzhao], and [~sureshms] have made 
the point that this feature is still incomplete until it supports snapshots.  
Since snapshot support may involve fundamentally changing the design (e.g. 
copying the final partial block file versus using block recovery), we need to 
figure it out before merging to trunk or branch-2.

As far as I can see, there are two options here.  We could roll the snapshots 
support into this patch, or start a feature branch with the above commit... and 
then do snapshots support (and whatever else is needed) in that feature branch. 
 I'm fine with either option, I have absolutely no preference.  I'm happy to 
review anything and Jing has offered to help with snapshots support.

[~rvs]: I realize that getting truncate into a release is important to you.  If 
people get this done by 2.6, I wouldn't oppose putting it in.  But you will 
have to convince the release manager for 2.6, and propose it to the community.  
Since 2.6 is already a very big release, I think you will get pushback.

I also think you should evaluate writing length-delimited records instead of 
using truncate.  Truncate is an operation that can fail, and relying on 
truncate to clean up mistakes will always be more fragile than writing in a 
format that can ignore torn records automatically.  Think about it... if a 
write error occurs because of a disk I/O error, isn't it likely that truncate 
will also get an error when trying to shrink that block file?  And for write 
errors that result from clients dropping off the network, truncate will also be 
impossible because of the lack of connectivity.

> HDFS truncate
> -------------
>
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>            Reporter: Lei Chang
>            Assignee: Plamen Jeliazkov
>         Attachments: HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, 
> HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
> editsStored
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-3107) HDFS truncate

Reply via email to