[ https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935707#comment-13935707 ]
Tsz Wo Nicholas Sze commented on HDFS-6087: ------------------------------------------- > 1. A block cannot be read by others while under construction, until it is > fully written and committed. ... It also does not support hflush. > 2. Your proposal (if I understand it correctly) will potentially lead to a > lot of small blocks if appends, fscyncs (and truncates) are used intensively. > ... I guess it won't lead to a lot of small block since it does copy-on-write. However, there is going to be a lot of block coping if there are a lot of append, hsync, etc. ---- In addition, I think it would be a problem for reading the last block: If a reader opens a file and reads the last block slowly, then a writer reopen the file for append and committed the new last block. The old last block may then be deleted and becomes not available to the read anymore. > Unify HDFS write/append/truncate > -------------------------------- > > Key: HDFS-6087 > URL: https://issues.apache.org/jira/browse/HDFS-6087 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Reporter: Guo Ruijing > Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf > > > In existing implementation, HDFS file can be appended and HDFS block can be > reopened for append. This design will introduce complexity including lease > recovery. If we design HDFS block as immutable, it will be very simple for > append & truncate. The idea is that HDFS block is immutable if the block is > committed to namenode. If the block is not committed to namenode, it is HDFS > client’s responsibility to re-added with new block ID. -- This message was sent by Atlassian JIRA (v6.2#6252)