[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936383#comment-13936383
 ] 

Guo Ruijing commented on HDFS-6087:
-----------------------------------

writing not in block boundary will trigger block copying in DN:

1) it won't lead to a lot of small block
2) Like most of file system, hflush/hsync/truncate may cause performance 
downgrade.

If we can design zero copy for block copy, there is little performance 
downgrade.

1) Block is defined as (block data file, block length)
2) source block is already committed to NN and immutable.
3) block file can be created/appended and cannot be overridden or truncated.
4) Block size may not be equal to block data file length
5) create hardlink for block data file if copy block length = file length
6) copy block data file if copy block length < file length

Example:

1) Block 1:  (blockfile1, 32M) blockfile1(length: 32M)
2) copy Block 1 to Block 2 with 32M

a) hardlink blockfile 1 to blockfile 2.
b) Block 2: (blockfile2, 32M) blockfile2 (length: 32M)

3) write 16M buffer to block 2

a) Block 1:  (blockfile1, 32M) blockfile1(length: 48M)
   
b) Block 2:  (blockfile2, 48M) blockfile2(length: 48M)

3) copy Block 2 to Block 3 with 16M

a) copy blockfile2 to blockfile3 with 16M

b) Block 1:  (blockfile1, 32M) blockfile1(length: 48M)
   
c) Block 2:  (blockfile2, 48M) blockfile2(length: 48M)

d) block 3: (blockfile 3, 16M) blockfile3(length: 16M)

> Unify HDFS write/append/truncate
> --------------------------------
>
>                 Key: HDFS-6087
>                 URL: https://issues.apache.org/jira/browse/HDFS-6087
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Guo Ruijing
>         Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to