[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13934485#comment-13934485
 ] 

Guo Ruijing commented on HDFS-6087:
-----------------------------------

I plan to remove snapshot part and add one work-flow for write/append/truncate  
and more work-flow for exception handle in design proposal.

The basic idea:

1) block is immutable. if block is committed to NN, we can copy the bock 
instead of append the block and commit to NN.

2) before block is committed to NN, it is client's repsonsibility to readd it 
if fails and other client cannot read that block. so we don't need 
generationStamp to recover the block.

3) after block is committed to NN, file length is updated in NN so that client 
cannot see uncommitted block.

4) write/append/truncate have same logic.
 
1. Update BlockID before commit failure including pipeline failure. The design 
proposal try to remove generationStamp.

2. extra copyBlock(oldBlockID, newBlockID, length) is used for append and 
truncate.

3. commitBlock a) block will be immutable  b) remove all blocks after offset to 
implement truncate & append  3) update file length.

4. if block is not committed to namenode, file length is not updated and client 
cannot read the block.

5. I will add more failure scenarios

> Unify HDFS write/append/truncate
> --------------------------------
>
>                 Key: HDFS-6087
>                 URL: https://issues.apache.org/jira/browse/HDFS-6087
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Guo Ruijing
>         Attachments: HDFS Design Proposal.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to