[ 
https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936391#comment-13936391
 ] 

Guo Ruijing commented on HDFS-6087:
-----------------------------------

issue: The last block is not available for reading.


solution 1: if the block is referenced by client, the block can be moved to 
remove list in NN after block is unreferenced by client.

1) GetBlockLocations with Reference option
2) Client copy block to local buffer
3) New RPM message UnreferenceBlocks is sent to NN

solution 2: block is moved to trash and delayed to be deleted in DN.

In exsiting, blocks are deleted in DN after Heartbeat is responded to DN (lazy 
to delete blocks)

if block is already read by client and the block is requested to delete, DN 
should delete the block after read complete.

In most case, client can read the last block:

1) client request block location information

2) HDFS client copy blocks to local buffer. 

3) Heartbeat request to delete block(lazy to delete blocks)

4) HDFS application slowly read data from local buffer.

for race condition 2) and 3), we can delay to delete blocks.

even if block is deleted, client can request new block information.

I like solution 2


> Unify HDFS write/append/truncate
> --------------------------------
>
>                 Key: HDFS-6087
>                 URL: https://issues.apache.org/jira/browse/HDFS-6087
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Guo Ruijing
>         Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be 
> reopened for append. This design will introduce complexity including lease 
> recovery. If we design HDFS block as immutable, it will be very simple for 
> append & truncate. The idea is that HDFS block is immutable if the block is 
> committed to namenode. If the block is not committed to namenode, it is HDFS 
> client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to