[jira] [Commented] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.

Tsz Wo Nicholas Sze (JIRA) Tue, 05 Jan 2016 07:01:04 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083176#comment-15083176
 ]


Tsz Wo Nicholas Sze commented on HDFS-8999:
-------------------------------------------

> Is it strange? If we gonna do this, does it mean addBlock(..) can apply the 
> same change?

[~walter.k.su], your example is interesting.  As you mentioned, addBlock(..) 
waits for the second-last block.  close() still waits for the second-last 
block.  Two methods are the same in this sense.

Indeed, we may change addBlock(..) to wait for the third-last block.  However, 
we don't see a need for the moment.

> If block size is small or client writes lots of small files, we have lots of 
> committed blocks. ...

Within a short period of time, it is correct that we have a lot of committed 
blocks.  This is the problem we try to solve here -- datanode send an 
accumulated block receipt instead of a block receipt for each block within a 
short period of time in order to reduce the number of RPCs to NN.

> ... And, what's the meaning of "minRepl"? Why we need "committed" and 
> "completed"? ...

Historically, the notion of minRepl existed before we introduced the notions of 
COMMITTED and COMPLETE blocks for append.  These two states are still useful 
for append after the change.

> Namenode need not wait for {{blockReceived}} for the last block before 
> completing a file.
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-8999
>                 URL: https://issues.apache.org/jira/browse/HDFS-8999
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Jitendra Nath Pandey
>            Assignee: Tsz Wo Nicholas Sze
>         Attachments: h8999_20151228.patch
>
>
> This comes out of a discussion in HDFS-8763. Pasting [~jingzhao]'s comment 
> from the jira:
> {quote}
> ...whether we need to let NameNode wait for all the block_received msgs to 
> announce the replica is safe. Looking into the code, now we have
>    # NameNode knows the DataNodes involved when initially setting up the 
> writing pipeline
>    # If any DataNode fails during the writing, client bumps the GS and 
> finally reports all the DataNodes included in the new pipeline to NameNode 
> through the updatePipeline RPC.
>    # When the client received the ack for the last packet of the block (and 
> before the client tries to close the file on NameNode), the replica has been 
> finalized in all the DataNodes.
> Then in this case, when NameNode receives the close request from the client, 
> the NameNode already knows the latest replicas for the block. Currently the 
> checkReplication call only counts in all the replicas that NN has already 
> received the block_received msg, but based on the above #2 and #3, it may be 
> safe to also count in all the replicas in the 
> BlockUnderConstructionFeature#replicas?
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.

Reply via email to