[ 
https://issues.apache.org/jira/browse/HDFS-16569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532790#comment-17532790
 ] 

Xiaoqiao He commented on HDFS-16569:
------------------------------------

Thanks [~yuanbo] for your detailed explanation. 
It is an interesting proposal. IIRC, there are some others DFS has implemented 
as your proposal, such as 
Tectonic(https://www.usenix.org/conference/fast21/presentation/pan). 
I believe it will improve performance for write flow. I am sure if there are 
some historical reasons or some security issues consideration when following 
above proposal.
{quote}You mean to say, since the client has already reported the block 
location, no need to wait for the datanode to report it again, right? So, the 
answer is No, that is per design, the namenode doesn't blindly trust a client, 
and there are other reasons as well. So, it requires at least one datanode to 
confirm that it has received the block with same amount of data as the client 
claims.{quote}
[~ayushtkn] Thanks Ayush to give one point that NameNode could not trust client 
hundred percent. +1 for me, this is basic of the original HDFS design. And 
another one, it could introduce security issue, such as one fake client report 
random information to NameNode could cause the following read failed and mount 
of missing blocks at NameNode side seen. I am concern if we could tradeoff 
initial assumptions and try to find the best solution. Anyway, it is valuable 
to have more discussion.
Thanks [~yuanbo] and [~ayushtkn].

> Consider attaching block location info from client when closing a completed 
> file
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-16569
>                 URL: https://issues.apache.org/jira/browse/HDFS-16569
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Yuanbo Liu
>            Priority: Major
>
> when a file is finished, client will not close it until DNs send 
> RECEIVED_BLOCK by ibr or client is timeout. we can always see such kind of 
> log in namenode
> {code:java}
> is COMMITTED but not COMPLETE(numNodes= 0 <  minimum = 1) in file{code}
> Since client already has the last block locations, it's not necessary to rely 
> on the ibr from DN when closing file.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to