[ 
https://issues.apache.org/jira/browse/HDFS-16261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17457543#comment-17457543
 ] 

Xiaoqiao He commented on HDFS-16261:
------------------------------------

Thanks [~bbeaudreault] for your detailed analysis and continued/solid works.
IMO, I am prefer to solve this issue at DataNode side. Of course it has some 
pros and cons compare to process it at NameNode side as you mentioned above.
a. NameNode could postpone to send back Invalid Command to DataNode to avoid 
exception and retry.  Maybe the during is unique configuration. But for 
different block/access, it will cost different time to read, so the static 
postpone during is hard to solve all cases.
b. IMO, DataNode has all information about read which block at local. And we 
could use this point to decide if execute `DNA_INVALIDATE` immediately or 
postpone it. For every invalid block, check if it is in BlockSender (between 
receive op_read and close the stream) sets first, if yes then wait to execute 
until all corresponding streams close.
Just one choice, welcome to comment and discussion. Thanks again.

> Configurable grace period around invalidation of replaced blocks
> ----------------------------------------------------------------
>
>                 Key: HDFS-16261
>                 URL: https://issues.apache.org/jira/browse/HDFS-16261
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> When a block is moved with REPLACE_BLOCK, the new location is recorded in the 
> NameNode and the NameNode instructs the old host to in invalidate the block 
> using DNA_INVALIDATE. As it stands today, this invalidation is async but 
> tends to happen relatively quickly.
> I'm working on a feature for HBase which enables efficient healing of 
> locality through Balancer-style low level block moves (HBASE-26250). One 
> issue is that HBase tends to keep open long running DFSInputStreams and 
> moving blocks from under them causes lots of warns in the RegionServer and 
> increases long tail latencies due to the necessary retries in the DFSClient.
> One way I'd like to fix this is to provide a configurable grace period on 
> async invalidations. This would give the DFSClient enough time to refresh 
> block locations before hitting any errors.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to