[ 
https://issues.apache.org/jira/browse/HDFS-11708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027652#comment-16027652
 ] 

Konstantin Shvachko commented on HDFS-11708:
--------------------------------------------

Hey guys, I reproduced failure of 
{{TestPread#testPreadFailureWithChangedBlockLocations}} without the change to 
{{DFSInputStream}}.
But only if I run the test case individually. If I run the entire {{TestPread}} 
everything passes. Could you fix this please.

I see now that {{refreshLocatedBlock()}} uses cached locations. It seems like a 
rather fancy way to update a block variable that got stale. Would've preferred 
something more explicit, but oh well. Two observations:
# Should you call {{refreshLocatedBlock()}} after {{chooseDataNode()}}, not 
before? {{chooseDataNode()}} can update the block, but we will still work with 
the old locations outside.
# So in this case we will call {{refreshLocatedBlock()}} twice on each 
iteration, one inside {{chooseDataNode()}} and then one outside. Is it possible 
to avoid this overhead? Something like updating {{LocatedBlock block}} 
parameter inside {{chooseDataNode()}} by setting new values rather than 
re-assigning new reference, which will make {{block}} input/output parameter 
rather than input-only.

> positional read will fail if replicas moved to different DNs after stream is 
> opened
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-11708
>                 URL: https://issues.apache.org/jira/browse/HDFS-11708
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.7.3
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>            Priority: Critical
>              Labels: release-blocker
>         Attachments: HDFS-11708-01.patch, HDFS-11708-02.patch, 
> HDFS-11708-03.patch, HDFS-11708-04.patch, HDFS-11708-05.patch
>
>
> Scenario:
> 1. File was written to DN1, DN2 with RF=2
> 2. File stream opened to read and kept. Block Locations are [DN1,DN2]
> 3. One of the replica (DN2) moved to another datanode (DN3) due to datanode 
> dead/balancing/etc.
> 4. Latest block locations in NameNode will be DN1 and DN3 in the 'same order'
> 5. DN1 went down, but not yet detected as dead in NameNode.
> 6. Client start reading using positional read api "read(pos, buf[], offset, 
> length)"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to