[ https://issues.apache.org/jira/browse/HDFS-11708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027652#comment-16027652 ]
Konstantin Shvachko commented on HDFS-11708: -------------------------------------------- Hey guys, I reproduced failure of {{TestPread#testPreadFailureWithChangedBlockLocations}} without the change to {{DFSInputStream}}. But only if I run the test case individually. If I run the entire {{TestPread}} everything passes. Could you fix this please. I see now that {{refreshLocatedBlock()}} uses cached locations. It seems like a rather fancy way to update a block variable that got stale. Would've preferred something more explicit, but oh well. Two observations: # Should you call {{refreshLocatedBlock()}} after {{chooseDataNode()}}, not before? {{chooseDataNode()}} can update the block, but we will still work with the old locations outside. # So in this case we will call {{refreshLocatedBlock()}} twice on each iteration, one inside {{chooseDataNode()}} and then one outside. Is it possible to avoid this overhead? Something like updating {{LocatedBlock block}} parameter inside {{chooseDataNode()}} by setting new values rather than re-assigning new reference, which will make {{block}} input/output parameter rather than input-only. > positional read will fail if replicas moved to different DNs after stream is > opened > ----------------------------------------------------------------------------------- > > Key: HDFS-11708 > URL: https://issues.apache.org/jira/browse/HDFS-11708 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client > Affects Versions: 2.7.3 > Reporter: Vinayakumar B > Assignee: Vinayakumar B > Priority: Critical > Labels: release-blocker > Attachments: HDFS-11708-01.patch, HDFS-11708-02.patch, > HDFS-11708-03.patch, HDFS-11708-04.patch, HDFS-11708-05.patch > > > Scenario: > 1. File was written to DN1, DN2 with RF=2 > 2. File stream opened to read and kept. Block Locations are [DN1,DN2] > 3. One of the replica (DN2) moved to another datanode (DN3) due to datanode > dead/balancing/etc. > 4. Latest block locations in NameNode will be DN1 and DN3 in the 'same order' > 5. DN1 went down, but not yet detected as dead in NameNode. > 6. Client start reading using positional read api "read(pos, buf[], offset, > length)" -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org