[ 
https://issues.apache.org/jira/browse/HDFS-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15760201#comment-15760201
 ] 

Uma Maheswara Rao G commented on HDFS-11248:
--------------------------------------------

Thank you [~rakeshr] finding this issue.

I am still thinking that, when we found source/targets for portion of blocks 
only, then how about having this information in  storageMovementsMonitor when 
adding. 
Like we can add Map<Long, Long> storageMovementAttemptedItems --> Map<Long, 
ItemInfo> storageMovementAttemptedItems
Here ItemInfo can contain timestamp, isAllBlocksCoveredToSatisfy(boolean)
If isAllBlocksCoveredToSatisfy false means, we did not sent all blocks for 
movement. So, when processing this item, we can consider it to have another try.

Can we think in this lines? I am bit concerned on the details communicating to 
DN just for this retry reason. 


> [SPS]: Handle partial block location movements
> ----------------------------------------------
>
>                 Key: HDFS-11248
>                 URL: https://issues.apache.org/jira/browse/HDFS-11248
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>    Affects Versions: HDFS-10285
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: HDFS-11248-HDFS-10285-00.patch
>
>
> This jira is to handle partial block location movements due to unavailability 
> of target nodes for the matching storage type. 
> For example, We have only A(disk,archive), B(disk) and C(disk,archive) are 
> live nodes with A & C have archive storage type. Say, we have a block with 
> locations {{A(disk), B(disk), C(disk)}}. Again assume, user changed the 
> storage policy to COLD. Now, SPS internally starts preparing the src-target 
> pairing like, {{src=> (A, B, C) and target=> (A, C)}} and sends 
> BLOCK_STORAGE_MOVEMENT to the coordinator. SPS is skipping B as it doesn't 
> have archive media to indicate that it should do retries to satisfy all block 
> locations after some time. On receiving the movement command, coordinator 
> will pair the src-target node to schedule actual physical movements like, 
> {{movetask=> (A, A), (B, C)}}. Here ideally it should do {{(C, C)}} instead 
> of {{(B, C)}} but mistakenly choosing the source C and creates problem.
> IMHO, the implicit assumptions of retry needed is creating confusions and 
> leads to coding mistakes. One idea to fix this problem is to create a new 
> flag {{retryNeeded}} flag to make it more readable. With this, SPS will 
> prepare only the matching pair and dummy source slots will be avoided like, 
> {{src=> (A, C) and target=> (A, C)}} and mark {{retryNeeded=true}} to convey 
> the message that this {{trackId}} has only partial blocks movements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to