[ 
https://issues.apache.org/jira/browse/HDFS-13165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441057#comment-16441057
 ] 

Daryn Sharp edited comment on HDFS-13165 at 4/17/18 3:47 PM:
-------------------------------------------------------------

I've asked a few times why a new command (ie. newest name in this patch is 
{{DNA_MOVEBLOCK)}} is required to move blocks across storages instead of using 
the existing {{DNA_TRANSFER?  }}I've heard mention of delHint issues, but I'm 
unclear why it's an issue?

The delHint is needed to avoid split-brain fights between the NN and the 
balancer.  Both the old and new locations may equally satisfy the block 
placement policy, so the delHint ensures the NN won't delete the new replica 
created by the balancer.  For a storage movement, the placement policy takes 
the storage policy into account while computing invalidations so the new 
location won't be invalidated.

I'm struggling to understand why any datanode changes are needed to move blocks 
between storages when the DN already has all the necessary functionality.  
Sure, DNA_TRANSFER could be optimized to short-circuit for local moves, ala 
DNA_REPLACE, by using {{FsDatasetSpi#moveBlockAcrossStorage}} but that's just 
an optimization.  Using the existing commands also avoid version 
incompatibilities.

What am I missing?

 


was (Author: daryn):
I've asked a few times why a new command (ie. newest name in this patch is 
{{DNA_MOVEBLOCK)}} is required to move blocks across storages instead of using 
the existing {{DNA_TRANSFER?  I've heard mention of delHint issues, but I'm 
unclear why it's an issue?}}

The delHint is needed to avoid split-brain fights between the NN and the 
balancer.  Both the old and new locations may equally satisfy the storage 
placement policy, so the delHint ensures the NN won't delete the new replica 
created by the balancer.  For a storage movement, the placement policy takes 
SPS into account while computing invalidations so the new location won't be 
invalidated.

I'm struggling to understand why any datanode changes are needed to move blocks 
between storages when the DN already has all the necessary functionality.  
Sure, DNA_TRANSFER could be optimized to short-circuit for local moves, ala 
DNA_REPLACE, by using {{FsDatasetSpi#moveBlockAcrossStorage}} but that's just 
an optimization.  Using the existing commands also avoid version 
incompatibilities.

What am I missing?

 

> [SPS]: Collects successfully moved block details via IBR
> --------------------------------------------------------
>
>                 Key: HDFS-13165
>                 URL: https://issues.apache.org/jira/browse/HDFS-13165
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>            Priority: Major
>         Attachments: HDFS-13165-HDFS-10285-00.patch, 
> HDFS-13165-HDFS-10285-01.patch, HDFS-13165-HDFS-10285-02.patch, 
> HDFS-13165-HDFS-10285-03.patch, HDFS-13165-HDFS-10285-04.patch, 
> HDFS-13165-HDFS-10285-05.patch, HDFS-13165-HDFS-10285-06.patch, 
> HDFS-13165-HDFS-10285-07.patch, HDFS-13165-HDFS-10285-08.patch, 
> HDFS-13166-HDFS-10285-07.patch
>
>
> This task to make use of the existing IBR to get moved block details and 
> remove unwanted future tracking logic exists in BlockStorageMovementTracker 
> code, this is no more needed as the file level tracking maintained at NN 
> itself.
> Following comments taken from HDFS-10285, 
> [here|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16347472&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16347472]
> Comment-3)
> {quote}BPServiceActor
> Is it actually sending back the moved blocks? Aren’t IBRs sufficient?{quote}
> Comment-21)
> {quote}
> BlockStorageMovementTracker
> Many data structures are riddled with non-threadsafe race conditions and risk 
> of CMEs.
> Ex. The moverTaskFutures map. Adding new blocks and/or adding to a block's 
> list of futures is synchronized. However the run loop does an unsynchronized 
> block get, unsynchronized future remove, unsynchronized isEmpty, possibly 
> another unsynchronized get, only then does it do a synchronized remove of the 
> block. The whole chunk of code should be synchronized.
> Is the problematic moverTaskFutures even needed? It's aggregating futures 
> per-block for seemingly no reason. Why track all the futures at all instead 
> of just relying on the completion service? As best I can tell:
> It's only used to determine if a future from the completion service should be 
> ignored during shutdown. Shutdown sets the running boolean to false and 
> clears the entire datastructure so why not use the running boolean like a 
> check just a little further down?
> As synchronization to sleep up to 2 seconds before performing a blocking 
> moverCompletionService.take, but only when it thinks there are no active 
> futures. I'll ignore the missed notify race that the bounded wait masks, but 
> the real question is why not just do the blocking take?
> Why all the complexity? Am I missing something?
> BlocksMovementsStatusHandler
> Suffers same type of thread safety issues as StoragePolicySatisfyWorker. Ex. 
> blockIdVsMovementStatus is inconsistent synchronized. Does synchronize to 
> return an unmodifiable list which sadly does nothing to protect the caller 
> from CME.
> handle is iterating over a non-thread safe list.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to