[ 
https://issues.apache.org/jira/browse/HDFS-15068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006581#comment-17006581
 ] 

Xiaoqiao He commented on HDFS-15068:
------------------------------------

v005 LGTM,  Run the failed unit test on local, it seems most of them(except 
failed #TestRedudantBlocks which seems not related with this changes.) are 
passed. ping [~elgoiri],[~iwasakims],[~weichiu] Would like take another look if 
have time?

> DataNode could meet deadlock if invoke refreshVolumes when register
> -------------------------------------------------------------------
>
>                 Key: HDFS-15068
>                 URL: https://issues.apache.org/jira/browse/HDFS-15068
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Xiaoqiao He
>            Assignee: Aiphago
>            Priority: Critical
>         Attachments: HDFS-15068.001.patch, HDFS-15068.002.patch, 
> HDFS-15068.003.patch, HDFS-15068.004.patch, HDFS-15068.005.patch
>
>
> DataNode could meet deadlock when invoke `dfsadmin -reconfig datanode ip:host 
> start` to trigger #refreshVolumes.
> 1. DataNod#refreshVolumes hold datanode instance ownable {{synchronizer}} 
> when enter this method first, then try to hold BPOfferService {{readlock}} 
> when `bpos.getNamespaceInfo()` in following code segment. 
> {code:java}
>         for (BPOfferService bpos : blockPoolManager.getAllNamenodeThreads()) {
>           nsInfos.add(bpos.getNamespaceInfo());
>         }
> {code}
> 2. BPOfferService#registrationSucceeded (which is invoked by #register when 
> DataNode start or #reregister when processCommandFromActor) hold 
> BPOfferService {{writelock}} first, then try to hold datanode instance 
> ownable {{synchronizer}} in following method.
> {code:java}
>   synchronized void bpRegistrationSucceeded(DatanodeRegistration 
> bpRegistration,
>       String blockPoolId) throws IOException {
>     id = bpRegistration;
>     if(!storage.getDatanodeUuid().equals(bpRegistration.getDatanodeUuid())) {
>       throw new IOException("Inconsistent Datanode IDs. Name-node returned "
>           + bpRegistration.getDatanodeUuid()
>           + ". Expecting " + storage.getDatanodeUuid());
>     }
>     
>     registerBlockPoolWithSecretManager(bpRegistration, blockPoolId);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to