[ 
https://issues.apache.org/jira/browse/HDFS-4466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-4466:
-----------------------------

    Description: 
In HDFS-3374, new synchronization in 
AbstractDelegationTokenSecretManager.ExpiredTokenRemover was added to make sure 
the ExpiredTokenRemover thread can be interrupted in time. Otherwise 
TestDelegation fails intermittently because the MiniDFScluster thread could be 
shut down before tokenRemover thread. 
However, as Todd pointed out in HDFS-3374, a potential deadlock was introduced 
by its patch:
{quote}
   * FSNamesystem.saveNamespace (holding FSN lock) calls 
DTSM.saveSecretManagerState (which takes DTSM lock)
   * ExpiredTokenRemover.run (holding DTSM lock) calls rollMasterKey calls 
updateCurrentKey calls logUpdateMasterKey which takes FSN lock
So if there is a concurrent saveNamespace at the same tie as the expired token 
remover runs, it might make the NN deadlock. {quote}

This JIRA is to track the change of removing the possible deadlock from 
AbstractDelegationTokenSecretManager. 

  was:Per discussion in HDFS-3374, this JIRA is to track the change of removing 
the possible deadlock from AbstractDelegationTokenSecretManager. 

    
> Remove the deadlock from AbstractDelegationTokenSecretManager
> -------------------------------------------------------------
>
>                 Key: HDFS-4466
>                 URL: https://issues.apache.org/jira/browse/HDFS-4466
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode, security
>    Affects Versions: 1.2.0
>            Reporter: Brandon Li
>            Assignee: Brandon Li
>         Attachments: HDFS-4466.branch-1.patch
>
>
> In HDFS-3374, new synchronization in 
> AbstractDelegationTokenSecretManager.ExpiredTokenRemover was added to make 
> sure the ExpiredTokenRemover thread can be interrupted in time. Otherwise 
> TestDelegation fails intermittently because the MiniDFScluster thread could 
> be shut down before tokenRemover thread. 
> However, as Todd pointed out in HDFS-3374, a potential deadlock was 
> introduced by its patch:
> {quote}
>    * FSNamesystem.saveNamespace (holding FSN lock) calls 
> DTSM.saveSecretManagerState (which takes DTSM lock)
>    * ExpiredTokenRemover.run (holding DTSM lock) calls rollMasterKey calls 
> updateCurrentKey calls logUpdateMasterKey which takes FSN lock
> So if there is a concurrent saveNamespace at the same tie as the expired 
> token remover runs, it might make the NN deadlock. {quote}
> This JIRA is to track the change of removing the possible deadlock from 
> AbstractDelegationTokenSecretManager. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to