[ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583299#comment-13583299
 ] 

Suresh Srinivas commented on HDFS-4222:
---------------------------------------

bq. Perhaps init-ed in the same places where getPermissionChecker is being 
invoked, or ideally at a higher level to avoid all command methods from having 
"to do the right".
Only problem is, if subsequent methods are not passed FSPermissionChecker, they 
might end up calling getPermissionChecker (due to a bug) as well, that too 
inside the lock. The likelihood of that with parameter passing is low.

bq. "lockGroups" would internally fetch the groups and then make them immutable 
in the UGI....
We should certainly explore this in a subsequent jira.
                
> NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to 
> use LDAP and LDAP has issues
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-4222
>                 URL: https://issues.apache.org/jira/browse/HDFS-4222
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 1.0.0, 0.23.3, 2.0.0-alpha
>            Reporter: Xiaobo Peng
>            Assignee: Xiaobo Peng
>            Priority: Minor
>         Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, 
> HDFS-4222.patch, hdfs-4222-release-1.0.3.patch
>
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to