[ https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581637#comment-13581637 ]
Xiaobo Peng commented on HDFS-4222: ----------------------------------- Suresh, Thanks a lot. I should finish what you suggested within 2 days. > NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to > use LDAP and LDAP has issues > ------------------------------------------------------------------------------------------------------- > > Key: HDFS-4222 > URL: https://issues.apache.org/jira/browse/HDFS-4222 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 0.23.3 > Reporter: Xiaobo Peng > Assignee: Xiaobo Peng > Priority: Minor > Attachments: hdfs-4222-branch-0.23.3.patch, > hdfs-4222-release-1.0.3.patch > > > For Hadoop clusters configured to access directory information by LDAP, the > FSNamesystem calls on behave of DFS clients might hang due to LDAP issues > (including LDAP access issues caused by networking issues) while holding the > single lock of FSNamesystem. That will result in the NN unresponsive and loss > of the heartbeats from DNs. > The places LDAP got accessed by FSNamesystem calls are the instantiation of > FSPermissionChecker, which could be moved out of the lock scope since the > instantiation does not need the FSNamesystem lock. After the move, a DFS > client hang will not affect other threads by hogging the single lock. This is > especially helpful when we use separate RPC servers for ClientProtocol and > DatanodeProtocol since the calls for DatanodeProtocol do not need to access > LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be > able to process the requests (including heartbeats) from DNs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira