[ 
https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502521#comment-13502521
 ] 

Xiaobo Peng commented on HDFS-4222:
-----------------------------------

Sorry the former comment did not format well. I'm trying to format it now.

The following code snippets show a simple way to change FSNamesystem::renameTo 
in branch-0.23.4. Changes to other methods are similar.

//////////// existent code
{code:borderStyle=solid}
  /** Rename src to dst */
  void renameTo(String src, String dst, Options.Rename... options)
      throws IOException, UnresolvedLinkException {
...
    writeLock();
    try {
      renameToInternal(src, dst, options);
      if (auditLog.isInfoEnabled() && isExternalInvocation()) {
        resultingStat = dir.getFileInfo(dst, false); 
      }
    } finally {
      writeUnlock();
    }
...
  }


  private void renameToInternal(String src, String dst,
      Options.Rename... options) throws IOException {
...
    if (isPermissionEnabled) {
      checkParentAccess(src, FsAction.WRITE);
      checkAncestorAccess(dst, FsAction.WRITE);
    }
...
  }


  private FSPermissionChecker checkParentAccess(String path, FsAction access
      ) throws AccessControlException, UnresolvedLinkException {
    return checkPermission(path, false, null, access, null, null);
  }


  private FSPermissionChecker checkPermission(String path, boolean doCheckOwner,
      FsAction ancestorAccess, FsAction parentAccess, FsAction access,
      FsAction subAccess) throws AccessControlException, 
UnresolvedLinkException {
    FSPermissionChecker pc = new FSPermissionChecker(
        fsOwner.getShortUserName(), supergroup);
    if (!pc.isSuper) {
      dir.waitForReady();
      readLock();
      try {
        pc.checkPermission(path, dir.rootDir, doCheckOwner,
            ancestorAccess, parentAccess, access, subAccess);
      } finally {
        readUnlock();
      } 
    }
    return pc;
  }
{code}

//////////// proposed changes
{code:borderStyle=solid}
  /** Rename src to dst */
  void renameTo(String src, String dst, Options.Rename... options)
      throws IOException, UnresolvedLinkException {
...
    FSPermissionChecker pc = new FSPermissionChecker(
        fsOwner.getShortUserName(), supergroup);

    writeLock();
    try {
      renameToInternal(pc, src, dst, options);
      if (auditLog.isInfoEnabled() && isExternalInvocation()) {
        resultingStat = dir.getFileInfo(dst, false); 
      }
    } finally {
      writeUnlock();
    }
...
  }


  private void renameToInternal(FSPermissionChecker pc, String src, String dst,
      Options.Rename... options) throws IOException {
...
    if (isPermissionEnabled) {
      checkParentAccess(pc, src, FsAction.WRITE);
      checkAncestorAccess(pc, dst, FsAction.WRITE);
    }
...
  }


  private FSPermissionChecker checkParentAccess(FSPermissionChecker pc, String 
path, FsAction access
      ) throws AccessControlException, UnresolvedLinkException {
    return checkPermission(pc, path, false, null, access, null, null);
  }


  private FSPermissionChecker checkPermission(FSPermissionChecker pc, String 
path, boolean doCheckOwner,
      FsAction ancestorAccess, FsAction parentAccess, FsAction access,
      FsAction subAccess) throws AccessControlException, 
UnresolvedLinkException {
    if (!pc.isSuper) {
      dir.waitForReady();
      readLock();
      try {
        pc.checkPermission(path, dir.rootDir, doCheckOwner,
            ancestorAccess, parentAccess, access, subAccess);
      } finally {
        readUnlock();
      } 
    }
    return pc;
  }
{code}
                
> NN is unresponsive and lose hearbeats of DNs when Hadoop is configured to use 
> LADP and LDAP has issues
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-4222
>                 URL: https://issues.apache.org/jira/browse/HDFS-4222
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.23.3
>            Reporter: Xiaobo Peng
>            Assignee: Xiaobo Peng
>            Priority: Minor
>
> For Hadoop clusters configured to access directory information by LDAP, the 
> FSNamesystem calls on behave of DFS clients might hang due to LDAP issues 
> (including LDAP access issues caused by networking issues) while holding the 
> single lock of FSNamesystem. That will result in the NN unresponsive and loss 
> of the heartbeats from DNs.
> The places LDAP got accessed by FSNamesystem calls are the instantiation of 
> FSPermissionChecker, which could be moved out of the lock scope since the 
> instantiation does not need the FSNamesystem lock. After the move, a DFS 
> client hang will not affect other threads by hogging the single lock. This is 
> especially helpful when we use separate RPC servers for ClientProtocol and 
> DatanodeProtocol since the calls for DatanodeProtocol do not need to access 
> LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be 
> able to process the requests (including heartbeats) from DNs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to