[ 
https://issues.apache.org/jira/browse/YARN-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16689755#comment-16689755
 ] 

Wangda Tan commented on YARN-9030:
----------------------------------

[~suma.shivaprasad], it seems the logic of verifyAndCreateRemoteDir is wrong, 
here's my local edit of the method:

{code} 
  /**
   * Verify and create the remote log directory.
   */
  public void verifyAndCreateRemoteLogDir() {
    // Checking the existence of the TLD
    FileSystem remoteFS;
    try {
      remoteFS = getFileSystem(conf);
    } catch (IOException e) {
      throw new YarnRuntimeException("Unable to get Remote FileSystem instance",
          e);
    }
    boolean remoteExists = true;
    Path remoteRootLogDir = getRemoteRootLogDir();
    try {
      FsPermission perms = remoteFS.getFileStatus(remoteRootLogDir)
          .getPermission();
      if (!perms.equals(TLDIR_PERMISSIONS)) {
        LOG.warn("Remote Root Log Dir [" + remoteRootLogDir
            + "] already exist, but with incorrect permissions. "
            + "Expected: [" + TLDIR_PERMISSIONS + "], Found: [" + perms + "]."
            + " The cluster may have problems with multiple users.");
      }
    } catch (FileNotFoundException e) {
      remoteExists = false;
    } catch (IOException e) {
      throw new YarnRuntimeException(
          "Failed to check permissions for dir [" + remoteRootLogDir + "]", e);
    }

    try {
      Path qualified = remoteRootLogDir.makeQualified(remoteFS.getUri(),
          remoteFS.getWorkingDirectory());

      if (!remoteExists) {
        LOG.warn("Remote Root Log Dir [" + remoteRootLogDir
            + "] does not exist. Attempting to create it.");
      }

      remoteFS.mkdirs(qualified, new FsPermission(TLDIR_PERMISSIONS));

      // Not possible to query FileSystem API to check if it supports
      // chmod, chown etc. Hence resorting to catching exceptions here.
      // Remove when FS APi is ready
      try {
        remoteFS.setPermission(qualified, new FsPermission(TLDIR_PERMISSIONS));
      } catch (UnsupportedOperationException use) {
        LOG.info("Unable to set permissions for configured filesystem since"
            + " it does not support this", remoteFS.getScheme());
        fsSupportsChmod = false;
      }

      UserGroupInformation loginUser = UserGroupInformation.getLoginUser();
      String primaryGroupName = null;
      try {
        primaryGroupName = loginUser.getPrimaryGroupName();
      } catch (IOException e) {
        LOG.warn("No primary group found. The remote root log directory"
            + " will be created with the HDFS superuser being its group "
            + "owner. JobHistoryServer may be unable to read the directory.");
      }
      // set owner on the remote directory only if the primary group exists
      if (primaryGroupName != null) {
        try {
          remoteFS.setOwner(qualified, loginUser.getShortUserName(),
              primaryGroupName);
        } catch (UnsupportedOperationException use) {
          LOG.info("File System does not support setting user/group" + remoteFS
              .getScheme(), use);
        }
      }
    } catch (IOException e) {
      throw new YarnRuntimeException(
          "Failed to create remoteLogDir [" + remoteRootLogDir + "]", e);
    }
  }
{code}

Several issues of previosu code: 
1)
{code}
          Path qualified = remoteRootLogDir.makeQualified(remoteFS.getUri(),
          remoteFS.getWorkingDirectory());
{code}
Need to be called in anycase (you're right, but should place it under try).

2) 
{code}
       remoteFS.mkdirs(qualified, new FsPermission(TLDIR_PERMISSIONS));
{code} 
Need to be called in anycase. (Not only when remoteExists == false).

3) Removed duplicated else block at the end.

4) Removed unused parameters. 

Does this make sense to you?

> Log aggregation changes to handle filesystems which do not support permissions
> ------------------------------------------------------------------------------
>
>                 Key: YARN-9030
>                 URL: https://issues.apache.org/jira/browse/YARN-9030
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Suma Shivaprasad
>            Assignee: Suma Shivaprasad
>            Priority: Major
>         Attachments: YARN-9030.1.patch
>
>
> Some cloud storages like ADLS do not support permissions in which case they 
> throw an UnsupportedOperationException. Log aggregation should hanlde these 
> case and not set permissions for log aggregation base dir/ sub dirs 
> {noformat}
> 2018-11-12 15:37:28,726 WARN  logaggregation.LogAggregationService 
> (LogAggregationService.java:initApp(209)) - Application failed to init 
> aggregation
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to check 
> permissions for dir [abfs://testc...@test.blob.core.windows.net/app-logs]
>         at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.verifyAndCreateRemoteLogDir(LogAggregationFileController.java:277)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:238)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:204)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:347)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:69)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
>         at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to