[ https://issues.apache.org/jira/browse/YARN-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16689755#comment-16689755 ]
Wangda Tan commented on YARN-9030: ---------------------------------- [~suma.shivaprasad], it seems the logic of verifyAndCreateRemoteDir is wrong, here's my local edit of the method: {code} /** * Verify and create the remote log directory. */ public void verifyAndCreateRemoteLogDir() { // Checking the existence of the TLD FileSystem remoteFS; try { remoteFS = getFileSystem(conf); } catch (IOException e) { throw new YarnRuntimeException("Unable to get Remote FileSystem instance", e); } boolean remoteExists = true; Path remoteRootLogDir = getRemoteRootLogDir(); try { FsPermission perms = remoteFS.getFileStatus(remoteRootLogDir) .getPermission(); if (!perms.equals(TLDIR_PERMISSIONS)) { LOG.warn("Remote Root Log Dir [" + remoteRootLogDir + "] already exist, but with incorrect permissions. " + "Expected: [" + TLDIR_PERMISSIONS + "], Found: [" + perms + "]." + " The cluster may have problems with multiple users."); } } catch (FileNotFoundException e) { remoteExists = false; } catch (IOException e) { throw new YarnRuntimeException( "Failed to check permissions for dir [" + remoteRootLogDir + "]", e); } try { Path qualified = remoteRootLogDir.makeQualified(remoteFS.getUri(), remoteFS.getWorkingDirectory()); if (!remoteExists) { LOG.warn("Remote Root Log Dir [" + remoteRootLogDir + "] does not exist. Attempting to create it."); } remoteFS.mkdirs(qualified, new FsPermission(TLDIR_PERMISSIONS)); // Not possible to query FileSystem API to check if it supports // chmod, chown etc. Hence resorting to catching exceptions here. // Remove when FS APi is ready try { remoteFS.setPermission(qualified, new FsPermission(TLDIR_PERMISSIONS)); } catch (UnsupportedOperationException use) { LOG.info("Unable to set permissions for configured filesystem since" + " it does not support this", remoteFS.getScheme()); fsSupportsChmod = false; } UserGroupInformation loginUser = UserGroupInformation.getLoginUser(); String primaryGroupName = null; try { primaryGroupName = loginUser.getPrimaryGroupName(); } catch (IOException e) { LOG.warn("No primary group found. The remote root log directory" + " will be created with the HDFS superuser being its group " + "owner. JobHistoryServer may be unable to read the directory."); } // set owner on the remote directory only if the primary group exists if (primaryGroupName != null) { try { remoteFS.setOwner(qualified, loginUser.getShortUserName(), primaryGroupName); } catch (UnsupportedOperationException use) { LOG.info("File System does not support setting user/group" + remoteFS .getScheme(), use); } } } catch (IOException e) { throw new YarnRuntimeException( "Failed to create remoteLogDir [" + remoteRootLogDir + "]", e); } } {code} Several issues of previosu code: 1) {code} Path qualified = remoteRootLogDir.makeQualified(remoteFS.getUri(), remoteFS.getWorkingDirectory()); {code} Need to be called in anycase (you're right, but should place it under try). 2) {code} remoteFS.mkdirs(qualified, new FsPermission(TLDIR_PERMISSIONS)); {code} Need to be called in anycase. (Not only when remoteExists == false). 3) Removed duplicated else block at the end. 4) Removed unused parameters. Does this make sense to you? > Log aggregation changes to handle filesystems which do not support permissions > ------------------------------------------------------------------------------ > > Key: YARN-9030 > URL: https://issues.apache.org/jira/browse/YARN-9030 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Suma Shivaprasad > Assignee: Suma Shivaprasad > Priority: Major > Attachments: YARN-9030.1.patch > > > Some cloud storages like ADLS do not support permissions in which case they > throw an UnsupportedOperationException. Log aggregation should hanlde these > case and not set permissions for log aggregation base dir/ sub dirs > {noformat} > 2018-11-12 15:37:28,726 WARN logaggregation.LogAggregationService > (LogAggregationService.java:initApp(209)) - Application failed to init > aggregation > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to check > permissions for dir [abfs://testc...@test.blob.core.windows.net/app-logs] > at > org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.verifyAndCreateRemoteLogDir(LogAggregationFileController.java:277) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:238) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:204) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:347) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:69) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org