[ https://issues.apache.org/jira/browse/YARN-7879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351059#comment-16351059 ]
Jason Lowe commented on YARN-7879: ---------------------------------- bq. We are allowing file cache to be mounted in docker container as read only in YARN-7815. If we are mounting a file cache directory into a container then I assume the user running in the Docker container should have the right to read every file under that file cache directory. I do not see the security concern there if that's the case, but maybe I'm missing a key scenario that would be problematic? bq. The risk of exposing filename is marginally small, but I like to confirm that is not a problem even the filename contains sensitive information exposed in docker containers. The only way I can see it being an issue specific to Docker is if somehow something in the Docker container is not trusted that runs as a different user within the Docker container (but still in the hadoop group or equivalent for the Docker container) pokes around for the filename. That thing would have to probe for filenames since there's no read access on the filecache top-level directory, only group-execute permissions. However I would argue that if the user is running untrusted things within the Docker container it's simply much easier to access the sensitive files _as the user_. Then there would be access to the file's contents in addition to the filename. bq. Can cache directory contain subdirectories to prevent this arrangement from working? Yes, if the cache directory manager is being used there can be subdirectories to limit the total number of entries in a single directory. In those cases the intermediate directories are setup with similar 0755 permissions so the NM user can access them easily, see ContainerLocalizer#createParentDirs. This patch is restoring the usercache permissions behavior from before YARN-2185 went in. YARN-2185 wasn't about addressing directory permissions, but it had a sidecar permission change that broke the ability for the NM to reuse non-public localized resources. Therefore I'd like to see this go in so we aren't regressing functionality, and if there are concerns/improvements for how usercache permissions are handled we should address those in a separate JIRA. Either that or we revert YARN-2185, remove the unrelated permissions change, recommit it, and still end up addressing any usercache permissions concerns in a separate JIRA. ;-) > NM user is unable to access the application filecache due to permissions > ------------------------------------------------------------------------ > > Key: YARN-7879 > URL: https://issues.apache.org/jira/browse/YARN-7879 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 3.1.0 > Reporter: Shane Kumpf > Assignee: Jason Lowe > Priority: Critical > Attachments: YARN-7879.001.patch > > > I noticed the following log entries where localization was being retried on > several MR AM files. > {code} > 2018-02-02 02:53:02,905 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: > Resource > /hadoop-yarn/usercache/hadoopuser/appcache/application_1517539453610_0001/filecache/11/job.jar > is missing, localizing it again > 2018-02-02 02:53:42,908 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: > Resource > /hadoop-yarn/usercache/hadoopuser/appcache/application_1517539453610_0001/filecache/13/job.xml > is missing, localizing it again > {code} > The cluster is configured to use LCE and > {{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is > set to a user ({{hadoopuser}}) that is in the {{hadoop}} group. The user has > a umask of {{0002}}. The cluser is configured with > {{fs.permissions.umask-mode=022}}, coming from {{core-default}}. Setting the > local-user to {{nobody}}, who is not a login user or in the {{hadoop}} group, > produces the same results. > {code} > [hadoopuser@y7001 ~]$ umask > 0002 > [hadoopuser@y7001 ~]$ id > uid=1003(hadoopuser) gid=1004(hadoopuser) groups=1004(hadoopuser),1001(hadoop) > {code} > The cause of the log entry was tracked down a simple !file.exists call in > {{LocalResourcesTrackerImpl#isResourcePresent}}. > {code} > public boolean isResourcePresent(LocalizedResource rsrc) { > boolean ret = true; > if (rsrc.getState() == ResourceState.LOCALIZED) { > File file = new File(rsrc.getLocalPath().toUri().getRawPath(). > toString()); > if (!file.exists()) { > ret = false; > } else if (dirsHandler != null) { > ret = checkLocalResource(rsrc); > } > } > return ret; > } > {code} > The Resources Tracker runs as the NM user, in this case {{yarn}}. The files > being retried are in the filecache. The directories in the filecache are all > owned by the local-user's primary group and 700 perms, which makes it > unreadable by the {{yarn}} user. > {code} > [root@y7001 ~]# ls -la > /hadoop-yarn/usercache/hadoopuser/appcache/application_1517540536531_0001/filecache > total 0 > drwx--x---. 6 hadoopuser hadoop 46 Feb 2 03:06 . > drwxr-s---. 4 hadoopuser hadoop 73 Feb 2 03:07 .. > drwx------. 2 hadoopuser hadoopuser 61 Feb 2 03:05 10 > drwx------. 3 hadoopuser hadoopuser 21 Feb 2 03:05 11 > drwx------. 2 hadoopuser hadoopuser 45 Feb 2 03:06 12 > drwx------. 2 hadoopuser hadoopuser 41 Feb 2 03:06 13 > {code} > I saw YARN-5287, but that appears to be related to a restrictive umask and > the usercache itself. I was unable to locate any other known issues that > seemed relevent. Is the above already known? a configuration issue? -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org