[ https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011879#comment-15011879 ]
Jason Lowe commented on MAPREDUCE-6550: --------------------------------------- Thanks for the patch, Robert! I'm a bit worried about having the working directory have wide-open permissions. Literally anyone on the cluster can go in and start rearranging the contents of that directory, as they have write permissions to it. I think we need to at least set the sticky bit on it so it's like /tmp. Users can create their own stuff but they can't fiddle with other users' stuff. Do we really want to only support the proxy user setup? Wondering if there's a scenario where admins want to aggregate the logs to save on the namespace but don't want to allow the proxy behaviors. As long as they're OK with fetching the logs via the log server and not via direct HDFS access by the user, it would still work despite the logs being owned by the HDFS user. > archive-logs tool changes log ownership to the Yarn user when using > DefaultContainerExecutor > -------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-6550 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 2.8.0 > Reporter: Robert Kanter > Assignee: Robert Kanter > Attachments: MAPREDUCE-6550.001.patch > > > The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell > app. When using the DistributedContainerExecutor, this means that the job > will actually run as the Yarn user, so the resulting har files are owned by > the Yarn user instead of the original owner. The permissions are also now > world-readable. > In the below example, the archived logs are owned by 'yarn' instead of 'paul' > and are now world-readable: > {noformat} > [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs > ... > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005 > drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har > -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS > -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index > -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex > -rw-r--r-- 3 yarn hadoop 8451177 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0 > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006 > -rw-r----- 3 paul hadoop 1155 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041 > -rw-r----- 3 paul hadoop 4880 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041 > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)