[jira] [Updated] (MAPREDUCE-6550) archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-6550: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2! > archive-logs tool changes log ownership to the Yarn user when using > DefaultContainerExecutor > > > Key: MAPREDUCE-6550 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.8.0 > > Attachments: MAPREDUCE-6550.001.patch, MAPREDUCE-6550.002.patch, > MAPREDUCE-6550.003.patch, MAPREDUCE-6550.004.patch, MAPREDUCE-6550.005.patch > > > The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell > app. When using the DefaultContainerExecutor, this means that the job will > actually run as the Yarn user, so the resulting har files are owned by the > Yarn user instead of the original owner. The permissions are also now > world-readable. > In the below example, the archived logs are owned by 'yarn' instead of 'paul' > and are now world-readable: > {noformat} > [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs > ... > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005 > drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har > -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS > -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index > -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex > -rw-r--r-- 3 yarn hadoop8451177 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0 > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006 > -rw-r- 3 paul hadoop 1155 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041 > -rw-r- 3 paul hadoop 4880 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041 > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6550) archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-6550: - Attachment: MAPREDUCE-6550.005.patch The 005 patch increases some of the test timeouts, including the one that failed (due to a timeout). > archive-logs tool changes log ownership to the Yarn user when using > DefaultContainerExecutor > > > Key: MAPREDUCE-6550 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: MAPREDUCE-6550.001.patch, MAPREDUCE-6550.002.patch, > MAPREDUCE-6550.003.patch, MAPREDUCE-6550.004.patch, MAPREDUCE-6550.005.patch > > > The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell > app. When using the DefaultContainerExecutor, this means that the job will > actually run as the Yarn user, so the resulting har files are owned by the > Yarn user instead of the original owner. The permissions are also now > world-readable. > In the below example, the archived logs are owned by 'yarn' instead of 'paul' > and are now world-readable: > {noformat} > [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs > ... > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005 > drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har > -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS > -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index > -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex > -rw-r--r-- 3 yarn hadoop8451177 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0 > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006 > -rw-r- 3 paul hadoop 1155 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041 > -rw-r- 3 paul hadoop 4880 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041 > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6550) archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-6550: - Attachment: MAPREDUCE-6550.004.patch The 004 patch fixes the conf checkstyle thing. > archive-logs tool changes log ownership to the Yarn user when using > DefaultContainerExecutor > > > Key: MAPREDUCE-6550 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: MAPREDUCE-6550.001.patch, MAPREDUCE-6550.002.patch, > MAPREDUCE-6550.003.patch, MAPREDUCE-6550.004.patch > > > The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell > app. When using the DefaultContainerExecutor, this means that the job will > actually run as the Yarn user, so the resulting har files are owned by the > Yarn user instead of the original owner. The permissions are also now > world-readable. > In the below example, the archived logs are owned by 'yarn' instead of 'paul' > and are now world-readable: > {noformat} > [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs > ... > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005 > drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har > -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS > -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index > -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex > -rw-r--r-- 3 yarn hadoop8451177 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0 > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006 > -rw-r- 3 paul hadoop 1155 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041 > -rw-r- 3 paul hadoop 4880 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041 > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6550) archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-6550: - Attachment: MAPREDUCE-6550.003.patch That's also a good idea, and should be more efficient too. The 003 patch sets the umask for the 'hadoop archive' MR job to 027 so we get 640 files and 750 dirs. > archive-logs tool changes log ownership to the Yarn user when using > DefaultContainerExecutor > > > Key: MAPREDUCE-6550 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: MAPREDUCE-6550.001.patch, MAPREDUCE-6550.002.patch, > MAPREDUCE-6550.003.patch > > > The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell > app. When using the DefaultContainerExecutor, this means that the job will > actually run as the Yarn user, so the resulting har files are owned by the > Yarn user instead of the original owner. The permissions are also now > world-readable. > In the below example, the archived logs are owned by 'yarn' instead of 'paul' > and are now world-readable: > {noformat} > [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs > ... > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005 > drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har > -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS > -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index > -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex > -rw-r--r-- 3 yarn hadoop8451177 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0 > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006 > -rw-r- 3 paul hadoop 1155 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041 > -rw-r- 3 paul hadoop 4880 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041 > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6550) archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-6550: - Attachment: MAPREDUCE-6550.002.patch Thanks for taking a look Jason. Those sound like good ideas. The 002 patch fixes checkstyle warnings, sets the sticky bit on the working dir, and adds a {{-noProxy}} option. > archive-logs tool changes log ownership to the Yarn user when using > DefaultContainerExecutor > > > Key: MAPREDUCE-6550 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: MAPREDUCE-6550.001.patch, MAPREDUCE-6550.002.patch > > > The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell > app. When using the DefaultContainerExecutor, this means that the job will > actually run as the Yarn user, so the resulting har files are owned by the > Yarn user instead of the original owner. The permissions are also now > world-readable. > In the below example, the archived logs are owned by 'yarn' instead of 'paul' > and are now world-readable: > {noformat} > [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs > ... > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005 > drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har > -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS > -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index > -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex > -rw-r--r-- 3 yarn hadoop8451177 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0 > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006 > -rw-r- 3 paul hadoop 1155 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041 > -rw-r- 3 paul hadoop 4880 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041 > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6550) archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated MAPREDUCE-6550: Description: The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell app. When using the DefaultContainerExecutor, this means that the job will actually run as the Yarn user, so the resulting har files are owned by the Yarn user instead of the original owner. The permissions are also now world-readable. In the below example, the archived logs are owned by 'yarn' instead of 'paul' and are now world-readable: {noformat} [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs ... drwxrwx--- - paul hadoop 0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005 drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex -rw-r--r-- 3 yarn hadoop8451177 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0 drwxrwx--- - paul hadoop 0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0006 -rw-r- 3 paul hadoop 1155 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041 -rw-r- 3 paul hadoop 4880 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041 ... {noformat} was: The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell app. When using the DistributedContainerExecutor, this means that the job will actually run as the Yarn user, so the resulting har files are owned by the Yarn user instead of the original owner. The permissions are also now world-readable. In the below example, the archived logs are owned by 'yarn' instead of 'paul' and are now world-readable: {noformat} [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs ... drwxrwx--- - paul hadoop 0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005 drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex -rw-r--r-- 3 yarn hadoop8451177 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0 drwxrwx--- - paul hadoop 0 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0006 -rw-r- 3 paul hadoop 1155 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041 -rw-r- 3 paul hadoop 4880 2015-10-02 13:24 /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041 ... {noformat} > archive-logs tool changes log ownership to the Yarn user when using > DefaultContainerExecutor > > > Key: MAPREDUCE-6550 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: MAPREDUCE-6550.001.patch > > > The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell > app. When using the DefaultContainerExecutor, this means that the job will > actually run as the Yarn user, so the resulting har files are owned by the > Yarn user instead of the original owner. The permissions are also now > world-readable. > In the below example, the archived logs are owned by 'yarn' instead of 'paul' > and are now world-readable: > {noformat} > [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs > ... > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005 > drwxr-xr-x - yarn
[jira] [Updated] (MAPREDUCE-6550) archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-6550: - Attachment: MAPREDUCE-6550.001.patch The patch fixes the user problem by using UGI to proxy as the correct user. It fixes the permissions problem by setting the correct permissions on the files. Other than those changes, the bulk of the changes in the patch are due to moving some things around and indenting. I've updated the unit tests to check for the permissions and also verified in a cluster that it behaves correctly. > archive-logs tool changes log ownership to the Yarn user when using > DefaultContainerExecutor > > > Key: MAPREDUCE-6550 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: MAPREDUCE-6550.001.patch > > > The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell > app. When using the DistributedContainerExecutor, this means that the job > will actually run as the Yarn user, so the resulting har files are owned by > the Yarn user instead of the original owner. The permissions are also now > world-readable. > In the below example, the archived logs are owned by 'yarn' instead of 'paul' > and are now world-readable: > {noformat} > [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs > ... > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005 > drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har > -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS > -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index > -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex > -rw-r--r-- 3 yarn hadoop8451177 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0 > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006 > -rw-r- 3 paul hadoop 1155 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041 > -rw-r- 3 paul hadoop 4880 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041 > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6550) archive-logs tool changes log ownership to the Yarn user when using DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-6550: - Status: Patch Available (was: Open) > archive-logs tool changes log ownership to the Yarn user when using > DefaultContainerExecutor > > > Key: MAPREDUCE-6550 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: MAPREDUCE-6550.001.patch > > > The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell > app. When using the DistributedContainerExecutor, this means that the job > will actually run as the Yarn user, so the resulting har files are owned by > the Yarn user instead of the original owner. The permissions are also now > world-readable. > In the below example, the archived logs are owned by 'yarn' instead of 'paul' > and are now world-readable: > {noformat} > [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs > ... > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005 > drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har > -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS > -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index > -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex > -rw-r--r-- 3 yarn hadoop8451177 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0 > drwxrwx--- - paul hadoop 0 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006 > -rw-r- 3 paul hadoop 1155 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041 > -rw-r- 3 paul hadoop 4880 2015-10-02 13:24 > /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041 > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)