[ 
https://issues.apache.org/jira/browse/HDDS-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumitra Sulav updated HDDS-584:
--------------------------------
    Description: 
YARN jobs are failing on ozonefs with below exception :
{code:java}
java.io.IOException: The ownership on the staging directory 
/tmp/hadoop-yarn/staging/hdfs/.staging is not as expected. It is owned by . The 
directory must be owned by the submitter hdfs or hdfs
at 
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:152)
at 
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:113)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:151)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
{code}
Example Job was run using below command both with user root & hdfs :
{code:java}
hadoop jar /usr/hdp/3.0.0.0-1634/hadoop-mapreduce/hadoop-mapreduce-examples.jar 
wordcount /hosts /tmp/hosts
{code}
YARN/MR Job is checking the file/folder ownership of the user staging directory 
and if it doesn't matches with the user who is submitting the job, it throws 
above exception.

Ownership check happens in below file : 
[https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmissionFiles.java#L144]

In HDDS/OzoneFS staging area is created accordingly but with no owner :
{code:java}
[root@hcatest-4 ~]# hdfs dfs -ls -R /tmp
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.0.0.0-1634/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/10/08 10:11:11 INFO conf.Configuration: Removed undeclared tags:
drwxrwxrwx - 0 2018-10-04 11:20 /tmp/entity-file-history
drwxrwxrwx - 0 2018-10-04 11:20 /tmp/entity-file-history/active
drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn
drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging
drwxrwxrwx - 0 2018-10-05 11:56 /tmp/hadoop-yarn/staging/hdfs
drwxrwxrwx - 0 2018-10-05 11:56 /tmp/hadoop-yarn/staging/hdfs/.staging
drwxrwxrwx - 0 2018-10-05 11:56 
/tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002
-rw-rw-rw- 1 316239 2018-10-05 11:56 
/tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.jar
-rw-rw-rw- 1 104 2018-10-05 11:56 
/tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.split
-rw-rw-rw- 1 23 2018-10-05 11:56 
/tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.splitmetainfo
-rw-rw-rw- 1 213088 2018-10-05 11:56 
/tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.xml
drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging/root
drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging/root/.staging
drwxrwxrwx - 0 2018-10-05 08:55 
/tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001
-rw-rw-rw- 1 316239 2018-10-05 08:55 
/tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.jar
-rw-rw-rw- 1 104 2018-10-05 08:55 
/tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.split
-rw-rw-rw- 1 23 2018-10-05 08:55 
/tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.splitmetainfo
-rw-rw-rw- 1 213679 2018-10-05 08:55 
/tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.xml
{code}

  was:
YARN jobs are failing on ozonefs with below exception :
{code:java}
java.io.IOException: The ownership on the staging directory 
/tmp/hadoop-yarn/staging/hdfs/.staging is not as expected. It is owned by . The 
directory must be owned by the submitter hdfs or hdfs
at 
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:152)
at 
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:113)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:151)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
{code}
YARN/MR Job is checking the file/folder ownership of the user staging directory 
and if it doesn't matches with the user who is submitting the job, it throws 
above exception.

Ownership check happens in below file : 
https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmissionFiles.java#L144

In HDDS/OzoneFS staging area is created accordingly but with no owner :
{code:java}
[root@hcatest-4 ~]# hdfs dfs -ls -R /tmp
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.0.0.0-1634/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/10/08 10:11:11 INFO conf.Configuration: Removed undeclared tags:
drwxrwxrwx - 0 2018-10-04 11:20 /tmp/entity-file-history
drwxrwxrwx - 0 2018-10-04 11:20 /tmp/entity-file-history/active
drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn
drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging
drwxrwxrwx - 0 2018-10-05 11:56 /tmp/hadoop-yarn/staging/hdfs
drwxrwxrwx - 0 2018-10-05 11:56 /tmp/hadoop-yarn/staging/hdfs/.staging
drwxrwxrwx - 0 2018-10-05 11:56 
/tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002
-rw-rw-rw- 1 316239 2018-10-05 11:56 
/tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.jar
-rw-rw-rw- 1 104 2018-10-05 11:56 
/tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.split
-rw-rw-rw- 1 23 2018-10-05 11:56 
/tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.splitmetainfo
-rw-rw-rw- 1 213088 2018-10-05 11:56 
/tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.xml
drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging/root
drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging/root/.staging
drwxrwxrwx - 0 2018-10-05 08:55 
/tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001
-rw-rw-rw- 1 316239 2018-10-05 08:55 
/tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.jar
-rw-rw-rw- 1 104 2018-10-05 08:55 
/tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.split
-rw-rw-rw- 1 23 2018-10-05 08:55 
/tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.splitmetainfo
-rw-rw-rw- 1 213679 2018-10-05 08:55 
/tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.xml
{code}


> OzoneFS with HDP failing to run YARN jobs
> -----------------------------------------
>
>                 Key: HDDS-584
>                 URL: https://issues.apache.org/jira/browse/HDDS-584
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Filesystem
>    Affects Versions: 0.3.0
>         Environment: OS - RHEL7.3
> Openstack based VMs : 3 Node HDP, 3 Node Ozone
>            Reporter: Soumitra Sulav
>            Priority: Major
>
> YARN jobs are failing on ozonefs with below exception :
> {code:java}
> java.io.IOException: The ownership on the staging directory 
> /tmp/hadoop-yarn/staging/hdfs/.staging is not as expected. It is owned by . 
> The directory must be owned by the submitter hdfs or hdfs
> at 
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:152)
> at 
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:113)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:151)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
> at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
> {code}
> Example Job was run using below command both with user root & hdfs :
> {code:java}
> hadoop jar 
> /usr/hdp/3.0.0.0-1634/hadoop-mapreduce/hadoop-mapreduce-examples.jar 
> wordcount /hosts /tmp/hosts
> {code}
> YARN/MR Job is checking the file/folder ownership of the user staging 
> directory and if it doesn't matches with the user who is submitting the job, 
> it throws above exception.
> Ownership check happens in below file : 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmissionFiles.java#L144]
> In HDDS/OzoneFS staging area is created accordingly but with no owner :
> {code:java}
> [root@hcatest-4 ~]# hdfs dfs -ls -R /tmp
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.0.0.0-1634/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/root/ozone-0.3.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 18/10/08 10:11:11 INFO conf.Configuration: Removed undeclared tags:
> drwxrwxrwx - 0 2018-10-04 11:20 /tmp/entity-file-history
> drwxrwxrwx - 0 2018-10-04 11:20 /tmp/entity-file-history/active
> drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn
> drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging
> drwxrwxrwx - 0 2018-10-05 11:56 /tmp/hadoop-yarn/staging/hdfs
> drwxrwxrwx - 0 2018-10-05 11:56 /tmp/hadoop-yarn/staging/hdfs/.staging
> drwxrwxrwx - 0 2018-10-05 11:56 
> /tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002
> -rw-rw-rw- 1 316239 2018-10-05 11:56 
> /tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.jar
> -rw-rw-rw- 1 104 2018-10-05 11:56 
> /tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.split
> -rw-rw-rw- 1 23 2018-10-05 11:56 
> /tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.splitmetainfo
> -rw-rw-rw- 1 213088 2018-10-05 11:56 
> /tmp/hadoop-yarn/staging/hdfs/.staging/job_1538654387547_0002/job.xml
> drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging/root
> drwxrwxrwx - 0 2018-10-05 08:55 /tmp/hadoop-yarn/staging/root/.staging
> drwxrwxrwx - 0 2018-10-05 08:55 
> /tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001
> -rw-rw-rw- 1 316239 2018-10-05 08:55 
> /tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.jar
> -rw-rw-rw- 1 104 2018-10-05 08:55 
> /tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.split
> -rw-rw-rw- 1 23 2018-10-05 08:55 
> /tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.splitmetainfo
> -rw-rw-rw- 1 213679 2018-10-05 08:55 
> /tmp/hadoop-yarn/staging/root/.staging/job_1538654387547_0001/job.xml
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to