[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061772#comment-14061772 ] Hadoop QA commented on MAPREDUCE-5957: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12655644/MAPREDUCE-5957.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesAttempts org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobs org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesTasks org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobConf org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServices {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4735//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4735//console This message is automatically generated. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061782#comment-14061782 ] Sangjin Lee commented on MAPREDUCE-5957: Likely to be an existing issue? Kicking it off one more time. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061875#comment-14061875 ] Hadoop QA commented on MAPREDUCE-5957: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12655644/MAPREDUCE-5957.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient: org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesAttempts org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobs org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesTasks org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobConf org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServices {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4736//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4736//console This message is automatically generated. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5790) Default map hprof profile options do not work
[ https://issues.apache.org/jira/browse/MAPREDUCE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061931#comment-14061931 ] Hudson commented on MAPREDUCE-5790: --- FAILURE: Integrated in Hadoop-Yarn-trunk #613 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/613/]) MAPREDUCE-5790. Made it easier to enable hprof profile options by default. Contributed by Gera Shegalov. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1610578) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestJobConf.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobsWithProfiler.java Default map hprof profile options do not work - Key: MAPREDUCE-5790 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5790 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.3.0 Environment: java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Reporter: Andrew Wang Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: MAPREDUCE-5790.v01.patch, MAPREDUCE-5790.v02.patch I have an MR job doing the following: {code} Job job = Job.getInstance(conf); // Enable profiling job.setProfileEnabled(true); job.setProfileTaskRange(true, 0); job.setProfileTaskRange(false, 0); {code} When I run this job, some of my map tasks fail with this error message: {noformat} org.apache.hadoop.util.Shell$ExitCodeException: /data/5/yarn/nm/usercache/hdfs/appcache/application_1394482121761_0012/container_1394482121761_0012_01_41/launch_container.sh: line 32: $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx825955249 -Djava.io.tmpdir=$PWD/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA ${mapreduce.task.profile.params} org.apache.hadoop.mapred.YarnChild 10.20.212.12 43135 attempt_1394482121761_0012_r_00_0 41 1/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41/stdout 2/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41/stderr : bad substitution {noformat} It looks like ${mapreduce.task.profile.params} is not getting subbed in correctly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5790) Default map hprof profile options do not work
[ https://issues.apache.org/jira/browse/MAPREDUCE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062074#comment-14062074 ] Hudson commented on MAPREDUCE-5790: --- FAILURE: Integrated in Hadoop-Hdfs-trunk #1805 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1805/]) MAPREDUCE-5790. Made it easier to enable hprof profile options by default. Contributed by Gera Shegalov. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1610578) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestJobConf.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobsWithProfiler.java Default map hprof profile options do not work - Key: MAPREDUCE-5790 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5790 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.3.0 Environment: java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Reporter: Andrew Wang Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: MAPREDUCE-5790.v01.patch, MAPREDUCE-5790.v02.patch I have an MR job doing the following: {code} Job job = Job.getInstance(conf); // Enable profiling job.setProfileEnabled(true); job.setProfileTaskRange(true, 0); job.setProfileTaskRange(false, 0); {code} When I run this job, some of my map tasks fail with this error message: {noformat} org.apache.hadoop.util.Shell$ExitCodeException: /data/5/yarn/nm/usercache/hdfs/appcache/application_1394482121761_0012/container_1394482121761_0012_01_41/launch_container.sh: line 32: $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx825955249 -Djava.io.tmpdir=$PWD/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA ${mapreduce.task.profile.params} org.apache.hadoop.mapred.YarnChild 10.20.212.12 43135 attempt_1394482121761_0012_r_00_0 41 1/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41/stdout 2/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41/stderr : bad substitution {noformat} It looks like ${mapreduce.task.profile.params} is not getting subbed in correctly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5790) Default map hprof profile options do not work
[ https://issues.apache.org/jira/browse/MAPREDUCE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062172#comment-14062172 ] Hudson commented on MAPREDUCE-5790: --- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1832 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1832/]) MAPREDUCE-5790. Made it easier to enable hprof profile options by default. Contributed by Gera Shegalov. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1610578) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/conf/TestJobConf.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobsWithProfiler.java Default map hprof profile options do not work - Key: MAPREDUCE-5790 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5790 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.3.0 Environment: java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Reporter: Andrew Wang Assignee: Gera Shegalov Priority: Blocker Fix For: 2.5.0 Attachments: MAPREDUCE-5790.v01.patch, MAPREDUCE-5790.v02.patch I have an MR job doing the following: {code} Job job = Job.getInstance(conf); // Enable profiling job.setProfileEnabled(true); job.setProfileTaskRange(true, 0); job.setProfileTaskRange(false, 0); {code} When I run this job, some of my map tasks fail with this error message: {noformat} org.apache.hadoop.util.Shell$ExitCodeException: /data/5/yarn/nm/usercache/hdfs/appcache/application_1394482121761_0012/container_1394482121761_0012_01_41/launch_container.sh: line 32: $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx825955249 -Djava.io.tmpdir=$PWD/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA ${mapreduce.task.profile.params} org.apache.hadoop.mapred.YarnChild 10.20.212.12 43135 attempt_1394482121761_0012_r_00_0 41 1/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41/stdout 2/var/log/hadoop-yarn/container/application_1394482121761_0012/container_1394482121761_0012_01_41/stderr : bad substitution {noformat} It looks like ${mapreduce.task.profile.params} is not getting subbed in correctly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062220#comment-14062220 ] Sangjin Lee commented on MAPREDUCE-5957: I don't think the unit test failures are related to this patch. A jenkins run with a different patch have the same unit test failures (https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4733/). It's not clear yet whether these are related to jenkins (the port in question being held onto) or a more legitimate unit test issue. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062285#comment-14062285 ] Jason Lowe commented on MAPREDUCE-5957: --- Thanks Sangjin for the report and the patch. I'm pretty sure the test failures are unrelated as I've seen other recent JIRAs also complain about the same set of test failures. I think the proposed solution is reasonable and would probably work in most cases but will it cover all possible cases? In particular I'm thinking of a case where an output format/committer creates threads in its constructor or does reflection on its own with the current TCCL to load other classes. In those cases don't we need to have the TCCL set as well? I'm not a fan of flip-flopping the classloader around, but I'm wondering if we really should do that to cover whatever crazy stuff user code might try to do. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062412#comment-14062412 ] Sangjin Lee commented on MAPREDUCE-5957: Thanks for your comment Jason. I've been going back and forth between the two approaches on this. On the one hand, if we make the job classloader available early (but hold back setting the TCCL) this one time change would be sufficient to cover future changes when another instance of custom class loading is needed. That's why I gravitated to that approach. It is true that if the custom class needs to load another class via TCCL in its constructor (and a couple of other methods that get called by MRAppMaster) then it is a problem. It's a fairly uncommon scenario, but I can't say it should never happen. Surrounding custom class loading with setting and unsetting of the job classloader (both as the configuration classloader and as TCCL) does solve that problem. And I can't think of a case where making it available as TCCL during that time period would cause a different type of problems (along the line of MAPREDUCE-5751). Even if this thing invokes jetty initialization for example, it would have compelled its own copy of jetty and things would stay consistent within that class namespace. The main problem I have with this approach is that it's bit more expensive to maintain. Every time new code is added to load a custom class in MRAppMaster, we *must* remember to wrap it with setting and unsetting the job classloader. It's perfectly doable, but leaves room for making mistakes. I can bring up the second version of the patch that implements the other approach. Shall we discuss that a little more then? Let me know your thoughts. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated MAPREDUCE-5957: --- Status: Open (was: Patch Available) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated MAPREDUCE-5957: --- Status: Patch Available (was: Open) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated MAPREDUCE-5957: --- Attachment: MAPREDUCE-5957.patch Uploaded a patch based on the alternate approach (setting and unsetting the job classloader on key sections of MRAppMaster code). The sections of code that need this treatment are any place that uses Configuration.getClass() to load a potentially non-hadoop class or any place that exercises methods on those objects that are instantiated this way. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062815#comment-14062815 ] Jason Lowe commented on MAPREDUCE-5957: --- Yeah, seeing the second patch I can understand why you initially gravitated to the first approach. I agree that it is quite likely things will break as the code is maintained. Not only do we have to flip-flop each time a new custom class is loaded but also each time we invoke a method on a user-provided instance. So if a new committer method is added or what-not that needs to be called before we finally throw the classloader switch for good then that's another necessary flip-flop. That's annoying and probably won't be remembered when the new method invocation is added. If we do stick with the flip-flop approach then it'd be nice if we had a nice way to wrap such code with a common flip-flop wrapper. I'm thinking something akin to how UserGroupInformation.doAs works so we can wrap code with common logic and don't have to copy-n-paste the wrapper everywhere. However the wrapped code has to be marshaled in the form of a Runnable or what-not, so that might not be much better in the end. So I guess it comes down to weighing the likelihood this will ever be needed in practice or if simply setting the conf classloader will just work. I'm not as worried about the reflection case since that's probably rare to do without already leveraging Hadoop's conf class property processing. However I'm worried about thread creation since I could see a case where a committer/speculator/whatever creates some threads in their constructor or some other method we invoke before throwing the final classloader switch. If those threads end up inheriting the system TCCL instead of the job classloader TCCL then we have a problem. If that indeed is what would happen then it comes down to how likely is it that user code will want to create threads in constructors/methods that are invoked before the final classloader switch. I don't know offhand, unfortunately. If we can convince ourselves the thread use by user methods invoked before the final loader switch is either a non-issue or super unlikely then I think we should go for the simpler fix to set the conf loader early. Otherwise I think we may be stuck doing the flip flop case just for correctness sake, and hopefully we can make that as painless and obvious as possible. Which reminds me, the CommitterEventHandler creates a thread with which it invokes methods on the output committer. That thread is going to have the system TCCL since it was created before the final classloader switch, correct? If so would we also have problems if the output committer does lazy thread creation or class loading when commit methods are invoked as the job runs? Seems like the CommitterEventHandler event handling thread needs the job classloader, if specified. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at
[jira] [Updated] (MAPREDUCE-5969) Private non-Archive Files' size add twice in Distributed Cache directory size calculation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-5969: - Attachment: (was: MAPREDUCE-5969.branch1.patch) Private non-Archive Files' size add twice in Distributed Cache directory size calculation. -- Key: MAPREDUCE-5969 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5969 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Reporter: zhihai xu Assignee: zhihai xu Private non-Archive Files' size add twice in Distributed Cache directory size calculation. Private non-Archive Files list is passed in by -files command line option. The Distributed Cache directory size is used to check whether the total cache files size exceed the cache size limitation, the default cache size limitation is 10G. I add log in addCacheInfoUpdate and setSize in TrackerDistributedCacheManager.java. I use the following command to test: hadoop jar ./wordcount.jar org.apache.hadoop.examples.WordCount -files hdfs://host:8022/tmp/zxu/WordCount.java,hdfs://host:8022/tmp/zxu/wordcount.jar /tmp/zxu/test_in/ /tmp/zxu/test_out to add two files into distributed cache:WordCount.java and wordcount.jar. WordCount.java file size is 2395 byes and wordcount.jar file size is 3865 bytes. The total should be 6260. The log show these files size added twice: add one time before download to local node and add second time after download to local node, so total file number becomes 4 instead of 2: addCacheInfoUpdate size: 6260 num: 2 baseDir: /mapred/local addCacheInfoUpdate size: 8683 num: 3 baseDir: /mapred/local addCacheInfoUpdate size: 12588 num: 4 baseDir: /mapred/local In the code, for Private non-Archive File, the first time we add file size is at getLocalCache: if (!isArchive) { //for private archives, the lengths come over RPC from the //JobLocalizer since the JobLocalizer is the one who expands //archives and gets the total length lcacheStatus.size = fileStatus.getLen(); LOG.info(getLocalCache: + localizedPath + size = + lcacheStatus.size); // Increase the size and sub directory count of the cache // from baseDirSize and baseDirNumberSubDir. baseDirManager.addCacheInfoUpdate(lcacheStatus); } The second time we add file size is at setSize: synchronized (status) { status.size = size; baseDirManager.addCacheInfoUpdate(status); } The fix is not to add the file size for for Private non-Archive File after download(downloadCacheObject). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5969) Private non-Archive Files' size add twice in Distributed Cache directory size calculation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated MAPREDUCE-5969: - Attachment: MAPREDUCE-5969.branch1.patch Private non-Archive Files' size add twice in Distributed Cache directory size calculation. -- Key: MAPREDUCE-5969 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5969 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5969.branch1.patch Private non-Archive Files' size add twice in Distributed Cache directory size calculation. Private non-Archive Files list is passed in by -files command line option. The Distributed Cache directory size is used to check whether the total cache files size exceed the cache size limitation, the default cache size limitation is 10G. I add log in addCacheInfoUpdate and setSize in TrackerDistributedCacheManager.java. I use the following command to test: hadoop jar ./wordcount.jar org.apache.hadoop.examples.WordCount -files hdfs://host:8022/tmp/zxu/WordCount.java,hdfs://host:8022/tmp/zxu/wordcount.jar /tmp/zxu/test_in/ /tmp/zxu/test_out to add two files into distributed cache:WordCount.java and wordcount.jar. WordCount.java file size is 2395 byes and wordcount.jar file size is 3865 bytes. The total should be 6260. The log show these files size added twice: add one time before download to local node and add second time after download to local node, so total file number becomes 4 instead of 2: addCacheInfoUpdate size: 6260 num: 2 baseDir: /mapred/local addCacheInfoUpdate size: 8683 num: 3 baseDir: /mapred/local addCacheInfoUpdate size: 12588 num: 4 baseDir: /mapred/local In the code, for Private non-Archive File, the first time we add file size is at getLocalCache: if (!isArchive) { //for private archives, the lengths come over RPC from the //JobLocalizer since the JobLocalizer is the one who expands //archives and gets the total length lcacheStatus.size = fileStatus.getLen(); LOG.info(getLocalCache: + localizedPath + size = + lcacheStatus.size); // Increase the size and sub directory count of the cache // from baseDirSize and baseDirNumberSubDir. baseDirManager.addCacheInfoUpdate(lcacheStatus); } The second time we add file size is at setSize: synchronized (status) { status.size = size; baseDirManager.addCacheInfoUpdate(status); } The fix is not to add the file size for for Private non-Archive File after download(downloadCacheObject). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062900#comment-14062900 ] Hadoop QA commented on MAPREDUCE-5957: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12655890/MAPREDUCE-5957.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4737//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4737//console This message is automatically generated. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5969) Private non-Archive Files' size add twice in Distributed Cache directory size calculation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062903#comment-14062903 ] zhihai xu commented on MAPREDUCE-5969: -- I updated the patch. The new patch is to delete the old file size before add the new file size for Private non-Archive Files. With the new patch, It will work even you change the file size multiple times for the same file. Private non-Archive Files' size add twice in Distributed Cache directory size calculation. -- Key: MAPREDUCE-5969 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5969 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5969.branch1.patch Private non-Archive Files' size add twice in Distributed Cache directory size calculation. Private non-Archive Files list is passed in by -files command line option. The Distributed Cache directory size is used to check whether the total cache files size exceed the cache size limitation, the default cache size limitation is 10G. I add log in addCacheInfoUpdate and setSize in TrackerDistributedCacheManager.java. I use the following command to test: hadoop jar ./wordcount.jar org.apache.hadoop.examples.WordCount -files hdfs://host:8022/tmp/zxu/WordCount.java,hdfs://host:8022/tmp/zxu/wordcount.jar /tmp/zxu/test_in/ /tmp/zxu/test_out to add two files into distributed cache:WordCount.java and wordcount.jar. WordCount.java file size is 2395 byes and wordcount.jar file size is 3865 bytes. The total should be 6260. The log show these files size added twice: add one time before download to local node and add second time after download to local node, so total file number becomes 4 instead of 2: addCacheInfoUpdate size: 6260 num: 2 baseDir: /mapred/local addCacheInfoUpdate size: 8683 num: 3 baseDir: /mapred/local addCacheInfoUpdate size: 12588 num: 4 baseDir: /mapred/local In the code, for Private non-Archive File, the first time we add file size is at getLocalCache: if (!isArchive) { //for private archives, the lengths come over RPC from the //JobLocalizer since the JobLocalizer is the one who expands //archives and gets the total length lcacheStatus.size = fileStatus.getLen(); LOG.info(getLocalCache: + localizedPath + size = + lcacheStatus.size); // Increase the size and sub directory count of the cache // from baseDirSize and baseDirNumberSubDir. baseDirManager.addCacheInfoUpdate(lcacheStatus); } The second time we add file size is at setSize: synchronized (status) { status.size = size; baseDirManager.addCacheInfoUpdate(status); } The fix is not to add the file size for for Private non-Archive File after download(downloadCacheObject). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5969) Private non-Archive Files' size add twice in Distributed Cache directory size calculation.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062908#comment-14062908 ] Hadoop QA commented on MAPREDUCE-5969: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12655931/MAPREDUCE-5969.branch1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4738//console This message is automatically generated. Private non-Archive Files' size add twice in Distributed Cache directory size calculation. -- Key: MAPREDUCE-5969 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5969 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Reporter: zhihai xu Assignee: zhihai xu Attachments: MAPREDUCE-5969.branch1.patch Private non-Archive Files' size add twice in Distributed Cache directory size calculation. Private non-Archive Files list is passed in by -files command line option. The Distributed Cache directory size is used to check whether the total cache files size exceed the cache size limitation, the default cache size limitation is 10G. I add log in addCacheInfoUpdate and setSize in TrackerDistributedCacheManager.java. I use the following command to test: hadoop jar ./wordcount.jar org.apache.hadoop.examples.WordCount -files hdfs://host:8022/tmp/zxu/WordCount.java,hdfs://host:8022/tmp/zxu/wordcount.jar /tmp/zxu/test_in/ /tmp/zxu/test_out to add two files into distributed cache:WordCount.java and wordcount.jar. WordCount.java file size is 2395 byes and wordcount.jar file size is 3865 bytes. The total should be 6260. The log show these files size added twice: add one time before download to local node and add second time after download to local node, so total file number becomes 4 instead of 2: addCacheInfoUpdate size: 6260 num: 2 baseDir: /mapred/local addCacheInfoUpdate size: 8683 num: 3 baseDir: /mapred/local addCacheInfoUpdate size: 12588 num: 4 baseDir: /mapred/local In the code, for Private non-Archive File, the first time we add file size is at getLocalCache: if (!isArchive) { //for private archives, the lengths come over RPC from the //JobLocalizer since the JobLocalizer is the one who expands //archives and gets the total length lcacheStatus.size = fileStatus.getLen(); LOG.info(getLocalCache: + localizedPath + size = + lcacheStatus.size); // Increase the size and sub directory count of the cache // from baseDirSize and baseDirNumberSubDir. baseDirManager.addCacheInfoUpdate(lcacheStatus); } The second time we add file size is at setSize: synchronized (status) { status.size = size; baseDirManager.addCacheInfoUpdate(status); } The fix is not to add the file size for for Private non-Archive File after download(downloadCacheObject). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5970) Provide a boolean switch to enable MR-AM profiling
Gera Shegalov created MAPREDUCE-5970: Summary: Provide a boolean switch to enable MR-AM profiling Key: MAPREDUCE-5970 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5970 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, client Affects Versions: 2.4.1 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5970) Provide a boolean switch to enable MR-AM profiling
[ https://issues.apache.org/jira/browse/MAPREDUCE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated MAPREDUCE-5970: - Release Note: (was: MR task profiling can be enabled with a simple switch {{mapreduce.task.profile=true}}. We can analogously have {{yarn.app.mapreduce.am.profile}} for MR-AM) Provide a boolean switch to enable MR-AM profiling -- Key: MAPREDUCE-5970 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5970 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, client Affects Versions: 2.4.1 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5970) Provide a boolean switch to enable MR-AM profiling
[ https://issues.apache.org/jira/browse/MAPREDUCE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated MAPREDUCE-5970: - Description: MR task profiling can be enabled with a simple switch {{mapreduce.task.profile=true}}. We can analogously have {{yarn.app.mapreduce.am.profile}} for MR-AM Provide a boolean switch to enable MR-AM profiling -- Key: MAPREDUCE-5970 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5970 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, client Affects Versions: 2.4.1 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Minor MR task profiling can be enabled with a simple switch {{mapreduce.task.profile=true}}. We can analogously have {{yarn.app.mapreduce.am.profile}} for MR-AM -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5970) Provide a boolean switch to enable MR-AM profiling
[ https://issues.apache.org/jira/browse/MAPREDUCE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated MAPREDUCE-5970: - Attachment: MAPREDUCE-5970.v01.patch v01 patch for review. Provide a boolean switch to enable MR-AM profiling -- Key: MAPREDUCE-5970 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5970 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, client Affects Versions: 2.4.1 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Minor Attachments: MAPREDUCE-5970.v01.patch MR task profiling can be enabled with a simple switch {{mapreduce.task.profile=true}}. We can analogously have {{yarn.app.mapreduce.am.profile}} for MR-AM -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5970) Provide a boolean switch to enable MR-AM profiling
[ https://issues.apache.org/jira/browse/MAPREDUCE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated MAPREDUCE-5970: - Status: Patch Available (was: Open) Provide a boolean switch to enable MR-AM profiling -- Key: MAPREDUCE-5970 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5970 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, client Affects Versions: 2.4.1 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Minor Attachments: MAPREDUCE-5970.v01.patch MR task profiling can be enabled with a simple switch {{mapreduce.task.profile=true}}. We can analogously have {{yarn.app.mapreduce.am.profile}} for MR-AM -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5910) MRAppMaster should handle Resync from RM instead of shutting down.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated MAPREDUCE-5910: --- Status: Open (was: Patch Available) MRAppMaster should handle Resync from RM instead of shutting down. -- Key: MAPREDUCE-5910 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5910 Project: Hadoop Map/Reduce Issue Type: Task Components: applicationmaster Reporter: Rohith Assignee: Rohith Attachments: MAPREDUCE-5910.1.patch, MAPREDUCE-5910.2.patch The ApplicationMasterService currently sends a resync response to which the AM responds by shutting down. The MRAppMaster behavior is expected to change to calling resyncing with the RM. Resync means resetting the allocate RPC sequence number to 0 and the AM should send its entire outstanding request to the RM. Note that if the AM is making its first allocate call to the RM then things should proceed like normal without needing a resync. The RM will return all containers that have completed since the RM last synced with the AM. Some container completions may be reported more than once. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5957) AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063151#comment-14063151 ] Sangjin Lee commented on MAPREDUCE-5957: Thanks for the feedback Jason. I do like the idea of providing a convenience method that wraps the logic of setting and unsetting the classloader around code. I think it will lead to easier and more obvious handling of this. I'll update the patch to do that (lambda would have been very nice btw). Also, a good catch about the committer event handler thread creation. I'll also handle that scenario. I think the latest patch should cover at least all these known cases, except for one. It's the composite service's serviceInit() and serviceStart(). In case of the speculator, it's added as a service to the parent composite service. As such, super.serviceInit() and super.serviceStart() will invoke its serviceInit() and serviceStart(). If it loads classes via configuration classloader or TCCL, it will break. Note that normal classloading will still work. For that matter, if any of these classes loads other classes in the normal way, that still works fine. AM throws ClassNotFoundException with job classloader enabled if custom output format/committer is used --- Key: MAPREDUCE-5957 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5957 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: MAPREDUCE-5957.patch, MAPREDUCE-5957.patch, MAPREDUCE-5957.patch With the job classloader enabled, the MR AM throws ClassNotFoundException if a custom output format class is specified. {noformat} org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:473) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:374) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1459) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1895) at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:222) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:469) ... 8 more Caused by: java.lang.ClassNotFoundException: Class com.foo.test.TestOutputFormat not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893) ... 10 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5970) Provide a boolean switch to enable MR-AM profiling
[ https://issues.apache.org/jira/browse/MAPREDUCE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063165#comment-14063165 ] Hadoop QA commented on MAPREDUCE-5970: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12655973/MAPREDUCE-5970.v01.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4739//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4739//console This message is automatically generated. Provide a boolean switch to enable MR-AM profiling -- Key: MAPREDUCE-5970 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5970 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, client Affects Versions: 2.4.1 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Minor Attachments: MAPREDUCE-5970.v01.patch MR task profiling can be enabled with a simple switch {{mapreduce.task.profile=true}}. We can analogously have {{yarn.app.mapreduce.am.profile}} for MR-AM -- This message was sent by Atlassian JIRA (v6.2#6252)