[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5363: - Target Version/s: 3.0.0, 2.5.0 (was: trunk, 2.1.0-beta, 1.3.0) > Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus > > > Key: MAPREDUCE-5363 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 1.1.2, 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Akira AJISAKA > Labels: newbie > Attachments: MAPREDUCE-5363-1.patch, MAPREDUCE-5363-2.patch, > MAPREDUCE-5363-3.patch > > > The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is > {code} > Returns enum Status.SUCESS or Status.FAILURE. > @return task tracker status > {code} > The actual values that the Status enum can take are > FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5363) Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus
[ https://issues.apache.org/jira/browse/MAPREDUCE-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated MAPREDUCE-5363: - Priority: Minor (was: Major) > Fix doc and spelling for TaskCompletionEvent#getTaskStatus and getStatus > > > Key: MAPREDUCE-5363 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5363 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 1.1.2, 2.1.0-beta >Reporter: Sandy Ryza >Assignee: Akira AJISAKA >Priority: Minor > Labels: newbie > Attachments: MAPREDUCE-5363-1.patch, MAPREDUCE-5363-2.patch, > MAPREDUCE-5363-3.patch > > > The doc for TaskCompletionEvent#get(Task)Status in both MR1 and MR2 is > {code} > Returns enum Status.SUCESS or Status.FAILURE. > @return task tracker status > {code} > The actual values that the Status enum can take are > FAILED, KILLED, SUCCEEDED, OBSOLETE, TIPFAILED -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3677) If "hadoop.security.authorization" is set to true, NM is not starting.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976262#comment-13976262 ] Chen He commented on MAPREDUCE-3677: "hadoop.security.authorization" is for secue RPC accessibility. I am not sure why you need to start nodemanager as root. But I did investigation based on Hadoop 0.23.10. I change hadoop/bin/yarn a little bit. I am using Java 1.7.0_45 and it reports illegal argument "-jvm". I comment following lines in hadoop/bin/yarn: #if [[ $EUID -eq 0 ]]; then # YARN_OPTS="$YARN_OPTS -jvm server $YARN_NODEMANAGER_OPTS" #else YARN_OPTS="$YARN_OPTS -server $YARN_NODEMANAGER_OPTS" #fi It works fine. Feel free to make any comments. If there is no response, I will close this JIRA in 3 days. > If "hadoop.security.authorization" is set to true, NM is not starting. > -- > > Key: MAPREDUCE-3677 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3677 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: nodemanager >Affects Versions: 0.23.0 >Reporter: Ramgopal N >Assignee: Chen He > > I have the hadoop cluster setup with root user.Accidentally i have set > hadoop.security.authorization to true.I have not set any permissions in > policy.xml.When i am trying to start the NM with root user ...it is throwing > the following error > Exception in thread "main" java.lang.NoClassDefFoundError: nodemanager > Caused by: java.lang.ClassNotFoundException: nodemanager > at java.net.URLClassLoader$1.run(URLClassLoader.java:200) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:188) > at java.lang.ClassLoader.loadClass(ClassLoader.java:303) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) > at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316) > Could not find the main class: nodemanager. Program will exit. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5852) Prepare MapReduce codebase for JUnit 4.11.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976215#comment-13976215 ] Hudson commented on MAPREDUCE-5852: --- SUCCESS: Integrated in Hadoop-trunk-Commit #5548 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5548/]) MAPREDUCE-5852. Prepare MapReduce codebase for JUnit 4.11. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1589006) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestEvents.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEventHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestAMInfos.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestFail.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestKill.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRAppComponentDependencies.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestMRClientService.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRMContainerAllocator.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestStagingCleanup.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/commit/TestCommitterEventHandler.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestMapReduceChildJVM.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskAttempt.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskAttemptContainerRequest.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/webapp/TestAMWebApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestJobClientGetJob.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapred/TestMRWithDistributedCache.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/TestRPCFactories.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/TestRecordFactory.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/test/java/org/apache/hadoop/mapreduce/v2/util/TestMRApps.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestClusterStatus.java * /hadoop/common/trunk/hadoop-mapre
[jira] [Updated] (MAPREDUCE-5852) Prepare MapReduce codebase for JUnit 4.11.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5852: - Resolution: Fixed Fix Version/s: 2.5.0 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I committed this patch to trunk and branch-2. > Prepare MapReduce codebase for JUnit 4.11. > -- > > Key: MAPREDUCE-5852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5852 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Fix For: 3.0.0, 2.5.0 > > Attachments: MAPREDUCE-5852.1.patch > > > HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the > MapReduce code needs some minor updates to fix deprecation warnings before > the upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5852) Prepare MapReduce codebase for JUnit 4.11.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976179#comment-13976179 ] Chris Nauroth commented on MAPREDUCE-5852: -- Thanks, Arpit. I'll commit this soon. BTW, I meant to mention that HADOOP-10503 contains some comments with more explanation of the need for these changes. > Prepare MapReduce codebase for JUnit 4.11. > -- > > Key: MAPREDUCE-5852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5852 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Attachments: MAPREDUCE-5852.1.patch > > > HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the > MapReduce code needs some minor updates to fix deprecation warnings before > the upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5852) Prepare MapReduce codebase for JUnit 4.11.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976158#comment-13976158 ] Arpit Agarwal commented on MAPREDUCE-5852: -- +1 for the patch. Nice catch in {{TestMRApps}}. > Prepare MapReduce codebase for JUnit 4.11. > -- > > Key: MAPREDUCE-5852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5852 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Attachments: MAPREDUCE-5852.1.patch > > > HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the > MapReduce code needs some minor updates to fix deprecation warnings before > the upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5812) Make task context available to OutputCommitter.isRecoverySupported()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976117#comment-13976117 ] Hadoop QA commented on MAPREDUCE-5812: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12641126/MAPREDUCE-5812.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4541//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4541//console This message is automatically generated. > Make task context available to OutputCommitter.isRecoverySupported() > - > > Key: MAPREDUCE-5812 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5812 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.3.0 >Reporter: Mohammad Kamrul Islam >Assignee: Mohammad Kamrul Islam > Attachments: MAPREDUCE-5812.1.patch, MAPREDUCE-5812.2.patch, > MAPREDUCE-5812.3.patch > > > Background > == > The system like Hive provides its version of OutputCommitter. The custom > implementation of isRecoverySupported() requires task context. From > taskContext:getConfiguration(), hive checks if hive-defined specific > property is set or not. Based on the property value, it returns true or > false. However, in the current OutputCommitter:isRecoverySupported(), there > is no way of getting task config. As a result, user can't turn on/off the > MRAM recovery feature. > Proposed resolution: > === > 1. Pass Task Context into isRecoverySupported() method. > Pros: Easy and clean > Cons: Possible backward compatibility issue due to aPI changes. (Is it true?) > 2. Call outputCommitter.setupTask(taskContext) from MRAM: The new > OutputCommitter will store the context in the class level variable and use it > from isRecoverySupported() > Props: No API changes. No backward compatibility issue. This call can be made > from MRAppMaster.getOutputCommitter() method for old API case. > Cons: Might not be very clean solution due to class level variable. > Please give your comments. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5756) CombineFileInputFormat.getSplits() including directories in its results
[ https://issues.apache.org/jira/browse/MAPREDUCE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976105#comment-13976105 ] Ashutosh Chauhan commented on MAPREDUCE-5756: - Whats the decision on this one? Is it expected to make it in 2.5 timeframe? > CombineFileInputFormat.getSplits() including directories in its results > --- > > Key: MAPREDUCE-5756 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5756 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Jason Dere > > Trying to track down HIVE-6401, where we see some "is not a file" errors > because getSplits() is giving us directories. I believe the culprit is > FileInputFormat.listStatus(): > {code} > if (recursive && stat.isDirectory()) { > addInputPathRecursively(result, fs, stat.getPath(), > inputFilter); > } else { > result.add(stat); > } > {code} > Which seems to be allowing directories to be added to the results if > recursive is false. Is this meant to return directories? If not, I think it > should look like this: > {code} > if (stat.isDirectory()) { > if (recursive) { > addInputPathRecursively(result, fs, stat.getPath(), > inputFilter); > } > } else { > result.add(stat); > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5852) Prepare MapReduce codebase for JUnit 4.11.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976092#comment-13976092 ] Hadoop QA commented on MAPREDUCE-5852: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12641105/MAPREDUCE-5852.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 67 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-tools/hadoop-distcp hadoop-tools/hadoop-streaming. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4540//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4540//console This message is automatically generated. > Prepare MapReduce codebase for JUnit 4.11. > -- > > Key: MAPREDUCE-5852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5852 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Attachments: MAPREDUCE-5852.1.patch > > > HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the > MapReduce code needs some minor updates to fix deprecation warnings before > the upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5812) Make task context available to OutputCommitter.isRecoverySupported()
[ https://issues.apache.org/jira/browse/MAPREDUCE-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated MAPREDUCE-5812: - Attachment: MAPREDUCE-5812.3.patch New patch to address the findbugs warning. > Make task context available to OutputCommitter.isRecoverySupported() > - > > Key: MAPREDUCE-5812 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5812 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.3.0 >Reporter: Mohammad Kamrul Islam >Assignee: Mohammad Kamrul Islam > Attachments: MAPREDUCE-5812.1.patch, MAPREDUCE-5812.2.patch, > MAPREDUCE-5812.3.patch > > > Background > == > The system like Hive provides its version of OutputCommitter. The custom > implementation of isRecoverySupported() requires task context. From > taskContext:getConfiguration(), hive checks if hive-defined specific > property is set or not. Based on the property value, it returns true or > false. However, in the current OutputCommitter:isRecoverySupported(), there > is no way of getting task config. As a result, user can't turn on/off the > MRAM recovery feature. > Proposed resolution: > === > 1. Pass Task Context into isRecoverySupported() method. > Pros: Easy and clean > Cons: Possible backward compatibility issue due to aPI changes. (Is it true?) > 2. Call outputCommitter.setupTask(taskContext) from MRAM: The new > OutputCommitter will store the context in the class level variable and use it > from isRecoverySupported() > Props: No API changes. No backward compatibility issue. This call can be made > from MRAppMaster.getOutputCommitter() method for old API case. > Cons: Might not be very clean solution due to class level variable. > Please give your comments. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5841) uber job doesn't terminate on getting mapred job kill
[ https://issues.apache.org/jira/browse/MAPREDUCE-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975992#comment-13975992 ] Sangjin Lee commented on MAPREDUCE-5841: Any comments on the bug and/or the proposed fix? Thanks! > uber job doesn't terminate on getting mapred job kill > - > > Key: MAPREDUCE-5841 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5841 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.3.0 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Attachments: mapreduce-5841.patch > > > If you issue a "mapred job -kill" against a uberized job, the job (and the > yarn application) state transitions to KILLED, but the application master > process continues to run. The job actually runs to completion despite the > killed status. > This can be easily reproduced by running a sleep job: > {noformat} > hadoop jar hadoop-mapreduce-client-jobclient-2.3.0-tests.jar sleep -m 1 -r 0 > -mt 30 > {noformat} > Issue a kill with "mapred job -kill \[job-id\]". The UI will show the job > (app) is in the KILLED state. However, you can see the application master is > still running. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5852) Prepare MapReduce codebase for JUnit 4.11.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5852: - Attachment: MAPREDUCE-5852.1.patch > Prepare MapReduce codebase for JUnit 4.11. > -- > > Key: MAPREDUCE-5852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5852 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Attachments: MAPREDUCE-5852.1.patch > > > HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the > MapReduce code needs some minor updates to fix deprecation warnings before > the upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5852) Prepare MapReduce codebase for JUnit 4.11.
Chris Nauroth created MAPREDUCE-5852: Summary: Prepare MapReduce codebase for JUnit 4.11. Key: MAPREDUCE-5852 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5852 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor Attachments: MAPREDUCE-5852.1.patch HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the MapReduce code needs some minor updates to fix deprecation warnings before the upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5852) Prepare MapReduce codebase for JUnit 4.11.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5852: - Status: Patch Available (was: Open) > Prepare MapReduce codebase for JUnit 4.11. > -- > > Key: MAPREDUCE-5852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5852 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Attachments: MAPREDUCE-5852.1.patch > > > HADOOP-10503 upgrades the entire Hadoop repo to use JUnit 4.11. Some of the > MapReduce code needs some minor updates to fix deprecation warnings before > the upgrade. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3111) Fix log serving in NodeManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975832#comment-13975832 ] Mit Desai commented on MAPREDUCE-3111: -- Thanks [~jlowe] for correcting the status > Fix log serving in NodeManager > -- > > Key: MAPREDUCE-3111 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3111 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: security >Affects Versions: 0.23.0 >Reporter: Luke Lu >Assignee: Omkar Vinit Joshi > Labels: security, webapp > > Just noticed that the current log serving is using the raw writer (instead of > Hamlet) to serve logs without escaping html. > It's actually easier/cleaner to use Hamlet to serve logs: > {code} > pre._(buffer); > {code} > which takes care of content escaping automatically. > I will make raw writer access package private for framework use only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-5851) Enable regular expression in the DistCp input
[ https://issues.apache.org/jira/browse/MAPREDUCE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Qi reassigned MAPREDUCE-5851: - Assignee: Yan Qi > Enable regular expression in the DistCp input > - > > Key: MAPREDUCE-5851 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5851 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: distcp >Reporter: Yan Qi >Assignee: Yan Qi >Priority: Minor > Labels: distcp, expression, regular > > DistCp doesn't support regular expression as the input. If the files to copy > are in the different locations, it is quite verbose to put a long list of > inputs in the command. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (MAPREDUCE-3111) Fix log serving in NodeManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reopened MAPREDUCE-3111: --- This was fixed by MAPREDUCE-4283, as that patch started using Hamlet's pre method to emit the log data. Reopening to close this as a duplicate for proper tracking of when this was fixed. > Fix log serving in NodeManager > -- > > Key: MAPREDUCE-3111 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3111 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: security >Affects Versions: 0.23.0 >Reporter: Luke Lu >Assignee: Omkar Vinit Joshi > Labels: security, webapp > > Just noticed that the current log serving is using the raw writer (instead of > Hamlet) to serve logs without escaping html. > It's actually easier/cleaner to use Hamlet to serve logs: > {code} > pre._(buffer); > {code} > which takes care of content escaping automatically. > I will make raw writer access package private for framework use only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-3111) Fix log serving in NodeManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved MAPREDUCE-3111. --- Resolution: Duplicate Fixed by MAPREDUCE-4283. > Fix log serving in NodeManager > -- > > Key: MAPREDUCE-3111 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3111 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: security >Affects Versions: 0.23.0 >Reporter: Luke Lu >Assignee: Omkar Vinit Joshi > Labels: security, webapp > > Just noticed that the current log serving is using the raw writer (instead of > Hamlet) to serve logs without escaping html. > It's actually easier/cleaner to use Hamlet to serve logs: > {code} > pre._(buffer); > {code} > which takes care of content escaping automatically. > I will make raw writer access package private for framework use only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-5851) Enable regular expression in the DistCp input
Yan Qi created MAPREDUCE-5851: - Summary: Enable regular expression in the DistCp input Key: MAPREDUCE-5851 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5851 Project: Hadoop Map/Reduce Issue Type: New Feature Components: distcp Reporter: Yan Qi Priority: Minor DistCp doesn't support regular expression as the input. If the files to copy are in the different locations, it is quite verbose to put a long list of inputs in the command. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5402) DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE
[ https://issues.apache.org/jira/browse/MAPREDUCE-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975768#comment-13975768 ] David Rosenstrauch commented on MAPREDUCE-5402: --- No problem! Happy to help get this fixed! FYI, if you're interested, here's the stats for another (supposedly more optimized) run of job 2. No real improvement in execution time, though. Total number of files: 17,197 Number of files copied: 17,132 Number of files skipped: 65 Number of bytes copied: 1,202,347,772,147 Number of mappers: 512 Split ratio: 10 Max chunks tolerable: 10,000 Number of dynamic-chunk-files created: 4416 Run time: 54mins, 51sec > DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE > -- > > Key: MAPREDUCE-5402 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5402 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp, mrv2 >Reporter: David Rosenstrauch >Assignee: Tsuyoshi OZAWA > Attachments: MAPREDUCE-5402.1.patch, MAPREDUCE-5402.2.patch, > MAPREDUCE-5402.3.patch > > > In MAPREDUCE-2765, which provided the design spec for DistCpV2, the author > describes the implementation of DynamicInputFormat, with one of the main > motivations cited being to reduce the chance of long-tails where a few > leftover mappers run much longer than the rest. > However, I today ran into a situation where I experienced exactly such a long > tail using DistCpV2 and DynamicInputFormat. And when I tried to alleviate > the problem by overriding the number of mappers and the split ratio used by > the DynamicInputFormat, I was prevented from doing so by the hard-coded limit > set in the code by the MAX_CHUNKS_TOLERABLE constant. (Currently set to 400.) > This constant is actually set quite low for production use. (See a > description of my use case below.) And although MAPREDUCE-2765 states that > this is an "overridable maximum", when reading through the code there does > not actually appear to be any mechanism available to override it. > This should be changed. It should be possible to expand the maximum # of > chunks beyond this arbitrary limit. > For example, here is the situation I ran into today: > I ran a distcpv2 job on a cluster with 8 machines containing 128 map slots. > The job consisted of copying ~2800 files from HDFS to Amazon S3. I overrode > the number of mappers for the job from the default of 20 to 128, so as to > more properly parallelize the copy across the cluster. The number of chunk > files created was calculated as 241, and mapred.num.entries.per.chunk was > calculated as 12. > As the job ran on, it reached a point where there were only 4 remaining map > tasks, which had each been running for over 2 hours. The reason for this was > that each of the 12 files that those mappers were copying were quite large > (several hundred megabytes in size) and took ~20 minutes each. However, > during this time, all the other 124 mappers sat idle. > In theory I should be able to alleviate this problem with DynamicInputFormat. > If I were able to, say, quadruple the number of chunk files created, that > would have made each chunk contain only 3 files, and these large files would > have gotten distributed better around the cluster and copied in parallel. > However, when I tried to do that - by overriding mapred.listing.split.ratio > to, say, 10 - DynamicInputFormat responded with an exception ("Too many > chunks created with splitRatio:10, numMaps:128. Reduce numMaps or decrease > split-ratio to proceed.") - presumably because I exceeded the > MAX_CHUNKS_TOLERABLE value of 400. > Is there any particular logic behind this MAX_CHUNKS_TOLERABLE limit? I > can't personally see any. > If this limit has no particular logic behind it, then it should be > overridable - or even better: removed altogether. After all, I'm not sure I > see any need for it. Even if numMaps * splitRatio resulted in an > extraordinarily large number, if the code were modified so that the number of > chunks got calculated as Math.min( numMaps * splitRatio, numFiles), then > there would be no need for MAX_CHUNKS_TOLERABLE. In this worst-case scenario > where the product of numMaps and splitRatio is large, capping the number of > chunks at the number of files (numberOfChunks = numberOfFiles) would result > in 1 file per chunk - the maximum parallelization possible. That may not be > the best-tuned solution for some users, but I would think that it should be > left up to the user to deal with the potential consequence of not having > tuned their job properly. Certainly that would be better than having an > arbitrary hard-coded limit that *prevents* proper parallelization when > dealing with large files and/or l
[jira] [Commented] (MAPREDUCE-5402) DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE
[ https://issues.apache.org/jira/browse/MAPREDUCE-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975749#comment-13975749 ] Tsuyoshi OZAWA commented on MAPREDUCE-5402: --- Thank you for testing, David! [~qwertymaniac], can you take a look and review a latest patch? > DynamicInputFormat should allow overriding of MAX_CHUNKS_TOLERABLE > -- > > Key: MAPREDUCE-5402 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5402 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp, mrv2 >Reporter: David Rosenstrauch >Assignee: Tsuyoshi OZAWA > Attachments: MAPREDUCE-5402.1.patch, MAPREDUCE-5402.2.patch, > MAPREDUCE-5402.3.patch > > > In MAPREDUCE-2765, which provided the design spec for DistCpV2, the author > describes the implementation of DynamicInputFormat, with one of the main > motivations cited being to reduce the chance of long-tails where a few > leftover mappers run much longer than the rest. > However, I today ran into a situation where I experienced exactly such a long > tail using DistCpV2 and DynamicInputFormat. And when I tried to alleviate > the problem by overriding the number of mappers and the split ratio used by > the DynamicInputFormat, I was prevented from doing so by the hard-coded limit > set in the code by the MAX_CHUNKS_TOLERABLE constant. (Currently set to 400.) > This constant is actually set quite low for production use. (See a > description of my use case below.) And although MAPREDUCE-2765 states that > this is an "overridable maximum", when reading through the code there does > not actually appear to be any mechanism available to override it. > This should be changed. It should be possible to expand the maximum # of > chunks beyond this arbitrary limit. > For example, here is the situation I ran into today: > I ran a distcpv2 job on a cluster with 8 machines containing 128 map slots. > The job consisted of copying ~2800 files from HDFS to Amazon S3. I overrode > the number of mappers for the job from the default of 20 to 128, so as to > more properly parallelize the copy across the cluster. The number of chunk > files created was calculated as 241, and mapred.num.entries.per.chunk was > calculated as 12. > As the job ran on, it reached a point where there were only 4 remaining map > tasks, which had each been running for over 2 hours. The reason for this was > that each of the 12 files that those mappers were copying were quite large > (several hundred megabytes in size) and took ~20 minutes each. However, > during this time, all the other 124 mappers sat idle. > In theory I should be able to alleviate this problem with DynamicInputFormat. > If I were able to, say, quadruple the number of chunk files created, that > would have made each chunk contain only 3 files, and these large files would > have gotten distributed better around the cluster and copied in parallel. > However, when I tried to do that - by overriding mapred.listing.split.ratio > to, say, 10 - DynamicInputFormat responded with an exception ("Too many > chunks created with splitRatio:10, numMaps:128. Reduce numMaps or decrease > split-ratio to proceed.") - presumably because I exceeded the > MAX_CHUNKS_TOLERABLE value of 400. > Is there any particular logic behind this MAX_CHUNKS_TOLERABLE limit? I > can't personally see any. > If this limit has no particular logic behind it, then it should be > overridable - or even better: removed altogether. After all, I'm not sure I > see any need for it. Even if numMaps * splitRatio resulted in an > extraordinarily large number, if the code were modified so that the number of > chunks got calculated as Math.min( numMaps * splitRatio, numFiles), then > there would be no need for MAX_CHUNKS_TOLERABLE. In this worst-case scenario > where the product of numMaps and splitRatio is large, capping the number of > chunks at the number of files (numberOfChunks = numberOfFiles) would result > in 1 file per chunk - the maximum parallelization possible. That may not be > the best-tuned solution for some users, but I would think that it should be > left up to the user to deal with the potential consequence of not having > tuned their job properly. Certainly that would be better than having an > arbitrary hard-coded limit that *prevents* proper parallelization when > dealing with large files and/or large numbers of mappers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-3111) Fix log serving in NodeManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975656#comment-13975656 ] Mit Desai commented on MAPREDUCE-3111: -- I did not hear anything since last couple of days. I was working on a similar issue YARN-1932 and I don't think this is a problem anymore. Closing this for now. Feel free to reopen this jira if the issue is not resolved. > Fix log serving in NodeManager > -- > > Key: MAPREDUCE-3111 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3111 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: security >Affects Versions: 0.23.0 >Reporter: Luke Lu >Assignee: Omkar Vinit Joshi > Labels: security, webapp > > Just noticed that the current log serving is using the raw writer (instead of > Hamlet) to serve logs without escaping html. > It's actually easier/cleaner to use Hamlet to serve logs: > {code} > pre._(buffer); > {code} > which takes care of content escaping automatically. > I will make raw writer access package private for framework use only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-3111) Fix log serving in NodeManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai resolved MAPREDUCE-3111. -- Resolution: Fixed Fix Version/s: (was: 0.24.0) > Fix log serving in NodeManager > -- > > Key: MAPREDUCE-3111 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3111 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: security >Affects Versions: 0.23.0 >Reporter: Luke Lu >Assignee: Omkar Vinit Joshi > Labels: security, webapp > > Just noticed that the current log serving is using the raw writer (instead of > Hamlet) to serve logs without escaping html. > It's actually easier/cleaner to use Hamlet to serve logs: > {code} > pre._(buffer); > {code} > which takes care of content escaping automatically. > I will make raw writer access package private for framework use only. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975646#comment-13975646 ] Chen He commented on MAPREDUCE-4339: I will close this issue since it can not be regenerated. Open if necessary. > pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is > included in the setting environment. > - > > Key: MAPREDUCE-4339 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: examples, job submission, mrv2, scheduler >Affects Versions: 0.23.0 > Environment: Ubuntu Server 11.04, Hadoop 0.23.0, >Reporter: srikanth ayalasomayajulu > Labels: hadoop > Fix For: 0.23.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > Tried to include default capacity scheduler in hadoop and tried to run an > example pi program. The job hangs and no more output is getting displayed. > Starting Job > 2012-06-12 22:10:02,524 INFO ipc.YarnRPC (YarnRPC.java:create(47)) - > Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC > 2012-06-12 22:10:02,538 INFO mapred.ResourceMgrDelegate > (ResourceMgrDelegate.java:(95)) - Connecting to ResourceManager at > localhost/127.0.0.1:8030 > 2012-06-12 22:10:02,539 INFO ipc.HadoopYarnRPC > (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy > for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol > 2012-06-12 22:10:02,665 INFO mapred.ResourceMgrDelegate > (ResourceMgrDelegate.java:(99)) - Connected to ResourceManager at > localhost/127.0.0.1:8030 > 2012-06-12 22:10:02,727 WARN conf.Configuration > (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. > Instead, use fs.defaultFS > 2012-06-12 22:10:02,728 WARN conf.Configuration > (Configuration.java:handleDeprecation(343)) - > mapred.used.genericoptionsparser is deprecated. Instead, use > mapreduce.client.genericoptionsparser.used > 2012-06-12 22:10:02,831 INFO input.FileInputFormat > (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10 > 2012-06-12 22:10:02,900 INFO mapreduce.JobSubmitter > (JobSubmitter.java:submitJobInternal(362)) - number of splits:10 > 2012-06-12 22:10:03,044 INFO mapred.YARNRunner > (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster > capability = memory: 2048 > 2012-06-12 22:10:03,286 INFO mapred.YARNRunner > (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch > container for ApplicationMaster is : $JAVA_HOME/bin/java > -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.mapreduce.container.log.dir= > -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA > -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout > 2>/stderr > 2012-06-12 22:10:03,370 INFO mapred.ResourceMgrDelegate > (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application > application_1339507608976_0002 to ResourceManager > 2012-06-12 22:10:03,432 INFO mapreduce.Job > (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002 > 2012-06-12 22:10:04,443 INFO mapreduce.Job > (Job.java:monitorAndPrintJob(1227)) - map 0% reduce 0% -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (MAPREDUCE-4339) pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is included in the setting environment.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He resolved MAPREDUCE-4339. Resolution: Cannot Reproduce > pi example job hangs on when run on hadoop 0.23.0 when capacity scheduler is > included in the setting environment. > - > > Key: MAPREDUCE-4339 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4339 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: examples, job submission, mrv2, scheduler >Affects Versions: 0.23.0 > Environment: Ubuntu Server 11.04, Hadoop 0.23.0, >Reporter: srikanth ayalasomayajulu > Labels: hadoop > Fix For: 0.23.0 > > Original Estimate: 48h > Remaining Estimate: 48h > > Tried to include default capacity scheduler in hadoop and tried to run an > example pi program. The job hangs and no more output is getting displayed. > Starting Job > 2012-06-12 22:10:02,524 INFO ipc.YarnRPC (YarnRPC.java:create(47)) - > Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC > 2012-06-12 22:10:02,538 INFO mapred.ResourceMgrDelegate > (ResourceMgrDelegate.java:(95)) - Connecting to ResourceManager at > localhost/127.0.0.1:8030 > 2012-06-12 22:10:02,539 INFO ipc.HadoopYarnRPC > (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy > for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol > 2012-06-12 22:10:02,665 INFO mapred.ResourceMgrDelegate > (ResourceMgrDelegate.java:(99)) - Connected to ResourceManager at > localhost/127.0.0.1:8030 > 2012-06-12 22:10:02,727 WARN conf.Configuration > (Configuration.java:handleDeprecation(326)) - fs.default.name is deprecated. > Instead, use fs.defaultFS > 2012-06-12 22:10:02,728 WARN conf.Configuration > (Configuration.java:handleDeprecation(343)) - > mapred.used.genericoptionsparser is deprecated. Instead, use > mapreduce.client.genericoptionsparser.used > 2012-06-12 22:10:02,831 INFO input.FileInputFormat > (FileInputFormat.java:listStatus(245)) - Total input paths to process : 10 > 2012-06-12 22:10:02,900 INFO mapreduce.JobSubmitter > (JobSubmitter.java:submitJobInternal(362)) - number of splits:10 > 2012-06-12 22:10:03,044 INFO mapred.YARNRunner > (YARNRunner.java:createApplicationSubmissionContext(279)) - AppMaster > capability = memory: 2048 > 2012-06-12 22:10:03,286 INFO mapred.YARNRunner > (YARNRunner.java:createApplicationSubmissionContext(355)) - Command to launch > container for ApplicationMaster is : $JAVA_HOME/bin/java > -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.mapreduce.container.log.dir= > -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA > -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/stdout > 2>/stderr > 2012-06-12 22:10:03,370 INFO mapred.ResourceMgrDelegate > (ResourceMgrDelegate.java:submitApplication(304)) - Submitted application > application_1339507608976_0002 to ResourceManager > 2012-06-12 22:10:03,432 INFO mapreduce.Job > (Job.java:monitorAndPrintJob(1207)) - Running job: job_1339507608976_0002 > 2012-06-12 22:10:04,443 INFO mapreduce.Job > (Job.java:monitorAndPrintJob(1227)) - map 0% reduce 0% -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (MAPREDUCE-5848) MapReduce counts forcibly preempted containers as FAILED
[ https://issues.apache.org/jira/browse/MAPREDUCE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975608#comment-13975608 ] Jason Lowe edited comment on MAPREDUCE-5848 at 4/21/14 2:54 PM: bq. On the positive side, the AM should know the containers was on the short-list to be killed from previous preemption messages it received so maybe it could count a failure of a container "doomed" by preemption as a kill? Or simply postpone the decision on FAIL/KILL. Not sure... Yes, the AM should definitely know, and I think the change in the patch is good just not sufficient. As for postponing the decision, we may have to do just that. To resolve the general case of SIGTERM potentially causing failures in the task which should be ignored in light of the kill, the AM may need to wait until it receives the container status from the RM to distinguish the cases. Haven't thought through all of the ramifications of doing that, and I suspect there could be some long delays for some corner cases (e.g.: node fails as task fails, takes the RM a while to expire the node in order to send the container status). was (Author: jlowe): bq. On the positive side, the AM should know the containers was on the short-list to be killed from previous preemption messages it received so maybe it could count a failure of a container "doomed" by preemption as a kill? Or simply postpone the decision on FAIL/KILL. Not sure... Yes, the AM should definitely know, and I think the change in the patch is good just not sufficient. As for postponing the decision, we may have to do just that. To resolve the general case of SIGTERM potentially causing failures in the task which should be ignored in light of the kill, the AM may need to wait until it receives the container status from the RM to distinguish the cases. Haven't thought through all of the ramifications of doing that, and I suspect there could be some long delays for some corner cases (e.g.: node fails as task fails, takes the RM a while to expire the node in order to send the container status). > MapReduce counts forcibly preempted containers as FAILED > > > Key: MAPREDUCE-5848 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5848 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Carlo Curino >Assignee: Subramaniam Krishnan > Attachments: YARN-1958.patch > > > The MapReduce AM is considering a forcibly preempted container as FAILED, > while I think it should be considered as KILLED (i.e., not count against the > maximum number of failures). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5848) MapReduce counts forcibly preempted containers as FAILED
[ https://issues.apache.org/jira/browse/MAPREDUCE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975608#comment-13975608 ] Jason Lowe commented on MAPREDUCE-5848: --- bq. On the positive side, the AM should know the containers was on the short-list to be killed from previous preemption messages it received so maybe it could count a failure of a container "doomed" by preemption as a kill? Or simply postpone the decision on FAIL/KILL. Not sure... Yes, the AM should definitely know, and I think the change in the patch is good just not sufficient. As for postponing the decision, we may have to do just that. To resolve the general case of SIGTERM potentially causing failures in the task which should be ignored in light of the kill, the AM may need to wait until it receives the container status from the RM to distinguish the cases. Haven't thought through all of the ramifications of doing that, and I suspect there could be some long delays for some corner cases (e.g.: node fails as task fails, takes the RM a while to expire the node in order to send the container status). > MapReduce counts forcibly preempted containers as FAILED > > > Key: MAPREDUCE-5848 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5848 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Carlo Curino >Assignee: Subramaniam Krishnan > Attachments: YARN-1958.patch > > > The MapReduce AM is considering a forcibly preempted container as FAILED, > while I think it should be considered as KILLED (i.e., not count against the > maximum number of failures). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5848) MapReduce counts forcibly preempted containers as FAILED
[ https://issues.apache.org/jira/browse/MAPREDUCE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975600#comment-13975600 ] Jason Lowe commented on MAPREDUCE-5848: --- Yes, FileSystem is doing this. See the ClientFinalizer in FileSystem. One example where we've seen this race occur is when tasks start running outside their memory boundaries and the NM kills them. Sometimes the task kills cleanly and the AM gets the useful container status added by the NM explaining why the container was killed. Other times the task ends up failing first due to actions caused by the SIGTERM processing and the useful container status arrives too late. > MapReduce counts forcibly preempted containers as FAILED > > > Key: MAPREDUCE-5848 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5848 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Carlo Curino >Assignee: Subramaniam Krishnan > Attachments: YARN-1958.patch > > > The MapReduce AM is considering a forcibly preempted container as FAILED, > while I think it should be considered as KILLED (i.e., not count against the > maximum number of failures). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Kim updated MAPREDUCE-4718: Target Version/s: 0.23.3, 1.0.3 (was: 1.0.3, 0.23.3, 2.0.0-alpha, 2.0.1-alpha) > MapReduce fails If I pass a parameter as a S3 folder > > > Key: MAPREDUCE-4718 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission >Affects Versions: 1.0.0, 1.0.3 > Environment: Hadoop with default configurations >Reporter: Benjamin Kim > > I'm running a wordcount MR as follows > hadoop jar WordCount.jar wordcount.WordCountDriver > s3n://bucket/wordcount/input s3n://bucket/wordcount/output > > s3n://bucket/wordcount/input is a s3 object that contains other input files. > However I get following NPE error > 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% > 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% > 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : > attempt_201210021853_0001_m_01_0, Status : FAILED > java.lang.NullPointerException > at > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) > at java.io.BufferedInputStream.close(BufferedInputStream.java:451) > at java.io.FilterInputStream.close(FilterInputStream.java:155) > at org.apache.hadoop.util.LineReader.close(LineReader.java:83) > at > org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > MR runs fine if i specify more specific input path such as > s3n://bucket/wordcount/input/file.txt > MR fails if I pass s3 folder as a parameter > In summary, > This works > hadoop jar ./hadoop-examples-1.0.3.jar wordcount > /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ > This doesn't work > hadoop jar ./hadoop-examples-1.0.3.jar wordcount > s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ > (both input path are directories) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (MAPREDUCE-3677) If "hadoop.security.authorization" is set to true, NM is not starting.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He reassigned MAPREDUCE-3677: -- Assignee: Chen He > If "hadoop.security.authorization" is set to true, NM is not starting. > -- > > Key: MAPREDUCE-3677 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3677 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: nodemanager >Affects Versions: 0.23.0 >Reporter: Ramgopal N >Assignee: Chen He > > I have the hadoop cluster setup with root user.Accidentally i have set > hadoop.security.authorization to true.I have not set any permissions in > policy.xml.When i am trying to start the NM with root user ...it is throwing > the following error > Exception in thread "main" java.lang.NoClassDefFoundError: nodemanager > Caused by: java.lang.ClassNotFoundException: nodemanager > at java.net.URLClassLoader$1.run(URLClassLoader.java:200) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:188) > at java.lang.ClassLoader.loadClass(ClassLoader.java:303) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) > at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316) > Could not find the main class: nodemanager. Program will exit. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975570#comment-13975570 ] Chen He commented on MAPREDUCE-4718: Or close it if it is not a problem for 1.x either. > MapReduce fails If I pass a parameter as a S3 folder > > > Key: MAPREDUCE-4718 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission >Affects Versions: 1.0.0, 1.0.3 > Environment: Hadoop with default configurations >Reporter: Benjamin Kim > > I'm running a wordcount MR as follows > hadoop jar WordCount.jar wordcount.WordCountDriver > s3n://bucket/wordcount/input s3n://bucket/wordcount/output > > s3n://bucket/wordcount/input is a s3 object that contains other input files. > However I get following NPE error > 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% > 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% > 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : > attempt_201210021853_0001_m_01_0, Status : FAILED > java.lang.NullPointerException > at > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) > at java.io.BufferedInputStream.close(BufferedInputStream.java:451) > at java.io.FilterInputStream.close(FilterInputStream.java:155) > at org.apache.hadoop.util.LineReader.close(LineReader.java:83) > at > org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > MR runs fine if i specify more specific input path such as > s3n://bucket/wordcount/input/file.txt > MR fails if I pass s3 folder as a parameter > In summary, > This works > hadoop jar ./hadoop-examples-1.0.3.jar wordcount > /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ > This doesn't work > hadoop jar ./hadoop-examples-1.0.3.jar wordcount > s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ > (both input path are directories) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-4718) MapReduce fails If I pass a parameter as a S3 folder
[ https://issues.apache.org/jira/browse/MAPREDUCE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975567#comment-13975567 ] Chen He commented on MAPREDUCE-4718: Hi [~benkimkimben] Thank you for the reply. Since it is not a problem for 2.x, would you mind remove 2.x from the target version? > MapReduce fails If I pass a parameter as a S3 folder > > > Key: MAPREDUCE-4718 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4718 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: job submission >Affects Versions: 1.0.0, 1.0.3 > Environment: Hadoop with default configurations >Reporter: Benjamin Kim > > I'm running a wordcount MR as follows > hadoop jar WordCount.jar wordcount.WordCountDriver > s3n://bucket/wordcount/input s3n://bucket/wordcount/output > > s3n://bucket/wordcount/input is a s3 object that contains other input files. > However I get following NPE error > 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% > 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% > 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : > attempt_201210021853_0001_m_01_0, Status : FAILED > java.lang.NullPointerException > at > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) > at java.io.BufferedInputStream.close(BufferedInputStream.java:451) > at java.io.FilterInputStream.close(FilterInputStream.java:155) > at org.apache.hadoop.util.LineReader.close(LineReader.java:83) > at > org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > MR runs fine if i specify more specific input path such as > s3n://bucket/wordcount/input/file.txt > MR fails if I pass s3 folder as a parameter > In summary, > This works > hadoop jar ./hadoop-examples-1.0.3.jar wordcount > /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ > This doesn't work > hadoop jar ./hadoop-examples-1.0.3.jar wordcount > s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ > (both input path are directories) -- This message was sent by Atlassian JIRA (v6.2#6252)