[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618011#comment-13618011 ] Sergey Tryuber commented on MAPREDUCE-3859: --- What I know is that the bug is still present in CDH4.1 MR1. So we had to patch Capacity Scheduler there as well... CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs Key: MAPREDUCE-3859 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 Project: Hadoop Map/Reduce Issue Type: Bug Components: capacity-sched Affects Versions: 1.0.0 Environment: CDH3u1 Reporter: Sergey Tryuber Attachments: test-to-fail.patch.txt Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, jobs which use 3 map slots will never consume more than 9 slots, regardless how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4875) coverage fixing for org.apache.hadoop.mapred
[ https://issues.apache.org/jira/browse/MAPREDUCE-4875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618063#comment-13618063 ] Hudson commented on MAPREDUCE-4875: --- Integrated in Hadoop-Hdfs-0.23-Build #568 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/568/]) MAPREDUCE-4875. coverage fixing for org.apache.hadoop.mapred (Aleksey Gorshkov via bobby) (Revision 1462527) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1462527 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobEndNotifier.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/QueueManager.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/TaskLog.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/TaskStatus.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestClock.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestJobConf.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestJobInfo.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestOldMethodsJobID.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestQueue.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestSkipBadRecords.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestTaskLog.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestTaskLogAppender.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/resources/mapred-queues.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestIFile.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMultiFileSplit.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestNetworkedJob.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestQueueConfigurationParser.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestStatisticsCollector.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextOutputFormat.java coverage fixing for org.apache.hadoop.mapred Key: MAPREDUCE-4875 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4875 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Aleksey Gorshkov Assignee: Aleksey Gorshkov Fix For: 3.0.0, 0.23.7, 2.0.5-beta Attachments: MAPREDUCE-4875-branch-0.23-b.patch, MAPREDUCE-4875-branch-0.23.patch,
[jira] [Commented] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618219#comment-13618219 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3949: Wasn't clear, so commenting: This is closed as duplicate of MAPREDUCE-3688. If AM fails due to overrunning resource limits, error not visible through UI Key: MAPREDUCE-3949 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.24.0, 0.23.2 Reporter: Todd Lipcon Priority: Minor I had a case where an MR AM eclipsed the configured memory limit. This caused the AM's container to get killed, but nowhere accessible through the web UI showed these diagnostics. I had to go view the NM's logs via ssh before I could figure out what had happened to my application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reopened MAPREDUCE-3949: Assignee: Ravi Prakash Just caught with all the discussion at MAPREDUCE-3688, seems like it isn't fixed completely. Reopening. If AM fails due to overrunning resource limits, error not visible through UI Key: MAPREDUCE-3949 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.24.0, 0.23.2 Reporter: Todd Lipcon Assignee: Ravi Prakash Priority: Minor I had a case where an MR AM eclipsed the configured memory limit. This caused the AM's container to get killed, but nowhere accessible through the web UI showed these diagnostics. I had to go view the NM's logs via ssh before I could figure out what had happened to my application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI sometimes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3949: --- Summary: If AM fails due to overrunning resource limits, error not visible through UI sometimes (was: If AM fails due to overrunning resource limits, error not visible through UI) If AM fails due to overrunning resource limits, error not visible through UI sometimes -- Key: MAPREDUCE-3949 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.24.0, 0.23.2 Reporter: Todd Lipcon Assignee: Ravi Prakash Priority: Minor I had a case where an MR AM eclipsed the configured memory limit. This caused the AM's container to get killed, but nowhere accessible through the web UI showed these diagnostics. I had to go view the NM's logs via ssh before I could figure out what had happened to my application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI sometimes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618236#comment-13618236 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3949: [~raviprak] says [on MAPREDUCE-3688|https://issues.apache.org/jira/browse/MAPREDUCE-3688?focusedCommentId=13606901page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606901]: bq. From my testing on trunk, I notice that even for the case where the AM goes over container limits (which I trigger with -Dyarn.app.mapreduce.am.resource.mb=512 -Dyarn.app.mapreduce.am.command-opts=-Xmx3500m on a sleep job), sometimes the error is propagated back and sometimes its not. Can you please corroborate this? When State == FinalState == FAILED, the error is propagated back. However about half the times, State == FINISHED and FinalState == KILLED, in which case there is no message anywhere to help me. Not in the diagnostics, and there are no logs. If AM fails due to overrunning resource limits, error not visible through UI sometimes -- Key: MAPREDUCE-3949 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.24.0, 0.23.2 Reporter: Todd Lipcon Assignee: Ravi Prakash Priority: Minor I had a case where an MR AM eclipsed the configured memory limit. This caused the AM's container to get killed, but nowhere accessible through the web UI showed these diagnostics. I had to go view the NM's logs via ssh before I could figure out what had happened to my application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3688) Need better Error message if AM is killed/throws exception
[ https://issues.apache.org/jira/browse/MAPREDUCE-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618238#comment-13618238 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3688: Ravi, I reopened MAPREDUCE-3949 and assigned it to you, can you please add info and perhaps work on it? Need better Error message if AM is killed/throws exception -- Key: MAPREDUCE-3688 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3688 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, mrv2 Affects Versions: 0.23.1 Reporter: David Capwell Assignee: Sandy Ryza Fix For: 0.23.2 Attachments: mapreduce-3688-h0.23-v01.patch, mapreduce-3688-h0.23-v02.patch We need better error messages in the UI if the AM gets killed or throws an Exception. If the following error gets thrown: java.lang.NumberFormatException: For input string: 9223372036854775807l // last char is an L then the UI should say this exception. Instead I get the following: Application application_1326504761991_0018 failed 1 times due to AM Container for appattempt_1326504761991_0018_01 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5120) Allow app master to use tracing async dispatcher
Sandy Ryza created MAPREDUCE-5120: - Summary: Allow app master to use tracing async dispatcher Key: MAPREDUCE-5120 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5120 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza YARN-366 proposes an option to add traces to events so that exceptions could report an events lineage. This JIRA would add a mapreduce config option that would allow the MR app master to use the tracing async dispatcher as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira