[jira] [Created] (MAPREDUCE-5120) Allow app master to use tracing async dispatcher
Sandy Ryza created MAPREDUCE-5120: - Summary: Allow app master to use tracing async dispatcher Key: MAPREDUCE-5120 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5120 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza YARN-366 proposes an option to add traces to events so that exceptions could report an events lineage. This JIRA would add a mapreduce config option that would allow the MR app master to use the tracing async dispatcher as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3688) Need better Error message if AM is killed/throws exception
[ https://issues.apache.org/jira/browse/MAPREDUCE-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618238#comment-13618238 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3688: Ravi, I reopened MAPREDUCE-3949 and assigned it to you, can you please add info and perhaps work on it? > Need better Error message if AM is killed/throws exception > -- > > Key: MAPREDUCE-3688 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3688 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am, mrv2 >Affects Versions: 0.23.1 >Reporter: David Capwell >Assignee: Sandy Ryza > Fix For: 0.23.2 > > Attachments: mapreduce-3688-h0.23-v01.patch, > mapreduce-3688-h0.23-v02.patch > > > We need better error messages in the UI if the AM gets killed or throws an > Exception. > If the following error gets thrown: > java.lang.NumberFormatException: For input string: "9223372036854775807l" // > last char is an L > then the UI should say this exception. Instead I get the following: > Application application_1326504761991_0018 failed 1 times due to AM Container > for appattempt_1326504761991_0018_01 > exited with exitCode: 1 due to: Exception from container-launch: > org.apache.hadoop.util.Shell$ExitCodeException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI sometimes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618236#comment-13618236 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3949: [~raviprak] says [on MAPREDUCE-3688|https://issues.apache.org/jira/browse/MAPREDUCE-3688?focusedCommentId=13606901&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606901]: bq. From my testing on trunk, I notice that even for the case where the AM goes over container limits (which I trigger with -Dyarn.app.mapreduce.am.resource.mb=512 -Dyarn.app.mapreduce.am.command-opts="-Xmx3500m" on a sleep job), sometimes the error is propagated back and sometimes its not. Can you please corroborate this? When State == FinalState == FAILED, the error is propagated back. However about half the times, State == FINISHED and FinalState == KILLED, in which case there is no message anywhere to help me. Not in the diagnostics, and there are no logs. > If AM fails due to overrunning resource limits, error not visible through UI > sometimes > -- > > Key: MAPREDUCE-3949 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.24.0, 0.23.2 >Reporter: Todd Lipcon >Assignee: Ravi Prakash >Priority: Minor > > I had a case where an MR AM eclipsed the configured memory limit. This caused > the AM's container to get killed, but nowhere accessible through the web UI > showed these diagnostics. I had to go view the NM's logs via ssh before I > could figure out what had happened to my application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI sometimes
[ https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated MAPREDUCE-3949: --- Summary: If AM fails due to overrunning resource limits, error not visible through UI sometimes (was: If AM fails due to overrunning resource limits, error not visible through UI) > If AM fails due to overrunning resource limits, error not visible through UI > sometimes > -- > > Key: MAPREDUCE-3949 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.24.0, 0.23.2 >Reporter: Todd Lipcon >Assignee: Ravi Prakash >Priority: Minor > > I had a case where an MR AM eclipsed the configured memory limit. This caused > the AM's container to get killed, but nowhere accessible through the web UI > showed these diagnostics. I had to go view the NM's logs via ssh before I > could figure out what had happened to my application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reopened MAPREDUCE-3949: Assignee: Ravi Prakash Just caught with all the discussion at MAPREDUCE-3688, seems like it isn't fixed completely. Reopening. > If AM fails due to overrunning resource limits, error not visible through UI > > > Key: MAPREDUCE-3949 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.24.0, 0.23.2 >Reporter: Todd Lipcon >Assignee: Ravi Prakash >Priority: Minor > > I had a case where an MR AM eclipsed the configured memory limit. This caused > the AM's container to get killed, but nowhere accessible through the web UI > showed these diagnostics. I had to go view the NM's logs via ssh before I > could figure out what had happened to my application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI
[ https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618219#comment-13618219 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-3949: Wasn't clear, so commenting: This is closed as duplicate of MAPREDUCE-3688. > If AM fails due to overrunning resource limits, error not visible through UI > > > Key: MAPREDUCE-3949 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.24.0, 0.23.2 >Reporter: Todd Lipcon >Priority: Minor > > I had a case where an MR AM eclipsed the configured memory limit. This caused > the AM's container to get killed, but nowhere accessible through the web UI > showed these diagnostics. I had to go view the NM's logs via ssh before I > could figure out what had happened to my application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4875) coverage fixing for org.apache.hadoop.mapred
[ https://issues.apache.org/jira/browse/MAPREDUCE-4875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618063#comment-13618063 ] Hudson commented on MAPREDUCE-4875: --- Integrated in Hadoop-Hdfs-0.23-Build #568 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/568/]) MAPREDUCE-4875. coverage fixing for org.apache.hadoop.mapred (Aleksey Gorshkov via bobby) (Revision 1462527) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1462527 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobEndNotifier.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/QueueManager.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/TaskLog.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/TaskStatus.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestClock.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestJobConf.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestJobInfo.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestOldMethodsJobID.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestQueue.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestSkipBadRecords.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestTaskLog.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestTaskLogAppender.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/resources/mapred-queues.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestIFile.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMultiFileSplit.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestNetworkedJob.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestQueueConfigurationParser.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestStatisticsCollector.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextOutputFormat.java > coverage fixing for org.apache.hadoop.mapred > > > Key: MAPREDUCE-4875 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4875 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 >Reporter: Aleksey Gorshkov >Assignee: Aleksey Gorshkov > Fix For: 3.0.0, 0.23.7, 2.0.5-beta > > Attachments: MAPREDUCE-4875-branch-0.23-b.patch, > MAPREDUCE-4875-branch-0.23.patc
[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618011#comment-13618011 ] Sergey Tryuber commented on MAPREDUCE-3859: --- What I know is that the bug is still present in CDH4.1 MR1. So we had to patch Capacity Scheduler there as well... > CapacityScheduler incorrectly utilizes extra-resources of queue for > high-memory jobs > > > Key: MAPREDUCE-3859 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: capacity-sched >Affects Versions: 1.0.0 > Environment: CDH3u1 >Reporter: Sergey Tryuber > Attachments: test-to-fail.patch.txt > > > Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, > jobs which use 3 map slots will never consume more than 9 slots, regardless > how many free slots on a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira