[jira] [Created] (MAPREDUCE-5120) Allow app master to use tracing async dispatcher

2013-03-30 Thread Sandy Ryza (JIRA)
Sandy Ryza created MAPREDUCE-5120:
-

 Summary: Allow app master to use tracing async dispatcher
 Key: MAPREDUCE-5120
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5120
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza


YARN-366 proposes an option to add traces to events so that exceptions could 
report an events lineage.  This JIRA would add a mapreduce config option that 
would allow the MR app master to use the tracing async dispatcher as well. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3688) Need better Error message if AM is killed/throws exception

2013-03-30 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618238#comment-13618238
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-3688:


Ravi, I reopened MAPREDUCE-3949 and assigned it to you, can you please add info 
and perhaps work on it?

> Need better Error message if AM is killed/throws exception
> --
>
> Key: MAPREDUCE-3688
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3688
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mr-am, mrv2
>Affects Versions: 0.23.1
>Reporter: David Capwell
>Assignee: Sandy Ryza
> Fix For: 0.23.2
>
> Attachments: mapreduce-3688-h0.23-v01.patch, 
> mapreduce-3688-h0.23-v02.patch
>
>
> We need better error messages in the UI if the AM gets killed or throws an 
> Exception.
> If the following error gets thrown: 
> java.lang.NumberFormatException: For input string: "9223372036854775807l" // 
> last char is an L
> then the UI should say this exception.  Instead I get the following:
> Application application_1326504761991_0018 failed 1 times due to AM Container 
> for appattempt_1326504761991_0018_01
> exited with exitCode: 1 due to: Exception from container-launch: 
> org.apache.hadoop.util.Shell$ExitCodeException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI sometimes

2013-03-30 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618236#comment-13618236
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-3949:


[~raviprak] says [on 
MAPREDUCE-3688|https://issues.apache.org/jira/browse/MAPREDUCE-3688?focusedCommentId=13606901&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13606901]:
bq. From my testing on trunk, I notice that even for the case where the AM goes 
over container limits (which I trigger with 
-Dyarn.app.mapreduce.am.resource.mb=512 
-Dyarn.app.mapreduce.am.command-opts="-Xmx3500m" on a sleep job), sometimes the 
error is propagated back and sometimes its not. Can you please corroborate 
this? When State == FinalState == FAILED, the error is propagated back. However 
about half the times, State == FINISHED and FinalState == KILLED, in which case 
there is no message anywhere to help me. Not in the diagnostics, and there are 
no logs.

> If AM fails due to overrunning resource limits, error not visible through UI 
> sometimes
> --
>
> Key: MAPREDUCE-3949
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.24.0, 0.23.2
>Reporter: Todd Lipcon
>Assignee: Ravi Prakash
>Priority: Minor
>
> I had a case where an MR AM eclipsed the configured memory limit. This caused 
> the AM's container to get killed, but nowhere accessible through the web UI 
> showed these diagnostics. I had to go view the NM's logs via ssh before I 
> could figure out what had happened to my application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI sometimes

2013-03-30 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-3949:
---

Summary: If AM fails due to overrunning resource limits, error not visible 
through UI sometimes  (was: If AM fails due to overrunning resource limits, 
error not visible through UI)

> If AM fails due to overrunning resource limits, error not visible through UI 
> sometimes
> --
>
> Key: MAPREDUCE-3949
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.24.0, 0.23.2
>Reporter: Todd Lipcon
>Assignee: Ravi Prakash
>Priority: Minor
>
> I had a case where an MR AM eclipsed the configured memory limit. This caused 
> the AM's container to get killed, but nowhere accessible through the web UI 
> showed these diagnostics. I had to go view the NM's logs via ssh before I 
> could figure out what had happened to my application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI

2013-03-30 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reopened MAPREDUCE-3949:


  Assignee: Ravi Prakash

Just caught with all the discussion at MAPREDUCE-3688, seems like it isn't 
fixed completely. Reopening.

> If AM fails due to overrunning resource limits, error not visible through UI
> 
>
> Key: MAPREDUCE-3949
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.24.0, 0.23.2
>Reporter: Todd Lipcon
>Assignee: Ravi Prakash
>Priority: Minor
>
> I had a case where an MR AM eclipsed the configured memory limit. This caused 
> the AM's container to get killed, but nowhere accessible through the web UI 
> showed these diagnostics. I had to go view the NM's logs via ssh before I 
> could figure out what had happened to my application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-3949) If AM fails due to overrunning resource limits, error not visible through UI

2013-03-30 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618219#comment-13618219
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-3949:


Wasn't clear, so commenting: This is closed as duplicate of MAPREDUCE-3688.

> If AM fails due to overrunning resource limits, error not visible through UI
> 
>
> Key: MAPREDUCE-3949
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3949
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.24.0, 0.23.2
>Reporter: Todd Lipcon
>Priority: Minor
>
> I had a case where an MR AM eclipsed the configured memory limit. This caused 
> the AM's container to get killed, but nowhere accessible through the web UI 
> showed these diagnostics. I had to go view the NM's logs via ssh before I 
> could figure out what had happened to my application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4875) coverage fixing for org.apache.hadoop.mapred

2013-03-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618063#comment-13618063
 ] 

Hudson commented on MAPREDUCE-4875:
---

Integrated in Hadoop-Hdfs-0.23-Build #568 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/568/])
MAPREDUCE-4875. coverage fixing for org.apache.hadoop.mapred (Aleksey 
Gorshkov via bobby) (Revision 1462527)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1462527
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapred/TestTaskAttemptListenerImpl.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobEndNotifier.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/QueueManager.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/TaskLog.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/TaskStatus.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestClock.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestJobConf.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestJobInfo.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestOldMethodsJobID.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestQueue.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestSkipBadRecords.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestTaskLog.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapred/TestTaskLogAppender.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/resources/mapred-queues.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestIFile.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestMultiFileSplit.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestNetworkedJob.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestQueueConfigurationParser.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestStatisticsCollector.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextInputFormat.java
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestTextOutputFormat.java


> coverage fixing for org.apache.hadoop.mapred
> 
>
> Key: MAPREDUCE-4875
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4875
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Aleksey Gorshkov
>Assignee: Aleksey Gorshkov
> Fix For: 3.0.0, 0.23.7, 2.0.5-beta
>
> Attachments: MAPREDUCE-4875-branch-0.23-b.patch, 
> MAPREDUCE-4875-branch-0.23.patc

[jira] [Commented] (MAPREDUCE-3859) CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs

2013-03-30 Thread Sergey Tryuber (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618011#comment-13618011
 ] 

Sergey Tryuber commented on MAPREDUCE-3859:
---

What I know is that the bug is still present in CDH4.1 MR1. So we had to patch 
Capacity Scheduler there as well... 

> CapacityScheduler incorrectly utilizes extra-resources of queue for 
> high-memory jobs
> 
>
> Key: MAPREDUCE-3859
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3859
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: capacity-sched
>Affects Versions: 1.0.0
> Environment: CDH3u1
>Reporter: Sergey Tryuber
> Attachments: test-to-fail.patch.txt
>
>
> Imagine, we have a queue A with capacity 10 slots and 20 as extra-capacity, 
> jobs which use 3 map slots will never consume more than 9 slots, regardless 
> how many free slots on a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira