[jira] [Updated] (MAPREDUCE-6641) TestTaskAttempt fails in trunk

2016-02-20 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated MAPREDUCE-6641:
--
Attachment: 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt-output.txt

Attaching a log.

> TestTaskAttempt fails in trunk
> --
>
> Key: MAPREDUCE-6641
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6641
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Tsuyoshi Ozawa
> Attachments: 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt-output.txt
>
>
> {code}
> Running org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt
> Tests run: 23, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.917 sec 
> <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt
> testMRAppHistoryForTAFailedInAssigned(org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt)
>   Time elapsed: 12.732 sec  <<< FAILURE!
> java.lang.AssertionError: No Ta Started JH Event
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testTaskAttemptAssignedKilledHistory(TestTaskAttempt.java:388)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testMRAppHistoryForTAFailedInAssigned(TestTaskAttempt.java:177)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6641) TestTaskAttempt fails in trunk

2016-02-20 Thread Tsuyoshi Ozawa (JIRA)
Tsuyoshi Ozawa created MAPREDUCE-6641:
-

 Summary: TestTaskAttempt fails in trunk
 Key: MAPREDUCE-6641
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6641
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Tsuyoshi Ozawa


{code}
Running org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt
Tests run: 23, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.917 sec <<< 
FAILURE! - in org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt
testMRAppHistoryForTAFailedInAssigned(org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt)
  Time elapsed: 12.732 sec  <<< FAILURE!
java.lang.AssertionError: No Ta Started JH Event
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testTaskAttemptAssignedKilledHistory(TestTaskAttempt.java:388)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testMRAppHistoryForTAFailedInAssigned(TestTaskAttempt.java:177)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them

2016-02-20 Thread Eric Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated MAPREDUCE-5044:
--
Target Version/s: 2.8.0, 2.7.3  (was: 2.8.0)
  Status: Patch Available  (was: Open)

> Have AM trigger jstack on task attempts that timeout before killing them
> 
>
> Key: MAPREDUCE-5044
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 2.1.0-beta
>Reporter: Jason Lowe
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, 
> MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, 
> MAPREDUCE-5044.v06.patch, MAPREDUCE-5044.v07.local.patch, Screen Shot 
> 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png
>
>
> When an AM expires a task attempt it would be nice if it triggered a jstack 
> output via SIGQUIT before killing the task attempt.  This would be invaluable 
> for helping users debug their hung tasks, especially if they do not have 
> shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them

2016-02-20 Thread Eric Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated MAPREDUCE-5044:
--
Attachment: MAPREDUCE-5044.v07.local.patch

Thanks, [~jira.shegalov] for all of the work already done on this JIRA.

I have upmerged the latest patch and integrated it with the 
{{SignalContainerRequest}} that was added as part of YARN-445 and its children.

[~mingma], [~xgong], [~jlowe], [~jira.shegalov], would you please take a look?

I would like to see functionality in this JIRA implemented. We occasionally see 
containers time out, and it would be good if users could have direct feedback 
in the form of a jstack to help them debug their applications.

IIUC, YARN-445 and its children put in place the infrastructure for a {{Client 
-> RM -> NM -> Container}} signal path. However, in order to automatically dump 
the jstack when a container times out, we still need an {{AM -> NM -> 
Container}} signal path. This JIRA (MAPREDUCE-5044 along with YARN-1515) adds 
this signal path along with the ability to send multiple signals per call.

I think sending multiple signals per call could be split into a separate JIRA.


> Have AM trigger jstack on task attempts that timeout before killing them
> 
>
> Key: MAPREDUCE-5044
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am
>Affects Versions: 2.1.0-beta
>Reporter: Jason Lowe
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, 
> MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, 
> MAPREDUCE-5044.v06.patch, MAPREDUCE-5044.v07.local.patch, Screen Shot 
> 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png
>
>
> When an AM expires a task attempt it would be nice if it triggered a jstack 
> output via SIGQUIT before killing the task attempt.  This would be invaluable 
> for helping users debug their hung tasks, especially if they do not have 
> shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)