[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606941#comment-13606941
 ] 

Sandy Ryza commented on MAPREDUCE-3688:
---------------------------------------

Hi Ravi,

I haven't come across the issue that you mentioned, i.e. I've gotten the proper 
diagnostic message when the NM kills a container for going over resource 
limits, but my testing has only been limited.  Sounds like some sort of bug 
with the NM state machine?

The part I've been looking into is related to Koji's work, making it that any 
errors that containers spit out to stdout/stderr on startup get added to the 
diagnostics.

As the focus of this JIRA has gone between a few related but separate issues, 
my opinion is at this point it makes most sense to file new JIRAs (or 
subtasks?) for the specific changes we want to make.  Does me working on 
picking up the logs and you working on the over-resource-limits message work 
for you?
                
> Need better Error message if AM is killed/throws exception
> ----------------------------------------------------------
>
>                 Key: MAPREDUCE-3688
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3688
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.1
>            Reporter: David Capwell
>            Assignee: Sandy Ryza
>             Fix For: 0.23.2
>
>         Attachments: mapreduce-3688-h0.23-v01.patch, 
> mapreduce-3688-h0.23-v02.patch
>
>
> We need better error messages in the UI if the AM gets killed or throws an 
> Exception.
> If the following error gets thrown: 
> java.lang.NumberFormatException: For input string: "9223372036854775807l" // 
> last char is an L
> then the UI should say this exception.  Instead I get the following:
> Application application_1326504761991_0018 failed 1 times due to AM Container 
> for appattempt_1326504761991_0018_000001
> exited with exitCode: 1 due to: Exception from container-launch: 
> org.apache.hadoop.util.Shell$ExitCodeException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to