[ https://issues.apache.org/jira/browse/MAPREDUCE-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606941#comment-13606941 ]
Sandy Ryza commented on MAPREDUCE-3688: --------------------------------------- Hi Ravi, I haven't come across the issue that you mentioned, i.e. I've gotten the proper diagnostic message when the NM kills a container for going over resource limits, but my testing has only been limited. Sounds like some sort of bug with the NM state machine? The part I've been looking into is related to Koji's work, making it that any errors that containers spit out to stdout/stderr on startup get added to the diagnostics. As the focus of this JIRA has gone between a few related but separate issues, my opinion is at this point it makes most sense to file new JIRAs (or subtasks?) for the specific changes we want to make. Does me working on picking up the logs and you working on the over-resource-limits message work for you? > Need better Error message if AM is killed/throws exception > ---------------------------------------------------------- > > Key: MAPREDUCE-3688 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3688 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am, mrv2 > Affects Versions: 0.23.1 > Reporter: David Capwell > Assignee: Sandy Ryza > Fix For: 0.23.2 > > Attachments: mapreduce-3688-h0.23-v01.patch, > mapreduce-3688-h0.23-v02.patch > > > We need better error messages in the UI if the AM gets killed or throws an > Exception. > If the following error gets thrown: > java.lang.NumberFormatException: For input string: "9223372036854775807l" // > last char is an L > then the UI should say this exception. Instead I get the following: > Application application_1326504761991_0018 failed 1 times due to AM Container > for appattempt_1326504761991_0018_000001 > exited with exitCode: 1 due to: Exception from container-launch: > org.apache.hadoop.util.Shell$ExitCodeException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira