[ https://issues.apache.org/jira/browse/YARN-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sandy Ryza updated YARN-499: ---------------------------- Summary: On container failure, include logs in diagnostics (was: On container failure, surface logs to client) > On container failure, include logs in diagnostics > ------------------------------------------------- > > Key: YARN-499 > URL: https://issues.apache.org/jira/browse/YARN-499 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Affects Versions: 2.0.3-alpha > Reporter: Sandy Ryza > Assignee: Sandy Ryza > Attachments: YARN-499.patch > > > When a container fails, the only way to diagnose it is to look at the logs. > ContainerStatuses include a diagnostic string that is reported back to the > resource manager by the node manager. > Currently in MR2 I believe whatever is sent to the task's standard out is > added to the diagnostics string, but for MR standard out is redirected to a > file called stdout. In MR1, this string was populated with the last few > lines of the task's stdout file, and got printed to the console, allowing for > easy debugging. > Handling this would help to soothe the infuriating problem of an AM dying for > a mysterious reason before setting a tracking URL (MAPREDUCE-3688). > This could be done in one of two ways. > * Use tee to send MR's standard out to both the stdout file and standard out. > This requires modifying ShellCmdExecutor to roll what it reads in, as we > wouldn't want to be storing the entire task log in NM memory. > * Read the task's log files. This would require standardizing or making the > container log files configurable. Right now the log files are determined in > userland and all that is YARN is aware of the log directory. > Does this present any issues I'm not considering? If so it this might only > be needed for AMs? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira