[ https://issues.apache.org/jira/browse/YARN-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218842#comment-17218842 ]
Szilard Nemeth commented on YARN-10323: --------------------------------------- Thanks [~bteke] for making this clear, I quickly prepared this Jira and completely forgot to mention that the design document is written by [~prabhujoseph]. We just discussed I will create this Jira itself. Didn't want to take credits or something similar. Apologies for this. > [Umbrella] YARN Diagnostic collector > ------------------------------------ > > Key: YARN-10323 > URL: https://issues.apache.org/jira/browse/YARN-10323 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Szilard Nemeth > Assignee: Benjamin Teke > Priority: Major > Attachments: YARN-10323 Design of Diagnostics Collector.pdf > > > Troubleshooting YARN problems can be difficult on a production environment. > Collecting data before problems occur or actively collecting data in an > on-demand basis could truly help tracking down issues. > Some examples: > 1. If application is hanging, application logs along with RM / NM logs could > be collected, plus jstack of either the YARN daemons or the application > container. > 2. Similarly, when an application fails we may collect data. > 3. Scheduler issues are quite common so good tooling that helps to spot > issues would be crucial. > Design document in the attachments section. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org