[ https://issues.apache.org/jira/browse/YARN-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Szilard Nemeth reassigned YARN-9035: ------------------------------------ Assignee: (was: Szilard Nemeth) > Allow better troubleshooting of FS container assignments and lack of > container assignments > ------------------------------------------------------------------------------------------ > > Key: YARN-9035 > URL: https://issues.apache.org/jira/browse/YARN-9035 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Szilard Nemeth > Priority: Major > Attachments: YARN-9035.001.patch > > > The call chain started from {{FairScheduler.attemptScheduling}}, to > {{FSQueue}} (parent / leaf).assignContainer and down to > {{FSAppAttempt#assignContainer}} has many calls and has many potential > conditions where {{Resources.none()}} can be returned, meaning container is > not allocated. > A bunch of these empty-assignments do not come with a debug log statement, > so it's very hard to tell what condition lead the {{FairScheduler}} to a > decision where containers are not allocated. > On top of that, in many places, it's difficult to tell either why a > container was allocated to an app attempt. > The goal is to have a common place (i.e. class) that will do all the > loggings, so users conveniently can control all the logs if they are curious > why (and why not) container assigments happened. > Also, it would be handy if readers of the log could easily decide which > {{AppAttempt}} is the log record created for, in other words: every log > record should include the ID of the application / app attempt, if possible. > > Details of implementation: > As most of the already in-place debug messages were protected by a condition > that checks whether the debug level is enabled on loggers, I followed a > similar pattern. All the relevant log messages are created with the class > {{ResourceAssignment}}. > This class is a wrapper for the assigned {{Resource}} object and has a > single logger, so clients should use its helper methods to create log > records. There is a helper method called {{shouldLogReservationActivity}} > that checks if DEBUG or TRACE level is activated on the logger. > See the javadoc on this class for further information. > > {{ResourceAssignment}} is also responsible for adding the app / appettempt ID > to every log record (with some exceptions). > A couple of check classes are introduced: They are responsible to run and > store results of checks that are dependency of a successful container > allocation. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org