[ https://issues.apache.org/jira/browse/YARN-11644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xie YiFan updated YARN-11644: ----------------------------- Description: LogAggregationService is responsible for uploading log to HDFS. It applies thread pool to execute upload task. The workflow of upload log as follow: # NM construct Applicaiton object when first container of a certain application launch, then notify LogAggregationService to init AppLogAggregationImpl. # LogAggregationService submit AppLogAggregationImpl to task queue # The idle worker of thread pool pulls AppLogAggregationImpl from task queue. # AppLogAggregationImpl do while loop to check the application state, do upload when application finished. Suppose the following scenario: * LogAggregationService initialize thread pool with 4 threads. * 4 long running applications start on this NM, so all threads are occupied by aggregator. * The next short application starts on this NM and quickly finish, but no idle thread for this app to upload log. as a result, the following applications have to wait the previous applications finish before uploading their logs. !image-2024-01-10-11-03-57-553.png|width=599,height=195! h4. Solution Change the spin behavior of AppLogAggregationImpl. If application has not finished, just return to yield current thread and resubmit itself to executor service. So the LogAggregationService can roll the task queue and the logs of finished application can be uploaded immediately. was: LogAggregationService is responsible for uploading log to HDFS. It applies thread pool to execute upload task. The workflow of upload log as follow: # NM construct Applicaiton object when first container of a certain application launch, then notify LogAggregationService to init AppLogAggregationImpl. # LogAggregationService submit AppLogAggregationImpl to task queue. # The idle worker of thread pool pulls AppLogAggregationImpl from task queue. # AppLogAggregationImpl do while loop to check the application state, do upload when application finished. Suppose the following scenario: * LogAggregationService initialize thread pool with 4 threads. * 4 long running applications start on this NM, so all threads are occupied by aggregator. * The next short application starts on this NM and quickly finish, but no idle thread for this app to upload log. as a result, the following applications have to wait the previous applications finish before uploading their logs. !image-2024-01-10-11-03-57-553.png|width=599,height=195! h4. Solution Change the spin behavior of AppLogAggregationImpl. If application has not finished, just return to yield current thread and resubmit itself to executor service. So the LogAggregationService can roll the task queue and the logs of finished application can be uploaded immediately. > LogAggregationService can't upload log in time when application finished > ------------------------------------------------------------------------ > > Key: YARN-11644 > URL: https://issues.apache.org/jira/browse/YARN-11644 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation > Reporter: Xie YiFan > Assignee: Xie YiFan > Priority: Minor > Attachments: image-2024-01-10-11-03-57-553.png > > > LogAggregationService is responsible for uploading log to HDFS. It applies > thread pool to execute upload task. > The workflow of upload log as follow: > # NM construct Applicaiton object when first container of a certain > application launch, then notify LogAggregationService to init > AppLogAggregationImpl. > # LogAggregationService submit AppLogAggregationImpl to task queue > # The idle worker of thread pool pulls AppLogAggregationImpl from task queue. > # AppLogAggregationImpl do while loop to check the application state, do > upload when application finished. > Suppose the following scenario: > * LogAggregationService initialize thread pool with 4 threads. > * 4 long running applications start on this NM, so all threads are occupied > by aggregator. > * The next short application starts on this NM and quickly finish, but no > idle thread for this app to upload log. > as a result, the following applications have to wait the previous > applications finish before uploading their logs. > !image-2024-01-10-11-03-57-553.png|width=599,height=195! > h4. Solution > Change the spin behavior of AppLogAggregationImpl. If application has not > finished, just return to yield current thread and resubmit itself to executor > service. So the LogAggregationService can roll the task queue and the logs of > finished application can be uploaded immediately. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org