Xie YiFan created YARN-11644: -------------------------------- Summary: LogAggregationService can't upload log in time when application finished Key: YARN-11644 URL: https://issues.apache.org/jira/browse/YARN-11644 Project: Hadoop YARN Issue Type: Improvement Components: log-aggregation Reporter: Xie YiFan Assignee: Xie YiFan Attachments: image-2024-01-10-11-03-57-553.png
LogAggregationService is responsible for uploading log to HDFS. It applies thread pool to execute upload task. The workflow of upload log as follow: # NM construct Applicaiton object when first container of a certain application launch, then notify LogAggregationService to init AppLogAggregationImpl. # LogAggregationService submit AppLogAggregationImpl to task queue. # The idle worker of thread pool pulls AppLogAggregationImpl from task queue. # AppLogAggregationImpl do while loop to check the application state, do upload when application finished. Suppose the following scenario: * LogAggregationService initialize thread pool with 4 threads. * 4 long running applications start on this NM, so all threads are occupied by aggregator. * The next short application starts on this NM and quickly finish, but no idle thread for this app to upload log. as a result, the following applications have to wait the previous applications finish before uploading their logs. !image-2024-01-10-11-03-57-553.png|width=599,height=195! h4. Solution Change the spin behavior of AppLogAggregationImpl. If application has not finished, just return to yield current thread and resubmit itself to executor service. So the LogAggregationService can roll the task queue and the logs of finished application can be uploaded immediately. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org