Oleksandr Shevchenko created YARN-8364:
------------------------------------------

             Summary: NM aggregation thread should be able to exempt pool
                 Key: YARN-8364
                 URL: https://issues.apache.org/jira/browse/YARN-8364
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: log-aggregation
            Reporter: Oleksandr Shevchenko


For now, we have limited NM aggregation thread pool that can be configured by 
the property yarn.nodemanager.logaggregation.threadpool-size-max=100. 
When some application is starting it use one unit of the pool. And locks this 
unit until the application is finished. As the result, another application can 
aggregate their logs only when the previous application is finished.

Just for example:
yarn.nodemanager.logaggregation.threadpool-size-max=1

1. Start long-running application app1
2. Start short application app2
3. Finished app2
4. Finished app1
5. Aggregating logs of app1
6. Aggregating logs of app2

In the real cluster, we can have many long running jobs (for example Spark 
streaming), therefore short-running application do not aggregate their logs a 
long time. It problem appears if the average number of jobs exceeds thread pool 
size. All threads occupied by some applications, as the result we have the huge 
delay between application finishing and logs uploading.

Will be good if we improve this behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to