[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150864#comment-15150864 ]
Haibo Chen commented on YARN-4697: ---------------------------------- Hi Naganarasimha G R, Thanks very much for your comments. I have addressed the threadPool accessibility issue and also modified yarn-default.xml to match YarnConfiguration. To answer your other comments: 1. Yes, 50 should be safe. (The default I set is 100). But maybe sometimes even 50 threads alone for log aggregation is too much resource dedicated? Some users may also want to use more than 50 if they have powerful machines and many yarn applications? If this is configurable, users themselves can decide. 2. The purpose of the semaphore is to block the threads in the thread pool because the main thread always acquire the semaphore first. Because I set the thread pool size to be 1, once that single thread tries to acquire the semaphore when it executes either of the two runnable, it blocks and the other runnable will not be executed if the thread pool can indeed create only 1 thread. (If another thread is available in the thread pool, there will be another thread blocking on the semaphore, failing the test). The immediate release after acquire in runnable is just to safely release the resource. I'll try to add comments in the test code. > NM aggregation thread pool is not bound by limits > ------------------------------------------------- > > Key: YARN-4697 > URL: https://issues.apache.org/jira/browse/YARN-4697 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager > Reporter: Haibo Chen > Assignee: Haibo Chen > Attachments: yarn4697.001.patch > > > In the LogAggregationService.java we create a threadpool to upload logs from > the nodemanager to HDFS if log aggregation is turned on. This is a cached > threadpool which based on the javadoc is an ulimited pool of threads. > In the case that we have had a problem with log aggregation this could cause > a problem on restart. The number of threads created at that point could be > huge and will put a large load on the NameNode and in worse case could even > bring it down due to file descriptor issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)