[ 
https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150864#comment-15150864
 ] 

Haibo Chen commented on YARN-4697:
----------------------------------

Hi Naganarasimha G R, 

Thanks very much for your comments. I have addressed the threadPool 
accessibility issue and also modified yarn-default.xml to match 
YarnConfiguration. To answer your other comments:

1. Yes, 50 should  be safe. (The default I set is 100). But maybe sometimes 
even 50 threads alone for log aggregation is too much resource dedicated? Some 
users may also want to use more than 50 if they have powerful machines and many 
yarn applications? If this is configurable, users themselves can decide.

2. The purpose of the semaphore is to block the threads in the thread pool 
because the main thread always acquire the semaphore first. Because I set the 
thread pool size to be 1, once that single thread tries to acquire the 
semaphore when it executes either of the two runnable, it blocks and the other 
runnable will not be executed if the thread pool can indeed create only 1 
thread. (If another thread is available in the thread pool, there will be 
another thread blocking on the semaphore, failing the test). The immediate 
release after acquire in runnable is just to safely release the resource. I'll 
try to add comments in the test code.


> NM aggregation thread pool is not bound by limits
> -------------------------------------------------
>
>                 Key: YARN-4697
>                 URL: https://issues.apache.org/jira/browse/YARN-4697
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>         Attachments: yarn4697.001.patch
>
>
> In the LogAggregationService.java we create a threadpool to upload logs from 
> the nodemanager to HDFS if log aggregation is turned on. This is a cached 
> threadpool which based on the javadoc is an ulimited pool of threads.
> In the case that we have had a problem with log aggregation this could cause 
> a problem on restart. The number of threads created at that point could be 
> huge and will put a large load on the NameNode and in worse case could even 
> bring it down due to file descriptor issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to