[ 
https://issues.apache.org/jira/browse/YARN-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171443#comment-14171443
 ] 

Jason Lowe commented on YARN-2314:
----------------------------------

So Tez will automatically benefit on large clusters because the default is to 
not use the cache.  However if we've found empirically that Tez needs the proxy 
cache to perform well then this patch would be a performance hit for Tez by 
default on clusters where the cache issues weren't a problem.  I wasn't sure 
which default benefit you were referring to above (running faster because cache 
is enabled or working on a large cluster because cache is disabled).

If Tez shows significant improvements with this cache turned on then I could 
see an argument to have the cache on by default since small clusters are common 
and large clusters are rare. 

> ContainerManagementProtocolProxy can create thousands of threads for a large 
> cluster
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-2314
>                 URL: https://issues.apache.org/jira/browse/YARN-2314
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.1.0-beta
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: YARN-2314.patch, disable-cm-proxy-cache.patch, 
> nmproxycachefix.prototype.patch
>
>
> ContainerManagementProtocolProxy has a cache of NM proxies, and the size of 
> this cache is configurable.  However the cache can grow far beyond the 
> configured size when running on a large cluster and blow AM address/container 
> limits.  More details in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to