[ 
https://issues.apache.org/jira/browse/HDFS-13234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390739#comment-16390739
 ] 

He Xiaoqiao commented on HDFS-13234:
------------------------------------

Thanks [~kihwal],[~elgoiri] for your comments.
{quote}How big is a single instance in your use case? Bloated conf in dfs 
client is obviously a serious issue, but it can create bigger issues in 
apps/jobs.{quote}
Actually this is yarn logs upload service, and the size of single 
{{Configuration}} instance which located at NodeManager is about 120KB, but it 
is bloated to 600MB over all {{Configuration}} instances since two factors:
a. HDFS Federation + HA with QJM and there are dozens of nameservices (~20), 
and it create {{ConfiguredFailoverProxyProvider}} instance for each name 
service at client,  while num of {{Configuration}} instances will *2;
b. there are 150 single threads at most to execute upload yarn logs to HDFS;
so, in the extreme case, memory footprint of {{Configuration}} instances will 
occupy ~20 * 2 * 150 * 120KB;

{quote}New conf objects are created to prevent unintended conf update 
propagation. {quote}
it is true to prevent unintended conf update propagation, I think there are 
other ways to avoid clone the whole conf for only two parameters of 
{{ConfiguredFailoverProxyProvider}} and {{IPFailoverProxyProvider}} and waste 
huge memory resource probably as you mentioned, is there some suggestions? 
[~kihwal]

Thanks again.

> Remove renew configuration instance in ConfiguredFailoverProxyProvider and 
> reduce memory footprint for client
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-13234
>                 URL: https://issues.apache.org/jira/browse/HDFS-13234
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: fs, ha, hdfs-client
>            Reporter: He Xiaoqiao
>            Priority: Major
>         Attachments: HDFS-13234.001.patch
>
>
> The memory footprint of #DFSClient is very considerable in some special 
> scenario since there are many #Configuration instances and occupy much memory 
> resource (In an extreme case, org.apache.hadoop.conf.Configuration occupies 
> over 600MB we meet under HDFS Federation an HA with QJM and there are dozens 
> of NameNodes). I think some new Configuration instance is not necessary. Such 
> as  #ConfiguredFailoverProxyProvider initialization.
> {code:java}
>   public ConfiguredFailoverProxyProvider(Configuration conf, URI uri,
>       Class<T> xface, HAProxyFactory<T> factory) {
>     this.xface = xface;
>     this.conf = new Configuration(conf);
>     ......
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to