[ 
https://issues.apache.org/jira/browse/YARN-9634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated YARN-9634:
------------------------
    Description: When the cluster size is large, the dir which user submits the 
job, and the dir which container log aggregate, and other information will fill 
the HDFS directory, because the HDFS directory has a default storage limit, 
this can be configured by "yarn.log-aggregation.retain-seconds" to solve. But  
the FSNamesystemLock#writeLock and rpc operation which these dir operation 
triggered will affect the namespace which these dirs are located, in order to 
get this better we have let this dir in one single HDFS federation namespace, 
but with the cluster become huge, the single namespace will also affect the rpc 
performance. In response to this situation, we can change these dirs more 
distributed among multi namespace dirs, with some policy to choose, such as 
hash policy and round robin policy.  (was: When the cluster size is large, the 
dir which user submits the job, and the dir which container log aggregate, and 
other information will fill the HDFS directory, because the HDFS directory has 
a default storage limit. In response to this situation, we can change these 
dirs more distributed, with some policy to choose, such as hash policy and 
round robin policy.)

> Make yarn submit dir and log aggregation dir more evenly distributed
> --------------------------------------------------------------------
>
>                 Key: YARN-9634
>                 URL: https://issues.apache.org/jira/browse/YARN-9634
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 3.2.0
>            Reporter: zhuqi
>            Assignee: zhuqi
>            Priority: Major
>
> When the cluster size is large, the dir which user submits the job, and the 
> dir which container log aggregate, and other information will fill the HDFS 
> directory, because the HDFS directory has a default storage limit, this can 
> be configured by "yarn.log-aggregation.retain-seconds" to solve. But  the 
> FSNamesystemLock#writeLock and rpc operation which these dir operation 
> triggered will affect the namespace which these dirs are located, in order to 
> get this better we have let this dir in one single HDFS federation namespace, 
> but with the cluster become huge, the single namespace will also affect the 
> rpc performance. In response to this situation, we can change these dirs more 
> distributed among multi namespace dirs, with some policy to choose, such as 
> hash policy and round robin policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to