[ https://issues.apache.org/jira/browse/YARN-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anand Srinivasan updated YARN-10967: ------------------------------------ Summary: setPermission() call floods HDFS NN RPC queue (was: CDP : setPermission() call floods HDFS NN RPC queue) > setPermission() call floods HDFS NN RPC queue > --------------------------------------------- > > Key: YARN-10967 > URL: https://issues.apache.org/jira/browse/YARN-10967 > Project: Hadoop YARN > Issue Type: Improvement > Affects Versions: 3.0.0 > Reporter: Anand Srinivasan > Priority: Major > Labels: performance > > Checking the code changes in CDP for the log aggregation feature, we could > see that when the log aggregator is inited for each app, we do verify and > create remote dir where we make an additional call to setPermission() even > though the remote dir exists and the permissions are set as expected. > This code path was introduced in CDP to cater to the cloud storage where we > had to make this additional check to ensure the remote file system and the > corresponding cloud storage supports setting permissions. > Upstream jira that introduced this call. > https://issues.apache.org/jira/browse/YARN-9030 > This additional setPermission() call per each app/job floods the HDFS NN and > its RPC queue which affects the performance overall. > The ask here is to see if it's feasible to do the following : > (a)if we can put the code introduced via YARN-9030 behind a configuration > option (may be setting this option to false by default (assuming the storage > used is HDFS) to bypass this code) > (b)check if customer is using HDFS storage internally in the code (by > checking yarn.nodemanager.remote-app-log-dir) and bypass this code if the > storage is indeed HDFS. > given that the code introduced in YARN-9030 is mainly put in for cloud > storage providers. > Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org