[ https://issues.apache.org/jira/browse/HDFS-15014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chao Sun resolved HDFS-15014. ----------------------------- Resolution: Duplicate > RBF: WebHdfs chooseDatanode shouldn't call getDatanodeReport > ------------------------------------------------------------- > > Key: HDFS-15014 > URL: https://issues.apache.org/jira/browse/HDFS-15014 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf > Reporter: Chao Sun > Priority: Major > > Currently the {{chooseDatanode}} call (which is shared by {{open}}, > {{create}}, {{append}} and {{getFileChecksum}}) in RBF WebHDFS calls > {{getDatanodeReport}} from ALL downstream namenodes: > {code} > private DatanodeInfo chooseDatanode(final Router router, > final String path, final HttpOpParam.Op op, final long openOffset, > final String excludeDatanodes) throws IOException { > // We need to get the DNs as a privileged user > final RouterRpcServer rpcServer = getRPCServer(router); > UserGroupInformation loginUser = UserGroupInformation.getLoginUser(); > RouterRpcServer.setCurrentUser(loginUser); > DatanodeInfo[] dns = null; > try { > dns = rpcServer.getDatanodeReport(DatanodeReportType.LIVE); > } catch (IOException e) { > LOG.error("Cannot get the datanodes from the RPC server", e); > } finally { > // Reset ugi to remote user for remaining operations. > RouterRpcServer.resetCurrentUser(); > } > HashSet<Node> excludes = new HashSet<Node>(); > if (excludeDatanodes != null) { > Collection<String> collection = > getTrimmedStringCollection(excludeDatanodes); > for (DatanodeInfo dn : dns) { > if (collection.contains(dn.getName())) { > excludes.add(dn); > } > } > } > ... > {code} > The {{getDatanodeReport}} is very expensive (particularly in a large cluster) > as it need to lock the {{DatanodeManager}} which is also shared by calls such > as processing heartbeats. Check HDFS-14366 for a similar issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org