sodonnel commented on pull request #3928: URL: https://github.com/apache/hadoop/pull/3928#issuecomment-1023240381
Yea looks likes it is all the isReplicatedOK that is making it slow. What about a change like this, to move the replicated check outside of the lock: ``` --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminBackoffMonitor.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminBackoffMonitor.java @@ -652,24 +652,27 @@ private void scanDatanodeStorage(DatanodeDescriptor dn, Iterator<BlockInfo> it = s.getBlockIterator(); while (it.hasNext()) { BlockInfo b = it.next(); - if (!initialScan || dn.isEnteringMaintenance()) { - // this is a rescan, so most blocks should be replicated now, - // or this node is going into maintenance. On a healthy - // cluster using racks or upgrade domain, a node should be - // able to go into maintenance without replicating many blocks - // so we will check them immediately. - if (!isBlockReplicatedOk(dn, b, false, null)) { - blockList.put(b, null); - } - } else { - blockList.put(b, null); - } + blockList.put(b, null); numBlocksChecked++; } } finally { namesystem.readUnlock(); } } + if (!initialScan || dn.isEnteringMaintenance()) { + // this is a rescan, so most blocks should be replicated now, + // or this node is going into maintenance. On a healthy + // cluster using racks or upgrade domain, a node should be + // able to go into maintenance without replicating many blocks + // so we will check them immediately. + Iterator<BlockInfo> iterator = blockList.keySet().iterator(); + while(iterator.hasNext()) { + BlockInfo b = iterator.next(); + if (!isBlockReplicatedOk(dn, b, false, null)) { + iterator.remove(); + } + } + } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org