[GitHub] [hadoop] sodonnel commented on pull request #3928: HDFS-16438. Avoid holding read locks for a long time when scanDatanodeStorage

GitBox Thu, 27 Jan 2022 06:01:03 -0800


sodonnel commented on pull request #3928:
URL: https://github.com/apache/hadoop/pull/3928#issuecomment-1023240381



   Yea looks likes it is all the isReplicatedOK that is making it slow. What 
about a change like this, to move the replicated check outside of the lock:
   
   ```
   --- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminBackoffMonitor.java
   +++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminBackoffMonitor.java
   @@ -652,24 +652,27 @@ private void scanDatanodeStorage(DatanodeDescriptor dn,
            Iterator<BlockInfo> it = s.getBlockIterator();
            while (it.hasNext()) {
              BlockInfo b = it.next();
   -          if (!initialScan || dn.isEnteringMaintenance()) {
   -            // this is a rescan, so most blocks should be replicated now,
   -            // or this node is going into maintenance. On a healthy
   -            // cluster using racks or upgrade domain, a node should be
   -            // able to go into maintenance without replicating many blocks
   -            // so we will check them immediately.
   -            if (!isBlockReplicatedOk(dn, b, false, null)) {
   -              blockList.put(b, null);
   -            }
   -          } else {
   -            blockList.put(b, null);
   -          }
   +          blockList.put(b, null);
              numBlocksChecked++;
            }
          } finally {
            namesystem.readUnlock();
          }
        }
   +    if (!initialScan || dn.isEnteringMaintenance()) {
   +      // this is a rescan, so most blocks should be replicated now,
   +      // or this node is going into maintenance. On a healthy
   +      // cluster using racks or upgrade domain, a node should be
   +      // able to go into maintenance without replicating many blocks
   +      // so we will check them immediately.
   +      Iterator<BlockInfo> iterator = blockList.keySet().iterator();
   +      while(iterator.hasNext()) {
   +        BlockInfo b = iterator.next();
   +        if (!isBlockReplicatedOk(dn, b, false, null)) {
   +          iterator.remove();
   +        }
   +      }
   +    }
      }
   ```   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] sodonnel commented on pull request #3928: HDFS-16438. Avoid holding read locks for a long time when scanDatanodeStorage

Reply via email to