KevinWikant commented on code in PR #7179:
URL: https://github.com/apache/hadoop/pull/7179#discussion_r1894049598


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderConstructionBlocks.java:
##########
@@ -0,0 +1,331 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hdfs.server.blockmanagement;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hdfs.DFSConfigKeys;
+import org.apache.hadoop.hdfs.protocol.Block;
+import org.apache.hadoop.thirdparty.com.google.common.collect.Maps;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
+import java.time.Instant;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
+/**
+ * The BlockManager will not add an Under Construction
+ * block to the DatanodeDescriptor StorageInfos until
+ * the block is fully committed and finalized.
+ * The UC block replicas are instead tracked here
+ * for the DatanodeAdminManager to use.
+ * Note that this is tracked in-memory only, as such
+ * some Under Construction blocks may be missed under
+ * scenarios where Namenode is restarted.
+ **/
+public class UnderConstructionBlocks {
+  private static final Logger LOG =
+          LoggerFactory.getLogger(UnderConstructionBlocks.class);
+
+  // Amount of time to wait in between checking all block replicas

Review Comment:
   > Can we please convert 5, 2 and 30 into symbolic constants that would help 
understand how the default values were decided ?
   
   These 3 variables are already named constants by virtue of being `private 
static final`. Can you clarify what you mean by "symbolic constants"? If the 
intention is to clarify how these values were decided, I can add some code 
comments for this
   
   > Would they need to be made configurable ?
   
   We could make these configurable, but I think it is arguably 
overly-complex/over-engineering to make users aware of these values. I can 
think of limited scenarios where users might need to tune these.
   
   The intention of these constants is as follows:
   
   - LONG_UNDER_CONSTRUCTION_BLOCK_WARN_THRESHOLD: determines when a block 
replica is considered under construction (i.e. held open) for a long time. 
Decision to use 2 hours is somewhat arbitrary. I chose 2 hours because I think 
this is long time for datanode decommissioning to be blocked on an in-progress 
HDFS write operation. If the value is configured to be too small, then users 
may get a lot of false positive warning logs. If the value if configured to be 
too large, then users may not see this warning log but they will still see the 
log `"Cannot decommission datanode {} with {} UC blocks: [{}]"` printed by the 
DatanodeAdminMonitor. I think 2 hours should not result in excessive warning 
logs & if the value is too large for some customers then they can fallback on 
the DatanodeAdminMonitor logs for debugging slow decommissioning.
   
   - LONG_UNDER_CONSTRUCTION_BLOCK_WARN_INTERVAL: determines how frequently to 
print the warning log for each block replica. The intention is to rate limit 
how often the log is printed to avoid excessive logging in the Namenode. 30 
minutes means that each hourly rotated Namenode log file will contain the 
warning logs & its not possible to miss these based on which hourly log the 
sysadmin is checking (provided they check during the time period the block was 
under construction).
   
   - LONG_UNDER_CONSTRUCTION_BLOCK_CHECK_INTERVAL: this is an optimization to 
limit how often the Namenode iterates through all the block replicas in this 
data structure. Without this constant, the DatanodeAdminMonitor will call this 
method every 30 seconds resulting in all UC block replicas being iterated 
through every 30 seconds. 5 minutes is also somewhat arbitrary decision & this 
could be any value less than equal to 
LONG_UNDER_CONSTRUCTION_BLOCK_WARN_INTERVAL. In fact, it might make sense to 
make this value closer to 30 minutes since that's the minimum period for any 
warning log.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to