[GitHub] [ozone] sodonnel commented on a diff in pull request #4415: HDDS-8168. Make deadlines inside MoveManager for move commands configurable

via GitHub Tue, 21 Mar 2023 04:06:59 -0700


sodonnel commented on code in PR #4415:
URL: https://github.com/apache/ozone/pull/4415#discussion_r1143211308



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/MoveManager.java:
##########
@@ -109,8 +109,11 @@ public enum MoveResult {
   // TODO - Should pending ops notify under lock to allow MM to schedule a
   //        delete after the move, but before anything else can, eg RM?
 
-  // TODO - these need to be config defined somewhere, probably in the balancer
-  private static final long MOVE_DEADLINE = 1000 * 60 * 60; // 1 hour
+  /*
+  moveTimeout and replicationTimeout are set by ContainerBalancer.
+   */
+  private long moveTimeout = 1000 * 65 * 60;
+  private long replicationTimeout = 1000 * 50 * 60;
   private static final double MOVE_DEADLINE_FACTOR = 0.95;

Review Comment:
   I wonder if using a factor like this makes sense for these longer duration 
timeouts.
   
   The idea I had, is that the datanode should timeout before SCM does, so that 
when SCM abandons the command, we know the DN has given up on it too.
   
   If we have a replication timeout of 50 mins, then 95% of that is 47.5, so 
the DN will give up 2.5 mins before SCM does. Feels like the DN is then giving 
up too early. If we have a 10 minute timeout or a 60 minute timeout, the DN 
doesn't need to give up earlier for the longer timeout.
   
   Perhaps we could take factor away from here completely, and then let RM 
decide what the DN timeout should be. That would simplify the API slightly into 
RM, as we only need to pass a single timeout. 
   
   Perhaps 30 seconds less than the SCM timeout for all values, rather than a 
percentage like we have now, but it could have a single configuration in RM, 
rather than having the factor defined here and also in RM.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [ozone] sodonnel commented on a diff in pull request #4415: HDDS-8168. Make deadlines inside MoveManager for move commands configurable

Reply via email to