FrankChen021 commented on code in PR #19422:
URL: https://github.com/apache/druid/pull/19422#discussion_r3201245023


##########
server/src/main/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategy.java:
##########
@@ -85,17 +85,16 @@ protected double computePlacementCost(
       return cost;
     }
 
-    // Guard against NaN propagation in the cost comparator if a server
-    // somehow reports a non-positive maxSize. Such a server cannot hold
-    // anything and will be rejected by canLoadSegment, so returning the
-    // raw cost is safe.
+    // A server with non-positive maxSize cannot hold anything and will be
+    // rejected by canLoadSegment; return the raw cost to avoid NaN 
propagation.
     final long maxSize = server.getMaxSize();
     if (maxSize <= 0) {
       return cost;
     }
 
-    double usageRatio = (double) server.getSizeUsed() / maxSize;
-    double normalizedCost = cost * usageRatio;
+    final double usageRatio = (double) server.getSizeUsed() / maxSize;
+    final double headroom = Math.max(EPSILON, 1.0 - usageRatio);
+    double normalizedCost = cost / headroom;

Review Comment:
   [P1] Existing threshold test now fails
   
   Changing normalization to `cost / headroom` makes the existing 
`testThresholdBlocksMarginalMove` scenario choose DEST: source is roughly `38K 
/ 0.20 * 0.95 = 180.5K`, while dest is `40K / 0.26 = 153.8K`. The test still 
asserts null, so the server test suite should fail unless the threshold 
scenario or algorithm is adjusted.



##########
server/src/main/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategy.java:
##########
@@ -26,40 +26,40 @@
 
 /**
  * A {@link BalancerStrategy} which normalizes the cost of placing a segment 
on a
- * server as calculated by {@link CostBalancerStrategy} by multiplying it by 
the
- * server's disk usage ratio.
+ * server as calculated by {@link CostBalancerStrategy} by dividing by the

Review Comment:
   [P2] Public docs still describe the old formula
   
   The implementation and Javadoc now divide by available headroom, but 
`docs/design/coordinator.md` and `docs/configuration/index.md` still say 
`diskNormalized` multiplies cost by `diskUsed / maxSize`. That leaves 
user-facing behavior documentation incorrect for this config option.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to