jtuglu1 commented on PR #19422:
URL: https://github.com/apache/druid/pull/19422#issuecomment-4580987882

   Follow-up on the broader review concerns:
   
   * I tested this against the production `servers?full` snapshot saved locally 
in `servers.json`. The skew is byte-driven rather than count-driven: 44 
same-capacity Historicals ranged from about 69% to 99.8% disk used while 
segment counts were comparatively flat. The most-full nodes had much larger 
average segment sizes, so the old size-blind/locality-heavy scoring could 
preserve disproportionate placement.
   * The implementation now uses projected post-placement headroom, `cost / 
max(EPSILON, 1 - projectedUsageRatio)`, where the projected ratio includes the 
segment size if the server does not already project that segment. This 
specifically prevents moving/loading a large segment onto a server that would 
become nearly full after placement.
   * I did not add a legacy-behavior toggle. Since `diskNormalized` is already 
an opt-in strategy and the old behavior is the behavior this PR is fixing, 
keeping the PR focused avoids carrying an extra compatibility mode.
   * Docs and tests have been updated. The focused test passes, and as a 
negative control the diskNormalized-specific assertions fail if the test helper 
is temporarily changed to use `CostBalancerStrategy`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to