jtuglu1 commented on PR #19422: URL: https://github.com/apache/druid/pull/19422#issuecomment-4580987882
Follow-up on the broader review concerns: * I tested this against the production `servers?full` snapshot saved locally in `servers.json`. The skew is byte-driven rather than count-driven: 44 same-capacity Historicals ranged from about 69% to 99.8% disk used while segment counts were comparatively flat. The most-full nodes had much larger average segment sizes, so the old size-blind/locality-heavy scoring could preserve disproportionate placement. * The implementation now uses projected post-placement headroom, `cost / max(EPSILON, 1 - projectedUsageRatio)`, where the projected ratio includes the segment size if the server does not already project that segment. This specifically prevents moving/loading a large segment onto a server that would become nearly full after placement. * I did not add a legacy-behavior toggle. Since `diskNormalized` is already an opt-in strategy and the old behavior is the behavior this PR is fixing, keeping the PR focused avoids carrying an extra compatibility mode. * Docs and tests have been updated. The focused test passes, and as a negative control the diskNormalized-specific assertions fail if the test helper is temporarily changed to use `CostBalancerStrategy`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
