ChenSammi commented on code in PR #9505:
URL: https://github.com/apache/ozone/pull/9505#discussion_r2629666258
##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/policy/DefaultVolumeChoosingPolicy.java:
##########
@@ -50,67 +52,43 @@ public DefaultVolumeChoosingPolicy(ReentrantLock
globalLock) {
@Override
public Pair<HddsVolume, HddsVolume> chooseVolume(MutableVolumeSet volumeSet,
- double threshold, Map<HddsVolume, Long> deltaMap, long containerSize) {
+ double thresholdPercentage, Map<HddsVolume, Long> deltaMap, long
containerSize) {
lock.lock();
try {
// Create truly immutable snapshot of volumes to ensure consistency
- ImmutableList<HddsVolume> allVolumes =
DiskBalancerVolumeCalculation.getImmutableVolumeSet(volumeSet);
-
+ final List<StorageVolume> allVolumes = volumeSet.getVolumesList();
if (allVolumes.size() < 2) {
return null; // Can't balance with less than 2 volumes.
}
-
- // Calculate ideal usage using the same immutable volume
- double idealUsage =
DiskBalancerVolumeCalculation.getIdealUsage(allVolumes, deltaMap);
- // Threshold is given as a percentage
- double normalizedThreshold = threshold / 100;
- List<HddsVolume> volumes = allVolumes
- .stream()
- .filter(volume -> {
- SpaceUsageSource usage = volume.getCurrentUsage();
Review Comment:
@szetszwo, I got the point after a second thought. My previous
understanding of this filtering out < threashold volumes is it tries to
implement a high efficient way of selecting destVolume, as a straightforward
thinking is, if there is one volume beyond the utilization threshold, there is
likely one volume below the utilization threshold, but realized that actually
there are other cases, that there is one volume beyond threshold and no volumes
under threshold, or there is one volume under threshold and no volumes beyond
threshold,
(1) one volume beyond threshold and no volumes under threshold
```
Disk1, 30, 100
Disk2, 30, 100
Disk3, 40, 100
100 / 300 = 33.3%
Disk1: 30%
Disk2: 30%
Disk3: 40%
Threshold: 10
Disk utilization range (23.3, 43.3)
Out range volume list: NULL
Threshold: 5
Disk utilization range (28.3, 38.3)
Out range volume list: Disk3
```
(2) one volume under threshold and no volumes beyond threshold
```
Disk1, 30, 100
Disk2, 30, 100
Disk3, 20, 100
80 / 300 = 26.7%
Disk1: 30%
Disk2: 30%
Disk3: 20%
Threshold: 10
Disk utilization range (16.7, 36.7)
Out range volume list: NULL
Threshold: 5
Disk utilization range (21.7, 31.7)
Out range volume list: Disk3
```
So above two cases, are typical cases which are not covered by existing
logic. And it looks like case (2) is not covered by new logic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]