somandal commented on code in PR #15175:
URL: https://github.com/apache/pinot/pull/15175#discussion_r1978375341
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java:
##########
@@ -963,6 +974,15 @@ private Pair<InstancePartitions, Boolean>
getInstancePartitionsForTier(TableConf
}
}
+ private DiskUsageInfo getDiskUsageInfoOfInstance(String instanceId) {
+ return ResourceUtilizationInfo.getDiskUsageInfo(instanceId);
+ }
+
+ private long getSegmentsSizeInBytes(Set<String> segments) {
+ // TODO: replace with table API to get total segment size
+ return 1000L * segments.size();
Review Comment:
I calculate the average segment size based on TableSize API. Can you use
that to calculate this?
It is okay for this to be an estimate for now (we can enhance this later to
use exact segment sizes if we need the accuracy - TableSize API does provide
per segment information)
```
long tableSizePerReplicaInBytes =
calculateTableSizePerReplicaInBytes(tableNameWithType);
long averageSegmentSizeInBytes = tableSizePerReplicaInBytes <= 0 ?
tableSizePerReplicaInBytes
: tableSizePerReplicaInBytes / ((long) currentAssignment.size());
```
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java:
##########
@@ -677,25 +680,33 @@ private RebalanceSummaryResult
calculateDryRunSummary(Map<String, Map<String, St
} else {
serversAdded.add(server);
}
- newSegmentSet.removeAll(existingSegmentSet);
+ newSegmentSet.removeAll(intersection);
+ Set<String> removedSegmentSet = new HashSet<>(existingSegmentSet);
+ removedSegmentSet.removeAll(intersection);
int segmentsAdded = newSegmentSet.size();
if (segmentsAdded > 0) {
serversGettingNewSegments.add(server);
}
maxSegmentsAddedToServer = Math.max(maxSegmentsAddedToServer,
segmentsAdded);
- int segmentsDeleted = existingSegmentSet.size() - segmentsUnchanged;
+ int segmentsDeleted = removedSegmentSet.size();
+
+ long diskSizeChange = getSegmentsSizeInBytes(newSegmentSet) -
getSegmentsSizeInBytes(removedSegmentSet);
Review Comment:
Please see the discussion with @Jackie-Jiang on how this should be
calculated. We want a worst case estimate as well (maybe you can add an
additional parameter for what worst case will be)
In the worst case we will have `currentDiskUsageInBytes +
segmentSizeInBytesToBeAdded`, if the segments get added before the segments to
be removed are deleted from the server.
If `lowDiskMode` is enabled, we should be able to use a more exact size
since the segments to be removed will be removed first. In that case we can
assume worst case to be: `currentDiskUsageInBytes + segmentSizeInBytesToBeAdded
- segmentSizeInBytesToBeRemoved`.
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/RebalanceSummaryResult.java:
##########
@@ -86,12 +89,18 @@ public static class ServerSegmentChangeInfo {
public ServerSegmentChangeInfo(@JsonProperty("serverStatus") ServerStatus
serverStatus,
@JsonProperty("totalSegmentsAfterRebalance") int
totalSegmentsAfterRebalance,
@JsonProperty("totalSegmentsBeforeRebalance") int
totalSegmentsBeforeRebalance,
+ @JsonProperty("diskUsedAfterRebalance") long diskUsedAfterRebalance,
Review Comment:
- We should also have a field to track the worst case in size increase that
we can calculate. Have left a comment about how to do this where you do the
calculations
- It may also be a good idea to have a % utilization after rebalance for
both after final rebalance and worst case scenario
One more option may be to add a flag the `ServerInfo` for a quick check or
something:
- boolean isAnyServerWorstCaseUtilizationAbove80pctDuringRebalance
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java:
##########
@@ -963,6 +974,15 @@ private Pair<InstancePartitions, Boolean>
getInstancePartitionsForTier(TableConf
}
}
+ private DiskUsageInfo getDiskUsageInfoOfInstance(String instanceId) {
+ return ResourceUtilizationInfo.getDiskUsageInfo(instanceId);
Review Comment:
how often is this updated? It may be a good idea to compute this on the spot
to give a more accurate value if that is possible
Over time as rebalance runs, we can still run into running out of disk if a
large number of segments get added during the rebalance, so this is indeed just
an estimate. If it is too hard to have a more updated value, let's add a
comment indicating that limitation and some details on how old this value can
be. we can discuss this some more
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]