J-HowHuang commented on code in PR #16341:
URL: https://github.com/apache/pinot/pull/16341#discussion_r2229740224
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java:
##########
@@ -1813,9 +1878,99 @@ public int fetch(String segmentName) {
}
}
+ @VisibleForTesting
+ interface DataLossRiskAssessor {
+ boolean hasDataLossRisk(String segmentName);
+
+ String generateDataLossRiskMessage(String segmentName);
+ }
+
+ /**
+ * To be used for non-peer download enabled tables or peer-download enabled
tables rebalanced with
+ * minAvailableReplicas > 0
+ */
+ @VisibleForTesting
+ static class NoOpRiskAssessor implements DataLossRiskAssessor {
+ NoOpRiskAssessor() {
+ }
+
+ @Override
+ public boolean hasDataLossRisk(String segmentName) {
+ return false;
+ }
+
+ @Override
+ public String generateDataLossRiskMessage(String segmentName) {
+ throw new UnsupportedOperationException("No data loss risk message
should be generated for NoOpRiskAssessor");
+ }
+ }
+
+ /**
+ * To be used for peer-download enabled tables with downtime=true or
minAvailableReplicas=0
+ */
+ @VisibleForTesting
+ static class PeerDownloadTableDataLossRiskAssessor implements
DataLossRiskAssessor {
+ private final String _tableNameWithType;
+ private final TableConfig _tableConfig;
+ private final HelixManager _helixManager;
+ private final PinotLLCRealtimeSegmentManager
_pinotLLCRealtimeSegmentManager;
+ private final boolean _isPauselessEnabled;
+
+ @VisibleForTesting
+ PeerDownloadTableDataLossRiskAssessor(String tableNameWithType,
TableConfig tableConfig,
+ int minAvailableReplicas, HelixManager helixManager,
+ PinotLLCRealtimeSegmentManager pinotLLCRealtimeSegmentManager) {
+ // Should only be created for peer-download enabled tables with
minAvailableReplicas = 0
+
Preconditions.checkState(tableConfig.getValidationConfig().getPeerSegmentDownloadScheme()
!= null
+ && minAvailableReplicas == 0);
+ _tableNameWithType = tableNameWithType;
+ _tableConfig = tableConfig;
+ _helixManager = helixManager;
+ _pinotLLCRealtimeSegmentManager = pinotLLCRealtimeSegmentManager;
+ _isPauselessEnabled =
PauselessConsumptionUtils.isPauselessEnabled(tableConfig);
+ }
+
+ @Override
+ public boolean hasDataLossRisk(String segmentName) {
+ SegmentZKMetadata segmentZKMetadata = ZKMetadataProvider
+ .getSegmentZKMetadata(_helixManager.getHelixPropertyStore(),
_tableNameWithType, segmentName);
+ if (segmentZKMetadata == null) {
+ return false;
+ }
+
+ // If the segment state is COMPLETED and the peer download URL is empty,
there is a data loss risk
+ if (segmentZKMetadata.getStatus().isCompleted()) {
+ return
CommonConstants.Segment.METADATA_URI_FOR_PEER_DOWNLOAD.equals(segmentZKMetadata.getDownloadUrl());
+ }
+
+ // If the segment is not yet completed, then the following scenarios are
possible:
+ // - Non-upsert / non-dedup table:
+ // - data loss scenarios are not possible. Either the segment will
restart consumption or the
+ // RealtimeSegmentValidationManager will kick in to fix up the
segment if pauseless is enabled
+ // - Upsert / dedup table:
+ // - For non-pauseless tables, it is safe to move the segment
without data loss concerns
+ // - For pauseless tables, if the segment is still in CONSUMING
state, moving it is safe, but if it is in
+ // COMMITTING state then there is a risk of data loss on segment
build failures as well since the
+ // RealtimeSegmentValidationManager does not automatically try to
fix up these segments. To be safe it is
+ // best to return that there is a risk of data loss for pauseless
enabled tables for segments in COMMITTING
+ // state
Review Comment:
Actually I think this check may be covering too many false positive here.
https://github.com/apache/pinot/blob/3f13371fa4bd4a85921a2c6343aad9e7cd4d461f/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java#L1961-L1964
The data loss only occurs when a segment fails to build, plus it's partial
upsert or dedup so that the data can't be recovered. But we deny any segment
who is building here, assuming they are all to fail the build.
Additionally, if the build failure actually happens, the rebalance will fail
too unless `bestEffort=true`. It's an exception that needs explicit handling,
it should be fine we don't consider it fails here.
Let's not include this check in this PR and we can discuss how to check this
in the future.
cc @somandal
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]