swaminathanmanish commented on code in PR #15048:
URL: https://github.com/apache/pinot/pull/15048#discussion_r1954012191
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/SegmentDeletionManager.java:
##########
@@ -282,6 +284,23 @@ protected void removeSegmentFromStore(String
tableNameWithType, String segmentId
}
}
+ /**
+ * Gets URI for segment deletion by:
+ * 1. Fetching download URL from ZK metadata if available
+ * 2. Otherwise, constructing URI from data dir, table name and segment ID
+ */
+ private URI getFileToDeleteURI(String tableNameWithType, String segmentId) {
+ String segmentDownloadUrl =
Review Comment:
A failure between segment metadata clean up and DS clean up can happen at
which point we will not have the download URL, but standardization will help
there.
We need 2 things
1. Standardization on url (in BaseMultipleSegmentsConversionExecutor and any
other places) - That'll fix forward clean up of new segments. Perhaps
validation on downloadUrl format will catch things going forward.
2. Full scan of DS to clean up old ones with .tar.gz extension - This will
be one time.
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/SegmentDeletionManager.java:
##########
@@ -235,7 +237,7 @@ protected void removeSegmentFromStore(String
tableNameWithType, String segmentId
long retentionMs = deletedSegmentsRetentionMs == null
? _defaultDeletedSegmentsRetentionMs : deletedSegmentsRetentionMs;
String rawTableName =
TableNameBuilder.extractRawTableName(tableNameWithType);
- URI fileToDeleteURI = URIUtils.getUri(_dataDir, rawTableName,
URIUtils.encode(segmentId));
+ URI fileToDeleteURI = getFileToDeleteURI(tableNameWithType, segmentId);
Review Comment:
Since we observed that pinotFS has a different behavior when forceDelete is
supplied, Can we verify in this method that the file has been deleted ? The
output of deletion should reflect whether deletion actually happened or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]