noob-se7en commented on code in PR #18549:
URL: https://github.com/apache/pinot/pull/18549#discussion_r3290888037


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/SegmentValidationUtils.java:
##########
@@ -57,6 +70,66 @@ public static void validateTimeInterval(SegmentMetadata 
segmentMetadata, TableCo
     }
   }
 
+  public static void validateUpsertSegmentPartitionMetadata(SegmentMetadata 
segmentMetadata, TableConfig tableConfig) {
+    if (tableConfig.getTableType() != TableType.OFFLINE || 
!tableConfig.isUpsertEnabled()) {
+      return;
+    }
+
+    String partitionColumn = getPartitionColumn(tableConfig);
+    if (StringUtils.isEmpty(partitionColumn)) {
+      return;
+    }
+
+    ColumnMetadata columnMetadata = 
segmentMetadata.getColumnMetadataFor(partitionColumn);
+    Set<Integer> partitions = columnMetadata != null ? 
columnMetadata.getPartitions() : null;
+    if (partitions == null || partitions.size() != 1) {
+      throw new ControllerApplicationException(LOGGER,
+          String.format("Uploaded segment: %s for offline upsert table: %s 
must contain exactly one partition id for "

Review Comment:
   Pinot convention #14404 avoids `String.format` in exception messages even 
outside hot paths, since the pattern gets adopted by callers later. consider 
plain concatenation: `"Uploaded segment: " + segmentMetadata.getName() + " for 
offline upsert table: " + tableConfig.getTableName() + " must contain exactly 
one partition id for column: " + partitionColumn + ", got: " + partitions`.



##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/upload/SegmentValidationUtils.java:
##########
@@ -57,6 +70,66 @@ public static void validateTimeInterval(SegmentMetadata 
segmentMetadata, TableCo
     }
   }
 
+  public static void validateUpsertSegmentPartitionMetadata(SegmentMetadata 
segmentMetadata, TableConfig tableConfig) {
+    if (tableConfig.getTableType() != TableType.OFFLINE || 
!tableConfig.isUpsertEnabled()) {

Review Comment:
   the table-type guard skips realtime upsert, but realtime upsert tables 
receive uploads via backfill / segment-replace / minion refresh, and a 
multi-partition segment in those flows breaks upsert the same way. is realtime 
exclusion intentional? extending coverage isn't quite a one-line drop of the 
guard since `getPartitionColumn` only reads 
`instanceAssignmentConfigMap[OFFLINE]`; realtime resolution needs `[CONSUMING]` 
(and likely `[COMPLETED]`) too. worth either widening scope or adding a code 
comment justifying the offline-only restriction.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to