Copilot commented on code in PR #18724:
URL: https://github.com/apache/pinot/pull/18724#discussion_r3417580256


##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskGenerator.java:
##########
@@ -510,6 +511,24 @@ public void validateTaskConfigs(TableConfig tableConfig, 
Schema schema, Map<Stri
       Preconditions.checkState(columnNames.contains(dimension), "Column 
dimension to erase \"" + dimension
           + "\" not found in schema!");
     }
+    // check no mis-configured aggregation types
+    for (Map.Entry<String, String> entry : taskConfigs.entrySet()) {
+      if (entry.getKey().endsWith(MergeTask.AGGREGATION_TYPE_KEY_SUFFIX)) {
+        String column = StringUtils.removeEnd(entry.getKey(), 
MergeTask.AGGREGATION_TYPE_KEY_SUFFIX);
+        try {

Review Comment:
   `validateTaskConfigs` validates aggregation types but does not verify that 
the configured column exists in the schema. A typo like 
`missingCol.aggregationType=sum` would currently pass validation (and then be 
silently ignored at runtime for non-order-sensitive types). Add a schema 
existence check for the derived `column` before parsing/allowlisting the 
aggregation type.



##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/MergeTaskUtils.java:
##########
@@ -137,6 +139,31 @@ public static Map<String, AggregationFunctionType> 
getAggregationTypes(Map<Strin
     return aggregationTypes;
   }
 
+  /// Validates the prerequisites for an order sensitive aggregation 
(FIRSTWITHTIME/LASTWITHTIME) on the given
+  /// column: the column must be a metric column in the schema, and the table 
must have a time column with a DateTime
+  /// spec in the schema so that the reducer can order the values by the 
original time. No-op for other aggregation
+  /// types. The given aggregation type must be parseable (see
+  /// [AggregationFunctionType#getAggregationFunctionType(String)]).
+  public static void validateOrderSensitiveAggregation(TableConfig 
tableConfig, Schema schema, String column,
+      String aggregationType) {
+    AggregationFunctionType aggregationFunctionType = 
AggregationFunctionType.getAggregationFunctionType(
+        aggregationType);
+    if (!ValueAggregatorFactory.isOrderSensitive(aggregationFunctionType)) {
+      return;
+    }
+    FieldSpec fieldSpec = schema.getFieldSpecFor(column);
+    Preconditions.checkState(fieldSpec != null && fieldSpec.getFieldType() == 
FieldSpec.FieldType.METRIC,
+        "Aggregation type: %s on column: %s requires the column to be a metric 
column in schema!", aggregationType,
+        column);

Review Comment:
   `validateOrderSensitiveAggregation` uses a combined check (`fieldSpec != 
null && fieldSpec.getFieldType() == METRIC`) but the error message only 
mentions “metric column”. When the column is missing from the schema, this 
message is misleading and makes config debugging harder. Split this into two 
`checkState` calls so missing-column and wrong-field-type are reported 
accurately.



##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/genericrow/GenericRowFileRecordReader.java:
##########
@@ -94,6 +94,12 @@ public int compare(int rowId1, int rowId2) {
     return _fileReader.compare(_sortedRowIds[rowId1], _sortedRowIds[rowId2]);
   }
 
+  /// Compares the records at the given row ids. Only compare the values for 
the first `numFieldsToCompare` fields.
+  public int compare(int rowId1, int rowId2, int numFieldsToCompare) {
+    assert _sortedRowIds != null;
+    return _fileReader.compare(_sortedRowIds[rowId1], _sortedRowIds[rowId2], 
numFieldsToCompare);
+  }

Review Comment:
   The new `compare(rowId1, rowId2, numFieldsToCompare)` relies on `assert 
_sortedRowIds != null;`. Assertions are typically disabled in production, so 
this can devolve into a confusing NPE if the reader is used before sorting. 
Prefer an explicit runtime check that throws a clear exception (and consider 
aligning `compare(int,int)` similarly).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to