xiangfu0 commented on code in PR #18724:
URL: https://github.com/apache/pinot/pull/18724#discussion_r3423298927
##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskGenerator.java:
##########
@@ -510,6 +511,24 @@ public void validateTaskConfigs(TableConfig tableConfig,
Schema schema, Map<Stri
Preconditions.checkState(columnNames.contains(dimension), "Column
dimension to erase \"" + dimension
+ "\" not found in schema!");
}
+ // check no mis-configured aggregation types
+ for (Map.Entry<String, String> entry : taskConfigs.entrySet()) {
+ if (entry.getKey().endsWith(MergeTask.AGGREGATION_TYPE_KEY_SUFFIX)) {
+ String column = StringUtils.removeEnd(entry.getKey(),
MergeTask.AGGREGATION_TYPE_KEY_SUFFIX);
+ try {
Review Comment:
Fixed in 22c20c711e — `validateTaskConfigs` now rejects any
`*.aggregationType` whose derived column is not present in the schema (added
`Preconditions.checkState(columnNames.contains(column), ...)` before
parsing/allowlisting), so a typo like `missingCol.aggregationType=sum` now
fails validation fast instead of being silently ignored at runtime. Added a
regression test in `MergeRollupTaskGeneratorTest`.
##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/MergeTaskUtils.java:
##########
@@ -137,6 +139,31 @@ public static Map<String, AggregationFunctionType>
getAggregationTypes(Map<Strin
return aggregationTypes;
}
+ /// Validates the prerequisites for an order sensitive aggregation
(FIRSTWITHTIME/LASTWITHTIME) on the given
+ /// column: the column must be a metric column in the schema, and the table
must have a time column with a DateTime
+ /// spec in the schema so that the reducer can order the values by the
original time. No-op for other aggregation
+ /// types. The given aggregation type must be parseable (see
+ /// [AggregationFunctionType#getAggregationFunctionType(String)]).
+ public static void validateOrderSensitiveAggregation(TableConfig
tableConfig, Schema schema, String column,
+ String aggregationType) {
+ AggregationFunctionType aggregationFunctionType =
AggregationFunctionType.getAggregationFunctionType(
+ aggregationType);
+ if (!ValueAggregatorFactory.isOrderSensitive(aggregationFunctionType)) {
+ return;
+ }
+ FieldSpec fieldSpec = schema.getFieldSpecFor(column);
+ Preconditions.checkState(fieldSpec != null && fieldSpec.getFieldType() ==
FieldSpec.FieldType.METRIC,
+ "Aggregation type: %s on column: %s requires the column to be a metric
column in schema!", aggregationType,
+ column);
Review Comment:
Fixed in 22c20c711e — split the combined precondition into two `checkState`
calls: a missing column now reports "requires the column to exist in schema!"
and a non-metric column reports "requires the column to be a metric column in
schema!", so the two failure modes are no longer conflated.
##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/genericrow/GenericRowFileRecordReader.java:
##########
@@ -94,6 +94,12 @@ public int compare(int rowId1, int rowId2) {
return _fileReader.compare(_sortedRowIds[rowId1], _sortedRowIds[rowId2]);
}
+ /// Compares the records at the given row ids. Only compare the values for
the first `numFieldsToCompare` fields.
+ public int compare(int rowId1, int rowId2, int numFieldsToCompare) {
+ assert _sortedRowIds != null;
+ return _fileReader.compare(_sortedRowIds[rowId1], _sortedRowIds[rowId2],
numFieldsToCompare);
+ }
Review Comment:
Fixed in 22c20c711e — both `compare(int,int)` and `compare(int,int,int)` now
use `Preconditions.checkState(_sortedRowIds != null, "Cannot compare rows on an
unsorted reader")` instead of `assert`, so the invariant is enforced in
production (assertions are disabled by default) with a clear message instead of
a confusing NPE.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]