Copilot commented on code in PR #17368:
URL: https://github.com/apache/iotdb/pull/17368#discussion_r2993494409
##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/execution/operator/source/relational/aggregation/ApproxPercentileWithWeightAccumulator.java:
##########
@@ -32,6 +34,12 @@ public void addIntInput(Column[] arguments, AggregationMask
mask) {
if (mask.isSelectAll()) {
for (int i = 0; i < valueColumn.getPositionCount(); i++) {
+ if (weightColumn.isNull(i)) {
+ continue;
+ }
Review Comment:
Null weights are currently silently skipped (`continue`), which contradicts
the PR description that says weights are validated to be non-null. If the
intended behavior is to reject NULL weights (at least when the value is
non-null), this should throw a SemanticException instead of skipping; otherwise
the PR description/tests should be aligned to explicitly define the semantics
for NULL weights.
##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/execution/operator/source/relational/aggregation/ApproxPercentileWithWeightAccumulator.java:
##########
@@ -41,6 +49,12 @@ public void addIntInput(Column[] arguments, AggregationMask
mask) {
int position;
for (int i = 0; i < positionCount; i++) {
position = selectedPositions[i];
+ if (weightColumn.isNull(position)) {
+ continue;
+ }
+ if (weightColumn.getInt(position) < 1) {
+ throw new SemanticException("weight must be >= 1, was " +
weightColumn.getInt(position));
+ }
if (!valueColumn.isNull(position)) {
tDigest.add(valueColumn.getInt(position),
weightColumn.getInt(position));
}
Review Comment:
Only the non-grouped accumulator was updated to validate/handle weights.
Group-by queries use GroupedApproxPercentileWithWeightAccumulator (via
AccumulatorFactory) which still calls `weightColumn.getInt(...)` without
null/>=1 checks, so group-by approx_percentile(..., weight, ...) can still
accept invalid weights or fail on NULL weights. The same validation logic
should be applied to the grouped accumulator to keep behavior consistent.
##########
integration-test/src/test/java/org/apache/iotdb/relational/it/query/recent/IoTDBTableAggregationIT.java:
##########
@@ -4352,6 +4352,12 @@ public void approxPercentileTest() {
"2024-09-24T06:15:55.000Z,shanghai,55,null,",
},
DATABASE_NAME);
+
+ tableResultSetEqualTest(
+ "select approx_percentile(s1,null,0.5) from table1",
+ new String[] {"_col0"},
+ new String[] {"null,"},
+ DATABASE_NAME);
Review Comment:
The new NULL-weight test only covers the non-grouped execution path. Since
group-by uses a different (grouped) accumulator, it would be good to add an IT
that exercises `approx_percentile(..., null, ...)` (and an invalid weight like
-1) under GROUP BY to ensure the grouped implementation matches the intended
semantics and doesn’t regress.
```suggestion
DATABASE_NAME);
tableResultSetEqualTest(
"select 1 as g, approx_percentile(s1,null,0.5) from table1 group by
1",
new String[] {"g", "_col1"},
new String[] {"1,null,"},
DATABASE_NAME);
```
##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/execution/operator/source/relational/aggregation/ApproxPercentileWithWeightAccumulator.java:
##########
@@ -32,6 +34,12 @@ public void addIntInput(Column[] arguments, AggregationMask
mask) {
if (mask.isSelectAll()) {
for (int i = 0; i < valueColumn.getPositionCount(); i++) {
+ if (weightColumn.isNull(i)) {
+ continue;
+ }
+ if (weightColumn.getInt(i) < 1) {
+ throw new SemanticException("weight must be >= 1, was " +
weightColumn.getInt(i));
+ }
if (!valueColumn.isNull(i)) {
tDigest.add(valueColumn.getInt(i), weightColumn.getInt(i));
}
Review Comment:
Weight validation happens before checking whether the corresponding value is
null. This means rows with NULL values can still throw on an invalid/negative
weight even though that row would otherwise be ignored by the aggregation.
Consider only reading/validating the weight inside the
`!valueColumn.isNull(...)` branch (and then validating null/<=0 weight there).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]