Copilot commented on code in PR #17368:
URL: https://github.com/apache/iotdb/pull/17368#discussion_r2993494409


##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/execution/operator/source/relational/aggregation/ApproxPercentileWithWeightAccumulator.java:
##########
@@ -32,6 +34,12 @@ public void addIntInput(Column[] arguments, AggregationMask 
mask) {
 
     if (mask.isSelectAll()) {
       for (int i = 0; i < valueColumn.getPositionCount(); i++) {
+        if (weightColumn.isNull(i)) {
+          continue;
+        }

Review Comment:
   Null weights are currently silently skipped (`continue`), which contradicts 
the PR description that says weights are validated to be non-null. If the 
intended behavior is to reject NULL weights (at least when the value is 
non-null), this should throw a SemanticException instead of skipping; otherwise 
the PR description/tests should be aligned to explicitly define the semantics 
for NULL weights.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/execution/operator/source/relational/aggregation/ApproxPercentileWithWeightAccumulator.java:
##########
@@ -41,6 +49,12 @@ public void addIntInput(Column[] arguments, AggregationMask 
mask) {
       int position;
       for (int i = 0; i < positionCount; i++) {
         position = selectedPositions[i];
+        if (weightColumn.isNull(position)) {
+          continue;
+        }
+        if (weightColumn.getInt(position) < 1) {
+          throw new SemanticException("weight must be >= 1, was " + 
weightColumn.getInt(position));
+        }
         if (!valueColumn.isNull(position)) {
           tDigest.add(valueColumn.getInt(position), 
weightColumn.getInt(position));
         }

Review Comment:
   Only the non-grouped accumulator was updated to validate/handle weights. 
Group-by queries use GroupedApproxPercentileWithWeightAccumulator (via 
AccumulatorFactory) which still calls `weightColumn.getInt(...)` without 
null/>=1 checks, so group-by approx_percentile(..., weight, ...) can still 
accept invalid weights or fail on NULL weights. The same validation logic 
should be applied to the grouped accumulator to keep behavior consistent.



##########
integration-test/src/test/java/org/apache/iotdb/relational/it/query/recent/IoTDBTableAggregationIT.java:
##########
@@ -4352,6 +4352,12 @@ public void approxPercentileTest() {
           "2024-09-24T06:15:55.000Z,shanghai,55,null,",
         },
         DATABASE_NAME);
+
+    tableResultSetEqualTest(
+        "select approx_percentile(s1,null,0.5) from table1",
+        new String[] {"_col0"},
+        new String[] {"null,"},
+        DATABASE_NAME);

Review Comment:
   The new NULL-weight test only covers the non-grouped execution path. Since 
group-by uses a different (grouped) accumulator, it would be good to add an IT 
that exercises `approx_percentile(..., null, ...)` (and an invalid weight like 
-1) under GROUP BY to ensure the grouped implementation matches the intended 
semantics and doesn’t regress.
   ```suggestion
           DATABASE_NAME);
   
       tableResultSetEqualTest(
           "select 1 as g, approx_percentile(s1,null,0.5) from table1 group by 
1",
           new String[] {"g", "_col1"},
           new String[] {"1,null,"},
           DATABASE_NAME);
   ```



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/execution/operator/source/relational/aggregation/ApproxPercentileWithWeightAccumulator.java:
##########
@@ -32,6 +34,12 @@ public void addIntInput(Column[] arguments, AggregationMask 
mask) {
 
     if (mask.isSelectAll()) {
       for (int i = 0; i < valueColumn.getPositionCount(); i++) {
+        if (weightColumn.isNull(i)) {
+          continue;
+        }
+        if (weightColumn.getInt(i) < 1) {
+          throw new SemanticException("weight must be >= 1, was " + 
weightColumn.getInt(i));
+        }
         if (!valueColumn.isNull(i)) {
           tDigest.add(valueColumn.getInt(i), weightColumn.getInt(i));
         }

Review Comment:
   Weight validation happens before checking whether the corresponding value is 
null. This means rows with NULL values can still throw on an invalid/negative 
weight even though that row would otherwise be ignored by the aggregation. 
Consider only reading/validating the weight inside the 
`!valueColumn.isNull(...)` branch (and then validating null/<=0 weight there).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to