alamb commented on code in PR #10526:
URL: https://github.com/apache/datafusion/pull/10526#discussion_r1602085179


##########
datafusion/sqllogictest/test_files/aggregate.slt:
##########
@@ -198,6 +198,73 @@ statement error This feature is not implemented: LIMIT not 
supported in ARRAY_AG
 SELECT array_agg(c13 LIMIT 1) FROM aggregate_test_100
 
 
+# Test distinct aggregate function with merge batch

Review Comment:
   I ran the test without the changes in this PR and it panic'd as expected:
   
   ```shell
   cargo test --test sqllogictests
   ...
   thread 'tokio-runtime-worker' panicked at 
datafusion/physical-expr/src/aggregate/array_agg_distinct.rs:158:9:
   assertion `left == right` failed: state array should only include 1 row!
     left: 5
    right: 1
   note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
   ```



##########
datafusion/sqllogictest/test_files/aggregate.slt:
##########
@@ -198,6 +198,73 @@ statement error This feature is not implemented: LIMIT not 
supported in ARRAY_AG
 SELECT array_agg(c13 LIMIT 1) FROM aggregate_test_100
 
 
+# Test distinct aggregate function with merge batch
+query II
+with A as (
+    select 1 as id, 2 as foo
+    UNION ALL
+    select 1, null
+    UNION ALL
+    select 1, null
+    UNION ALL
+    select 1, 3
+    UNION ALL
+    select 1, 2
+    ---- The order is non-deterministic, verify with length
+) select array_length(array_agg(distinct a.foo)), sum(distinct 1) from A a 
group by a.id;
+----
+3 1
+
+# It has only AggregateExec with FinalPartitioned mode, so `merge_batch` is 
used
+# If the plan is changed, whether the `merge_batch` is used should be verified 
to ensure the test coverage

Review Comment:
   💯  for the explanatory comments about why this explain is here



##########
datafusion/physical-expr/src/aggregate/array_agg_distinct.rs:
##########
@@ -153,12 +153,11 @@ impl Accumulator for DistinctArrayAggAccumulator {
             return Ok(());
         }
 
-        let array = &states[0];
-
-        assert_eq!(array.len(), 1, "state array should only include 1 row!");
-        // Unwrap outer ListArray then do update batch
-        let inner_array = array.as_list::<i32>().value(0);
-        self.update_batch(&[inner_array])
+        states[0]
+            .as_list::<i32>()
+            .iter()
+            .flatten()

Review Comment:
   what does the flatten do? Discard `Option`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to