Jefffrey commented on code in PR #17536:
URL: https://github.com/apache/datafusion/pull/17536#discussion_r2343365209
##########
datafusion/functions-aggregate/src/average.rs:
##########
@@ -62,6 +62,17 @@ make_udaf_expr_and_func!(
avg_udaf
);
+pub fn avg_distinct(expr: Expr) -> Expr {
+ Expr::AggregateFunction(datafusion_expr::expr::AggregateFunction::new_udf(
+ avg_udaf(),
+ vec![expr],
+ true,
+ None,
+ vec![],
+ None,
+ ))
+}
Review Comment:
Same as how count handles it:
https://github.com/apache/datafusion/blob/bfc5067718a3ddcb87531b5a9633605792078546/datafusion/functions-aggregate/src/count.rs#L71-L80
##########
datafusion/core/tests/dataframe/mod.rs:
##########
@@ -496,32 +497,35 @@ async fn drop_with_periods() -> Result<()> {
#[tokio::test]
async fn aggregate() -> Result<()> {
// build plan using DataFrame API
- let df = test_table().await?;
+ // union so some of the distincts have a clearly distinct result
+ let df = test_table().await?.union(test_table().await?)?;
let group_expr = vec![col("c1")];
let aggr_expr = vec![
- min(col("c12")),
- max(col("c12")),
- avg(col("c12")),
- sum(col("c12")),
- count(col("c12")),
- count_distinct(col("c12")),
+ min(col("c4")).alias("min(c4)"),
+ max(col("c4")).alias("max(c4)"),
+ avg(col("c4")).alias("avg(c4)"),
+ avg_distinct(col("c4")).alias("avg_distinct(c4)"),
+ sum(col("c4")).alias("sum(c4)"),
+ sum_distinct(col("c4")).alias("sum_distinct(c4)"),
+ count(col("c4")).alias("count(c4)"),
+ count_distinct(col("c4")).alias("count_distinct(c4)"),
Review Comment:
I switched to `c4` from `c12` as `c12` had some precision variations for
avg_distinct leading to inconsistent test results, and figured it was easier to
switch columns than slap `round` on the outputs
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]