wiedld commented on code in PR #14637: URL: https://github.com/apache/datafusion/pull/14637#discussion_r1958868756
########## datafusion/core/tests/physical_optimizer/enforce_sorting.rs: ########## @@ -2280,3 +2283,49 @@ async fn test_not_replaced_with_partial_sort_for_unbounded_input() -> Result<()> assert_optimized!(expected_input, expected_no_change, physical_plan, true); Ok(()) } + +#[tokio::test] +async fn test_preserve_needed_coalesce() -> Result<()> { + // Input to EnforceSorting, from our test case. + let plan = projection_exec_with_alias( + union_exec(vec![parquet_exec_with_stats(); 2]), + vec![ + ("a".to_string(), "a".to_string()), + ("b".to_string(), "value".to_string()), + ], + ); + let plan = Arc::new(CoalescePartitionsExec::new(plan)); + let schema = schema(); + let sort_key = LexOrdering::new(vec![PhysicalSortExpr { + expr: col("a", &schema).unwrap(), + options: SortOptions::default(), + }]); + let plan: Arc<dyn ExecutionPlan> = + single_partitioned_aggregate(plan, vec![("a".to_string(), "a1".to_string())]); + let plan = sort_exec(sort_key, plan); + + // Starting plan: as in our test case. + let starting_plan = vec![ + "SortExec: expr=[a@0 ASC], preserve_partitioning=[false]", + " AggregateExec: mode=SinglePartitioned, gby=[a@0 as a1], aggr=[]", + " CoalescePartitionsExec", Review Comment: For two reasons: * (1) when using the default physical planner, it combines the adjancent partial and final aggregate into a singular aggregate before the EnforceSorting ([see here ordering)](https://github.com/apache/datafusion/blob/e4b78c7ed40c248cfc9596d53f1813b62c668249/datafusion/physical-optimizer/src/optimizer.rs#L103). This meant that in [our reproducer](https://github.com/apache/datafusion/issues/14691#issue-2855923235) we already had the partial & final combined. * (2) the [bug found](https://github.com/apache/datafusion/issues/14691#issue-2855923235) is when the distribution requirement fails for a single partitioned aggregate; since both the `AggregateMode::Single` and `AggregateMode::SinglePartitioned` requires the coalescing of partitions first (see [docs here](https://github.com/apache/datafusion/blob/e4b78c7ed40c248cfc9596d53f1813b62c668249/datafusion/physical-plan/src/aggregates/mod.rs#L83-L95)). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org