thinkharderdev commented on code in PR #16983: URL: https://github.com/apache/datafusion/pull/16983#discussion_r2327178463
########## datafusion/sqllogictest/test_files/aggregate.slt: ########## @@ -7390,6 +7392,41 @@ query error Error during planning: ORDER BY and WITHIN GROUP clauses cannot be u SELECT array_agg(a_varchar order by a_varchar) WITHIN GROUP (ORDER BY a_varchar) FROM (VALUES ('a'), ('d'), ('c'), ('a')) t(a_varchar); +statement ok +SET datafusion.execution.target_partitions = 1; + +query TT +EXPLAIN select * from (select 'id' as id union all select 'id' as id order by id) group by grouping sets ((id), ()); +---- +logical_plan +01)Projection: id +02)--Aggregate: groupBy=[[GROUPING SETS ((id), ())]], aggr=[[]] +03)----Union +04)------Projection: Utf8("id") AS id +05)--------EmptyRelation: rows=1 +06)------Projection: Utf8("id") AS id +07)--------EmptyRelation: rows=1 +physical_plan +01)ProjectionExec: expr=[id@0 as id] +02)--AggregateExec: mode=FinalPartitioned, gby=[id@0 as id, __grouping_id@1 as __grouping_id], aggr=[], ordering_mode=PartiallySorted([0]) +03)----CoalesceBatchesExec: target_batch_size=8192 +04)------RepartitionExec: partitioning=Hash([id@0, __grouping_id@1], 1), input_partitions=2 Review Comment: Sorry, been busy for past few days so just getting back to this. I think I understand the underlying issue now since `id` is a const we infer it as a singleton which is why we get the issue. Still I'm concerned that we are solving this with a pretty blunt instrument. Adding a repartition to ever aggregation with a grouping set can have a non-trivial cost, especially in a distributed query. Looking into it a bit more, it seems like in this case we infer `SortProperties::Singleton` for the `id` expr in the final aggregation which I think is incorrect. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org