Yes, that is correct.

What Substrait calls "groupings" is what is often referred to in SQL as
"grouping sets".  These allow you to compute the same aggregates but group
by different criteria.  Two very common ways of creating grouping sets are
"group by cube" and "group by rollup".  Snowflake's documentation for
rollup[1] describes the motivation quite well:

> You can think of rollup as generating multiple result sets, each
> of which (after the first) is the aggregate of the previous result
> set. So, for example, if you own a chain of retail stores, you
> might want to see the profit for:
>  * Each store.
>  * Each city (large cities might have multiple stores).
>  * Each state.
>  * Everything (all stores in all states).

Acero does not currently handle more than one grouping set.


[1] https://docs.snowflake.com/en/sql-reference/constructs/group-by-rollup

On Mon, Jul 10, 2023 at 2:22 PM Li Jin <ice.xell...@gmail.com> wrote:

> Hi,
>
> I am looking at the substrait protobuf for AggregateRel as well the Acero
> substrait consumer code:
>
>
> https://github.com/apache/arrow/blob/main/cpp/src/arrow/engine/substrait/relation_internal.cc#L851
>
> https://github.com/substrait-io/substrait/blob/main/proto/substrait/algebra.proto#L209
>
> Looks like in subtrait, AggregateRel can have multiple groupings and each
> grouping can have multiple expressions. Let's say now I want to "compute
> sum and mean on column A group by column B, C, D" (for Acero to execute).
> Is the right way to create one grouping with 3 expressions (direct
> reference) for "column B, C, D"?
>
> Thanks,
> Li
>

Reply via email to