xiangfu0 opened a new pull request, #18817:
URL: https://github.com/apache/pinot/pull/18817
## Summary
Adds native execution of **`GROUP BY GROUPING SETS (...)` / `ROLLUP(...)` /
`CUBE(...)`** and the
**`GROUPING()` / `GROUPING_ID()`** functions to the **multi-stage query
engine (MSE)**, building on the
single-stage support in #18662.
The implementation **pushes the per-set row expansion down to the
single-stage (leaf) engine** — the same
backend-expansion model used by Doris/StarRocks — so it reuses the
single-stage GROUPING SETS engine
wholesale instead of reimplementing it.
## How it works
A grouping-set aggregate is split into `LEAF → EXCHANGE → FINAL → PROJECT`:
- **LEAF** carries the grouping sets and is converted to a single-stage
query (`PinotQuery.groupingSetMasks`),
so the single-stage engine expands each row across the sets and appends
the synthetic `$groupingId`
discriminator column. Leaf output: `[union keys…, $groupingId, leaf
aggregates…]`.
- **EXCHANGE** hash-partitions by the union keys **and** `$groupingId`, so
every row of a `(set, key)`
co-locates.
- **FINAL** is a plain (SIMPLE) aggregate grouping by `[union keys…,
$groupingId]` — rows from different
grouping sets stay distinct with **no grouping-set-specific merge logic**.
- **PROJECT** computes `GROUPING()` / `GROUPING_ID()` from `$groupingId`
(bit extraction over the union
column indexes, mirroring the single-stage post-aggregation handler) and
drops `$groupingId` to restore
the original row type.
`GROUPING` / `GROUPING_ID` are registered in `PinotOperatorTable`, and the
explicit
`GROUP BY GROUPING SETS ((a, b), …)` tuple syntax is accepted by the
validator (the tuples are grouping
sets, not ROW constructors).
## Tests
`GroupingSetsQueriesTest#testMultiStage*` — 10 multi-stage-vs-single-stage
parity tests, all green:
ROLLUP, CUBE, explicit GROUPING SETS, composite levels, mixed plain+ROLLUP,
single-column ROLLUP,
HAVING, filtered aggregation, and `GROUPING()` / `GROUPING_ID()` in SELECT
and HAVING (including the
genuine-vs-rolled-up NULL discrimination that the `$groupingId` column
exists to handle).
## Known limitations (fall back to the single-stage engine with a clear
error — never wrong results)
- **v2 physical planner** (`usePhysicalOptimizer=true`, opt-in, default off)
still rejects grouping sets;
the default v1 logical planner is fully supported. (Follow-up.)
- More than **31** distinct grouping columns (the `$groupingId` bitmask is a
32-bit int) — same cap as the
single-stage engine.
- Leaf group-trim (`ORDER BY … LIMIT` pushdown) is not pushed for
grouping-set queries; results are still
correct (the broker applies the final `ORDER BY` + `LIMIT`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]