yashmayya opened a new pull request, #18482: URL: https://github.com/apache/pinot/pull/18482
## Summary Adds support for the SQL standard `EXCLUDE` clause on window functions, covering all four options: - `EXCLUDE NO OTHERS` (default; existing behavior preserved) - `EXCLUDE CURRENT ROW` - `EXCLUDE GROUP` - `EXCLUDE TIES` Supported for the window functions where it is semantically meaningful — `SUM`, `COUNT`, `AVG`, `MIN`, `MAX`, `BOOL_AND`, `BOOL_OR`, `FIRST_VALUE`, `LAST_VALUE` — across both `ROWS` and `RANGE` frames. Ranking functions and `LAG`/`LEAD` continue to be framed implicitly per the SQL standard (Calcite rejects `EXCLUDE` on these at parse time). ## Implementation **Plan side**: - New `WindowExclusion` proto enum on `WindowNode` (field 8, default 0 = `EXCLUDE_NO_OTHERS` so old serialized plans round-trip safely). - `RelToPlanNodeConverter` / `PRelToPlanNodeConverter` propagate the exclusion through; the previous `Preconditions.checkState` rejecting non-default exclusions is removed. - `PlanNodeToRelConverter`, both serde sides, and `PlanNodeMerger` round-trip the new field. **Runtime side**: - `WindowFrame` carries the exclusion; `WindowFunction` base gains O(n) `computePeerBoundaries` + O(1) `firstNonExcluded` / `lastNonExcluded` helpers. The default `EXCLUDE NO OTHERS` path branches out early so the hot path is unchanged. - `AggregateWindowFunction` handles ROWS and all four supported RANGE shapes (UU / UC / CU / CC) using a sliding aggregator with per-row apply / unapply correction. Peer bounds are skipped for `EXCLUDE CURRENT ROW` when frame shape allows. - `FirstValueWindowFunction` / `LastValueWindowFunction` compute the effective first / last index in O(1) per row from peer bounds; `IGNORE NULLS` continues to work. - Existing monotonic-deque MIN/MAX aggregators don't support arbitrary removal, so a new `SortedMultisetMinMaxWindowValueAggregator` (TreeMap-backed, O(log K) per op) is selected when EXCLUDE forces per-row corrections. SUM / COUNT / AVG / BOOL_AND / BOOL_OR are commutative under add / remove and reuse the existing aggregators. Semantics were cross-verified against PostgreSQL. ## Test plan - [x] 11 new EXCLUDE cases in `pinot-query-runtime/src/test/resources/queries/WindowFunctions.json` exercising each of the four EXCLUDE options across `SUM` / `COUNT` / `AVG` / `MIN` / `FIRST_VALUE` / `LAST_VALUE`, plus ROWS / all four RANGE shapes / no-`ORDER BY`. Each expected output was generated from PostgreSQL. - [x] Unit tests for `SortedMultisetMinMaxWindowValueAggregator` (min / max with duplicates, out-of-order removal, no-op removal of an unknown value, null handling, `BigDecimal`). - [x] Full `ResourceBasedQueriesTest`, `WindowAggregateOperatorTest`, and `WindowValueAggregatorTest` suites pass. - [x] `spotless:apply` / `checkstyle:check` / `license:check` clean. ## Backwards / rolling-upgrade notes The proto field is additive with the standard proto3 zero-default (`EXCLUDE_NO_OTHERS`). New brokers will continue to plan queries without `EXCLUDE` to the same wire shape as today. A new broker that plans a non-default `EXCLUDE` and dispatches to an old server will see the server silently default the field to `EXCLUDE_NO_OTHERS`; servers should be upgraded before brokers if operators expect the new SQL syntax to take effect. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
