yashmayya opened a new pull request, #18620: URL: https://github.com/apache/pinot/pull/18620
## Summary The per-segment group-by scan loop in `DefaultGroupByExecutor#process(ValueBlock)` had no resource-usage sampling or termination check. As a result, a heavy `GROUP BY` could scan hundreds of millions of rows and grow a large group-by hash table while: - its memory footprint was not freshly attributed to the query accountant (sampling only resumed later, when the `DataTable` was built), so per-query memory tracking lagged behind actual allocation, and - it could not respond to cancellation / query timeout mid-scan, since the only termination check on this path fired once before the loop started. This change adds a single per-block `QueryThreadContext.checkTerminationAndSampleUsage(...)` call at the start of `process()`. It samples usage and checks for termination once per block (`MAX_DOC_PER_CALL` rows), which: - keeps the query's tracked memory footprint fresh as the hash table grows across the scan, improving accounting accuracy for the OOM-protection framework, and - lets a cancelled or timed-out query bail out of a long-running aggregation instead of running to completion. The call sits in `process()`, so it covers `GroupByOperator`, `FilteredGroupByOperator`, and `StarTreeGroupByExecutor` (which inherits `process()`). It is invoked once per block (not per row), so the overhead is negligible and matches the existing periodic-sampling pattern used elsewhere on the query path. ## Testing New `DefaultGroupByExecutorTest` builds a real multi-block segment and verifies that `process()`: - samples usage exactly once per block, - throws `TerminationException` when the query is explicitly cancelled, and - throws an `EXECUTION_TIMEOUT` `QueryException` when the query deadline has passed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
