shauryachats opened a new pull request, #14981:
URL: https://github.com/apache/pinot/pull/14981
A new configuration to control the size of result holders for MSE is
necessary to avoid resizing and rehashing operations in use cases where
grouping is needed on high-cardinality columns (e.g., UUIDs).
A simple query where it is necessary is
```
SELECT
count(*)
FROM
table_A
WHERE
(
user_uuid NOT IN (
SELECT
user_uuid
FROM
table_B
)
)
LIMIT
100 option(useMultistageEngine=true, timeoutMs=120000, useColocatedJoin =
true, maxRowsInJoin = 40000000)
```
where a group by step occurs on `user_uuid` for `table_B` before the
colocated join with `table_A` which has a high cardinality.
More details in the following issue:
https://github.com/apache/pinot/issues/14685
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]