sparklxb opened a new issue, #41783: URL: https://github.com/apache/doris/issues/41783
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Description FE limits the max recursion depth of hash distribution pruner according to the config 'max_distribution_pruner_recursion_depth', but its algorithm is very crude which can lead to all tablets scan even when only a few are needed. Eg: Userid is the only distribution column, and SQL: select * from where userid in (100+ elements). Even the number of hash values for the 100+ elements is far less than 100, FE still consider the recursion depth to be larger than 100, and the distribution pruner will not work and just return all tablets. <img width="841" alt="image" src="https://github.com/user-attachments/assets/edba72e8-b7a6-429e-b132-3eb832f8e99a"> The limitation of max recursion depth, as I think, is ok for efficiency. However, in the above case where table has only one distribution column and no recursion is needed, It's more effective to prune tablets and avoid all tablets scan. ### Solution When table has only one distribution column, distribution pruner doesn't do any recursion but computes hash key and return tablets which need to be scanned. ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
