Dandandan commented on PR #19639: URL: https://github.com/apache/datafusion/pull/19639#issuecomment-3722492826
So I would suggest to take the following steps 1. Disable predicate pushdown for join filters (keep pruning only) 2. Run some further tests on other ideas, benchmark them in isolation: - Adaptive filter selectivity - Parallelization of small files - Coalesce small batches - Improve join pushdown performance / make it more adaptive - Do not pushdown predicates when IO benefit is small (no large columns besides predicate cols) - Do some profiling for regressions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
