alamb commented on issue #3463: URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3696448062
An update here: I have been using some free time over the holidays to try and fix this # Current Status The story so far is that thanks to many efforts, including - @hhhizzz's bitmask implementation in https://github.com/apache/arrow-rs/pull/8733 (described more in [Late Materialization in arrow-rs Parquet Reads](https://arrow.apache.org/blog/2025/12/11/parquet-late-materialization-deep-dive/)) - @XiangpengHao work to cache results in https://github.com/apache/arrow-rs/pull/7850 The performance when pushdown is enabled is faster for most queries. I recently ran benchmarks as par tof https://github.com/apache/datafusion/pull/19477 and here are some highlights ``` -------------------- Benchmark clickbench_extended.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ ┃ Query ┃ HEAD ┃ alamb_enable_pushdown ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │ QQuery 6 │ 4022.77 ms │ 231.17 ms │ +17.40x faster │ └──────────────┴─────────────┴───────────────────────┴────────────────┘ ``` ``` -------------------- Benchmark clickbench_partitioned.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ ┃ Query ┃ HEAD ┃ alamb_enable_pushdown ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │ QQuery 18 │ 5918.57 ms │ 4875.16 ms │ +1.21x faster │ │ QQuery 23 │ 20156.62 ms │ 1283.72 ms │ +15.70x faster │ │ QQuery 32 │ 5652.98 ms │ 4588.56 ms │ +1.23x faster │ │ QQuery 33 │ 6275.28 ms │ 5666.71 ms │ +1.11x faster │ │ QQuery 34 │ 6526.66 ms │ 5968.31 ms │ +1.09x faster │ │ QQuery 36 │ 66.42 ms │ 15.94 ms │ +4.17x faster │ │ QQuery 37 │ 45.91 ms │ 16.16 ms │ +2.84x faster │ │ QQuery 38 │ 65.21 ms │ 15.26 ms │ +4.27x faster │ │ QQuery 39 │ 105.50 ms │ 14.18 ms │ +7.44x faster │ │ QQuery 40 │ 27.96 ms │ 16.61 ms │ +1.68x faster │ │ QQuery 41 │ 22.42 ms │ 14.93 ms │ +1.50x faster │ │ QQuery 42 │ 19.97 ms │ 14.33 ms │ +1.39x faster │ └──────────────┴─────────────┴───────────────────────┴────────────────┘ ``` <details><summary>Full Results</summary> <p> ``` Comparing HEAD and alamb_enable_pushdown -------------------- Benchmark clickbench_extended.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ ┃ Query ┃ HEAD ┃ alamb_enable_pushdown ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │ QQuery 0 │ 2685.93 ms │ 2651.31 ms │ no change │ │ QQuery 1 │ 1073.38 ms │ 1053.67 ms │ no change │ │ QQuery 2 │ 2088.87 ms │ 2076.62 ms │ no change │ │ QQuery 3 │ 1135.31 ms │ 1169.33 ms │ no change │ │ QQuery 4 │ 2311.12 ms │ 2333.31 ms │ no change │ │ QQuery 5 │ 28540.84 ms │ 29180.44 ms │ no change │ │ QQuery 6 │ 4022.77 ms │ 231.17 ms │ +17.40x faster │ │ QQuery 7 │ 3930.66 ms │ 4171.90 ms │ 1.06x slower │ └──────────────┴─────────────┴───────────────────────┴────────────────┘ -------------------- Benchmark clickbench_partitioned.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ ┃ Query ┃ HEAD ┃ alamb_enable_pushdown ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │ QQuery 0 │ 2.37 ms │ 2.13 ms │ +1.12x faster │ │ QQuery 1 │ 50.16 ms │ 52.79 ms │ 1.05x slower │ │ QQuery 2 │ 134.81 ms │ 131.30 ms │ no change │ │ QQuery 3 │ 154.45 ms │ 154.20 ms │ no change │ │ QQuery 4 │ 1137.52 ms │ 1117.24 ms │ no change │ │ QQuery 5 │ 1511.93 ms │ 1476.79 ms │ no change │ │ QQuery 6 │ 2.04 ms │ 2.02 ms │ no change │ │ QQuery 7 │ 59.58 ms │ 68.59 ms │ 1.15x slower │ │ QQuery 8 │ 1474.01 ms │ 1456.40 ms │ no change │ │ QQuery 9 │ 1854.79 ms │ 1834.23 ms │ no change │ │ QQuery 10 │ 353.18 ms │ 492.42 ms │ 1.39x slower │ │ QQuery 11 │ 402.12 ms │ 525.34 ms │ 1.31x slower │ │ QQuery 12 │ 1356.90 ms │ 1555.24 ms │ 1.15x slower │ │ QQuery 13 │ 2021.01 ms │ 2244.56 ms │ 1.11x slower │ │ QQuery 14 │ 1244.62 ms │ 1454.12 ms │ 1.17x slower │ │ QQuery 15 │ 1274.32 ms │ 1277.69 ms │ no change │ │ QQuery 16 │ 2636.20 ms │ 2571.86 ms │ no change │ │ QQuery 17 │ 2601.62 ms │ 2559.34 ms │ no change │ │ QQuery 18 │ 5918.57 ms │ 4875.16 ms │ +1.21x faster │ │ QQuery 19 │ 130.29 ms │ 141.55 ms │ 1.09x slower │ │ QQuery 20 │ 1929.33 ms │ 1857.46 ms │ no change │ │ QQuery 21 │ 2200.83 ms │ 2330.91 ms │ 1.06x slower │ │ QQuery 22 │ 3857.11 ms │ 3729.31 ms │ no change │ │ QQuery 23 │ 20156.62 ms │ 1283.72 ms │ +15.70x faster │ │ QQuery 24 │ 221.23 ms │ 283.74 ms │ 1.28x slower │ │ QQuery 25 │ 474.00 ms │ 628.50 ms │ 1.33x slower │ │ QQuery 26 │ 216.46 ms │ 340.81 ms │ 1.57x slower │ │ QQuery 27 │ 2762.90 ms │ 2961.52 ms │ 1.07x slower │ │ QQuery 28 │ 23671.71 ms │ 23963.73 ms │ no change │ │ QQuery 29 │ 970.01 ms │ 952.03 ms │ no change │ │ QQuery 30 │ 1327.99 ms │ 1342.13 ms │ no change │ │ QQuery 31 │ 1345.31 ms │ 1348.15 ms │ no change │ │ QQuery 32 │ 5652.98 ms │ 4588.56 ms │ +1.23x faster │ │ QQuery 33 │ 6275.28 ms │ 5666.71 ms │ +1.11x faster │ │ QQuery 34 │ 6526.66 ms │ 5968.31 ms │ +1.09x faster │ │ QQuery 35 │ 1983.97 ms │ 1908.53 ms │ no change │ │ QQuery 36 │ 66.42 ms │ 15.94 ms │ +4.17x faster │ │ QQuery 37 │ 45.91 ms │ 16.16 ms │ +2.84x faster │ │ QQuery 38 │ 65.21 ms │ 15.26 ms │ +4.27x faster │ │ QQuery 39 │ 105.50 ms │ 14.18 ms │ +7.44x faster │ │ QQuery 40 │ 27.96 ms │ 16.61 ms │ +1.68x faster │ │ QQuery 41 │ 22.42 ms │ 14.93 ms │ +1.50x faster │ │ QQuery 42 │ 19.97 ms │ 14.33 ms │ +1.39x faster │ └──────────────┴─────────────┴───────────────────────┴────────────────┘ ``` </p> </details> You can see Q23 benefits from topk pushdown, as do predicate heavy queries such as extended q6. However there are several queries that still get slower, notably: ``` │ QQuery 7 │ 59.58 ms │ 68.59 ms │ 1.15x slower │ │ QQuery 10 │ 353.18 ms │ 492.42 ms │ 1.39x slower │ │ QQuery 11 │ 402.12 ms │ 525.34 ms │ 1.31x slower │ │ QQuery 12 │ 1356.90 ms │ 1555.24 ms │ 1.15x slower │ │ QQuery 13 │ 2021.01 ms │ 2244.56 ms │ 1.11x slower │ │ QQuery 14 │ 1244.62 ms │ 1454.12 ms │ 1.17x slower │ │ QQuery 19 │ 130.29 ms │ 141.55 ms │ 1.09x slower │ │ QQuery 21 │ 2200.83 ms │ 2330.91 ms │ 1.06x slower │ │ QQuery 24 │ 221.23 ms │ 283.74 ms │ 1.28x slower │ │ QQuery 25 │ 474.00 ms │ 628.50 ms │ 1.33x slower │ │ QQuery 26 │ 216.46 ms │ 340.81 ms │ 1.57x slower │ │ QQuery 27 │ 2762.90 ms │ 2961.52 ms │ 1.07x slower │ ``` # Plan I have been profiling the queries that show the largest slowdowns (e.g. q10 and q26) trying to find what is taking longer so I can fix it. There are two potential ways to do this: 1. Optimize code paths that are only invoked when predicate pushdown is enabled (for example https://github.com/apache/arrow-rs/pull/9056) 2. Optimize general code paths (e.g. https://github.com/apache/arrow-rs/pull/9048) My thinking is that even if turning on pushdown slows down some queries relative to not turning it on, if pushdown + other optimizations in arrow 57.2.0 is faster than non-pushdown and arrow 57.1.0 we can still turn it on -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
