alamb opened a new pull request, #7120: URL: https://github.com/apache/arrow-datafusion/pull/7120
Draft while I am finishing testing Note this looks like a large change but it a lot of moving code around rather than any logic changes # Which issue does this PR close? Part of https://github.com/apache/arrow-datafusion/issues/7052 # Rationale for this change see https://github.com/apache/arrow-datafusion/issues/7052 TLDR is that making benchmarks easier to run means more people will find them and run them :) # What changes are included in this PR? 1. Combine / consolidate the parquet filter pushdown and sort benchmarks Like https://github.com/apache/arrow-datafusion/pull/7054, this PR maintains the old entrypoint (`parquet`) as well So these two commands do the same thing (run the filter pushdown benchmark): ``` # New cargo run --bin dfbench -- parquet-filter --iterations=5 --partitions=1 --scale-factor=0.01 --path=/tmp # Old cargo run --bin parquet filter --iterations=5 --partitions=1 --scale-factor=0.01 --path=/tmp ``` Likewise for sort benchmark: ```shell # New cargo run --bin dfbench sort --iterations=5 --partitions=1 --scale-factor=0.01 --path=/tmp # Old cargo run --bin parquet sort --iterations=5 --partitions=1 --scale-factor=0.01 --path=/tmp ``` # Are these changes tested? I tested them manually, both alone and with `bench.sh` # Are there any user-facing changes? No, this is a development tool -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
