alamb opened a new pull request, #16731:
URL: https://github.com/apache/datafusion/pull/16731

   ## Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and 
enhancements and this helps us generate change logs for our releases. You can 
link an issue to this PR using the GitHub syntax. For example `Closes #123` 
indicates that this PR will close issue #123.
   -->
   
   - Related to of https://github.com/apache/datafusion/issues/3463
   - Closes https://github.com/apache/datafusion/issues/16729
   
   ## Rationale for this change
   
   In order to enable `filter_pushdown` by default, we need to ensure it 
doesn't regress existing performance
   
   However, it has been very hard to make forward progress on improving filter 
pushdown because all our benchmarks compare filter pushdown to not filter 
pushdown, so the bar for change is quite high.
   Here is the most recent example:
   - https://github.com/apache/datafusion/pull/16711
   
   It seems obvious but the the right metric for improvements to the filter 
pushdown are comparing when filter pushdown is already on. However, we don't 
have any such benchmark (see https://github.com/apache/datafusion/issues/16729 
and https://github.com/apache/datafusion/pull/16730 for why the existing 
benchmarks are not good enough)
   
   
   ## What changes are included in this PR?
   
   Add a benchmark (clickbench_pushdown) that turns on filter_pushdown and 
reorder_filters on
   
   You can run it like this:
   ```shell
   `./benchmarks/bench.sh run clickbench_pushdown
   ```
   
   Which then invokes
   
   ```shell
   + cargo run --release --bin dfbench -- clickbench --pushdown --iterations 5 
--path /Users/andrewlamb/Software/datafusion/benchmarks/data/hits_partitioned 
--queries-path 
/Users/andrewlamb/Software/datafusion/benchmarks/queries/clickbench/queries -o 
/Users/andrewlamb/Software/datafusion/benchmarks/results/alamb_new_filter_pushdown/clickbench_partitioned.json
   ```
   
   ## Are these changes tested?
   
   I tested it manually and also did some profiling on Q30 to verify that 
filter pushdown is indeed being invoked 
   
   (TODO picture)
   
   ## Are there any user-facing changes?
   
   No this is a development process change only


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to