[GitHub] [arrow] wesm opened a new pull request #7358: ARROW-9045: [C++] Expand / improve Take and Filter benchmarks for enhanced baseline

GitBox Fri, 05 Jun 2020 10:56:12 -0700


wesm opened a new pull request #7358:
URL: https://github.com/apache/arrow/pull/7358



   The idea of this patch is to provide a more comprehensive baseline for the 
optimization work I'm undertaking.
   
   Summary:
   
   * Benchmark take when indices are monotonic and contain no nulls. Monotonic 
takes perform much faster because it accesses memory consecutively rather than 
at random
   * Test null percentages down to 0.01% (1% is even a lot of nulls, and 
obscures behavior between 1% and 0%). 
   * Benchmark indices/filter-mask with and without nulls, because there may be 
faster code paths for the no-nulls case
   * Benchmark when values being taken/filtered are all not null
   * Benchmark filtering/taking smaller strings. The benchmarks were doing 
strings of size 0 to 128 -- realistic workloads generally will be working with 
smaller strings, so I set a range instead of 0 to 32 with 16 the average


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] wesm opened a new pull request #7358: ARROW-9045: [C++] Expand / improve Take and Filter benchmarks for enhanced baseline

Reply via email to