ClSlaid commented on PR #9755: URL: https://github.com/apache/arrow-rs/pull/9755#issuecomment-4615911949
Benchmark update after the latest schema-level fast-path change. Command used for both revisions: ```text CARGO_TARGET_DIR=target/bench-compare-fix9143 cargo bench -p arrow --features test_utils --bench coalesce_kernels -- 'filter:' --sample-size 10 --warm-up-time 1 --measurement-time 2 --save-baseline <baseline> ``` Baseline: `apache/main` at `f03e1bc028630561af7d4adfc000916337ec6ce5` Current: this branch working tree after the schema-capability fast path update Scope: filter benchmarks only; take benchmarks intentionally not rerun. Summary: | Group | Cases | Median delta vs apache/main | | --- | ---: | ---: | | All filter cases | 104 | +0.4% | | Primitive / Utf8View / BinaryView schemas | 88 | +0.3% | | Unsupported schemas using materialized fallback | 16 | +1.4% | Negative delta means current is faster. Largest improvements: | Case | apache/main | current | delta | speedup | | --- | ---: | ---: | ---: | ---: | | filter: mixed_binaryview (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.8 | 2.17 ms | 1.43 ms | -34.0% | 1.51x | | filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8 | 4.41 ms | 3.25 ms | -26.3% | 1.36x | | filter: mixed_binaryview (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.8 | 1.71 ms | 1.37 ms | -19.8% | 1.25x | | filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8 | 2.30 ms | 2.08 ms | -9.7% | 1.11x | | filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01 | 3.21 ms | 2.99 ms | -6.8% | 1.07x | Largest regressions/noise candidates: | Case | apache/main | current | delta | speedup | | --- | ---: | ---: | ---: | ---: | | filter: mixed_binaryview (max_string_len=20), 8192, nulls: 0, selectivity: 0.8 | 1.06 ms | 1.19 ms | +12.1% | 0.89x | | filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001 | 27.0 ms | 30.2 ms | +11.7% | 0.90x | | filter: primitive, 8192, nulls: 0, selectivity: 0.1 | 1.43 ms | 1.58 ms | +10.6% | 0.90x | | filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01 | 5.15 ms | 5.68 ms | +10.2% | 0.91x | | filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.8 | 1.54 ms | 1.68 ms | +8.9% | 0.92x | | filter: single_utf8view (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.01 | 3.95 ms | 4.28 ms | +8.5% | 0.92x | | filter: mixed_utf8view (max_string_len=8), 8192, nulls: 0, selectivity: 0.8 | 617.7 us | 665.1 us | +7.7% | 0.93x | | filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1 | 3.01 ms | 3.22 ms | +6.9% | 0.94x | | filter: primitive, 8192, nulls: 0, selectivity: 0.01 | 4.54 ms | 4.84 ms | +6.6% | 0.94x | | filter: single_binaryview, 8192, nulls: 0, selectivity: 0.001 | 32.2 ms | 34.3 ms | +6.4% | 0.94x | | filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01 | 5.24 ms | 5.54 ms | +5.6% | 0.95x | | filter: single_binaryview (max_string_len=8), 8192, nulls: 0, selectivity: 0.8 | 760.0 us | 799.0 us | +5.1% | 0.95x | Notes: - The schema-level guard means schemas with unsupported columns such as `Utf8` and dictionary use the materialized filter path instead of the per-column sparse-copy path. - This keeps the fast path focused on primitive, `Utf8View`, and `BinaryView` columns, where the branch has specialized copying. - These are short local Criterion runs, so small deltas should be read as noise; the large wins are concentrated around view-heavy filter cases. <details> <summary>Full filter benchmark table</summary> | Case | apache/main | current | delta | speedup | | --- | ---: | ---: | ---: | ---: | | filter: mixed_binaryview (max_string_len=128), 8192, nulls: 0, selectivity: 0.001 | 21.7 ms | 22.0 ms | +1.7% | 0.98x | | filter: mixed_binaryview (max_string_len=128), 8192, nulls: 0, selectivity: 0.01 | 2.65 ms | 2.78 ms | +4.9% | 0.95x | | filter: mixed_binaryview (max_string_len=128), 8192, nulls: 0, selectivity: 0.1 | 1.32 ms | 1.32 ms | -0.3% | 1.00x | | filter: mixed_binaryview (max_string_len=128), 8192, nulls: 0, selectivity: 0.8 | 1.85 ms | 1.90 ms | +2.5% | 0.98x | | filter: mixed_binaryview (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.001 | 30.6 ms | 30.4 ms | -0.7% | 1.01x | | filter: mixed_binaryview (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.01 | 4.13 ms | 4.08 ms | -1.3% | 1.01x | | filter: mixed_binaryview (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.1 | 2.23 ms | 2.25 ms | +1.3% | 0.99x | | filter: mixed_binaryview (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.8 | 1.71 ms | 1.37 ms | -19.8% | 1.25x | | filter: mixed_binaryview (max_string_len=20), 8192, nulls: 0, selectivity: 0.001 | 21.3 ms | 21.7 ms | +2.0% | 0.98x | | filter: mixed_binaryview (max_string_len=20), 8192, nulls: 0, selectivity: 0.01 | 2.35 ms | 2.37 ms | +0.7% | 0.99x | | filter: mixed_binaryview (max_string_len=20), 8192, nulls: 0, selectivity: 0.1 | 962.4 us | 939.9 us | -2.3% | 1.02x | | filter: mixed_binaryview (max_string_len=20), 8192, nulls: 0, selectivity: 0.8 | 1.06 ms | 1.19 ms | +12.1% | 0.89x | | filter: mixed_binaryview (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.001 | 29.8 ms | 29.7 ms | -0.4% | 1.00x | | filter: mixed_binaryview (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.01 | 3.81 ms | 3.83 ms | +0.5% | 0.99x | | filter: mixed_binaryview (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.1 | 1.87 ms | 1.86 ms | -0.5% | 1.01x | | filter: mixed_binaryview (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.8 | 2.17 ms | 1.43 ms | -34.0% | 1.51x | | filter: mixed_binaryview (max_string_len=8), 8192, nulls: 0, selectivity: 0.001 | 19.2 ms | 20.0 ms | +3.8% | 0.96x | | filter: mixed_binaryview (max_string_len=8), 8192, nulls: 0, selectivity: 0.01 | 1.81 ms | 1.82 ms | +0.8% | 0.99x | | filter: mixed_binaryview (max_string_len=8), 8192, nulls: 0, selectivity: 0.1 | 587.1 us | 592.2 us | +0.9% | 0.99x | | filter: mixed_binaryview (max_string_len=8), 8192, nulls: 0, selectivity: 0.8 | 616.3 us | 617.6 us | +0.2% | 1.00x | | filter: mixed_binaryview (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.001 | 28.4 ms | 28.2 ms | -0.7% | 1.01x | | filter: mixed_binaryview (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.01 | 3.38 ms | 3.28 ms | -3.0% | 1.03x | | filter: mixed_binaryview (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.1 | 1.55 ms | 1.56 ms | +1.0% | 0.99x | | filter: mixed_binaryview (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.8 | 1.38 ms | 1.35 ms | -1.8% | 1.02x | | filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001 | 58.5 ms | 61.3 ms | +4.7% | 0.95x | | filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01 | 3.21 ms | 2.99 ms | -6.8% | 1.07x | | filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1 | 1.57 ms | 1.59 ms | +1.7% | 0.98x | | filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8 | 1.88 ms | 1.91 ms | +1.4% | 0.99x | | filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001 | 55.0 ms | 55.4 ms | +0.7% | 0.99x | | filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01 | 3.95 ms | 3.95 ms | +0.0% | 1.00x | | filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1 | 2.11 ms | 2.01 ms | -4.5% | 1.05x | | filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8 | 2.30 ms | 2.08 ms | -9.7% | 1.11x | | filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001 | 27.0 ms | 30.2 ms | +11.7% | 0.90x | | filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01 | 5.24 ms | 5.54 ms | +5.6% | 0.95x | | filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1 | 3.01 ms | 3.22 ms | +6.9% | 0.94x | | filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 | 3.53 ms | 3.64 ms | +3.3% | 0.97x | | filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001 | 34.9 ms | 35.1 ms | +0.6% | 0.99x | | filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01 | 5.15 ms | 5.68 ms | +10.2% | 0.91x | | filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1 | 3.33 ms | 3.41 ms | +2.4% | 0.98x | | filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8 | 4.41 ms | 3.25 ms | -26.3% | 1.36x | | filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.001 | 21.3 ms | 22.0 ms | +3.4% | 0.97x | | filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.01 | 2.68 ms | 2.70 ms | +0.7% | 0.99x | | filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.1 | 1.27 ms | 1.29 ms | +1.3% | 0.99x | | filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.8 | 1.33 ms | 1.38 ms | +3.3% | 0.97x | | filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.001 | 30.0 ms | 29.7 ms | -1.2% | 1.01x | | filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.01 | 4.01 ms | 4.00 ms | -0.4% | 1.00x | | filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.1 | 2.17 ms | 2.15 ms | -0.7% | 1.01x | | filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 0.8 | 1.45 ms | 1.45 ms | +0.6% | 0.99x | | filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.001 | 20.9 ms | 21.2 ms | +1.3% | 0.99x | | filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.01 | 2.29 ms | 2.33 ms | +1.8% | 0.98x | | filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.1 | 921.9 us | 930.6 us | +1.0% | 0.99x | | filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.8 | 716.9 us | 714.6 us | -0.3% | 1.00x | | filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.001 | 28.7 ms | 28.4 ms | -0.9% | 1.01x | | filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.01 | 3.74 ms | 3.76 ms | +0.3% | 1.00x | | filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.1 | 1.86 ms | 1.84 ms | -1.1% | 1.01x | | filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0.1, selectivity: 0.8 | 1.43 ms | 1.42 ms | -0.5% | 1.01x | | filter: mixed_utf8view (max_string_len=8), 8192, nulls: 0, selectivity: 0.001 | 19.9 ms | 19.6 ms | -1.3% | 1.01x | | filter: mixed_utf8view (max_string_len=8), 8192, nulls: 0, selectivity: 0.01 | 1.80 ms | 1.84 ms | +2.1% | 0.98x | | filter: mixed_utf8view (max_string_len=8), 8192, nulls: 0, selectivity: 0.1 | 581.8 us | 605.3 us | +4.0% | 0.96x | | filter: mixed_utf8view (max_string_len=8), 8192, nulls: 0, selectivity: 0.8 | 617.7 us | 665.1 us | +7.7% | 0.93x | | filter: mixed_utf8view (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.001 | 28.8 ms | 28.2 ms | -2.0% | 1.02x | | filter: mixed_utf8view (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.01 | 3.32 ms | 3.29 ms | -0.9% | 1.01x | | filter: mixed_utf8view (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.1 | 1.54 ms | 1.54 ms | -0.2% | 1.00x | | filter: mixed_utf8view (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.8 | 1.37 ms | 1.36 ms | -1.2% | 1.01x | | filter: primitive, 8192, nulls: 0, selectivity: 0.001 | 46.8 ms | 47.8 ms | +2.3% | 0.98x | | filter: primitive, 8192, nulls: 0, selectivity: 0.01 | 4.54 ms | 4.84 ms | +6.6% | 0.94x | | filter: primitive, 8192, nulls: 0, selectivity: 0.1 | 1.43 ms | 1.58 ms | +10.6% | 0.90x | | filter: primitive, 8192, nulls: 0, selectivity: 0.8 | 1.57 ms | 1.58 ms | +0.8% | 0.99x | | filter: primitive, 8192, nulls: 0.1, selectivity: 0.001 | 68.2 ms | 68.1 ms | -0.2% | 1.00x | | filter: primitive, 8192, nulls: 0.1, selectivity: 0.01 | 8.51 ms | 8.48 ms | -0.4% | 1.00x | | filter: primitive, 8192, nulls: 0.1, selectivity: 0.1 | 3.97 ms | 3.97 ms | -0.0% | 1.00x | | filter: primitive, 8192, nulls: 0.1, selectivity: 0.8 | 3.49 ms | 3.38 ms | -3.4% | 1.04x | | filter: single_binaryview, 8192, nulls: 0, selectivity: 0.001 | 32.2 ms | 34.3 ms | +6.4% | 0.94x | | filter: single_binaryview, 8192, nulls: 0, selectivity: 0.01 | 4.06 ms | 3.97 ms | -2.3% | 1.02x | | filter: single_binaryview, 8192, nulls: 0, selectivity: 0.1 | 1.85 ms | 1.88 ms | +1.7% | 0.98x | | filter: single_binaryview, 8192, nulls: 0, selectivity: 0.8 | 952.9 us | 949.3 us | -0.4% | 1.00x | | filter: single_binaryview, 8192, nulls: 0.1, selectivity: 0.001 | 46.3 ms | 46.7 ms | +1.0% | 0.99x | | filter: single_binaryview, 8192, nulls: 0.1, selectivity: 0.01 | 5.48 ms | 5.50 ms | +0.3% | 1.00x | | filter: single_binaryview, 8192, nulls: 0.1, selectivity: 0.1 | 2.25 ms | 2.26 ms | +0.5% | 1.00x | | filter: single_binaryview, 8192, nulls: 0.1, selectivity: 0.8 | 1.71 ms | 1.68 ms | -1.7% | 1.02x | | filter: single_binaryview (max_string_len=8), 8192, nulls: 0, selectivity: 0.001 | 27.0 ms | 27.7 ms | +2.7% | 0.97x | | filter: single_binaryview (max_string_len=8), 8192, nulls: 0, selectivity: 0.01 | 2.34 ms | 2.28 ms | -2.5% | 1.03x | | filter: single_binaryview (max_string_len=8), 8192, nulls: 0, selectivity: 0.1 | 595.7 us | 590.3 us | -0.9% | 1.01x | | filter: single_binaryview (max_string_len=8), 8192, nulls: 0, selectivity: 0.8 | 760.0 us | 799.0 us | +5.1% | 0.95x | | filter: single_binaryview (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.001 | 40.1 ms | 40.0 ms | -0.1% | 1.00x | | filter: single_binaryview (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.01 | 3.98 ms | 3.99 ms | +0.4% | 1.00x | | filter: single_binaryview (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.1 | 1.04 ms | 1.05 ms | +1.3% | 0.99x | | filter: single_binaryview (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.8 | 1.55 ms | 1.52 ms | -1.4% | 1.01x | | filter: single_utf8view, 8192, nulls: 0, selectivity: 0.001 | 32.0 ms | 31.4 ms | -1.9% | 1.02x | | filter: single_utf8view, 8192, nulls: 0, selectivity: 0.01 | 3.76 ms | 3.68 ms | -2.0% | 1.02x | | filter: single_utf8view, 8192, nulls: 0, selectivity: 0.1 | 1.81 ms | 1.83 ms | +1.1% | 0.99x | | filter: single_utf8view, 8192, nulls: 0, selectivity: 0.8 | 930.4 us | 957.3 us | +2.9% | 0.97x | | filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.001 | 45.0 ms | 44.5 ms | -1.0% | 1.01x | | filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.01 | 5.52 ms | 5.39 ms | -2.3% | 1.02x | | filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.1 | 2.27 ms | 2.18 ms | -3.9% | 1.04x | | filter: single_utf8view, 8192, nulls: 0.1, selectivity: 0.8 | 1.54 ms | 1.68 ms | +8.9% | 0.92x | | filter: single_utf8view (max_string_len=8), 8192, nulls: 0, selectivity: 0.001 | 27.1 ms | 26.6 ms | -1.8% | 1.02x | | filter: single_utf8view (max_string_len=8), 8192, nulls: 0, selectivity: 0.01 | 2.29 ms | 2.30 ms | +0.4% | 1.00x | | filter: single_utf8view (max_string_len=8), 8192, nulls: 0, selectivity: 0.1 | 590.1 us | 601.6 us | +1.9% | 0.98x | | filter: single_utf8view (max_string_len=8), 8192, nulls: 0, selectivity: 0.8 | 766.7 us | 782.5 us | +2.1% | 0.98x | | filter: single_utf8view (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.001 | 41.8 ms | 42.2 ms | +0.8% | 0.99x | | filter: single_utf8view (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.01 | 3.95 ms | 4.28 ms | +8.5% | 0.92x | | filter: single_utf8view (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.1 | 1.06 ms | 1.05 ms | -0.8% | 1.01x | | filter: single_utf8view (max_string_len=8), 8192, nulls: 0.1, selectivity: 0.8 | 1.54 ms | 1.52 ms | -0.7% | 1.01x | </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
