rluvaton commented on issue #9083:
URL: https://github.com/apache/arrow-rs/issues/9083#issuecomment-3800851661
@Dandandan @alamb created an run a benchmark, link for the code in the end.
----
I created a benchmark with different implementations that all achieve the
same goal, shuffle to partitions.
the implementations are:
1. take into builders - I implemented `take` that accept a sink to save the
take data into to avoid reallocation or `concat`
2. take and concat - using the existing take and then doing concat on each
column
3. interleave
4. existing row format
5. optimized row format
For each implementation that is no row based I implemented 2 benchmarks one
when we go for all batches of column1 than all batches of column2 and so on
(ones that named _column wise_) and another implementation that goes per batch,
run the `take`/`interleave` on each column in that batch and than go to the
next batch.
each benchmark get an input laid out so no pre processing on the input is
needed before running (so `interleave` for example will already get the
`Vec<(batch_index, row_index)>`).
The benchmark is to shuffle 128 batches of size 8192 rows into 1500
partitions. so shuffling ~1M rows. this also mean that each partition will only
have ~700 rows in it.
I excluded the logic for splitting into batches from the benchmark since all
of it are running in
I also preallocate for the take into builders and both row format enough
data so no need to copy.
these are the results (sorted from fastest to slowest) running on my local
MacBook M3 run:
Benchmark | Time (mean) | Time Range
-- | -- | --
shuffle/optimized_row_format_approach going partition wise | 2.6734 s |
[2.6685 s - 2.6790 s]
shuffle/optimized_row_format_approach | 2.6930 s | [2.6899 s - 2.6958 s]
shuffle/row_format_approach going partition wise | 3.0178 s | [3.0075 s -
3.0283 s]
shuffle/row_format_approach | 3.0346 s | [3.0230 s - 3.0486 s]
shuffle/interleave_column_wise_approach | 3.4028 s | [3.3219 s - 3.4863 s]
shuffle/interleave_approach | 3.5126 s | [3.4128 s - 3.6161 s]
shuffle/take_to_builders_column_wise_approach | 4.2012 s | [4.1025 s -
4.3140 s]
shuffle/take_to_builders_approach | 4.2263 s | [4.0417 s - 4.4349 s]
shuffle/take_approach | 7.0364 s | [6.8709 s - 7.2323 s]
shuffle/take_column_wise_approach | 8.0610 s | [8.0290 s - 8.0964 s]
<details>
<summary>Raw results</summary>
```
Running benches/shuffle.rs
(target/release/deps/shuffle-b6a3c1bf993fb91d)
Gnuplot not found, using plotters backend
Benchmarking shuffle/take_to_builders_approach: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase
target time to 40.8s.
shuffle/take_to_builders_approach
time: [4.0417 s 4.2263 s 4.4349 s]
Benchmarking shuffle/take_to_builders_column_wise_approach: Warming up for
3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase
target time to 44.6s.
shuffle/take_to_builders_column_wise_approach
time: [4.1025 s 4.2012 s 4.3140 s]
Found 3 outliers among 10 measurements (30.00%)
1 (10.00%) low mild
1 (10.00%) high mild
1 (10.00%) high severe
Benchmarking shuffle/take_approach: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase
target time to 107.1s.
shuffle/take_approach time: [6.8709 s 7.0364 s 7.2323 s]
Found 2 outliers among 10 measurements (20.00%)
2 (20.00%) high mild
Benchmarking shuffle/take_column_wise_approach: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase
target time to 80.8s.
shuffle/take_column_wise_approach
time: [8.0290 s 8.0610 s 8.0964 s]
Benchmarking shuffle/interleave_approach: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase
target time to 40.0s.
shuffle/interleave_approach
time: [3.4128 s 3.5126 s 3.6161 s]
Benchmarking shuffle/interleave_column_wise_approach: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase
target time to 34.8s.
shuffle/interleave_column_wise_approach
time: [3.3219 s 3.4028 s 3.4863 s]
Benchmarking shuffle/row_format_approach: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase
target time to 32.3s.
shuffle/row_format_approach
time: [3.0230 s 3.0346 s 3.0486 s]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
Benchmarking shuffle/row_format_approach going partition wise: Warming up
for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase
target time to 30.3s.
shuffle/row_format_approach going partition wise
time: [3.0075 s 3.0178 s 3.0283 s]
Benchmarking shuffle/optimized_row_format_approach: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase
target time to 27.0s.
shuffle/optimized_row_format_approach
time: [2.6899 s 2.6930 s 2.6958 s]
Benchmarking shuffle/optimized_row_format_approach going partition wise:
Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase
target time to 26.7s.
shuffle/optimized_row_format_approach going partition wise
time: [2.6685 s 2.6734 s 2.6790 s]
```
</details>
[**Benchmark
implementation:**](https://github.com/rluvaton/bench_arrow_shuffle_implementation)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]