jhorstmann opened a new pull request, #8102:
URL: https://github.com/apache/arrow-rs/pull/8102
# Which issue does this PR close?
This is a proof of concept for possibly the fastest way of decoding
bitpacked data, and also a showcase why writing short rle runs can be
detrimental to performance if decoding of bitpacked data is fast.
- Related to #7739.
# Rationale for this change
At the moment I'm not proposing to integrate this into the arrow codebase,
the code would need further changes to support processing arbitrary batch sizes
and is currently also only supporting a single bitwidth and only `u8` as the
target data type.
# What changes are included in this PR?
The benchmark includes a custom rle/bitpacking hybrid encoder that only
supports a bitwidth of 1 and only writes bitpacked runs. On mostly random input
data, the size of the encoded buffer is comparable to the size created by the
standard `RleEncoder`. Decoding using standard `RleDecoder` also shows that
decoding bitpacked data is slightly faster.
To run the benchmarks, you will need Rust 1.89, which stabilized avx512
support, and an avx512-capable machine (at least Intel Icelake or AMD Zen4).
```
$ RUSTFLAGS="-Ctarget-cpu=native" cargo bench --features experimental
--bench rle
```
```
rle_decoder/decode_bitpacked
time: [398.06 µs 398.66 µs 399.32 µs]
thrpt: [2.6259 Gelem/s 2.6302 Gelem/s 2.6342
Gelem/s]
rle_decoder/decode_hybrid
time: [540.05 µs 542.22 µs 544.84 µs]
thrpt: [1.9246 Gelem/s 1.9339 Gelem/s 1.9416
Gelem/s]
```
The results are more interesting when decoding with a custom, AVX512-VBMI
optimized decoder:
```
custom/decode_bitpacked time: [17.642 µs 17.661 µs 17.683 µs]
thrpt: [59.297 Gelem/s 59.372 Gelem/s 59.435
Gelem/s]
custom/decode_hybrid time: [87.593 µs 87.866 µs 88.184 µs]
thrpt: [11.891 Gelem/s 11.934 Gelem/s 11.971
Gelem/s]
```
Decoding bitpacked data gets a speedup of about 22x, while decoding hybrid
rle data *only* gets about 6x faster. My guess would be that this is caused by
branch prediction, or the call to a `memset`-like function which is not
optimized for short data.
# Are these changes tested?
# Are there any user-facing changes?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]