Dandandan opened a new pull request, #9577:
URL: https://github.com/apache/arrow-rs/pull/9577

   ## Summary
   - Avoid unnecessary buffer zero-fill in Snappy decompression by writing 
directly into spare Vec capacity instead of `resize(n, 0)` followed by overwrite
   - Reduce per-byte overhead in VLQ integer decoding by aligning once and 
iterating directly over the buffer slice, instead of calling `get_aligned` per 
byte
   
   ## Benchmark
   
   Profiled using the `arrow_reader_clickbench` benchmark (async_object_store, 
all queries).
   
   **Snappy zero-fill removal**: eliminates `__bzero` overhead visible in 
profiles (~2% of decompression time)
   
   **VLQ optimization**: `-3.0%` on Q10 (p=0.00), ~1-2% on other RLE-heavy 
queries
   
   | Query | Change |
   |-------|--------|
   | Q10 (string dict filter) | **-3.0%** |
   | Q37 (multi-predicate) | **-0.7%** |
   | Q41 (multi-predicate int) | **-1.7%** |
   
   ## Test plan
   - [x] All existing `test_codec_snappy`, RLE, `bit_util`, `bit_reader` tests 
pass (52 tests)
   - [x] Verified correct row counts on all 22 ClickBench queries
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to