yordan-pavlov edited a comment on issue #200:
URL: https://github.com/apache/arrow-rs/issues/200#issuecomment-840100979


   UPDATE: I was able to add benchmarks for int32 over the weekend, the latest 
changes can be found here: 
   
https://github.com/yordan-pavlov/arrow/commit/661ffd38c9a4807bae5c662ca0b1a94c6b9969c4
   
   This should be enough to provide a fairly comprehensive picture of the 
performance of the new `ArrowArrayReader` vs the old `PrimitiveArrayReader` and 
`ComplexObjectArrayReader`. However, the results from the int32 benchmarks are 
mixed - the new arrow array reader is faster than the old 
`PrimitiveArrayReader` in all cases except for mandatory columns where it is 
several times slower.
   
   @Dandandan @jorgecarleitao @alamb Let me know what you think - is the 
simplification of reading arrow arrays using a single, iterator-based 
abstraction worth a performance hit in  a small number of cases (given the 
performance improvement in most cases, especially strings and NULLs). Should I 
create a PR for this work next or should I try to make it even faster or even 
try to replace the Iterators with async Streams before creating a PR?
   
   here are the benchmark results:
   read Int32Array, plain encoded, mandatory, no NULLs - old: time:   [8.8238 
us 8.9269 us 9.0407 us]
   read Int32Array, plain encoded, mandatory, no NULLs - new: time:   [22.544 
us 22.703 us 22.872 us]
   
   read Int32Array, plain encoded, optional, no NULLs - old: time:   [276.80 us 
281.58 us 287.08 us]
   read Int32Array, plain encoded, optional, no NULLs - new: time:   [52.179 us 
52.998 us 53.886 us]
   
   read Int32Array, plain encoded, optional, half NULLs - old: time:   [454.15 
us 462.82 us 472.55 us]
   read Int32Array, plain encoded, optional, half NULLs - new: time:   [320.11 
us 325.34 us 330.93 us]
   
   read Int32Array, dictionary encoded, mandatory, no NULLs - old: time:   
[47.615 us 48.971 us 50.666 us]
   read Int32Array, dictionary encoded, mandatory, no NULLs - new: time:   
[115.89 us 118.07 us 120.55 us]
   
   read Int32Array, dictionary encoded, optional, no NULLs - old: time:   
[308.88 us 313.42 us 318.41 us]
   read Int32Array, dictionary encoded, optional, no NULLs - new: time:   
[160.98 us 164.96 us 170.25 us]
   
   read Int32Array, dictionary encoded, optional, half NULLs - old: time:   
[521.36 us 530.06 us 540.16 us]
   read Int32Array, dictionary encoded, optional, half NULLs - new: time:   
[399.54 us 415.00 us 433.30 us]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to