yordan-pavlov edited a comment on issue #200: URL: https://github.com/apache/arrow-rs/issues/200#issuecomment-840100979
UPDATE: I was able to add benchmarks for int32 over the weekend, the latest changes can be found here: https://github.com/yordan-pavlov/arrow/commit/661ffd38c9a4807bae5c662ca0b1a94c6b9969c4 This should be enough to provide a fairly comprehensive picture of the performance of the new `ArrowArrayReader` vs the old `PrimitiveArrayReader` and `ComplexObjectArrayReader`. However, the results from the int32 benchmarks are mixed - the new arrow array reader is faster than the old `PrimitiveArrayReader` in all cases except for mandatory columns where it is several times slower. @Dandandan @jorgecarleitao @alamb Let me know what you think - is the simplification of reading arrow arrays using a single, iterator-based abstraction worth a performance hit in a small number of cases (given the performance improvement in most cases, especially strings and NULLs). Should I create a PR for this work next or should I try to make it even faster or even try to replace the Iterators with async Streams before creating a PR? here are the benchmark results: read Int32Array, plain encoded, mandatory, no NULLs - old: time: [8.8238 us 8.9269 us 9.0407 us] read Int32Array, plain encoded, mandatory, no NULLs - new: time: [22.544 us 22.703 us 22.872 us] read Int32Array, plain encoded, optional, no NULLs - old: time: [276.80 us 281.58 us 287.08 us] read Int32Array, plain encoded, optional, no NULLs - new: time: [52.179 us 52.998 us 53.886 us] read Int32Array, plain encoded, optional, half NULLs - old: time: [454.15 us 462.82 us 472.55 us] read Int32Array, plain encoded, optional, half NULLs - new: time: [320.11 us 325.34 us 330.93 us] read Int32Array, dictionary encoded, mandatory, no NULLs - old: time: [47.615 us 48.971 us 50.666 us] read Int32Array, dictionary encoded, mandatory, no NULLs - new: time: [115.89 us 118.07 us 120.55 us] read Int32Array, dictionary encoded, optional, no NULLs - old: time: [308.88 us 313.42 us 318.41 us] read Int32Array, dictionary encoded, optional, no NULLs - new: time: [160.98 us 164.96 us 170.25 us] read Int32Array, dictionary encoded, optional, half NULLs - old: time: [521.36 us 530.06 us 540.16 us] read Int32Array, dictionary encoded, optional, half NULLs - new: time: [399.54 us 415.00 us 433.30 us] -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org