jonded94 commented on PR #9374:
URL: https://github.com/apache/arrow-rs/pull/9374#issuecomment-3890731263

   Actually, what I stated (and tested for) here was wrong: This isn't about 
mixing v1 and v2 data pages, but can be reproduced with v2 data pages only. 
That's what my new commit 365bd9a is about.
   
   Additionally, I verified that the original file I had this issue with also 
only contains v2 data pages. Please read this comment for a very detailed 
overview over all data pages in the row group that leads to some errors here: 
https://github.com/apache/arrow-rs/issues/9370#issuecomment-3889847488
   
   Regarding the actual bugfix introduced in this PR: It seems to lead to some 
quite hefty performance regressions, especially in these benchmarks launched by 
@alamb 
   
   ```
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split 
encoded, mandatory, no NULLs        1.00    160.5±0.72µs        ? ?/sec    1.22 
   195.6±2.97µs        ? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split 
encoded, optional, half NULLs       1.00    300.9±4.42µs        ? ?/sec    1.12 
   338.1±6.72µs        ? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split 
encoded, optional, no NULLs         1.00    164.7±2.33µs        ? ?/sec    1.21 
   199.1±3.93µs        ? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, 
mandatory, no NULLs                    1.00     76.8±1.39µs        ? ?/sec    
1.54    118.2±4.05µs        ? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, 
optional, half NULLs                   1.00    257.3±1.94µs        ? ?/sec    
1.17    300.9±5.17µs        ? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, 
optional, no NULLs                     1.00     81.0±0.54µs        ? ?/sec    
1.49    121.1±0.99µs        ? ?/sec
   arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, half NULLs 
                              1.00    203.9±1.61µs        ? ?/sec    1.34    
273.8±3.18µs        ? ?/sec
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to