asubiotto opened a new issue, #9620:
URL: https://github.com/apache/arrow-rs/issues/9620

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   The comparison kernels in `arrow-ord` (`eq`, `neq`, `lt`, `lt_eq`, `gt`, 
`gt_eq`) do not support `RunEndEncoded` arrays. Passing a REE array to any of 
these functions hits an `unreachable!()` panic, forcing callers to manually 
decode REE to flat arrays before comparing. This defeats the compression 
benefit of REE — particularly for filter operations (scalar comparison) which 
are the dominant use case.
   
   This is part of the ongoing effort to support REE in compute kernels (#3520).
   
   **Describe the solution you'd like**
   
   Add REE support to `compare_op` in `arrow-ord/src/cmp.rs`, following the 
existing dictionary unwrapping pattern:
   
   1. **REE vs scalar** (the filter case): compare only the physical values 
(one per run), then bulk-expand the boolean result to logical length. This 
should be O(m) comparisons + O(n) expansion.
   
   2. **REE vs REE**: walk both run_ends arrays simultaneously, compare once 
per aligned segment, and bulk-fill the result. O(m1+m2) comparisons instead of 
O(n) per-element.
   
   3. **Mixed cases** (REE vs plain, REE wrapping dictionary): fall back to 
materializing a logical-length index vector and using the existing vectored 
comparison path.
   
   **Describe alternatives you've considered**
   
   - **Decode REE to flat before comparing**: simpler (zero changes to 
`cmp.rs`) but materializes the full logical array, losing all compression 
benefit.
   
   - **Treat REE as dictionary** (implement `AnyDictionaryArray` for 
`RunArray`): would reuse the existing dict path but would just hide the same 
logical-length allocation inside `normalized_keys()`. The access patterns 
differ fundamentally — dictionary keys are stored per-element, while REE 
run-ends are compressed.
   
   **Additional context**
   
   None.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to