Dandandan opened a new pull request, #23204:
URL: https://github.com/apache/datafusion/pull/23204

   ## Which issue does this PR close?
   
   <!-- No issue filed; self-contained window perf change. Happy to open one if 
preferred. -->
   
   - Closes #.
   
   ## Rationale for this change
   
   RANGE window-frame boundaries are computed **once per row** by 
`WindowFrameStateRange::calculate_index_of_row`, which calls `search_in_slice`. 
That scan calls `get_row_at_idx` at every probed index — allocating a 
`Vec<ScalarValue>` and running a dynamic `ScalarValue` comparison 
(`compare_rows` → `try_cmp`) per probe. Since the scan amortizes to O(n) per 
partition, this per-probe heap allocation + enum dispatch dominates RANGE frame 
evaluation. (ROWS frames already use allocation-free integer arithmetic; this 
closes part of that gap.)
   
   ## What changes are included in this PR?
   
   A fast path in `calculate_index_of_row` for the common case of a **single 
primitive integer/float ORDER BY column**:
   
   - The column is downcast once to `PrimitiveArray<T>` and scanned over native 
values, reproducing the generic predicate exactly — `compare_rows` for one 
column, including every NULLS FIRST/LAST × ASC/DESC combination and the float 
**total ordering** that `ScalarValue::try_cmp` uses (via 
`ArrowNativeTypeOp::compare`, which is `total_cmp` for floats).
   - The boundary-**target** arithmetic (`add_checked`/`sub_checked`, 
overflow-to-edge, unsigned-underflow) is left entirely unchanged, so 
decimal/temporal/overflow/underflow semantics are identical — only the 
comparison *scan* is specialized.
   - The generic `ScalarValue` path remains the fallback for multi-column 
frames and non-primitive types (decimal/temporal, whose scale/units would not 
match a raw native comparison) and any column/target type mismatch.
   
   GROUPS frames are left for a follow-up (they use a different group-boundary 
mechanism).
   
   ## Are these changes tested?
   
   Yes:
   - A new differential unit test asserts the native scan returns the **same** 
boundary index as the generic `search_in_slice` path for every position, 
target, and sort-option combination, including nulls, NaN, duplicates, and 
signed zero.
   - Full `window` sqllogictest suite passes (all 6 files).
   
   ## Are there any user-facing changes?
   
   No — results are identical; this is a performance-only change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to