Re: [PR] Unroll interleave -25-30% [arrow-rs]

via GitHub Sat, 14 Mar 2026 09:35:37 -0700


mbutrovich commented on code in PR #9542:
URL: https://github.com/apache/arrow-rs/pull/9542#discussion_r2935498510



##########
arrow-select/src/interleave.rs:
##########
@@ -154,13 +154,51 @@ fn interleave_primitive<T: ArrowPrimitiveType>(
     data_type: &DataType,
 ) -> Result<ArrayRef, ArrowError> {
     let interleaved = Interleave::<'_, PrimitiveArray<T>>::new(values, 
indices);
+    let arrays = &interleaved.arrays;
+    let len = indices.len();
+
+    let mut output = Vec::with_capacity(len);
+    let dst: *mut T::Native = output.as_mut_ptr();
+    let mut base = 0;
+
+    // Process 8 elements at a time to issue multiple independent loads
+    // and increase memory-level parallelism for random access patterns.
+    let chunks = indices.chunks_exact(8);
+    let remainder = chunks.remainder();
+    for chunk in chunks {
+        let v0 = arrays[chunk[0].0].value(chunk[0].1);

Review Comment:
   Not even knowing architecture specific stuff could help a user optimize 
this, necessarily. If all the targets (or many) being gathered are on the same 
cache line you won't get a huge benefit from this. Skew and cardinality per 
batch could affect the effectiveness. I think the current approach makes sense 
and don't see a good way to expose knobs here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Unroll interleave -25-30% [arrow-rs]

Reply via email to