[GitHub] [arrow-rs] tustvold commented on a diff in pull request #2646: Help LLVM vectorize comparison kernel ~50-80% faster

GitBox Sun, 04 Sep 2022 13:41:06 -0700


tustvold commented on code in PR #2646:
URL: https://github.com/apache/arrow-rs/pull/2646#discussion_r962366792



##########
arrow/src/buffer/mutable.rs:
##########
@@ -496,38 +531,9 @@ impl MutableBuffer {
         mut iterator: I,
     ) -> Self {
         let (_, upper) = iterator.size_hint();
-        let upper = upper.expect("from_trusted_len_iter requires an upper 
limit");
-
-        let mut result = {
-            let byte_capacity: usize = upper.saturating_add(7) / 8;
-            MutableBuffer::new(byte_capacity)
-        };
+        let len = upper.expect("from_trusted_len_iter requires an upper 
limit");
 
-        'a: loop {
-            let mut byte_accum: u8 = 0;
-            let mut mask: u8 = 1;
-
-            //collect (up to) 8 bits into a byte
-            while mask != 0 {
-                if let Some(value) = iterator.next() {
-                    byte_accum |= match value {
-                        true => mask,
-                        false => 0,
-                    };
-                    mask <<= 1;
-                } else {
-                    if mask != 1 {
-                        // Add last byte
-                        result.push_unchecked(byte_accum);
-                    }
-                    break 'a;
-                }
-            }
-
-            // Soundness: from_trusted_len
-            result.push_unchecked(byte_accum);
-        }
-        result
+        Self::collect_bool(len, |_| iterator.next().unwrap())

Review Comment:
   Gave this a go and honestly I don't really understand the results. The 
scalar comparison is on par with this PR, but the non-scalar variants are worse 
by 50% compared to master... My 2 cents is that not using iterators for 
performance critical code makes for code that is easier to reason about, both 
by humans and LLVM, and so even if we can somehow finagle 
from_trusted_len_iter_bool to perform the same, I'm more comfortable with a 
simple loop :sweat_smile: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #2646: Help LLVM vectorize comparison kernel ~50-80% faster

Reply via email to