kumarUjjawal commented on code in PR #20116:
URL: https://github.com/apache/datafusion/pull/20116#discussion_r2759560813
##########
datafusion/spark/src/function/hash/sha2.rs:
##########
@@ -144,11 +257,18 @@ where
Arc::new(array)
}
+const HEX_CHARS: [u8; 16] = *b"0123456789abcdef";
+
+#[inline]
fn hex_encode<T: AsRef<[u8]>>(data: T) -> String {
- let mut s = String::with_capacity(data.as_ref().len() * 2);
- for b in data.as_ref() {
- // Writing to a string never errors, so we can unwrap here.
- write!(&mut s, "{b:02x}").unwrap();
+ let bytes = data.as_ref();
+ let mut out = Vec::with_capacity(bytes.len() * 2);
+ for &b in bytes {
+ let hi = b >> 4;
+ let lo = b & 0x0F;
+ out.push(HEX_CHARS[hi as usize]);
+ out.push(HEX_CHARS[lo as usize]);
Review Comment:
Since sha2 digests are fixed-size (28/32/48/64 bytes), the LUT approach is
already quite fast. I don't know if adding the `hex` will help here? What do
you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]