Re: [PR] Optimize numeric-to-string coercion in StringArrayDecoder [arrow-rs]

2025-03-12 Thread via GitHub


alamb commented on code in PR #7274:
URL: https://github.com/apache/arrow-rs/pull/7274#discussion_r1992280847


##
arrow-json/src/reader/string_array.rs:
##
@@ -30,15 +31,25 @@ const FALSE: &str = "false";
 pub struct StringArrayDecoder {
 coerce_primitive: bool,
 phantom: PhantomData,
+number_buffer: Vec,
 }
 
 impl StringArrayDecoder {
 pub fn new(coerce_primitive: bool) -> Self {
 Self {
 coerce_primitive,
 phantom: Default::default(),
+number_buffer: Vec::with_capacity(32),
 }
 }
+
+fn write_number(&mut self, n: T) -> &str {
+self.number_buffer.clear();
+write!(&mut self.number_buffer, "{}", n).unwrap();
+// SAFETY: We only write ASCII characters (digits, signs, decimal 
points,
+// exponent symbols) into `number_buffer`, which are guaranteed valid 
UTF-8.
+unsafe { std::str::from_utf8_unchecked(&self.number_buffer) }

Review Comment:
   I think you can avoid unsafe here by using 
   
   ```rusy
   number_buffer: String,
   ```
   
   This is similar to how @zhuqi-lucas  did it here:
   - https://github.com/apache/arrow-rs/pull/7263
   
   ```rust
   // Temporary buffer to avoid per-iteration allocation for numeric 
types
   let mut tmp_buf = String::new();
   ...
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Optimize numeric-to-string coercion in StringArrayDecoder [arrow-rs]

2025-03-11 Thread via GitHub


ndemir closed pull request #7274: Optimize numeric-to-string coercion in 
StringArrayDecoder
URL: https://github.com/apache/arrow-rs/pull/7274


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Optimize numeric-to-string coercion in StringArrayDecoder [arrow-rs]

2025-03-11 Thread via GitHub


ndemir commented on PR #7274:
URL: https://github.com/apache/arrow-rs/pull/7274#issuecomment-2716310977

   I see that the performance is consistently BETTER even though it is not 
hitting the refactored code.
   Will investigate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]