adriangb commented on code in PR #7015:
URL: https://github.com/apache/arrow-rs/pull/7015#discussion_r1992326453


##########
arrow-json/src/writer/encoder.rs:
##########
@@ -196,9 +236,18 @@ impl Encoder for StructArrayEncoder<'_> {
         let mut is_first = true;
         // Nulls can only be dropped in explicit mode
         let drop_nulls = (self.struct_mode == StructMode::ObjectOnly) && 
!self.explicit_nulls;
-        for field_encoder in &mut self.encoders {
-            let is_null = is_some_and(field_encoder.nulls.as_ref(), |n| 
n.is_null(idx));
-            if drop_nulls && is_null {
+
+        // Collect all the field nulls buffers up front to avoid dynamic 
dispatch in the loop
+        // This creates a temporary Vec, but avoids repeated virtual calls 
which should be a net win

Review Comment:
   I had run it and there was a measurable (~30%) slowdown. This makes sense, 
it was doing 2 dynamic dispatch calls for every row instead of just 1.
   
   I've now refactored this to avoid the dynamic dispatch altogether by 
encapsulating the dynamic encoder + null buffer in a helper struct. So it's 
basically the same as before except that things get passed around as a struct 
instead of a tuple of unrelated values. I don't love the name 
`EncoderWithNullBuffer`, open to suggestions.
   
   Benchmarks now show no difference vs. `main`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to