adriangb commented on code in PR #7015:
URL: https://github.com/apache/arrow-rs/pull/7015#discussion_r1992335818
##########
arrow-json/src/writer/encoder.rs:
##########
@@ -196,9 +236,18 @@ impl Encoder for StructArrayEncoder<'_> {
let mut is_first = true;
// Nulls can only be dropped in explicit mode
let drop_nulls = (self.struct_mode == StructMode::ObjectOnly) &&
!self.explicit_nulls;
- for field_encoder in &mut self.encoders {
- let is_null = is_some_and(field_encoder.nulls.as_ref(), |n|
n.is_null(idx));
- if drop_nulls && is_null {
+
+ // Collect all the field nulls buffers up front to avoid dynamic
dispatch in the loop
+ // This creates a temporary Vec, but avoids repeated virtual calls
which should be a net win
Review Comment:
```
❯ git checkout main && cargo bench --bench json_writer && git checkout
arrow-union && cargo bench --bench json_writer
Switched to branch 'main'
Your branch is ahead of 'origin/main' by 6 commits.
(use "git push" to publish your local commits)
Compiling arrow-json v54.2.1 (/Users/adriangb/GitHub/arrow-rs/arrow-json)
Compiling arrow v54.2.1 (/Users/adriangb/GitHub/arrow-rs/arrow)
Finished `bench` profile [optimized] target(s) in 3.03s
Running benches/json_writer.rs
(target/release/deps/json_writer-8d99cf5d0d08efba)
Gnuplot not found, using plotters backend
bench_integer time: [2.9239 ms 2.9488 ms 2.9810 ms]
change: [-1.4155% -0.3843% +0.7943%] (p = 0.52 >
0.05)
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
bench_float time: [3.4054 ms 3.4216 ms 3.4394 ms]
change: [-3.7264% -2.9504% -2.1609%] (p = 0.00 <
0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
8 (8.00%) high mild
7 (7.00%) high severe
bench_string time: [8.8852 ms 8.9466 ms 9.0106 ms]
change: [-1.3959% -0.5660% +0.3531%] (p = 0.21 >
0.05)
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
bench_mixed time: [11.975 ms 12.030 ms 12.087 ms]
change: [-3.7701% -3.0819% -2.4763%] (p = 0.00 <
0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
5 (5.00%) high mild
bench_dict_array time: [4.6221 ms 4.6771 ms 4.7345 ms]
change: [-2.4806% -0.5873% +1.2096%] (p = 0.53 >
0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
bench_struct time: [19.521 ms 19.623 ms 19.731 ms]
change: [-0.9078% +0.1000% +0.9733%] (p = 0.84 >
0.05)
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
bench_nullable_struct time: [9.4059 ms 9.4565 ms 9.5092 ms]
change: [-3.5737% -2.7898% -1.9920%] (p = 0.00 <
0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
bench_list time: [27.854 ms 27.976 ms 28.118 ms]
change: [-0.8182% -0.3309% +0.1897%] (p = 0.22 >
0.05)
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
7 (7.00%) high mild
3 (3.00%) high severe
bench_nullable_list time: [5.9283 ms 5.9849 ms 6.0429 ms]
change: [-8.7315% -6.8046% -5.0511%] (p = 0.00 <
0.05)
Performance has improved.
bench_struct_list time: [2.9264 ms 2.9694 ms 3.0130 ms]
change: [-5.8590% -3.8633% -1.8735%] (p = 0.00 <
0.05)
Performance has improved.
Switched to branch 'arrow-union'
Your branch is up to date with 'origin/arrow-union'.
Compiling arrow-json v54.2.1 (/Users/adriangb/GitHub/arrow-rs/arrow-json)
Compiling arrow v54.2.1 (/Users/adriangb/GitHub/arrow-rs/arrow)
Finished `bench` profile [optimized] target(s) in 3.03s
Running benches/json_writer.rs
(target/release/deps/json_writer-8d99cf5d0d08efba)
Gnuplot not found, using plotters backend
bench_integer time: [2.9105 ms 2.9313 ms 2.9542 ms]
change: [-1.8789% -0.5924% +0.5939%] (p = 0.36 >
0.05)
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) high mild
2 (2.00%) high severe
bench_float time: [3.3695 ms 3.3854 ms 3.4041 ms]
change: [-1.7458% -1.0557% -0.3581%] (p = 0.00 <
0.05)
Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
bench_string time: [8.7954 ms 8.8473 ms 8.9044 ms]
change: [-2.0603% -1.1094% -0.2183%] (p = 0.02 <
0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
bench_mixed time: [12.034 ms 12.099 ms 12.170 ms]
change: [-0.1335% +0.5726% +1.3337%] (p = 0.13 >
0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
bench_dict_array time: [4.4275 ms 4.4639 ms 4.5020 ms]
change: [-5.9236% -4.5580% -3.1699%] (p = 0.00 <
0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
bench_struct time: [19.450 ms 19.538 ms 19.640 ms]
change: [-1.1211% -0.4330% +0.3004%] (p = 0.25 >
0.05)
No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
bench_nullable_struct time: [9.4349 ms 9.4862 ms 9.5417 ms]
change: [-0.4570% +0.3147% +1.0725%] (p = 0.43 >
0.05)
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
bench_list time: [27.799 ms 27.901 ms 28.015 ms]
change: [-0.8950% -0.2687% +0.2907%] (p = 0.40 >
0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
bench_nullable_list time: [5.9567 ms 6.0222 ms 6.0886 ms]
change: [-0.9060% +0.6229% +2.1762%] (p = 0.41 >
0.05)
No change in performance detected.
bench_struct_list time: [2.8755 ms 2.9589 ms 3.0925 ms]
change: [-3.6152% -0.3533% +4.3395%] (p = 0.89 >
0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]