Re: [PR] [Parquet] perf: Create StructArrays directly rather than via `ArrayData` [arrow-rs]

2026-01-09 Thread via GitHub


alamb commented on code in PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#discussion_r2676536711


##
parquet/src/arrow/array_reader/struct_array.rs:
##
@@ -124,16 +124,15 @@ impl ArrayReader for StructArrayReader {
 return Err(general_err!("Not all children array length are the 
same!"));
 }
 
-// Now we can build array data
-let mut array_data_builder = 
ArrayDataBuilder::new(self.data_type.clone())
-.len(children_array_len)
-.child_data(
-children_array
-.into_iter()
-.map(|x| x.into_data())

Review Comment:
   converting the child arrays into ArrayData is wasteful for at least 2 
reasons:
   1. They are just converted back to ArrayRefs below
   2. Each array data has at least one new allocation (the Vec of buffers)



##
parquet/src/arrow/array_reader/struct_array.rs:
##
@@ -169,11 +166,18 @@ impl ArrayReader for StructArrayReader {
 return Err(general_err!("Failed to decode level data for 
struct array"));
 }
 
-array_data_builder = 
array_data_builder.null_bit_buffer(Some(bitmap_builder.into()));
+nulls = Some(NullBuffer::new(bitmap_builder.finish()));

Review Comment:
   NullBuffer::new counts the set bits, but so do the existing code paths



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Parquet] perf: Create StructArrays directly rather than via `ArrayData` [arrow-rs]

2026-01-09 Thread via GitHub


alamb commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3730702335

   The reason I think this will actually help ClickBench is that the ClickBench 
dataset has 105 columns. 
   
   I could probably make a benchmark that shows this helping for reading really 
wide tables
   
   > Read batch with 8192 rows and 105 columns
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Parquet] perf: Create StructArrays directly rather than via `ArrayData` [arrow-rs]

2026-01-09 Thread via GitHub


alamb commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3730184899

   my analysis of results is not a huge win but some improvement


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Parquet] perf: Create StructArrays directly rather than via `ArrayData` [arrow-rs]

2026-01-09 Thread via GitHub


alamb-ghbot commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3730131472

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   groupalamb_less_parquet_allocations 
main
   --- 

   arrow_reader_clickbench/async/Q1 1.00  2.3±0.05ms? ?/sec
1.01  2.3±0.01ms? ?/sec
   arrow_reader_clickbench/async/Q101.00 12.7±0.35ms? ?/sec
1.02 12.9±0.44ms? ?/sec
   arrow_reader_clickbench/async/Q111.00 14.3±0.33ms? ?/sec
1.04 14.8±0.66ms? ?/sec
   arrow_reader_clickbench/async/Q121.00 25.5±0.53ms? ?/sec
1.02 26.1±0.55ms? ?/sec
   arrow_reader_clickbench/async/Q131.00 31.0±0.60ms? ?/sec
1.00 31.0±0.70ms? ?/sec
   arrow_reader_clickbench/async/Q141.00 28.2±0.68ms? ?/sec
1.01 28.3±0.68ms? ?/sec
   arrow_reader_clickbench/async/Q191.00  5.2±0.13ms? ?/sec
1.03  5.3±0.14ms? ?/sec
   arrow_reader_clickbench/async/Q201.22149.2±1.37ms? ?/sec
1.00122.6±1.57ms? ?/sec
   arrow_reader_clickbench/async/Q211.08169.9±2.38ms? ?/sec
1.00156.6±2.79ms? ?/sec
   arrow_reader_clickbench/async/Q221.00241.6±2.10ms? ?/sec
1.29   311.7±10.07ms? ?/sec
   arrow_reader_clickbench/async/Q231.00408.7±2.83ms? ?/sec
1.00406.7±4.25ms? ?/sec
   arrow_reader_clickbench/async/Q241.00 34.5±0.51ms? ?/sec
1.00 34.7±0.86ms? ?/sec
   arrow_reader_clickbench/async/Q271.00100.4±1.00ms? ?/sec
1.00100.4±1.43ms? ?/sec
   arrow_reader_clickbench/async/Q281.01 99.0±0.91ms? ?/sec
1.00 98.4±0.66ms? ?/sec
   arrow_reader_clickbench/async/Q301.00 30.7±0.58ms? ?/sec
1.01 30.9±0.59ms? ?/sec
   arrow_reader_clickbench/async/Q361.00108.8±0.91ms? ?/sec
1.00108.9±1.48ms? ?/sec
   arrow_reader_clickbench/async/Q371.00 84.9±0.61ms? ?/sec
1.01 85.6±1.61ms? ?/sec
   arrow_reader_clickbench/async/Q381.00 32.7±0.50ms? ?/sec
1.01 33.0±0.48ms? ?/sec
   arrow_reader_clickbench/async/Q391.00 46.4±0.60ms? ?/sec
1.00 46.3±0.71ms? ?/sec
   arrow_reader_clickbench/async/Q401.00 27.5±0.60ms? ?/sec
1.00 27.5±0.65ms? ?/sec
   arrow_reader_clickbench/async/Q411.00 21.8±0.51ms? ?/sec
1.05 23.0±0.64ms? ?/sec
   arrow_reader_clickbench/async/Q421.00 10.8±0.14ms? ?/sec
1.04 11.2±0.19ms? ?/sec
   arrow_reader_clickbench/sync/Q1  1.00  2.1±0.01ms? ?/sec
1.03  2.1±0.01ms? ?/sec
   arrow_reader_clickbench/sync/Q10 1.00  9.9±0.16ms? ?/sec
1.00  9.9±0.11ms? ?/sec
   arrow_reader_clickbench/sync/Q11 1.00 11.4±0.17ms? ?/sec
1.00 11.4±0.25ms? ?/sec
   arrow_reader_clickbench/sync/Q12 1.06 36.6±1.80ms? ?/sec
1.00 34.4±1.12ms? ?/sec
   arrow_reader_clickbench/sync/Q13 1.00 47.9±0.97ms? ?/sec
1.00 47.8±1.07ms? ?/sec
   arrow_reader_clickbench/sync/Q14 1.00 36.4±0.71ms? ?/sec
1.23 44.6±1.02ms? ?/sec
   arrow_reader_clickbench/sync/Q19 1.00  4.2±0.03ms? ?/sec
1.03  4.3±0.04ms? ?/sec
   arrow_reader_clickbench/sync/Q20 1.00177.9±1.11ms? ?/sec
1.00177.8±1.22ms? ?/sec
   arrow_reader_clickbench/sync/Q21 1.01237.8±2.19ms? ?/sec
1.00235.0±3.23ms? ?/sec
   arrow_reader_clickbench/sync/Q22 1.00480.8±4.14ms? ?/sec
1.00481.5±4.21ms? ?/sec
   arrow_reader_clickbench/sync/Q23 1.02   442.3±17.97ms? ?/sec
1.00   433.4±12.76ms? ?/sec
   arrow_reader_clickbench/sync/Q24 1.00 45.6±1.34ms? ?/sec
1.00 45.8±0.80ms? ?/sec
   arrow_reader_clickbench/sync/Q27 1.00154.3±1.96ms? ?/sec
1.01155.2±2.01ms? ?/sec
   arrow_reader_clickbench/sync/Q28 1.00148.8±1.83ms? ?/sec
1.01149.8±2.15ms? ?/sec
   arrow_reader_clickbench/sync/Q30 1.00 30.9±0.55ms? ?/sec
1.02 31.5±0.97ms? ?/sec
   arrow_reader_clickbench/sync/Q36 1.00153.3±1.69ms? ?/sec
1.01154.1±1.67ms? ?/sec
   arrow_reader_clickbench/sync/Q37 1.00 88.4±0.86ms? ?/sec
1.01 89.3±1.68ms? ?/sec
   arrow_reader_clickbench/sync/Q38 1.00 29.1±0.49ms? ?/sec
1.02 29.6±0.39ms? ?/sec
   arrow_reader_c

Re: [PR] [Parquet] perf: Create StructArrays directly rather than via `ArrayData` [arrow-rs]

2026-01-09 Thread via GitHub


alamb-ghbot commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3730046979

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing alamb/less_parquet_allocations 
(75ea40bf536af42a1297fc11e5a47b18472b649d) to 
96637fc8b928a94de53bbec3501337c0ecfbf936 
[diff](https://github.com/apache/arrow-rs/compare/96637fc8b928a94de53bbec3501337c0ecfbf936..75ea40bf536af42a1297fc11e5a47b18472b649d)
   BENCH_NAME=arrow_reader_clickbench
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench arrow_reader_clickbench 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=alamb_less_parquet_allocations
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Parquet] perf: Create StructArrays directly rather than via `ArrayData` [arrow-rs]

2026-01-09 Thread via GitHub


alamb-ghbot commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3730046833

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
  alamb_less_parquet_allocations main
   -
  -- 
   arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no 
NULLs   1.02  1288.5±72.50µs? ?/sec1.00   
1269.0±4.39µs? ?/sec
   arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, half 
NULLs  1.00  1297.6±14.79µs? ?/sec1.00  
1292.0±16.26µs? ?/sec
   arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, no 
NULLs1.01   1285.3±7.71µs? ?/sec1.00   
1276.8±4.46µs? ?/sec
   arrow_array_reader/BinaryArray/dictionary encoded, mandatory, no NULLs   
  1.09   519.1±13.71µs? ?/sec1.00
477.7±5.81µs? ?/sec
   arrow_array_reader/BinaryArray/dictionary encoded, optional, half NULLs  
  1.01673.1±7.15µs? ?/sec1.00   
668.2±29.95µs? ?/sec
   arrow_array_reader/BinaryArray/dictionary encoded, optional, no NULLs
  1.04   509.4±27.48µs? ?/sec1.00
488.1±5.00µs? ?/sec
   arrow_array_reader/BinaryArray/plain encoded, mandatory, no NULLs
  1.04560.7±7.13µs? ?/sec1.00
538.2±9.08µs? ?/sec
   arrow_array_reader/BinaryArray/plain encoded, optional, half NULLs   
  1.00727.7±7.03µs? ?/sec1.02
742.8±4.69µs? ?/sec
   arrow_array_reader/BinaryArray/plain encoded, optional, no NULLs 
  1.05573.8±8.33µs? ?/sec1.00
547.0±3.26µs? ?/sec
   arrow_array_reader/BinaryViewArray/dictionary encoded, mandatory, no NULLs   
  1.00237.6±3.93µs? ?/sec1.05
249.3±2.97µs? ?/sec
   arrow_array_reader/BinaryViewArray/dictionary encoded, optional, half NULLs  
  1.05241.5±2.59µs? ?/sec1.00
230.3±3.86µs? ?/sec
   arrow_array_reader/BinaryViewArray/dictionary encoded, optional, no NULLs
  1.00236.5±3.31µs? ?/sec1.08
256.3±2.74µs? ?/sec
   arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs
  1.01295.2±9.58µs? ?/sec1.00
292.6±2.22µs? ?/sec
   arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs, short 
string1.00   287.8±14.88µs? ?/sec1.06
306.5±4.81µs? ?/sec
   arrow_array_reader/BinaryViewArray/plain encoded, optional, half NULLs   
  1.06285.0±3.48µs? ?/sec1.00
268.9±3.39µs? ?/sec
   arrow_array_reader/BinaryViewArray/plain encoded, optional, no NULLs 
  1.00302.1±4.74µs? ?/sec1.00   
302.5±11.43µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split 
encoded, mandatory, no NULLs 1.05   1127.7±6.40µs? ?/sec1.00   
1071.6±9.95µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split 
encoded, optional, half NULLs1.04   974.9±16.30µs? ?/sec1.00   
941.3±20.37µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split 
encoded, optional, no NULLs  1.04  1130.6±10.45µs? ?/sec1.00  
1089.5±41.68µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, 
mandatory, no NULLs 1.12454.1±5.67µs? ?/sec1.00 
   404.1±4.24µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, 
optional, half NULLs1.09   637.7±10.53µs? ?/sec1.00 
   586.2±3.94µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, 
optional, no NULLs  1.11   459.0±14.97µs? ?/sec1.00 
   415.3±3.92µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split 
encoded, mandatory, no NULLs1.00160.4±2.14µs? ?/sec1.26 
   202.3±1.43µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split 
encoded, optional, half NULLs   1.00287.4±2.25µs? ?/sec1.10 
   317.5±1.58µs? ?/sec
   arrow_array_reader/FIXED_L

Re: [PR] [Parquet] perf: Create StructArrays directly rather than via `ArrayData` [arrow-rs]

2026-01-09 Thread via GitHub


alamb-ghbot commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729810850

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing alamb/less_parquet_allocations 
(75ea40bf536af42a1297fc11e5a47b18472b649d) to 
96637fc8b928a94de53bbec3501337c0ecfbf936 
[diff](https://github.com/apache/arrow-rs/compare/96637fc8b928a94de53bbec3501337c0ecfbf936..75ea40bf536af42a1297fc11e5a47b18472b649d)
   BENCH_NAME=arrow_reader
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench arrow_reader 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=alamb_less_parquet_allocations
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Parquet] perf: Create StructArrays directly rather than via `ArrayData` [arrow-rs]

2026-01-09 Thread via GitHub


alamb-ghbot commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729810577

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   groupalamb_less_parquet_allocations 
main
   --- 

   arrow_reader_clickbench/async/Q1 1.00  2.3±0.07ms? ?/sec
1.02  2.3±0.09ms? ?/sec
   arrow_reader_clickbench/async/Q101.00 12.8±0.42ms? ?/sec
1.05 13.3±0.76ms? ?/sec
   arrow_reader_clickbench/async/Q111.00 14.6±0.55ms? ?/sec
1.02 14.9±0.57ms? ?/sec
   arrow_reader_clickbench/async/Q121.00 25.9±0.58ms? ?/sec
1.02 26.4±0.70ms? ?/sec
   arrow_reader_clickbench/async/Q131.02 31.5±0.79ms? ?/sec
1.00 30.9±0.52ms? ?/sec
   arrow_reader_clickbench/async/Q141.01 28.8±0.65ms? ?/sec
1.00 28.6±0.59ms? ?/sec
   arrow_reader_clickbench/async/Q191.00  5.2±0.08ms? ?/sec
1.03  5.4±0.22ms? ?/sec
   arrow_reader_clickbench/async/Q201.21149.2±1.53ms? ?/sec
1.00123.3±0.69ms? ?/sec
   arrow_reader_clickbench/async/Q211.08170.0±1.53ms? ?/sec
1.00157.7±2.81ms? ?/sec
   arrow_reader_clickbench/async/Q221.00   308.4±36.14ms? ?/sec
1.01   312.9±12.38ms? ?/sec
   arrow_reader_clickbench/async/Q231.01410.2±4.15ms? ?/sec
1.00408.1±3.52ms? ?/sec
   arrow_reader_clickbench/async/Q241.00 34.6±0.94ms? ?/sec
1.02 35.1±0.71ms? ?/sec
   arrow_reader_clickbench/async/Q271.01101.2±0.65ms? ?/sec
1.00100.5±0.75ms? ?/sec
   arrow_reader_clickbench/async/Q281.01 99.2±1.48ms? ?/sec
1.00 98.5±0.63ms? ?/sec
   arrow_reader_clickbench/async/Q301.00 31.1±0.99ms? ?/sec
1.00 31.0±0.63ms? ?/sec
   arrow_reader_clickbench/async/Q361.00109.9±0.99ms? ?/sec
1.00110.4±4.08ms? ?/sec
   arrow_reader_clickbench/async/Q371.00 86.0±0.74ms? ?/sec
1.00 85.8±0.76ms? ?/sec
   arrow_reader_clickbench/async/Q381.00 33.3±0.64ms? ?/sec
1.00 33.2±0.53ms? ?/sec
   arrow_reader_clickbench/async/Q391.00 46.1±0.46ms? ?/sec
1.01 46.4±0.59ms? ?/sec
   arrow_reader_clickbench/async/Q401.00 27.3±0.69ms? ?/sec
1.02 27.8±0.73ms? ?/sec
   arrow_reader_clickbench/async/Q411.00 22.5±0.52ms? ?/sec
1.00 22.6±0.41ms? ?/sec
   arrow_reader_clickbench/async/Q421.00 11.0±0.20ms? ?/sec
1.01 11.1±0.16ms? ?/sec
   arrow_reader_clickbench/sync/Q1  1.00  2.1±0.01ms? ?/sec
1.04  2.1±0.06ms? ?/sec
   arrow_reader_clickbench/sync/Q10 1.00 10.0±0.18ms? ?/sec
1.00 10.0±0.23ms? ?/sec
   arrow_reader_clickbench/sync/Q11 1.00 11.4±0.25ms? ?/sec
1.01 11.6±0.23ms? ?/sec
   arrow_reader_clickbench/sync/Q12 1.07 36.9±2.00ms? ?/sec
1.00 34.4±0.61ms? ?/sec
   arrow_reader_clickbench/sync/Q13 1.00 48.0±1.21ms? ?/sec
1.02 48.7±1.27ms? ?/sec
   arrow_reader_clickbench/sync/Q14 1.00 44.2±1.11ms? ?/sec
1.02 45.0±1.08ms? ?/sec
   arrow_reader_clickbench/sync/Q19 1.00  4.2±0.03ms? ?/sec
1.02  4.3±0.03ms? ?/sec
   arrow_reader_clickbench/sync/Q20 1.00178.7±1.50ms? ?/sec
1.00178.4±1.57ms? ?/sec
   arrow_reader_clickbench/sync/Q21 1.01237.9±2.33ms? ?/sec
1.00236.5±2.75ms? ?/sec
   arrow_reader_clickbench/sync/Q22 1.00478.2±4.24ms? ?/sec
1.01484.9±4.94ms? ?/sec
   arrow_reader_clickbench/sync/Q23 1.02   444.7±18.71ms? ?/sec
1.00   437.1±13.75ms? ?/sec
   arrow_reader_clickbench/sync/Q24 1.00 46.4±0.90ms? ?/sec
1.01 46.9±0.79ms? ?/sec
   arrow_reader_clickbench/sync/Q27 1.00154.8±1.48ms? ?/sec
1.00155.4±1.50ms? ?/sec
   arrow_reader_clickbench/sync/Q28 1.00149.8±1.36ms? ?/sec
1.01151.7±2.33ms? ?/sec
   arrow_reader_clickbench/sync/Q30 1.00 31.4±0.73ms? ?/sec
1.01 31.6±1.18ms? ?/sec
   arrow_reader_clickbench/sync/Q36 1.00154.4±2.27ms? ?/sec
1.02156.8±1.59ms? ?/sec
   arrow_reader_clickbench/sync/Q37 1.00 88.7±0.96ms? ?/sec
1.02 90.6±1.31ms? ?/sec
   arrow_reader_clickbench/sync/Q38 1.00 29.4±0.64ms? ?/sec
1.01 29.7±0.81ms? ?/sec
   arrow_reader_c

Re: [PR] [Parquet] perf: Create StructArrays directly rather than via `ArrayData` [arrow-rs]

2026-01-09 Thread via GitHub


alamb-ghbot commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729719775

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing alamb/less_parquet_allocations 
(75ea40bf536af42a1297fc11e5a47b18472b649d) to 
96637fc8b928a94de53bbec3501337c0ecfbf936 
[diff](https://github.com/apache/arrow-rs/compare/96637fc8b928a94de53bbec3501337c0ecfbf936..75ea40bf536af42a1297fc11e5a47b18472b649d)
   BENCH_NAME=arrow_reader_clickbench
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench arrow_reader_clickbench 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=alamb_less_parquet_allocations
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Parquet] perf: Create StructArrays directly rather than via `ArrayData` [arrow-rs]

2026-01-09 Thread via GitHub


alamb-ghbot commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729719550

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
  alamb_less_parquet_allocations main
   -
  -- 
   arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no 
NULLs   1.00  1275.4±14.64µs? ?/sec1.00   
1269.4±5.96µs? ?/sec
   arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, half 
NULLs  1.00  1297.5±16.24µs? ?/sec1.00   
1291.0±4.78µs? ?/sec
   arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, no 
NULLs1.00  1285.7±14.46µs? ?/sec1.00  
1280.7±19.13µs? ?/sec
   arrow_array_reader/BinaryArray/dictionary encoded, mandatory, no NULLs   
  1.08   517.9±17.01µs? ?/sec1.00   
477.4±11.91µs? ?/sec
   arrow_array_reader/BinaryArray/dictionary encoded, optional, half NULLs  
  1.01672.5±5.42µs? ?/sec1.00
665.5±3.98µs? ?/sec
   arrow_array_reader/BinaryArray/dictionary encoded, optional, no NULLs
  1.03504.6±4.95µs? ?/sec1.00
491.6±9.60µs? ?/sec
   arrow_array_reader/BinaryArray/plain encoded, mandatory, no NULLs
  1.04559.1±2.87µs? ?/sec1.00   
538.0±10.80µs? ?/sec
   arrow_array_reader/BinaryArray/plain encoded, optional, half NULLs   
  1.00728.3±4.37µs? ?/sec1.02   
744.3±11.11µs? ?/sec
   arrow_array_reader/BinaryArray/plain encoded, optional, no NULLs 
  1.04   570.1±29.19µs? ?/sec1.00
549.2±5.53µs? ?/sec
   arrow_array_reader/BinaryViewArray/dictionary encoded, mandatory, no NULLs   
  1.00238.2±5.00µs? ?/sec1.06   
252.5±15.70µs? ?/sec
   arrow_array_reader/BinaryViewArray/dictionary encoded, optional, half NULLs  
  1.05241.3±1.81µs? ?/sec1.00
229.9±2.92µs? ?/sec
   arrow_array_reader/BinaryViewArray/dictionary encoded, optional, no NULLs
  1.00235.2±3.05µs? ?/sec1.09
255.7±3.29µs? ?/sec
   arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs
  1.00293.4±4.22µs? ?/sec1.00
292.7±4.54µs? ?/sec
   arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs, short 
string1.00284.1±3.15µs? ?/sec1.07
305.1±3.08µs? ?/sec
   arrow_array_reader/BinaryViewArray/plain encoded, optional, half NULLs   
  1.07283.8±2.37µs? ?/sec1.00
266.2±2.03µs? ?/sec
   arrow_array_reader/BinaryViewArray/plain encoded, optional, no NULLs 
  1.00300.7±6.12µs? ?/sec1.00
302.0±1.53µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split 
encoded, mandatory, no NULLs 1.05   1124.7±6.56µs? ?/sec1.00   
1070.5±7.75µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split 
encoded, optional, half NULLs1.04   976.8±11.17µs? ?/sec1.00   
942.3±10.08µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split 
encoded, optional, no NULLs  1.05  1136.5±18.59µs? ?/sec1.00  
1083.9±15.25µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, 
mandatory, no NULLs 1.13453.9±9.08µs? ?/sec1.00 
   402.1±3.35µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, 
optional, half NULLs1.09   640.0±12.58µs? ?/sec1.00 
   588.2±6.46µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, 
optional, no NULLs  1.11458.7±3.46µs? ?/sec1.00 
   413.9±7.01µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split 
encoded, mandatory, no NULLs1.00160.4±1.14µs? ?/sec1.26 
   202.2±1.23µs? ?/sec
   arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split 
encoded, optional, half NULLs   1.00286.4±3.56µs? ?/sec1.11 
   316.5±2.25µs? ?/sec
   arrow_array_reader/FIXED_L