Re: [PR] [Parquet] perf: Create StructArrays directly rather than use ArrayData [arrow-rs]

2026-01-09 Thread via GitHub


alamb commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729350637

   
   run benchmark arrow_reader arrow_reader_clickbench
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Parquet] perf: Create StructArrays directly rather than use ArrayData [arrow-rs]

2026-01-09 Thread via GitHub


alamb-ghbot commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729350528

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing alamb/less_parquet_allocations 
(75ea40bf536af42a1297fc11e5a47b18472b649d) to 
96637fc8b928a94de53bbec3501337c0ecfbf936 
[diff](https://github.com/apache/arrow-rs/compare/96637fc8b928a94de53bbec3501337c0ecfbf936..75ea40bf536af42a1297fc11e5a47b18472b649d)
   BENCH_NAME=arrow_reader
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench arrow_reader 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=alamb_less_parquet_allocations
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Parquet] perf: Create StructArrays directly rather than use ArrayData [arrow-rs]

2026-01-09 Thread via GitHub


alamb commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729297605

   I suspect this is the biggest offender in terms of overhead. The same thing 
can be done to the other readers, which I think will also reduce some overhead 
(a single allocation)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Parquet] perf: Create StructArrays directly rather than use ArrayData [arrow-rs]

2026-01-09 Thread via GitHub


alamb-ghbot commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729283447

   🤖 Hi @alamb, thanks for the request 
(https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729283130).
   
   
[`scrape_comments.py`](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/scrape_comments.py)
 only supports whitelisted benchmarks.
   - Standard: (none)
   - Criterion: array_iter, arrow_reader, arrow_reader_clickbench, 
arrow_reader_row_filter, arrow_statistics, arrow_writer, bitwise_kernel, 
boolean_kernels, buffer_bit_ops, cast_kernels, coalesce_kernels, 
comparison_kernels, concatenate_kernel, csv_writer, encoding, filter_kernels, 
interleave_kernels, metadata, row_format, take_kernels, union_array, 
variant_builder, variant_kernels, variant_validation, view_types, zip_kernels
   
   Please choose one or more of these with `run benchmark ` or `run 
benchmark  ...`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [Parquet] perf: Create StructArrays directly rather than use ArrayData [arrow-rs]

2026-01-09 Thread via GitHub


alamb commented on PR #9120:
URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729283130

   run benchmarks arrow_reader arrow_reader_clickbench


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]