Re: [PR] [Parquet] perf: Create StructArrays directly rather than use ArrayData [arrow-rs]
alamb commented on PR #9120: URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729350637 run benchmark arrow_reader arrow_reader_clickbench -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] [Parquet] perf: Create StructArrays directly rather than use ArrayData [arrow-rs]
alamb-ghbot commented on PR #9120: URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729350528 🤖 `./gh_compare_arrow.sh` [gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing alamb/less_parquet_allocations (75ea40bf536af42a1297fc11e5a47b18472b649d) to 96637fc8b928a94de53bbec3501337c0ecfbf936 [diff](https://github.com/apache/arrow-rs/compare/96637fc8b928a94de53bbec3501337c0ecfbf936..75ea40bf536af42a1297fc11e5a47b18472b649d) BENCH_NAME=arrow_reader BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader BENCH_FILTER= BENCH_BRANCH_NAME=alamb_less_parquet_allocations Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] [Parquet] perf: Create StructArrays directly rather than use ArrayData [arrow-rs]
alamb commented on PR #9120: URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729297605 I suspect this is the biggest offender in terms of overhead. The same thing can be done to the other readers, which I think will also reduce some overhead (a single allocation) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] [Parquet] perf: Create StructArrays directly rather than use ArrayData [arrow-rs]
alamb-ghbot commented on PR #9120: URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729283447 🤖 Hi @alamb, thanks for the request (https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729283130). [`scrape_comments.py`](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/scrape_comments.py) only supports whitelisted benchmarks. - Standard: (none) - Criterion: array_iter, arrow_reader, arrow_reader_clickbench, arrow_reader_row_filter, arrow_statistics, arrow_writer, bitwise_kernel, boolean_kernels, buffer_bit_ops, cast_kernels, coalesce_kernels, comparison_kernels, concatenate_kernel, csv_writer, encoding, filter_kernels, interleave_kernels, metadata, row_format, take_kernels, union_array, variant_builder, variant_kernels, variant_validation, view_types, zip_kernels Please choose one or more of these with `run benchmark ` or `run benchmark ...` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] [Parquet] perf: Create StructArrays directly rather than use ArrayData [arrow-rs]
alamb commented on PR #9120: URL: https://github.com/apache/arrow-rs/pull/9120#issuecomment-3729283130 run benchmarks arrow_reader arrow_reader_clickbench -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
