Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
tobixdev commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3639345072 Hahaha no worries. We can just go with the new one if we can find another reviewer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3639288763 I am so sorry @tobixdev . I messed up this PR by accident. I did save the code in a new PR: https://github.com/apache/arrow-rs/pull/8981 I think you can revive the PR if you want to, by going to this commit: 885f31e26ec here is the command I used ```shell git reset --hard 885f31e26ec ``` If you push that branch again, I think we'll be able to reopen this PR Alternately, we can just go with https://github.com/apache/arrow-rs/pull/8981 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3639278490 I am sorry -- I accidentally pushed to this branch. I will fix shortly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3639278741 vv -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb closed pull request #8948: Check validity of indices in take_fixed_size_binary URL: https://github.com/apache/arrow-rs/pull/8948 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb-ghbot commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3639068063 🤖: Benchmark completed Details ``` group maintake-fsb-null-indices - - take bool 1024 1.01 1353.0±129.03ns? ?/sec1.00 1336.0±10.41ns? ?/sec take bool 512 1.00730.4±1.73ns? ?/sec 1.01 735.8±17.52ns? ?/sec take bool null indices 1024 1.00 1624.0±8.54ns? ?/sec 1.05 1701.1±12.56ns? ?/sec take bool null values 1024 1.00 2.6±0.01µs? ?/sec 1.00 2.6±0.08µs? ?/sec take bool null values null indices 1024 1.00 3.1±0.04µs? ?/sec 1.03 3.2±0.08µs? ?/sec take check bounds i32 1024 1.02 1648.2±45.69ns? ?/sec 1.00 1619.3±16.88ns? ?/sec take check bounds i32 512 1.00824.1±1.97ns? ?/sec 1.08 893.2±27.21ns? ?/sec take i32 1024 1.00 723.2±17.24ns? ?/sec 1.00719.8±4.72ns? ?/sec take i32 512 1.00399.9±3.01ns? ?/sec 1.06425.3±2.10ns? ?/sec take i32 null indices 1024 1.00 998.9±24.13ns? ?/sec 1.00 997.0±15.89ns? ?/sec take i32 null values 1024 1.01 2.1±0.04µs? ?/sec 1.00 2.0±0.04µs? ?/sec take i32 null values null indices 1024 1.00 2.6±0.12µs? ?/sec 1.02 2.7±0.03µs? ?/sec take primitive fsb value len: 12, indices: 1024 2.31 8.4±0.15µs? ?/sec 1.00 3.6±0.05µs? ?/sec take primitive fsb value len: 12, null values, indices: 1024 1.66 8.3±0.33µs? ?/sec 1.00 5.0±0.02µs? ?/sec take primitive run logical len: 1024, physical len: 512, indices: 1024 1.00 20.5±0.13µs? ?/sec 1.01 20.6±0.30µs? ?/sec take str 1024 1.01 11.1±0.09µs? ?/sec 1.00 11.0±0.05µs? ?/sec take str 512 1.00 5.3±0.03µs? ?/sec 1.02 5.4±0.10µs? ?/sec take str null indices 1024 1.01 6.8±0.07µs? ?/sec 1.00 6.8±0.15µs? ?/sec take str null indices 512 1.00 3.3±0.03µs? ?/sec 1.03 3.4±0.02µs? ?/sec take str null values 1024 1.01 8.7±0.23µs? ?/sec 1.00 8.7±0.10µs? ?/sec take str null values null indices 1024 1.01 6.4±0.07µs? ?/sec 1.00 6.4±0.05µs? ?/sec take stringview 1024 1.08847.7±4.66ns? ?/sec 1.00782.0±9.77ns? ?/sec take stringview 512 1.14546.4±0.82ns? ?/sec 1.00 477.9±16.14ns? ?/sec take stringview null indices 1024 1.01 1398.4±24.26ns? ?/sec 1.00 1381.4±19.69ns? ?/sec take stringview null indices 512 1.00671.7±3.08ns? ?/sec 1.03693.4±8.12ns? ?/sec take stringview null values 1024 1.00 2.1±0.03µs? ?/sec 1.01 2.1±0.01µs? ?/sec take stringview null values null indices 1024 1.00 2.9±0.06µs? ?/sec 1.00 2.9±0.04µs? ?/sec ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb-ghbot commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3639030467 🤖 `./gh_compare_arrow.sh` [gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing take-fsb-null-indices (885f31e26ec2019132f7b1512b9b3f08dc873c1a) to 1c90efe717c1a2f99a0a97d6e9c93cd913eaca3c [diff](https://github.com/apache/arrow-rs/compare/1c90efe717c1a2f99a0a97d6e9c93cd913eaca3c..885f31e26ec2019132f7b1512b9b3f08dc873c1a) BENCH_NAME=take_kernels BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench take_kernels BENCH_FILTER= BENCH_BRANCH_NAME=take-fsb-null-indices Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3639030099 run benchmark take_kernels -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3639029406 Looking nice now! Let's run it again to make sure it is reproduceable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb-ghbot commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3639016186 🤖: Benchmark completed Details ``` group main take-fsb-null-indices - - take bool 1024 1.00 1331.7±5.48ns? ?/sec1.00 1333.7±7.89ns? ?/sec take bool 512 1.00731.1±4.81ns? ?/sec1.00731.8±3.22ns? ?/sec take bool null indices 1024 1.00 1635.1±23.98ns? ?/sec1.05 1709.1±36.39ns? ?/sec take bool null values 1024 1.00 2.6±0.06µs? ?/sec1.00 2.6±0.02µs? ?/sec take bool null values null indices 1024 1.00 3.1±0.02µs? ?/sec1.07 3.3±0.04µs? ?/sec take check bounds i32 1024 1.00 1619.2±13.94ns? ?/sec1.00 1625.9±6.57ns? ?/sec take check bounds i32 512 1.00805.6±2.15ns? ?/sec1.11894.2±9.02ns? ?/sec take i32 1024 1.00710.7±2.03ns? ?/sec1.02725.6±3.23ns? ?/sec take i32 512 1.00389.3±6.96ns? ?/sec1.11 432.2±22.62ns? ?/sec take i32 null indices 1024 1.00 1005.7±88.58ns? ?/sec1.00 1002.6±6.38ns? ?/sec take i32 null values 1024 1.00 2.0±0.01µs? ?/sec1.00 2.0±0.02µs? ?/sec take i32 null values null indices 1024 1.00 2.6±0.01µs? ?/sec1.04 2.7±0.03µs? ?/sec take primitive fsb value len: 12, indices: 1024 2.31 8.4±0.14µs? ?/sec1.00 3.6±0.04µs? ?/sec take primitive fsb value len: 12, null values, indices: 1024 1.66 8.3±0.24µs? ?/sec1.00 5.0±0.03µs? ?/sec take primitive run logical len: 1024, physical len: 512, indices: 1024 1.01 20.8±0.31µs? ?/sec1.00 20.6±0.79µs? ?/sec take str 1024 1.00 11.0±0.16µs? ?/sec1.00 11.0±0.23µs? ?/sec take str 512 1.02 5.4±0.14µs? ?/sec1.00 5.3±0.06µs? ?/sec take str null indices 1024 1.00 6.9±0.08µs? ?/sec1.01 6.9±0.09µs? ?/sec take str null indices 512 1.00 3.4±0.11µs? ?/sec1.00 3.4±0.04µs? ?/sec take str null values 1024 1.00 8.7±0.16µs? ?/sec1.00 8.7±0.42µs? ?/sec take str null values null indices 1024 1.00 6.5±0.04µs? ?/sec1.00 6.5±0.04µs? ?/sec take stringview 1024 1.00 782.4±31.32ns? ?/sec1.00779.3±2.93ns? ?/sec take stringview 512 1.12548.4±1.35ns? ?/sec1.00 488.7±14.56ns? ?/sec take stringview null indices 1024 1.00 1386.1±41.76ns? ?/sec1.02 1413.2±18.39ns? ?/sec take stringview null indices 512 1.00676.0±3.94ns? ?/sec1.04706.2±8.41ns? ?/sec take stringview null values 1024 1.00 2.0±0.02µs? ?/sec1.01 2.1±0.06µs? ?/sec take stringview null values null indices 1024 1.01 2.9±0.02µs? ?/sec1.00 2.8±0.05µs? ?/sec ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
tobixdev commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3638994272 > I am always a bit suspicious about benchmarks in the usec range as they are so suseptable to noise. I'll run some more I agree but I think it's possible that the initial change introduced a performance regression. I've pushed a new version that performed better on my PC so hopefully we can see that in the benchmarks. :crossed_fingers: Nice robotic sheep for the bot btw! :sheep: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3638983429 I am always a bit suspicious about benchmarks in the usec range as they are so suseptable to noise. I'll run some more -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb-ghbot commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3638984254 🤖 `./gh_compare_arrow.sh` [gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing take-fsb-null-indices (885f31e26ec2019132f7b1512b9b3f08dc873c1a) to 1c90efe717c1a2f99a0a97d6e9c93cd913eaca3c [diff](https://github.com/apache/arrow-rs/compare/1c90efe717c1a2f99a0a97d6e9c93cd913eaca3c..885f31e26ec2019132f7b1512b9b3f08dc873c1a) BENCH_NAME=take_kernels BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench take_kernels BENCH_FILTER= BENCH_BRANCH_NAME=take-fsb-null-indices Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb-ghbot commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3638983803 🤖 Hi @alamb, thanks for the request (https://github.com/apache/arrow-rs/pull/8948#issuecomment-3638983729). [`scrape_comments.py`](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/scrape_comments.py) only supports whitelisted benchmarks. - Standard: (none) - Criterion: arrow_reader, arrow_reader_clickbench, arrow_reader_row_filter, arrow_statistics, arrow_writer, coalesce_kernels, concatenate_kernels, filter_kernels, interleave_kernels, take_kernels, union_array, variant_builder, variant_kernels, variant_validation Please choose one or more of these with `run benchmark ` or `run benchmark ...` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3638983729 run benchmarks take_kernels -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
tobixdev commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3638637353 > take primitive fsb value len: 12, indices: 1024 1.00 8.4±0.08µs? ?/sec1.29 10.9±0.36µs? ?/sec > take primitive fsb value len: 12, null values, indices: 1024 1.00 8.5±0.10µs? ?/sec1.75 14.8±0.38µs? ?/sec Thanks! I guess this confirms the performance regression. I'll give the other approach (applying nulls at the end) a shot. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb-ghbot commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3638596822 🤖: Benchmark completed Details ``` group main take-fsb-null-indices - - take bool 1024 1.00 1335.4±8.42ns? ?/sec1.00 1341.1±53.88ns? ?/sec take bool 512 1.00 735.3±31.78ns? ?/sec1.00 733.0±18.79ns? ?/sec take bool null indices 1024 1.02 1652.4±10.54ns? ?/sec1.00 1621.9±55.89ns? ?/sec take bool null values 1024 1.00 2.6±0.08µs? ?/sec1.00 2.6±0.01µs? ?/sec take bool null values null indices 1024 1.00 3.1±0.13µs? ?/sec1.06 3.3±0.05µs? ?/sec take check bounds i32 1024 1.00 1623.1±37.35ns? ?/sec1.00 1624.2±13.15ns? ?/sec take check bounds i32 512 1.00807.1±4.49ns? ?/sec1.01 811.2±17.09ns? ?/sec take i32 1024 1.00711.0±2.04ns? ?/sec1.01 715.8±23.29ns? ?/sec take i32 512 1.00389.5±5.54ns? ?/sec1.12436.2±7.66ns? ?/sec take i32 null indices 1024 1.01 999.9±24.26ns? ?/sec1.00992.5±3.65ns? ?/sec take i32 null values 1024 1.00 2.0±0.01µs? ?/sec1.00 2.0±0.02µs? ?/sec take i32 null values null indices 1024 1.02 2.6±0.10µs? ?/sec1.00 2.6±0.03µs? ?/sec take primitive fsb value len: 12, indices: 1024 1.00 8.4±0.08µs? ?/sec1.29 10.9±0.36µs? ?/sec take primitive fsb value len: 12, null values, indices: 1024 1.00 8.5±0.10µs? ?/sec1.75 14.8±0.38µs? ?/sec take primitive run logical len: 1024, physical len: 512, indices: 1024 1.01 20.9±0.17µs? ?/sec1.00 20.8±0.80µs? ?/sec take str 1024 1.00 11.0±0.06µs? ?/sec1.00 10.9±0.17µs? ?/sec take str 512 1.00 5.3±0.02µs? ?/sec1.00 5.3±0.04µs? ?/sec take str null indices 1024 1.00 7.0±0.10µs? ?/sec1.02 7.1±0.13µs? ?/sec take str null indices 512 1.00 3.3±0.02µs? ?/sec1.02 3.4±0.01µs? ?/sec take str null values 1024 1.01 8.7±0.10µs? ?/sec1.00 8.6±0.06µs? ?/sec take str null values null indices 1024 1.00 6.3±0.07µs? ?/sec1.02 6.4±0.14µs? ?/sec take stringview 1024 1.00777.2±9.62ns? ?/sec1.01783.0±7.73ns? ?/sec take stringview 512 1.00478.7±5.50ns? ?/sec1.00477.2±1.94ns? ?/sec take stringview null indices 1024 1.00 1377.1±15.56ns? ?/sec1.01 1391.7±29.53ns? ?/sec take stringview null indices 512 1.00695.1±3.90ns? ?/sec1.02 710.9±45.82ns? ?/sec take stringview null values 1024 1.00 2.1±0.06µs? ?/sec1.00 2.1±0.02µs? ?/sec take stringview null values null indices 1024 1.00 2.9±0.04µs? ?/sec1.01 2.9±0.02µs? ?/sec ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb-ghbot commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3638557586 🤖 `./gh_compare_arrow.sh` [gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing take-fsb-null-indices (10425c73be49787a340273fbf1203c9ce4b05158) to 1c90efe717c1a2f99a0a97d6e9c93cd913eaca3c [diff](https://github.com/apache/arrow-rs/compare/1c90efe717c1a2f99a0a97d6e9c93cd913eaca3c..10425c73be49787a340273fbf1203c9ce4b05158) BENCH_NAME=take_kernels BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench take_kernels BENCH_FILTER= BENCH_BRANCH_NAME=take-fsb-null-indices Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
alamb commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3638557252 run benchmark take_kernels -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Check validity of indices in take_fixed_size_binary [arrow-rs]
tobixdev commented on PR #8948: URL: https://github.com/apache/arrow-rs/pull/8948#issuecomment-3608941949 This might slow down the take kernels. A second approach I've thought of is that we keep the same loop and apply the null mask at the end (only if the null count is > 0 for indices). @alamb Could you maye be start the benchmarks so we know the effect of this change? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
