Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3764952182

   Looks almost on par now... now we can avoid `filter` altogether for views


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3764367672

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
coalesce_batches_filter main
   -
--- 
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001   
1.00250.4±3.41ms? ?/sec 1.00251.1±4.07ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01
1.00  8.0±0.16ms? ?/sec 1.03  8.2±0.15ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1 
1.00  3.9±0.10ms? ?/sec 1.05  4.1±0.11ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8 
1.00  3.2±0.03ms? ?/sec 1.08  3.4±0.05ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001 
1.00238.1±2.80ms? ?/sec 1.03245.9±5.10ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01  
1.00  9.0±0.10ms? ?/sec 1.01  9.0±0.13ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1   
1.00  4.4±0.14ms? ?/sec 1.04  4.6±0.13ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8   
1.00  3.5±0.03ms? ?/sec 1.34  4.7±0.06ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001   
1.00 53.1±0.70ms? ?/sec 1.09 57.7±0.42ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01
1.00 10.9±0.16ms? ?/sec 1.04 11.3±0.13ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1 
1.00  9.3±0.39ms? ?/sec 1.03  9.5±0.39ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 
1.00  7.8±0.17ms? ?/sec 1.27  9.9±0.18ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001 
1.00 67.2±1.63ms? ?/sec 1.01 68.2±0.93ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01  
1.03 12.9±0.29ms? ?/sec 1.00 12.5±0.19ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1   
1.00  9.7±0.34ms? ?/sec 1.07 10.4±0.40ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8   
1.00  8.1±0.13ms? ?/sec 1.46 11.8±0.38ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.001  1.00 42.4±0.27ms? ?/sec 1.09 46.3±0.32ms
? ?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.01   1.00  4.1±0.10ms? ?/sec 1.39  5.7±0.10ms
? ?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.11.00  1807.8±182.33µs? ?/sec2.52  4.6±0.26ms
? ?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.81.00  1068.6±28.26µs? ?/sec 2.64  2.8±0.03ms
? ?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.0011.04 58.7±0.84ms? ?/sec 1.00 56.4±0.44ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.01 1.00  6.9±0.14ms? ?/sec 1.11  7.6±0.28ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.1  1.00  3.0±0.16ms? ?/sec 1.80  5.4±0.17ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.8  1.00  2.1±0.02ms? ?/sec 1.75  3.8±0.11ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.001   1.00 38.3±0.23ms? ?/sec 1.09 41.9±0.40ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.011.00  3.8±0.04ms? ?/sec 1.18  4.5±0.05ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.1 
1.00  1765.2±187.71µs? ?/sec1.22  2.2±0.10ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3764349753

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (58190b8e4d584284467a2c37ea293361ecc5e895) 
to 1db1a8869cceb179aa885ed58da9f0b49c03eafe 
[diff](https://github.com/apache/arrow-rs/compare/1db1a8869cceb179aa885ed58da9f0b49c03eafe..58190b8e4d584284467a2c37ea293361ecc5e895)
   BENCH_NAME=coalesce_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench coalesce_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3764349640

   
   run benchmark coalesce_kernels
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3764348921

   🤖 Hi @Dandandan, thanks for the request 
(https://github.com/apache/arrow-rs/pull/8951#issuecomment-3764348856).
   
   
[`scrape_comments.py`](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/scrape_comments.py)
 only supports whitelisted benchmarks.
   - Standard: (none)
   - Criterion: array_iter, arrow_reader, arrow_reader_clickbench, 
arrow_reader_row_filter, arrow_statistics, arrow_writer, bitwise_kernel, 
boolean_kernels, buffer_bit_ops, cast_kernels, coalesce_kernels, 
comparison_kernels, concatenate_kernel, csv_writer, encoding, filter_kernels, 
interleave_kernels, json-reader, metadata, row_format, take_kernels, 
union_array, variant_builder, variant_kernels, variant_validation, view_types, 
zip_kernels
   
   Please choose one or more of these with `run benchmark ` or `run 
benchmark  ...`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3764348856

   run benchmarks coalesce_kernels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763921327

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
coalesce_batches_filtermain
   -
---
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001   
1.00275.6±2.70ms? ?/sec1.00274.7±2.23ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01
1.13 11.3±0.66ms? ?/sec1.00 10.0±0.43ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1 
1.03  4.4±0.14ms? ?/sec1.00  4.2±0.09ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8 
1.00  3.6±0.05ms? ?/sec1.03  3.7±0.10ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001 
1.00260.9±3.34ms? ?/sec1.28333.7±2.87ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01  
1.22 11.6±0.32ms? ?/sec1.00  9.5±0.14ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1   
1.04  4.7±0.10ms? ?/sec1.00  4.5±0.20ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8   
1.00  3.9±0.10ms? ?/sec1.23  4.8±0.06ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001   
1.07 68.3±2.28ms? ?/sec1.00 63.9±2.50ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01
1.00 12.1±0.33ms? ?/sec1.00 12.1±0.27ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1 
1.19 11.6±0.81ms? ?/sec1.00  9.8±0.31ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 
1.00 11.2±0.39ms? ?/sec1.18 13.2±0.34ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001 
1.03 72.7±1.54ms? ?/sec1.00 70.4±1.59ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01  
1.08 13.6±0.45ms? ?/sec1.00 12.6±0.17ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1   
1.06 10.9±0.44ms? ?/sec1.00 10.2±0.48ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8   
1.00 10.7±0.27ms? ?/sec1.08 11.6±0.28ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.001  5.56265.6±7.15ms? ?/sec1.00 47.8±0.49ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.01   3.78 22.4±0.71ms? ?/sec1.00  5.9±0.15ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.11.00  2.9±0.14ms? ?/sec1.55  4.5±0.22ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.81.00  1083.4±32.45µs? ?/sec3.00  3.3±0.11ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.0014.94287.3±6.26ms? ?/sec1.00 58.1±0.40ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.01 3.35 26.2±0.91ms? ?/sec1.00  7.8±0.12ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.1  1.00  4.3±0.16ms? ?/sec1.29  5.5±0.23ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.8  1.00  2.3±0.11ms? ?/sec1.61  3.7±0.09ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.001   6.18259.8±8.88ms? ?/sec1.00 42.1±0.24ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.014.87 21.9±0.58ms? ?/sec1.00  4.5±0.10ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.1 
1.30  2.9±0.17ms? ?/sec1.00  2.2±0.15ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.8 
1.00  1073.4±13.4

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701178041


##
arrow-select/src/coalesce/byte_view.rs:
##
@@ -347,6 +348,33 @@ impl InProgressArray for 
InProgressByteViewArray {
 Ok(())
 }
 
+fn copy_rows_by_filter(
+&mut self,
+filter: &FilterPredicate,
+offset: usize,
+len: usize,
+) -> Result<(), ArrowError> {
+// Temporarily swap out the source with a filtered slice
+let Some(source) = self.source.as_mut() else {
+return Err(ArrowError::InvalidArgumentError(
+"Internal Error: InProgressByteViewArray: source not 
set".to_string(),
+));
+};
+// TODO special case offset 0 to skip this nonsense
+let source_slice = source.array.slice(offset, len);
+let filtered = filter.filter(&source_slice)?;
+let old_source = std::mem::replace(&mut source.array, filtered);

Review Comment:
   I think we have to call `set_source` to calculate `need_gc` etc. for the 
filtered array



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763868491

   (with this change the coverage is now quite good) -- there are a few more 
cases to cover but basically 👌 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763862067

   > I found the bug -- fix incoming (then I really have to go)
   
   Oh I already had a look and pushed e76a5d73e9d5f5cd26011c5ed5a10a916642899c 
😆 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763860926

   Ah, NM I see you did it here 
[e76a5d7](https://github.com/apache/arrow-rs/pull/8951/commits/e76a5d73e9d5f5cd26011c5ed5a10a916642899c)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763857144

   I found the bug -- fix incoming (then I really have to go)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701157664


##
arrow-select/src/coalesce/primitive.rs:
##
@@ -95,6 +96,145 @@ impl InProgressArray for 
InProgressPrimitiveArray
 Ok(())
 }
 
+/// Copy rows using a predicate
+fn copy_rows_by_filter(&mut self, filter: &FilterPredicate) -> Result<(), 
ArrowError> {
+self.ensure_capacity();
+
+let s = self
+.source
+.as_ref()
+.ok_or_else(|| {
+ArrowError::InvalidArgumentError(
+"Internal Error: InProgressPrimitiveArray: source not 
set".to_string(),
+)
+})?
+.as_primitive::();
+
+let values = s.values();
+let count = filter.count();
+
+// Use the predicate's strategy for optimal iteration
+match filter.strategy() {
+IterationStrategy::SlicesIterator => {
+// Copy values, nulls using slices
+if let Some(nulls) = s.nulls().filter(|n| n.null_count() > 0) {
+for (start, end) in 
SlicesIterator::new(filter.filter_array()) {
+// SAFETY: slices are derived from filter predicate
+self.current
+.extend_from_slice(unsafe { 
values.get_unchecked(start..end) });
+let slice = nulls.slice(start, end - start);

Review Comment:
   Yes that would probably be faster



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763841851

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (e76a5d73e9d5f5cd26011c5ed5a10a916642899c) 
to 1db1a8869cceb179aa885ed58da9f0b49c03eafe 
[diff](https://github.com/apache/arrow-rs/compare/1db1a8869cceb179aa885ed58da9f0b49c03eafe..e76a5d73e9d5f5cd26011c5ed5a10a916642899c)
   BENCH_NAME=coalesce_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench coalesce_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763841582

   run benchmark coalesce_kernels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763798580

   > I am about out of time today, but here is what I was thinking both for 
testing and to support multi-column batches (it doesn't pass tests yet)
   > 
   > * [Implement fallback copy_with_filter 
Dandandan/arrow-rs#13](https://github.com/Dandandan/arrow-rs/pull/13)
   
   Thanks, I'll take a look <3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763795059

   I am about out of time today, but here is what I was thinking both for 
testing and to support multi-column batches (it doesn't pass tests yet)
   - https://github.com/Dandandan/arrow-rs/pull/13
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763734164

   The reason the existing test coverage for filter batch doesn't cover the new 
code is that it uses a multi-column record batch
   
   
   Another thing I noticed around while trying to test is that the filter 
predicate passed to `copy_rows_by_filter` is for a sliced filter array, but the 
underlying InProgressArrays seem to treat it as though it started at the start 
of the array
   
   I think that may be a bug, but I am looking more carefully now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701081585


##
arrow-select/src/filter.rs:
##
@@ -321,7 +341,13 @@ enum IterationStrategy {
 impl IterationStrategy {
 /// The default [`IterationStrategy`] for a filter of length 
`filter_length`
 /// and selecting `filter_count` rows
-fn default_strategy(filter_length: usize, filter_count: usize) -> Self {
+///
+/// Returns:
+/// - [`IterationStrategy::None`] if `filter_count` is 0
+/// - [`IterationStrategy::All`] if `filter_count == filter_length`
+/// - [`IterationStrategy::SlicesIterator`] if selectivity > 80%

Review Comment:
   Yes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701054448


##
arrow-select/src/coalesce/primitive.rs:
##
@@ -95,6 +96,145 @@ impl InProgressArray for 
InProgressPrimitiveArray
 Ok(())
 }
 
+/// Copy rows using a predicate
+fn copy_rows_by_filter(&mut self, filter: &FilterPredicate) -> Result<(), 
ArrowError> {
+self.ensure_capacity();
+
+let s = self
+.source
+.as_ref()
+.ok_or_else(|| {
+ArrowError::InvalidArgumentError(
+"Internal Error: InProgressPrimitiveArray: source not 
set".to_string(),
+)
+})?
+.as_primitive::();
+
+let values = s.values();
+let count = filter.count();
+
+// Use the predicate's strategy for optimal iteration
+match filter.strategy() {
+IterationStrategy::SlicesIterator => {
+// Copy values, nulls using slices
+if let Some(nulls) = s.nulls().filter(|n| n.null_count() > 0) {
+for (start, end) in 
SlicesIterator::new(filter.filter_array()) {
+// SAFETY: slices are derived from filter predicate
+self.current
+.extend_from_slice(unsafe { 
values.get_unchecked(start..end) });
+let slice = nulls.slice(start, end - start);

Review Comment:
   One thing we could also check is adding a `nulls.append_buffer_sliced` or 
something to avoid having to call "slice" on the buffer (and doing an atomic 
increment each time)



##
arrow-select/src/coalesce.rs:
##
@@ -605,6 +714,15 @@ trait InProgressArray: std::fmt::Debug + Send + Sync {
 /// Return an error if the source array is not set
 fn copy_rows(&mut self, offset: usize, len: usize) -> Result<(), 
ArrowError>;
 
+/// Copy rows at the given indices from the current source array into the 
in-progress array
+fn copy_rows_by_filter(&mut self, filter: &FilterPredicate) -> Result<(), 
ArrowError> {
+// Default implementation: iterate over indices from the filter
+for idx in IndexIterator::new(filter.filter_array(), filter.count()) {

Review Comment:
   I found it strange that the default implementation copied with a different 
iteration strategy -- Upon review, it seems like this code should never be 
called
   
   Therefore I would recommend making the default implementation `panic`
   
   
   Another potential way to make this clearer would be add a method like 
`supports_copy_rows_by_filter` to avoid having to special case "is primitive" 



##
arrow-select/src/filter.rs:
##
@@ -37,9 +37,9 @@ use arrow_schema::*;
 /// [`SlicesIterator`] to copy ranges of values. Otherwise iterate
 /// over individual rows using [`IndexIterator`]
 ///
-/// Threshold of 0.8 chosen based on 

+/// Threshold of 0.9 chosen based on benchmarking results
 ///
-const FILTER_SLICES_SELECTIVITY_THRESHOLD: f64 = 0.8;
+const FILTER_SLICES_SELECTIVITY_THRESHOLD: f64 = 0.9;

Review Comment:
   This change to the filter heuristic seems like we should pull it out into 
its own PR so we can test / make it clear what changed. 



##
arrow-select/src/coalesce.rs:
##
@@ -571,6 +788,15 @@ trait InProgressArray: std::fmt::Debug + Send + Sync {
 /// Return an error if the source array is not set
 fn copy_rows(&mut self, offset: usize, len: usize) -> Result<(), 
ArrowError>;
 
+/// Copy rows at the given indices from the current source array into the 
in-progress array
+fn copy_rows_by_filter(&mut self, filter: &FilterPredicate) -> Result<(), 
ArrowError> {
+// Default implementation: iterate over indices from the filter

Review Comment:
   - https://github.com/apache/arrow-rs/issues/9143



##
arrow-select/src/filter.rs:
##
@@ -321,7 +341,13 @@ enum IterationStrategy {
 impl IterationStrategy {
 /// The default [`IterationStrategy`] for a filter of length 
`filter_length`
 /// and selecting `filter_count` rows
-fn default_strategy(filter_length: usize, filter_count: usize) -> Self {
+///
+/// Returns:
+/// - [`IterationStrategy::None`] if `filter_count` is 0
+/// - [`IterationStrategy::All`] if `filter_count == filter_length`
+/// - [`IterationStrategy::SlicesIterator`] if selectivity > 80%

Review Comment:
   I think you changed the threshold to 90%



##
arrow-select/src/coalesce.rs:
##
@@ -238,10 +245,112 @@ impl BatchCoalescer {
 batch: RecordBatch,
 filter: &BooleanArray,
 ) -> Result<(), ArrowError> {
-// TODO: optimize this to avoid materializing (copying the results
-// of filter to a new batch)
-let filtered_batch = filter_record_batch(&batch, filter)?;
-self.push_batch(filtered_batch)
+// We on

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701047749


##
arrow-select/src/filter.rs:
##
@@ -37,9 +37,9 @@ use arrow_schema::*;
 /// [`SlicesIterator`] to copy ranges of values. Otherwise iterate
 /// over individual rows using [`IndexIterator`]
 ///
-/// Threshold of 0.8 chosen based on 

+/// Threshold of 0.9 chosen based on benchmarking results
 ///
-const FILTER_SLICES_SELECTIVITY_THRESHOLD: f64 = 0.8;
+const FILTER_SLICES_SELECTIVITY_THRESHOLD: f64 = 0.9;

Review Comment:
   I believe the optimal value for this might be even higher (also making the 
filter kernel faster).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763452027

   > I expect the filter kernels should not be changed, except for the 80-90% 
threshold for using the sliced kernel.
   
   Yeah, that is why I was running the benchmark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701042297


##
arrow-select/src/coalesce.rs:
##
@@ -238,10 +245,112 @@ impl BatchCoalescer {
 batch: RecordBatch,
 filter: &BooleanArray,
 ) -> Result<(), ArrowError> {
-// TODO: optimize this to avoid materializing (copying the results
-// of filter to a new batch)
-let filtered_batch = filter_record_batch(&batch, filter)?;
-self.push_batch(filtered_batch)
+// We only support primitve now, fallback to filter_record_batch for 
other types
+// Also, skip optimization when filter is not very selective
+if batch
+.schema()
+.fields()
+.iter()
+.any(|field| !field.data_type().is_primitive())

Review Comment:
   Because of the all-or-nothing check it has no effect on tpch/clickbench 
performance yet. I am not sure if we can somehow use the conventional filter 
kernel for non-supported arrays 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701042297


##
arrow-select/src/coalesce.rs:
##
@@ -238,10 +245,112 @@ impl BatchCoalescer {
 batch: RecordBatch,
 filter: &BooleanArray,
 ) -> Result<(), ArrowError> {
-// TODO: optimize this to avoid materializing (copying the results
-// of filter to a new batch)
-let filtered_batch = filter_record_batch(&batch, filter)?;
-self.push_batch(filtered_batch)
+// We only support primitve now, fallback to filter_record_batch for 
other types
+// Also, skip optimization when filter is not very selective
+if batch
+.schema()
+.fields()
+.iter()
+.any(|field| !field.data_type().is_primitive())

Review Comment:
   Because of the all-or-nothing check it has no effect on tpch/clickbench 
performance yet. I am not sure if we can somehow use the conventional filter 
for non-supported arrays 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701038833


##
arrow-select/src/coalesce.rs:
##
@@ -238,10 +245,112 @@ impl BatchCoalescer {
 batch: RecordBatch,
 filter: &BooleanArray,
 ) -> Result<(), ArrowError> {
-// TODO: optimize this to avoid materializing (copying the results
-// of filter to a new batch)
-let filtered_batch = filter_record_batch(&batch, filter)?;
-self.push_batch(filtered_batch)
+// We only support primitve now, fallback to filter_record_batch for 
other types
+// Also, skip optimization when filter is not very selective
+if batch
+.schema()
+.fields()
+.iter()
+.any(|field| !field.data_type().is_primitive())
+|| self

Review Comment:
   I think benchmarks show it is better to just use the kernel always (so I 
think this condition needs to go).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701038833


##
arrow-select/src/coalesce.rs:
##
@@ -238,10 +245,112 @@ impl BatchCoalescer {
 batch: RecordBatch,
 filter: &BooleanArray,
 ) -> Result<(), ArrowError> {
-// TODO: optimize this to avoid materializing (copying the results
-// of filter to a new batch)
-let filtered_batch = filter_record_batch(&batch, filter)?;
-self.push_batch(filtered_batch)
+// We only support primitve now, fallback to filter_record_batch for 
other types
+// Also, skip optimization when filter is not very selective
+if batch
+.schema()
+.fields()
+.iter()
+.any(|field| !field.data_type().is_primitive())
+|| self

Review Comment:
   I think benchmarks show it is better to just use the kernel always.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763386077

   > run benchmark filter_kernels
   
   I expect the filter kernels should not be changed, except for the 80-90% 
threshold for using the sliced kernel.
   It might make sense to add some ticket benchmarking the threshold, I think 
it's likely the optimal threshold is even a bit higher.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763394227

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
 coalesce_batches_filtermain
   -
 ---
   filter context decimal128 (kept 1/2) 
 1.00 45.0±4.74µs? ?/sec1.02 45.9±6.40µs? ?/sec
   filter context decimal128 high selectivity (kept 1023/1024)  
 1.04 51.2±1.19µs? ?/sec1.00 49.3±1.35µs? ?/sec
   filter context decimal128 low selectivity (kept 1/1024)  
 1.01232.5±2.94ns? ?/sec1.00229.6±5.50ns? ?/sec
   filter context f32 (kept 1/2)
 1.00 77.9±1.18µs? ?/sec1.12 87.3±0.36µs? ?/sec
   filter context f32 high selectivity (kept 1023/1024) 
 1.00 10.4±0.58µs? ?/sec1.00 10.5±0.54µs? ?/sec
   filter context f32 low selectivity (kept 1/1024) 
 1.06442.8±8.24ns? ?/sec1.00419.0±4.99ns? ?/sec
   filter context fsb with value length 20 (kept 1/2)   
 1.00 60.7±0.88µs? ?/sec1.16 70.7±0.88µs? ?/sec
   filter context fsb with value length 20 high selectivity (kept 1023/1024)
 1.00 60.6±0.72µs? ?/sec1.16 70.6±0.58µs? ?/sec
   filter context fsb with value length 20 low selectivity (kept 1/1024)
 1.00 60.6±0.61µs? ?/sec1.17 70.8±1.16µs? ?/sec
   filter context fsb with value length 5 (kept 1/2)
 1.00 60.6±0.42µs? ?/sec1.17 70.7±0.67µs? ?/sec
   filter context fsb with value length 5 high selectivity (kept 1023/1024) 
 1.00 60.8±0.77µs? ?/sec1.16 70.5±0.38µs? ?/sec
   filter context fsb with value length 5 low selectivity (kept 1/1024) 
 1.00 60.6±0.24µs? ?/sec1.17 70.8±1.47µs? ?/sec
   filter context fsb with value length 50 (kept 1/2)   
 1.00 60.7±0.48µs? ?/sec1.16 70.5±0.26µs? ?/sec
   filter context fsb with value length 50 high selectivity (kept 1023/1024)
 1.00 60.5±0.16µs? ?/sec1.17 70.6±0.28µs? ?/sec
   filter context fsb with value length 50 low selectivity (kept 1/1024)
 1.00 60.7±0.55µs? ?/sec1.17 71.0±2.85µs? ?/sec
   filter context i32 (kept 1/2)
 1.01 16.5±0.10µs? ?/sec1.00 16.4±0.10µs? ?/sec
   filter context i32 high selectivity (kept 1023/1024) 
 1.00  6.6±0.57µs? ?/sec1.02  6.8±0.51µs? ?/sec
   filter context i32 low selectivity (kept 1/1024) 
 1.00219.7±2.40ns? ?/sec1.00219.3±1.44ns? ?/sec
   filter context i32 w NULLs (kept 1/2)
 1.00 77.6±0.46µs? ?/sec1.13 87.4±0.40µs? ?/sec
   filter context i32 w NULLs high selectivity (kept 1023/1024) 
 1.00 10.4±0.53µs? ?/sec1.04 10.8±0.54µs? ?/sec
   filter context i32 w NULLs low selectivity (kept 1/1024) 
 1.04442.0±7.30ns? ?/sec1.00   423.2±19.31ns? ?/sec
   filter context mixed string view (kept 1/2)  
 1.00106.1±6.85µs? ?/sec1.16122.9±5.98µs? ?/sec
   filter context mixed string view high selectivity (kept 1023/1024)   
 1.01 55.8±0.87µs? ?/sec1.00 55.1±1.88µs? ?/sec
   filter context mixed string view low selectivity (kept 1/1024)   
 1.06   641.8±11.18ns? ?/sec1.00603.2±2.19ns? ?/sec
   filter context short string view (kept 1/2)  
 1.00109.3±6.75µs? ?/sec1.07117.4±7.72µs? ?/sec
   filter context short string view high selectivity (kept 1023/1024)   
 1.00 54.0±1.61µs? ?/sec1.01 54.7±1.33µs? ?/sec
   filter context short string view low selectivity (kept 1/1024)   
 1.05500.4±5.49ns? ?/sec1.00   478.2±14.02ns? ?/sec
   filter context string (kept 1/2) 
 1.02   584.5±18.08µs? ?/sec1.00   575.0±16.63µs? ?/sec
   filter context string dictionary (kept 1/2)  
 1.02 17.1±0.20µs? ?/sec1.00 16.8±0.08µs? ?/sec
   filter context stri

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763372597

   > BTW I am working on tests and a more thorough review of this PR
   
   Thank you <3 I'll probably have a bit of time later today again to check / 
review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701032624


##
arrow-buffer/src/builder/null.rs:
##
@@ -412,4 +439,47 @@ mod tests {
 
 assert_eq!(builder.finish(), None);
 }
+
+#[test]
+fn test_extend() {

Review Comment:
   Yeah probably redundant by now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763268667

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (9773da741e264ec4ed26afaab856959e4cf0f0db) 
to 1db1a8869cceb179aa885ed58da9f0b49c03eafe 
[diff](https://github.com/apache/arrow-rs/compare/1db1a8869cceb179aa885ed58da9f0b49c03eafe..9773da741e264ec4ed26afaab856959e4cf0f0db)
   BENCH_NAME=filter_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench filter_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763267983

   run benchmark filter_kernels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701012658


##
arrow-select/src/filter.rs:
##
@@ -193,6 +197,21 @@ pub fn filter(values: &dyn Array, predicate: 
&BooleanArray) -> Result bool 
{

Review Comment:
   I recommend putting this as a method on `FilterBuilder` so it is easier to 
discover



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763247485

   BTW I am working on tests and a more thorough review of this PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-17 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3763239717

   > Then it seems like we could/should do the same thing for the other types. 
I'll file a ticket to summarize the work to do
   
   I found we already have tickets and they are organized in an epic
   - https://github.com/apache/arrow-rs/issues/7761
   
   I am not going to spend the time documenting them in detail as I don't think 
these are good new contributor tickets -- they require quite a bit of knowledge 
about arrow-rs and how to optimize it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-16 Thread via GitHub


alamb commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2698481846


##
arrow-buffer/src/builder/null.rs:
##
@@ -412,4 +439,47 @@ mod tests {
 
 assert_eq!(builder.finish(), None);
 }
+
+#[test]
+fn test_extend() {

Review Comment:
   this is fine to have, but technically speaking should these paths be covered 
in the lower BooleanBufferBuilder::extend_trusted_len ?



##
arrow-select/src/coalesce/primitive.rs:
##
@@ -92,6 +93,145 @@ impl InProgressArray for 
InProgressPrimitiveArray
 Ok(())
 }
 
+/// Copy rows using a predicate
+fn copy_rows_by_filter(&mut self, filter: &FilterPredicate) -> Result<(), 
ArrowError> {
+self.ensure_capacity();
+
+let s = self
+.source
+.as_ref()
+.ok_or_else(|| {
+ArrowError::InvalidArgumentError(
+"Internal Error: InProgressPrimitiveArray: source not 
set".to_string(),
+)
+})?
+.as_primitive::();
+
+let values = s.values();
+let count = filter.count();
+
+// Use the predicate's strategy for optimal iteration
+match filter.strategy() {
+IterationStrategy::SlicesIterator => {
+// Copy values, nulls using slices
+if let Some(nulls) = s.nulls().filter(|n| n.null_count() > 0) {
+for (start, end) in 
SlicesIterator::new(filter.filter_array()) {
+// SAFETY: slices are derived from filter predicate
+self.current
+.extend_from_slice(unsafe { 
values.get_unchecked(start..end) });
+let slice = nulls.slice(start, end - start);
+self.nulls.append_buffer(&slice);
+}
+} else {
+for (start, end) in 
SlicesIterator::new(filter.filter_array()) {
+// SAFETY: SlicesIterator produces valid ranges 
derived from filter
+self.current
+.extend_from_slice(unsafe { 
values.get_unchecked(start..end) });
+}
+self.nulls.append_n_non_nulls(count);
+}
+}
+IterationStrategy::Slices(slices) => {

Review Comment:
   This code still needs tests (it is still uncovered I think)
   
   I can try and help find time to write some given how exciting this PR is



##
arrow-buffer/src/builder/null.rs:
##
@@ -412,4 +439,47 @@ mod tests {
 
 assert_eq!(builder.finish(), None);
 }
+
+#[test]
+fn test_extend() {

Review Comment:
   e.g. these tests: 
https://github.com/apache/arrow-rs/blob/840653bc0b1f072c6526aebce09fabbf4d938ce2/arrow-buffer/src/builder/boolean.rs#L580-L579



##
arrow-select/src/coalesce/primitive.rs:
##
@@ -95,6 +96,145 @@ impl InProgressArray for 
InProgressPrimitiveArray
 Ok(())
 }
 
+/// Copy rows using a predicate

Review Comment:
   That is a good point -- perhaps we can consolidate the implementations as a 
follow on PR
   
   What I was basically observing is that the specialized filtering iteration 
was needed in Coalesce -- but it was currently bound tp FilterPredicate. 
Figuring out how to reuse it was the trick.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-16 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3759934437

   checking this one out


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-16 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2698422312


##
arrow-select/src/coalesce/primitive.rs:
##
@@ -95,6 +96,145 @@ impl InProgressArray for 
InProgressPrimitiveArray
 Ok(())
 }
 
+/// Copy rows using a predicate

Review Comment:
   The refactor looks nice, but what parts could be reused? I think it is not 
possible to reuse the filtering directly, as the existing code allocates a new 
vec / buffer each time. I think the other way around could work (existing 
filter code uses some form of this code).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-15 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3754778386

   The null handling code using bititerator might be slightly slower than 
earlier (but removes some complex code and the earlier didn't take care of 
sliced arrays). I think we might recover perf later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-15 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3754760980

   ```
   filter: primitive, 8192, nulls: 0, selectivity: 0.001
1.00 53.4±1.46ms? ?/sec1.81 97.0±0.43ms? 
?/sec
   filter: primitive, 8192, nulls: 0, selectivity: 0.01 
1.00  5.4±0.03ms? ?/sec1.66  8.9±0.03ms? 
?/sec
   filter: primitive, 8192, nulls: 0, selectivity: 0.1  
1.00  2.9±0.04ms? ?/sec1.20  3.5±0.16ms? 
?/sec
   filter: primitive, 8192, nulls: 0, selectivity: 0.8  
1.00  1659.6±14.46µs? ?/sec1.70  2.8±0.02ms? 
?/sec
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.001  
1.00 60.5±1.04ms? ?/sec2.07125.0±2.53ms? 
?/sec
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.01   
1.00  9.4±0.05ms? ?/sec1.56 14.6±0.14ms? 
?/sec
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.1
1.00  6.5±0.39ms? ?/sec1.00  6.5±0.14ms? 
?/sec
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.8
1.00  4.5±0.06ms? ?/sec1.93  8.6±0.04ms? 
?/sec
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-15 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3754467270

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
coalesce_batches_filtermain
   -
---
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001   
1.00252.8±3.00ms? ?/sec1.04261.7±5.39ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01
1.00  8.4±0.27ms? ?/sec1.00  8.4±0.19ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1 
1.06  4.2±0.12ms? ?/sec1.00  3.9±0.13ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8 
1.00  3.2±0.06ms? ?/sec1.09  3.4±0.03ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001 
1.00237.7±2.36ms? ?/sec1.31312.0±5.94ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01  
1.02  9.3±0.08ms? ?/sec1.00  9.1±0.24ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1   
1.00  4.5±0.13ms? ?/sec1.01  4.5±0.08ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8   
1.00  3.5±0.07ms? ?/sec1.28  4.5±0.10ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001   
1.01 58.6±1.71ms? ?/sec1.00 57.8±0.68ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01
1.00 11.1±0.24ms? ?/sec1.00 11.1±0.26ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1 
1.00  9.4±0.42ms? ?/sec1.03  9.8±0.86ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 
1.04  8.4±0.32ms? ?/sec1.00  8.1±0.22ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001 
1.03 69.0±0.97ms? ?/sec1.00 67.2±1.60ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01  
1.00 12.5±0.41ms? ?/sec1.03 12.9±0.76ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1   
1.00  9.7±0.13ms? ?/sec1.01  9.8±0.31ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8   
1.00  8.0±0.20ms? ?/sec1.17  9.4±0.18ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.001  1.03 47.8±0.65ms? ?/sec1.00 46.5±0.79ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.01   1.00  5.6±0.03ms? ?/sec1.00  5.6±0.15ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.11.01  4.1±0.18ms? ?/sec1.00  4.1±0.10ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.81.00  2.5±0.11ms? ?/sec1.14  2.8±0.04ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.0011.05 58.1±0.32ms? ?/sec1.00 55.6±0.37ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.01 1.00  7.6±0.08ms? ?/sec1.01  7.7±0.23ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.1  1.00  5.3±0.14ms? ?/sec1.01  5.4±0.30ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.8  1.00  1978.6±56.95µs? ?/sec1.85  3.7±0.02ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.001   1.03 42.9±0.53ms? ?/sec1.00 41.6±0.59ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.011.01  4.4±0.07ms? ?/sec1.00  4.4±0.06ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.1 
1.03  2.2±0.15ms? ?/sec1.00  2.2±0.07ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.8 
1.00   983.7±16.2

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-15 Thread via GitHub


alamb commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2694135915


##
arrow-select/src/coalesce/primitive.rs:
##
@@ -95,6 +96,145 @@ impl InProgressArray for 
InProgressPrimitiveArray
 Ok(())
 }
 
+/// Copy rows using a predicate

Review Comment:
   BTW I think this PR may be related
   - https://github.com/apache/arrow-rs/pull/7771
   
   That was my attempt to avoid having to copy the logic from filter



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-15 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3754395618

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (a1d40979049ca963cd9324b25295d3c11b2b84b6) 
to c70804a728287f38fc51b04c8dbb1e7b72bceb9b 
[diff](https://github.com/apache/arrow-rs/compare/c70804a728287f38fc51b04c8dbb1e7b72bceb9b..a1d40979049ca963cd9324b25295d3c11b2b84b6)
   BENCH_NAME=coalesce_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench coalesce_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-15 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3754395314

   run benchmark coalesce_kernels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2026-01-09 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3732104575

   Still have to allocate time to finish this up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-17 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3667033373

   > I suggest we break this PR up into several smaller ones (now that you have 
proof the benchmarks are working well):
   
   
   @Dandandan would you like help getting this PR into shape / creating the 
smaller PRs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-15 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3654541687

   ```
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.1
1.00  6.4±0.25ms? ?/sec1.16  7.4±0.36ms? 
?/sec
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.8
1.00  4.5±0.03ms? ?/sec2.02  9.1±0.05ms? 
?/sec
   ```
   
   nice


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-15 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3654531991

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
coalesce_batches_filtermain
   -
---
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001   
1.00257.2±2.50ms? ?/sec1.00256.6±3.74ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01
1.04  8.8±0.18ms? ?/sec1.00  8.5±0.10ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1 
1.03  4.2±0.13ms? ?/sec1.00  4.1±0.12ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8 
1.00  3.3±0.04ms? ?/sec1.08  3.5±0.03ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001 
1.00244.5±3.19ms? ?/sec1.27310.3±4.52ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01  
1.01  9.4±0.17ms? ?/sec1.00  9.3±0.24ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1   
1.03  4.6±0.11ms? ?/sec1.00  4.5±0.10ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8   
1.00  3.7±0.07ms? ?/sec1.23  4.6±0.09ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001   
1.00 59.7±0.49ms? ?/sec1.00 59.5±0.60ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01
1.00 11.5±0.19ms? ?/sec1.01 11.5±0.15ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1 
1.04  9.5±0.33ms? ?/sec1.00  9.2±0.40ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 
1.00  7.7±0.15ms? ?/sec1.33 10.3±0.21ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001 
1.01 70.3±0.37ms? ?/sec1.00 69.6±2.61ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01  
1.00 12.8±0.15ms? ?/sec1.00 12.8±0.23ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1   
1.01  9.6±0.33ms? ?/sec1.00  9.5±0.21ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8   
1.00  8.3±0.22ms? ?/sec1.17  9.8±0.21ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.001  1.02 49.5±0.41ms? ?/sec1.00 48.4±0.46ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.01   1.00  5.9±0.06ms? ?/sec1.00  5.9±0.10ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.11.03  4.6±0.25ms? ?/sec1.00  4.5±0.23ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.81.00  2.6±0.05ms? ?/sec1.14  3.0±0.04ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.0011.01 59.2±0.47ms? ?/sec1.00 58.5±0.79ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.01 1.00  8.0±0.25ms? ?/sec1.01  8.1±0.23ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.1  1.00  5.5±0.25ms? ?/sec1.03  5.7±0.22ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.8  1.00  2.3±0.02ms? ?/sec1.73  3.9±0.04ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.001   1.02 43.5±0.42ms? ?/sec1.00 42.6±0.28ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.011.00  4.7±0.03ms? ?/sec1.00  4.7±0.10ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.1 
1.10  2.5±0.19ms? ?/sec1.00  2.3±0.18ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.8 
1.00  1176.8±34.7

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-15 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3654463272

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (b23524363f6297f2ffce3bd901e0b0ae6149) 
to ed9efe78e4cc958cc96707557818e754419debb0 
[diff](https://github.com/apache/arrow-rs/compare/ed9efe78e4cc958cc96707557818e754419debb0..b23524363f6297f2ffce3bd901e0b0ae6149)
   BENCH_NAME=coalesce_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench coalesce_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-15 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3654463032

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
coalesce_batches_filtermain
   -
---
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001   
1.00257.2±3.16ms? ?/sec1.02263.0±3.88ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01
1.00  8.6±0.14ms? ?/sec1.00  8.6±0.11ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1 
1.00  4.0±0.08ms? ?/sec1.02  4.1±0.08ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8 
1.00  3.3±0.10ms? ?/sec1.08  3.5±0.03ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001 
1.00242.3±3.47ms? ?/sec1.31317.6±4.11ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01  
1.00  9.3±0.13ms? ?/sec1.00  9.3±0.34ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1   
1.01  4.6±0.06ms? ?/sec1.00  4.5±0.10ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8   
1.00  3.7±0.04ms? ?/sec1.24  4.6±0.05ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001   
1.00 59.6±0.52ms? ?/sec1.01 59.9±1.07ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01
1.00 11.2±0.11ms? ?/sec1.03 11.6±0.15ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1 
1.00  9.1±0.27ms? ?/sec1.03  9.3±0.29ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 
1.00  7.8±0.20ms? ?/sec1.41 11.0±0.27ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001 
1.00 69.7±0.42ms? ?/sec1.00 69.4±0.68ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01  
1.00 12.8±0.10ms? ?/sec1.00 12.7±0.14ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1   
1.02 10.0±0.35ms? ?/sec1.00  9.8±0.28ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8   
1.00  8.6±0.20ms? ?/sec1.14  9.8±0.22ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.001  1.00 48.8±0.30ms? ?/sec1.00 48.6±0.63ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.01   1.00  5.9±0.04ms? ?/sec1.01  6.0±0.17ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.11.00  4.3±0.11ms? ?/sec1.05  4.6±0.20ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.81.00  2.6±0.03ms? ?/sec1.22  3.1±0.08ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.0011.00 58.0±0.49ms? ?/sec1.01 58.4±0.73ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.01 1.00  7.9±0.06ms? ?/sec1.00  7.9±0.06ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.1  1.00  5.4±0.20ms? ?/sec1.00  5.5±0.15ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.8  1.00  2.2±0.02ms? ?/sec1.75  3.9±0.06ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.001   1.01 42.7±0.22ms? ?/sec1.00 42.5±0.84ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.011.00  4.7±0.03ms? ?/sec1.00  4.7±0.03ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.1 
1.05  2.5±0.21ms? ?/sec1.00  2.4±0.19ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.8 
1.00  1100.1±30.4

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-15 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3654429756

   
   run benchmark coalesce_kernels
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-15 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3654384848

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (f718f2ec4fc5c069b8ed2d5a1892487e6340dc20) 
to ed9efe78e4cc958cc96707557818e754419debb0 
[diff](https://github.com/apache/arrow-rs/compare/ed9efe78e4cc958cc96707557818e754419debb0..f718f2ec4fc5c069b8ed2d5a1892487e6340dc20)
   BENCH_NAME=coalesce_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench coalesce_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-15 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3654384118

   run benchmark coalesce_kernels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-15 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2618385313


##
arrow-buffer/src/builder/null.rs:
##
@@ -193,6 +193,85 @@ impl NullBufferBuilder {
 }
 }
 
+/// Extends this builder with validity values.
+///
+///
+/// # Example
+/// ```
+/// # use arrow_buffer::NullBufferBuilder;
+/// let mut builder = NullBufferBuilder::new(8);
+/// let validities = [true, false, true, true];
+/// builder.extend(validities.iter().copied());
+/// assert_eq!(builder.len(), 4);
+/// ```
+pub fn extend>(&mut self, iter: I) {
+let (lower, upper) = iter.size_hint();
+let len = upper.expect("Iterator must have exact size_hint");
+debug_assert_eq!(lower, len, "Iterator must have exact size_hint");
+
+if len == 0 {
+return;
+}
+
+// Materialize since we're about to append bits
+self.materialize_if_needed();
+
+let buf = self.bitmap_builder.as_mut().unwrap();
+let start_len = buf.len();
+// Advance to allocate space, initializing new bits to 0
+buf.advance(len);
+
+let slice = buf.as_slice_mut();
+let mut bit_idx = start_len;
+let end_bit = start_len + len;
+
+// Process in chunks of 64 bits when byte-aligned for better 
performance

Review Comment:
   I just checked - this seems an additional ~30% improvement for null handling:
   
   ```
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.1
   time:   [2.4060 ms 2.4096 ms 2.4133 ms]
   change: [−33.920% −32.902% −32.274%] (p = 0.00 < 
0.05)
   Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
 1 (1.00%) high mild
   
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.8
   time:   [2.1610 ms 2.1666 ms 2.1728 ms]
   change: [−29.488% −28.499% −27.767%] (p = 0.00 < 
0.05)
   Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
 4 (4.00%) high mild
 3 (3.00%) high severe
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-14 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2618331428


##
arrow-buffer/src/builder/null.rs:
##
@@ -193,6 +193,85 @@ impl NullBufferBuilder {
 }
 }
 
+/// Extends this builder with validity values.
+///
+///
+/// # Example
+/// ```
+/// # use arrow_buffer::NullBufferBuilder;
+/// let mut builder = NullBufferBuilder::new(8);
+/// let validities = [true, false, true, true];
+/// builder.extend(validities.iter().copied());
+/// assert_eq!(builder.len(), 4);
+/// ```
+pub fn extend>(&mut self, iter: I) {
+let (lower, upper) = iter.size_hint();
+let len = upper.expect("Iterator must have exact size_hint");
+debug_assert_eq!(lower, len, "Iterator must have exact size_hint");
+
+if len == 0 {
+return;
+}
+
+// Materialize since we're about to append bits
+self.materialize_if_needed();
+
+let buf = self.bitmap_builder.as_mut().unwrap();
+let start_len = buf.len();
+// Advance to allocate space, initializing new bits to 0
+buf.advance(len);
+
+let slice = buf.as_slice_mut();
+let mut bit_idx = start_len;
+let end_bit = start_len + len;
+
+// Process in chunks of 64 bits when byte-aligned for better 
performance

Review Comment:
   I think this is the same comment as @alamb has?
   
   Yeah perhaps this can improve performance (will see guided by benchmarks).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-14 Thread via GitHub


mapleFU commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2617124197


##
arrow-buffer/src/builder/null.rs:
##
@@ -193,6 +193,85 @@ impl NullBufferBuilder {
 }
 }
 
+/// Extends this builder with validity values.
+///
+///
+/// # Example
+/// ```
+/// # use arrow_buffer::NullBufferBuilder;
+/// let mut builder = NullBufferBuilder::new(8);
+/// let validities = [true, false, true, true];
+/// builder.extend(validities.iter().copied());
+/// assert_eq!(builder.len(), 4);
+/// ```
+pub fn extend>(&mut self, iter: I) {
+let (lower, upper) = iter.size_hint();
+let len = upper.expect("Iterator must have exact size_hint");
+debug_assert_eq!(lower, len, "Iterator must have exact size_hint");
+
+if len == 0 {
+return;
+}
+
+// Materialize since we're about to append bits
+self.materialize_if_needed();
+
+let buf = self.bitmap_builder.as_mut().unwrap();
+let start_len = buf.len();
+// Advance to allocate space, initializing new bits to 0
+buf.advance(len);
+
+let slice = buf.as_slice_mut();
+let mut bit_idx = start_len;
+let end_bit = start_len + len;
+
+// Process in chunks of 64 bits when byte-aligned for better 
performance

Review Comment:
   I'm a bit curious, why this don't have some part for unaligned an aligned 
handling, and 
   ```
   handle_unaligned() // handled start_len % 8 header
   handle_aligned() // handle inner payloads
   handle_unaligned() // handle_trailer
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-13 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3649114672

   > Thanks @Dandandan -- this looks really exciting; I had a few comments.
   > 
   > I suggest we break this PR up into several smaller ones (now that you have 
proof the benchmarks are working well):
   > 
   > 1. Add `BooleanBufferBuilder::extend` (and tests)
   > 2. Add `BooleanBuffer::find_nth_set_bit_position` (and tests)
   > 3. Add the changes to coalesce
   
   Sounds like a good plan, I'll follow that!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-13 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2616173444


##
arrow-buffer/src/builder/null.rs:
##
@@ -193,6 +193,85 @@ impl NullBufferBuilder {
 }
 }
 
+/// Extends this builder with validity values.
+///
+///
+/// # Example
+/// ```
+/// # use arrow_buffer::NullBufferBuilder;
+/// let mut builder = NullBufferBuilder::new(8);
+/// let validities = [true, false, true, true];
+/// builder.extend(validities.iter().copied());
+/// assert_eq!(builder.len(), 4);
+/// ```
+pub fn extend>(&mut self, iter: I) {
+let (lower, upper) = iter.size_hint();
+let len = upper.expect("Iterator must have exact size_hint");
+debug_assert_eq!(lower, len, "Iterator must have exact size_hint");
+
+if len == 0 {
+return;
+}
+
+// Materialize since we're about to append bits
+self.materialize_if_needed();
+
+let buf = self.bitmap_builder.as_mut().unwrap();
+let start_len = buf.len();
+// Advance to allocate space, initializing new bits to 0
+buf.advance(len);
+
+let slice = buf.as_slice_mut();
+let mut bit_idx = start_len;
+let end_bit = start_len + len;
+
+// Process in chunks of 64 bits when byte-aligned for better 
performance
+if start_len % 8 == 0 {
+let start_byte = start_len / 8;
+let mut iter = iter.peekable();
+
+// Process full u64 chunks (64 bits at a time)
+while bit_idx + 64 <= end_bit && iter.peek().is_some() {

Review Comment:
   :+1: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-13 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2616173620


##
arrow-buffer/src/builder/null.rs:
##
@@ -193,6 +193,85 @@ impl NullBufferBuilder {
 }
 }
 
+/// Extends this builder with validity values.
+///
+///
+/// # Example
+/// ```
+/// # use arrow_buffer::NullBufferBuilder;
+/// let mut builder = NullBufferBuilder::new(8);
+/// let validities = [true, false, true, true];
+/// builder.extend(validities.iter().copied());
+/// assert_eq!(builder.len(), 4);
+/// ```
+pub fn extend>(&mut self, iter: I) {
+let (lower, upper) = iter.size_hint();

Review Comment:
   Sounds good, I'll do that



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-13 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2616171354


##
arrow-select/src/coalesce.rs:
##
@@ -526,6 +631,118 @@ impl BatchCoalescer {
 }
 }
 
+/// Find the position after the n-th set bit in a boolean array starting from 
`start`.
+/// Returns the position after the n-th set bit, or the end of the array if 
fewer than n bits are set.

Review Comment:
   Good idea



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-13 Thread via GitHub


Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2616171034


##
arrow-select/src/coalesce.rs:
##
@@ -571,6 +788,15 @@ trait InProgressArray: std::fmt::Debug + Send + Sync {
 /// Return an error if the source array is not set
 fn copy_rows(&mut self, offset: usize, len: usize) -> Result<(), 
ArrowError>;
 
+/// Copy rows at the given indices from the current source array into the 
in-progress array
+fn copy_rows_by_filter(&mut self, filter: &FilterPredicate) -> Result<(), 
ArrowError> {
+// Default implementation: iterate over indices from the filter

Review Comment:
   Correct, for views & bytes array



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-12 Thread via GitHub


alamb commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2615576833


##
arrow-select/src/coalesce.rs:
##
@@ -526,6 +631,118 @@ impl BatchCoalescer {
 }
 }
 
+/// Find the position after the n-th set bit in a boolean array starting from 
`start`.
+/// Returns the position after the n-th set bit, or the end of the array if 
fewer than n bits are set.

Review Comment:
   I recommend we move this code into BooleanBuffer as well so it is easier to 
find / reuse



##
arrow-buffer/src/builder/null.rs:
##
@@ -402,4 +481,43 @@ mod tests {
 
 assert_eq!(builder.finish(), None);
 }
+
+#[test]
+fn test_extend() {
+// Test small extend (less than 64 bits)
+let mut builder = NullBufferBuilder::new(0);
+builder.extend([true, false, true, true].iter().copied());
+// bits: 0=true, 1=false, 2=true, 3=true -> 0b1101 = 13
+assert_eq!(builder.as_slice().unwrap(), &[0b1101_u8]);
+
+// Test extend with exactly 64 bits
+let mut builder = NullBufferBuilder::new(0);
+let pattern: Vec = (0..64).map(|i| i % 2 == 0).collect();
+builder.extend(pattern.iter().copied());
+// Even positions are true: 0, 2, 4, ... -> bits 0, 2, 4, ...
+// In little-endian: 0b01010101 repeated
+assert_eq!(
+builder.as_slice().unwrap(),
+&[0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55]
+);
+
+// Test extend with more than 64 bits (tests chunking)
+let mut builder = NullBufferBuilder::new(0);
+let pattern: Vec = (0..100).map(|i| i % 3 == 0).collect();
+builder.extend(pattern.iter().copied());
+assert_eq!(builder.len(), 100);
+// Verify a few specific bits
+let buf = builder.finish().unwrap();
+assert!(buf.is_valid(0)); // 0 % 3 == 0
+assert!(!buf.is_valid(1)); // 1 % 3 != 0
+assert!(!buf.is_valid(2)); // 2 % 3 != 0
+assert!(buf.is_valid(3)); // 3 % 3 == 0
+assert!(buf.is_valid(99)); // 99 % 3 == 0
+
+// Test extend with non-aligned start (tests bit-by-bit path)
+let mut builder = NullBufferBuilder::new(0);
+builder.append_non_null(); // Start at bit 1 (non-aligned)
+builder.extend([false, true, false, true].iter().copied());

Review Comment:
   I think we should probably test non aligned writes with more than 64 bits as 
well (this only copies 4 bits)



##
arrow-select/src/coalesce/primitive.rs:
##
@@ -92,6 +93,145 @@ impl InProgressArray for 
InProgressPrimitiveArray
 Ok(())
 }
 
+/// Copy rows using a predicate
+fn copy_rows_by_filter(&mut self, filter: &FilterPredicate) -> Result<(), 
ArrowError> {
+self.ensure_capacity();
+
+let s = self
+.source
+.as_ref()
+.ok_or_else(|| {
+ArrowError::InvalidArgumentError(
+"Internal Error: InProgressPrimitiveArray: source not 
set".to_string(),
+)
+})?
+.as_primitive::();
+
+let values = s.values();
+let count = filter.count();
+
+// Use the predicate's strategy for optimal iteration
+match filter.strategy() {
+IterationStrategy::SlicesIterator => {
+// Copy values, nulls using slices
+if let Some(nulls) = s.nulls().filter(|n| n.null_count() > 0) {
+for (start, end) in 
SlicesIterator::new(filter.filter_array()) {
+// SAFETY: slices are derived from filter predicate
+self.current
+.extend_from_slice(unsafe { 
values.get_unchecked(start..end) });
+let slice = nulls.slice(start, end - start);
+self.nulls.append_buffer(&slice);
+}
+} else {
+for (start, end) in 
SlicesIterator::new(filter.filter_array()) {
+// SAFETY: SlicesIterator produces valid ranges 
derived from filter
+self.current
+.extend_from_slice(unsafe { 
values.get_unchecked(start..end) });
+}
+self.nulls.append_n_non_nulls(count);
+}
+}
+IterationStrategy::Slices(slices) => {

Review Comment:
   I think this function needs some tests
   
   I ran code coverage like this
   
   ```shell
   cargo llvm-cov test --html -p arrow-buffer  -p arrow-select
   ```
   
   And there appears to be no coverage
   https://github.com/user-attachments/assets/0e644657-d723-4f5f-925a-3c4ce86a78bc";
 />
   
   



##
arrow-buffer/src/builder/null.rs:
##
@@ -193,6 +193,85 @@ impl NullBufferBuilder {
 }
 }
 
+/// Extends this builder with validity values.
+///
+///
+/// # Example
+/// ```
+   

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-11 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3644095246

   Ok, thank you -- I will plan to review this one carefully in the morning


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-11 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3644088331

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
 coalesce_batches_filtermain
   -
 ---
   filter context decimal128 (kept 1/2) 
 1.06 47.0±7.08µs? ?/sec1.00 44.3±1.94µs? ?/sec
   filter context decimal128 high selectivity (kept 1023/1024)  
 1.02 51.2±1.84µs? ?/sec1.00 50.0±2.98µs? ?/sec
   filter context decimal128 low selectivity (kept 1/1024)  
 1.00241.4±2.88ns? ?/sec1.00240.8±4.26ns? ?/sec
   filter context f32 (kept 1/2)
 1.00 77.9±0.90µs? ?/sec1.00 77.7±0.95µs? ?/sec
   filter context f32 high selectivity (kept 1023/1024) 
 1.02 10.5±0.43µs? ?/sec1.00 10.3±0.44µs? ?/sec
   filter context f32 low selectivity (kept 1/1024) 
 1.00442.0±7.36ns? ?/sec1.03454.6±1.42ns? ?/sec
   filter context fsb with value length 20 (kept 1/2)   
 1.00 60.6±0.67µs? ?/sec1.00 60.9±0.62µs? ?/sec
   filter context fsb with value length 20 high selectivity (kept 1023/1024)
 1.00 60.7±0.74µs? ?/sec1.00 60.8±0.52µs? ?/sec
   filter context fsb with value length 20 low selectivity (kept 1/1024)
 1.00 60.6±0.30µs? ?/sec1.00 60.8±0.61µs? ?/sec
   filter context fsb with value length 5 (kept 1/2)
 1.00 60.7±0.52µs? ?/sec1.01 61.1±0.80µs? ?/sec
   filter context fsb with value length 5 high selectivity (kept 1023/1024) 
 1.00 60.9±1.82µs? ?/sec1.00 60.7±0.56µs? ?/sec
   filter context fsb with value length 5 low selectivity (kept 1/1024) 
 1.00 60.7±0.43µs? ?/sec1.00 60.6±0.16µs? ?/sec
   filter context fsb with value length 50 (kept 1/2)   
 1.00 60.8±0.32µs? ?/sec1.02 61.9±5.83µs? ?/sec
   filter context fsb with value length 50 high selectivity (kept 1023/1024)
 1.00 60.9±1.96µs? ?/sec1.00 60.7±0.69µs? ?/sec
   filter context fsb with value length 50 low selectivity (kept 1/1024)
 1.00 60.6±0.42µs? ?/sec1.00 60.7±0.34µs? ?/sec
   filter context i32 (kept 1/2)
 1.02 17.0±0.11µs? ?/sec1.00 16.6±0.09µs? ?/sec
   filter context i32 high selectivity (kept 1023/1024) 
 1.00  6.5±0.45µs? ?/sec1.00  6.6±0.38µs? ?/sec
   filter context i32 low selectivity (kept 1/1024) 
 1.02239.3±1.79ns? ?/sec1.00234.5±2.04ns? ?/sec
   filter context i32 w NULLs (kept 1/2)
 1.01 78.0±0.70µs? ?/sec1.00 77.5±0.39µs? ?/sec
   filter context i32 w NULLs high selectivity (kept 1023/1024) 
 1.05 10.6±0.52µs? ?/sec1.00 10.1±0.53µs? ?/sec
   filter context i32 w NULLs low selectivity (kept 1/1024) 
 1.00   450.4±20.45ns? ?/sec1.03463.5±3.66ns? ?/sec
   filter context mixed string view (kept 1/2)  
 1.01105.2±5.62µs? ?/sec1.00104.2±4.47µs? ?/sec
   filter context mixed string view high selectivity (kept 1023/1024)   
 1.00 54.9±1.34µs? ?/sec1.01 55.5±1.30µs? ?/sec
   filter context mixed string view low selectivity (kept 1/1024)   
 1.00663.5±1.51ns? ?/sec1.02678.2±2.96ns? ?/sec
   filter context short string view (kept 1/2)  
 1.00103.8±4.66µs? ?/sec1.01104.5±5.17µs? ?/sec
   filter context short string view high selectivity (kept 1023/1024)   
 1.00 53.3±1.67µs? ?/sec1.04 55.5±1.91µs? ?/sec
   filter context short string view low selectivity (kept 1/1024)   
 1.00471.2±8.20ns? ?/sec1.01476.8±5.12ns? ?/sec
   filter context string (kept 1/2) 
 1.00   576.6±11.44µs? ?/sec1.01   584.2±12.53µs? ?/sec
   filter context string dictionary (kept 1/2)  
 1.00 17.0±0.13µs? ?/sec1.01 17.1±0.11µs? ?/sec
   filter context stri

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-11 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3644014310

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (bb025cf4db6c36e1f2144ec87a29f8374bd1d6c5) 
to ed9efe78e4cc958cc96707557818e754419debb0 
[diff](https://github.com/apache/arrow-rs/compare/ed9efe78e4cc958cc96707557818e754419debb0..bb025cf4db6c36e1f2144ec87a29f8374bd1d6c5)
   BENCH_NAME=filter_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench filter_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-11 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3644009066

   > Hi @Dandandan -- I am working through the arrow-rs review backlog. 
coalesce benchmarks look better. Filter kernels look potentially slower. I'll 
rerun and try to see if we can reproduce the results
   
   I don't think the filter kernels should have any impact other than the 
threshold, but that isn't covered by a benchmark.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-11 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3643949618

   
   run benchmark filter_kernels
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-11 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3643908996

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
 coalesce_batches_filtermain
   -
 ---
   filter context decimal128 (kept 1/2) 
 1.00 45.9±6.33µs? ?/sec1.01 46.4±6.05µs? ?/sec
   filter context decimal128 high selectivity (kept 1023/1024)  
 1.03 51.1±2.22µs? ?/sec1.00 49.4±1.18µs? ?/sec
   filter context decimal128 low selectivity (kept 1/1024)  
 1.00241.1±2.07ns? ?/sec1.01244.2±1.03ns? ?/sec
   filter context f32 (kept 1/2)
 1.00 78.0±1.94µs? ?/sec1.00 77.9±0.71µs? ?/sec
   filter context f32 high selectivity (kept 1023/1024) 
 1.00 10.4±0.49µs? ?/sec1.00 10.4±0.31µs? ?/sec
   filter context f32 low selectivity (kept 1/1024) 
 1.00451.1±1.68ns? ?/sec1.02458.4±7.82ns? ?/sec
   filter context fsb with value length 20 (kept 1/2)   
 1.00 60.6±0.28µs? ?/sec1.00 60.6±0.24µs? ?/sec
   filter context fsb with value length 20 high selectivity (kept 1023/1024)
 1.00 60.8±0.96µs? ?/sec1.00 60.7±0.90µs? ?/sec
   filter context fsb with value length 20 low selectivity (kept 1/1024)
 1.00 60.5±0.24µs? ?/sec1.00 60.8±0.81µs? ?/sec
   filter context fsb with value length 5 (kept 1/2)
 1.00 61.1±1.47µs? ?/sec1.00 60.8±0.80µs? ?/sec
   filter context fsb with value length 5 high selectivity (kept 1023/1024) 
 1.00 60.8±0.80µs? ?/sec1.00 60.7±0.76µs? ?/sec
   filter context fsb with value length 5 low selectivity (kept 1/1024) 
 1.00 60.7±0.66µs? ?/sec1.00 61.0±1.31µs? ?/sec
   filter context fsb with value length 50 (kept 1/2)   
 1.00 60.6±0.93µs? ?/sec1.00 60.8±0.33µs? ?/sec
   filter context fsb with value length 50 high selectivity (kept 1023/1024)
 1.00 60.9±0.62µs? ?/sec1.00 60.7±0.25µs? ?/sec
   filter context fsb with value length 50 low selectivity (kept 1/1024)
 1.00 60.6±0.46µs? ?/sec1.00 60.7±0.60µs? ?/sec
   filter context i32 (kept 1/2)
 1.01 16.6±0.09µs? ?/sec1.00 16.5±0.17µs? ?/sec
   filter context i32 high selectivity (kept 1023/1024) 
 1.00  6.4±0.40µs? ?/sec1.02  6.6±0.40µs? ?/sec
   filter context i32 low selectivity (kept 1/1024) 
 1.01238.3±1.05ns? ?/sec1.00235.9±3.08ns? ?/sec
   filter context i32 w NULLs (kept 1/2)
 1.00 77.7±0.54µs? ?/sec1.00 77.7±1.59µs? ?/sec
   filter context i32 w NULLs high selectivity (kept 1023/1024) 
 1.00 10.2±0.45µs? ?/sec1.05 10.8±0.41µs? ?/sec
   filter context i32 w NULLs low selectivity (kept 1/1024) 
 1.00454.0±5.90ns? ?/sec1.02462.1±7.68ns? ?/sec
   filter context mixed string view (kept 1/2)  
 1.00104.5±4.74µs? ?/sec1.02106.5±4.35µs? ?/sec
   filter context mixed string view high selectivity (kept 1023/1024)   
 1.00 53.9±1.70µs? ?/sec1.04 56.2±1.61µs? ?/sec
   filter context mixed string view low selectivity (kept 1/1024)   
 1.00675.7±7.20ns? ?/sec1.01684.0±2.67ns? ?/sec
   filter context short string view (kept 1/2)  
 1.00104.8±5.30µs? ?/sec1.02107.0±3.28µs? ?/sec
   filter context short string view high selectivity (kept 1023/1024)   
 1.01 55.5±1.57µs? ?/sec1.00 54.9±1.38µs? ?/sec
   filter context short string view low selectivity (kept 1/1024)   
 1.01482.5±4.17ns? ?/sec1.00477.0±1.44ns? ?/sec
   filter context string (kept 1/2) 
 1.00   585.3±12.49µs? ?/sec1.03   603.2±17.73µs? ?/sec
   filter context string dictionary (kept 1/2)  
 1.01 17.4±0.13µs? ?/sec1.00 17.2±0.16µs? ?/sec
   filter context stri

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-11 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3643825662

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (bb025cf4db6c36e1f2144ec87a29f8374bd1d6c5) 
to ed9efe78e4cc958cc96707557818e754419debb0 
[diff](https://github.com/apache/arrow-rs/compare/ed9efe78e4cc958cc96707557818e754419debb0..bb025cf4db6c36e1f2144ec87a29f8374bd1d6c5)
   BENCH_NAME=filter_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench filter_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-11 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3643666446

   run benchmark filter_kernels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-11 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3643665704

   Hi @Dandandan  -- I am working through the arrow-rs review backlog. coalesce 
benchmarks look better. Filter kernels look potentially slower. I'll rerun and 
try to see if we can see that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-09 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3634291940

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
 coalesce_batches_filtermain
   -
 ---
   filter context decimal128 (kept 1/2) 
 1.00 44.1±1.01µs? ?/sec1.02 45.1±1.75µs? ?/sec
   filter context decimal128 high selectivity (kept 1023/1024)  
 1.00 50.7±0.96µs? ?/sec1.00 50.9±1.35µs? ?/sec
   filter context decimal128 low selectivity (kept 1/1024)  
 1.00240.2±4.97ns? ?/sec1.04   248.8±27.57ns? ?/sec
   filter context f32 (kept 1/2)
 1.00 77.8±0.69µs? ?/sec1.00 77.7±0.42µs? ?/sec
   filter context f32 high selectivity (kept 1023/1024) 
 1.00 10.5±0.26µs? ?/sec1.00 10.5±0.32µs? ?/sec
   filter context f32 low selectivity (kept 1/1024) 
 1.00442.5±1.33ns? ?/sec1.02453.5±4.38ns? ?/sec
   filter context fsb with value length 20 (kept 1/2)   
 1.00 60.7±0.53µs? ?/sec1.00 60.8±0.68µs? ?/sec
   filter context fsb with value length 20 high selectivity (kept 1023/1024)
 1.00 61.0±2.62µs? ?/sec1.00 60.8±0.82µs? ?/sec
   filter context fsb with value length 20 low selectivity (kept 1/1024)
 1.00 60.7±0.52µs? ?/sec1.00 60.8±0.44µs? ?/sec
   filter context fsb with value length 5 (kept 1/2)
 1.00 60.7±0.58µs? ?/sec1.00 60.8±0.43µs? ?/sec
   filter context fsb with value length 5 high selectivity (kept 1023/1024) 
 1.00 60.8±0.62µs? ?/sec1.00 60.7±0.34µs? ?/sec
   filter context fsb with value length 5 low selectivity (kept 1/1024) 
 1.00 60.7±0.86µs? ?/sec1.00 60.9±2.06µs? ?/sec
   filter context fsb with value length 50 (kept 1/2)   
 1.00 61.0±2.13µs? ?/sec1.00 61.2±1.53µs? ?/sec
   filter context fsb with value length 50 high selectivity (kept 1023/1024)
 1.00 60.7±0.48µs? ?/sec1.00 60.9±1.44µs? ?/sec
   filter context fsb with value length 50 low selectivity (kept 1/1024)
 1.00 60.6±0.19µs? ?/sec1.00 60.7±0.21µs? ?/sec
   filter context i32 (kept 1/2)
 1.01 16.7±0.08µs? ?/sec1.00 16.4±0.25µs? ?/sec
   filter context i32 high selectivity (kept 1023/1024) 
 1.00  6.8±0.20µs? ?/sec1.00  6.8±0.21µs? ?/sec
   filter context i32 low selectivity (kept 1/1024) 
 1.00233.0±2.05ns? ?/sec1.00233.4±2.00ns? ?/sec
   filter context i32 w NULLs (kept 1/2)
 1.00 78.0±1.49µs? ?/sec1.00 77.7±0.48µs? ?/sec
   filter context i32 w NULLs high selectivity (kept 1023/1024) 
 1.07 11.0±0.24µs? ?/sec1.00 10.3±0.34µs? ?/sec
   filter context i32 w NULLs low selectivity (kept 1/1024) 
 1.00444.5±4.31ns? ?/sec1.03457.1±6.10ns? ?/sec
   filter context mixed string view (kept 1/2)  
 1.00103.8±1.23µs? ?/sec1.02105.9±4.60µs? ?/sec
   filter context mixed string view high selectivity (kept 1023/1024)   
 1.00 53.7±1.68µs? ?/sec1.07 57.7±1.29µs? ?/sec
   filter context mixed string view low selectivity (kept 1/1024)   
 1.00627.1±2.50ns? ?/sec1.05658.2±2.46ns? ?/sec
   filter context short string view (kept 1/2)  
 1.00106.2±3.93µs? ?/sec1.00105.9±5.09µs? ?/sec
   filter context short string view high selectivity (kept 1023/1024)   
 1.00 54.8±1.60µs? ?/sec1.01 55.3±0.75µs? ?/sec
   filter context short string view low selectivity (kept 1/1024)   
 1.00464.9±5.81ns? ?/sec1.03   479.0±25.84ns? ?/sec
   filter context string (kept 1/2) 
 1.00   644.2±18.04µs? ?/sec1.00   641.2±18.57µs? ?/sec
   filter context string dictionary (kept 1/2)  
 1.00 17.0±0.14µs? ?/sec1.03 17.5±0.29µs? ?/sec
   filter context stri

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-09 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3634206105

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (bb025cf4db6c36e1f2144ec87a29f8374bd1d6c5) 
to ed9efe78e4cc958cc96707557818e754419debb0 
[diff](https://github.com/apache/arrow-rs/compare/ed9efe78e4cc958cc96707557818e754419debb0..bb025cf4db6c36e1f2144ec87a29f8374bd1d6c5)
   BENCH_NAME=filter_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench filter_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-09 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3634205957

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
coalesce_batches_filtermain
   -
---
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001   
1.01277.1±1.94ms? ?/sec1.00273.9±2.22ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01
1.04  9.6±0.29ms? ?/sec1.00  9.2±0.32ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1 
1.02  4.4±0.13ms? ?/sec1.00  4.3±0.10ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8 
1.00  3.5±0.05ms? ?/sec1.06  3.7±0.05ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001 
1.00261.6±1.88ms? ?/sec1.28333.6±2.60ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01  
1.00 10.0±0.36ms? ?/sec1.00 10.1±0.43ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1   
1.05  4.9±0.05ms? ?/sec1.00  4.6±0.09ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8   
1.00  3.8±0.04ms? ?/sec1.25  4.8±0.06ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001   
1.04 66.0±1.43ms? ?/sec1.00 63.2±1.47ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01
1.00 11.9±0.17ms? ?/sec1.01 12.1±0.21ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1 
1.01 10.6±0.47ms? ?/sec1.00 10.5±0.26ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 
1.00  9.9±0.21ms? ?/sec1.34 13.2±0.38ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001 
1.01 73.2±1.24ms? ?/sec1.00 72.3±0.90ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01  
1.03 13.5±0.19ms? ?/sec1.00 13.1±0.16ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1   
1.06 11.2±0.40ms? ?/sec1.00 10.6±0.32ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8   
1.00 10.2±0.27ms? ?/sec1.14 11.6±0.46ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.001  1.01 49.7±0.34ms? ?/sec1.00 49.0±1.09ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.01   1.01  6.2±0.06ms? ?/sec1.00  6.1±0.20ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.11.04  5.0±0.20ms? ?/sec1.00  4.8±0.13ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.81.00  3.1±0.11ms? ?/sec1.15  3.6±0.13ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.0011.03 60.2±1.30ms? ?/sec1.00 58.6±0.81ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.01 1.00  8.2±0.11ms? ?/sec1.00  8.2±0.09ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.1  1.10  6.2±0.16ms? ?/sec1.00  5.6±0.13ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.8  1.00  2.3±0.04ms? ?/sec1.74  4.0±0.06ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.001   1.02 43.6±0.62ms? ?/sec1.00 42.9±0.57ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.011.01  4.8±0.05ms? ?/sec1.00  4.8±0.10ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.1 
1.09  2.6±0.20ms? ?/sec1.00  2.4±0.09ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.8 
1.00  1181.2±18.7

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-09 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3634120510

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (1a7d04b97d59bae4cddeab847f7bc3ce8b77) 
to ed9efe78e4cc958cc96707557818e754419debb0 
[diff](https://github.com/apache/arrow-rs/compare/ed9efe78e4cc958cc96707557818e754419debb0..1a7d04b97d59bae4cddeab847f7bc3ce8b77)
   BENCH_NAME=coalesce_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench coalesce_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-09 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3634119941

   
   run benchmark coalesce_kernels filter_kernels
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-08 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3629135486

   on my list


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-08 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3627509541

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
 coalesce_batches_filtermain
   -
 ---
   filter context decimal128 (kept 1/2) 
 1.12 48.0±6.74µs? ?/sec1.00 42.7±2.30µs? ?/sec
   filter context decimal128 high selectivity (kept 1023/1024)  
 1.04 51.8±1.43µs? ?/sec1.00 49.8±1.98µs? ?/sec
   filter context decimal128 low selectivity (kept 1/1024)  
 1.00246.6±2.17ns? ?/sec1.02250.7±8.70ns? ?/sec
   filter context f32 (kept 1/2)
 1.00 77.6±0.34µs? ?/sec1.00 77.7±1.04µs? ?/sec
   filter context f32 high selectivity (kept 1023/1024) 
 1.00  9.6±0.30µs? ?/sec1.04 10.0±0.35µs? ?/sec
   filter context f32 low selectivity (kept 1/1024) 
 1.00453.9±8.60ns? ?/sec1.03465.9±1.78ns? ?/sec
   filter context fsb with value length 20 (kept 1/2)   
 1.00 60.7±0.44µs? ?/sec1.00 60.6±0.23µs? ?/sec
   filter context fsb with value length 20 high selectivity (kept 1023/1024)
 1.00 60.8±1.16µs? ?/sec1.00 60.7±0.47µs? ?/sec
   filter context fsb with value length 20 low selectivity (kept 1/1024)
 1.00 60.6±0.25µs? ?/sec1.00 60.6±0.22µs? ?/sec
   filter context fsb with value length 5 (kept 1/2)
 1.00 60.9±2.04µs? ?/sec1.00 61.0±1.50µs? ?/sec
   filter context fsb with value length 5 high selectivity (kept 1023/1024) 
 1.00 60.8±1.38µs? ?/sec1.00 60.6±0.39µs? ?/sec
   filter context fsb with value length 5 low selectivity (kept 1/1024) 
 1.00 60.5±0.17µs? ?/sec1.00 60.7±0.37µs? ?/sec
   filter context fsb with value length 50 (kept 1/2)   
 1.00 60.8±1.05µs? ?/sec1.00 60.6±0.41µs? ?/sec
   filter context fsb with value length 50 high selectivity (kept 1023/1024)
 1.00 60.6±0.30µs? ?/sec1.00 60.7±0.78µs? ?/sec
   filter context fsb with value length 50 low selectivity (kept 1/1024)
 1.00 60.7±0.54µs? ?/sec1.00 60.7±0.50µs? ?/sec
   filter context i32 (kept 1/2)
 1.00 16.6±0.15µs? ?/sec1.00 16.5±0.11µs? ?/sec
   filter context i32 high selectivity (kept 1023/1024) 
 1.03  6.4±0.26µs? ?/sec1.00  6.2±0.27µs? ?/sec
   filter context i32 low selectivity (kept 1/1024) 
 1.00236.9±0.69ns? ?/sec1.02240.5±1.53ns? ?/sec
   filter context i32 w NULLs (kept 1/2)
 1.00 77.8±1.01µs? ?/sec1.00 77.5±0.35µs? ?/sec
   filter context i32 w NULLs high selectivity (kept 1023/1024) 
 1.01 10.4±0.28µs? ?/sec1.00 10.3±0.50µs? ?/sec
   filter context i32 w NULLs low selectivity (kept 1/1024) 
 1.00445.7±1.90ns? ?/sec1.06470.6±0.97ns? ?/sec
   filter context mixed string view (kept 1/2)  
 1.02103.3±2.61µs? ?/sec1.00101.6±1.87µs? ?/sec
   filter context mixed string view high selectivity (kept 1023/1024)   
 1.01 55.4±1.79µs? ?/sec1.00 55.0±1.21µs? ?/sec
   filter context mixed string view low selectivity (kept 1/1024)   
 1.01673.2±3.53ns? ?/sec1.00668.5±7.34ns? ?/sec
   filter context short string view (kept 1/2)  
 1.00102.9±3.52µs? ?/sec1.02104.8±4.15µs? ?/sec
   filter context short string view high selectivity (kept 1023/1024)   
 1.00 53.4±1.23µs? ?/sec1.01 54.0±1.40µs? ?/sec
   filter context short string view low selectivity (kept 1/1024)   
 1.00   470.4±10.21ns? ?/sec1.02479.2±3.80ns? ?/sec
   filter context string (kept 1/2) 
 1.00   571.6±11.64µs? ?/sec1.01575.3±6.77µs? ?/sec
   filter context string dictionary (kept 1/2)  
 1.00 17.3±0.11µs? ?/sec1.01 17.5±0.15µs? ?/sec
   filter context stri

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-08 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3627485469

   > 🤖: Benchmark completed
   > 
   > Details
   > 
   > ```
   > group  
  coalesce_batches_filtermain
   > -  
  ---
   > filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001 
  1.00263.9±2.33ms? ?/sec1.00263.2±2.58ms? 
?/sec
   > filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01  
  1.00  8.5±0.08ms? ?/sec1.00  8.5±0.11ms? 
?/sec
   > filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1   
  1.01  4.1±0.10ms? ?/sec1.00  4.0±0.05ms? 
?/sec
   > filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8   
  1.00  3.3±0.06ms? ?/sec1.07  3.5±0.05ms? 
?/sec
   > filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001   
  1.00248.2±1.98ms? ?/sec1.28318.8±4.18ms? 
?/sec
   > filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01
  1.00  9.4±0.10ms? ?/sec1.00  9.4±0.10ms? 
?/sec
   > filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1 
  1.01  4.5±0.10ms? ?/sec1.00  4.5±0.06ms? 
?/sec
   > filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8 
  1.00  3.7±0.05ms? ?/sec1.26  4.6±0.06ms? 
?/sec
   > filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001 
  1.00 59.1±0.55ms? ?/sec1.00 59.3±0.77ms? 
?/sec
   > filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01  
  1.00 11.6±0.15ms? ?/sec1.00 11.6±0.11ms? 
?/sec
   > filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1   
  1.01  9.4±0.14ms? ?/sec1.00  9.3±0.15ms? 
?/sec
   > filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8   
  1.00  8.1±0.17ms? ?/sec1.34 10.8±0.28ms? 
?/sec
   > filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001   
  1.00 69.6±0.48ms? ?/sec1.02 70.8±1.08ms? 
?/sec
   > filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01
  1.00 12.7±0.10ms? ?/sec1.02 13.0±0.18ms? 
?/sec
   > filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1 
  1.00  9.8±0.20ms? ?/sec1.00  9.8±0.22ms? 
?/sec
   > filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8 
  1.00  8.6±0.21ms? ?/sec1.18 10.1±0.28ms? 
?/sec
   > filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.001  1.00 48.6±0.43ms? ?/sec1.00 48.7±0.30ms? 
?/sec
   > filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.01   1.01  6.0±0.08ms? ?/sec1.00  6.0±0.04ms? 
?/sec
   > filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.11.02  4.6±0.12ms? ?/sec1.00  4.5±0.06ms? 
?/sec
   > filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.81.00  2.6±0.03ms? ?/sec1.17  3.1±0.05ms? 
?/sec
   > filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, 
selectivity: 0.0011.00 58.4±0.76ms? ?/sec1.00 
58.5±0.46ms? ?/sec
   > filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, 
selectivity: 0.01 1.00  7.9±0.12ms? ?/sec1.02  
8.1±0.10ms? ?/sec
   > filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, 
selectivity: 0.1  1.00  5.5±0.12ms? ?/sec1.02  
5.7±0.16ms? ?/sec
   > filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, 
selectivity: 0.8  1.00  2.2±0.01ms? ?/sec1.75  
3.9±0.04ms? ?/sec
   > filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.001   1.00 42.5±0.61ms? ?/sec1.00 42.7±0.65ms
? ?/sec
   > filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.011.00  4.7±0.10ms? ?/sec1.01  4.7±0.04ms
? ?/sec
   > filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.1 1.01  2.3±0.07ms? ?/sec1.00  2.3±0.04ms
? ?/sec
   > filter: mixed_utf8view (max_string_len=20),

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-08 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3627439628

   > Andrew Lamb Weekly-ish Open Source plan - 2025-12-08 
apache/datafusion#19210
   
   Nice, so this confirms the higher threshold also is a speed up for filter 
kernels.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-08 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3627376515

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
coalesce_batches_filtermain
   -
---
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001   
1.00263.9±2.33ms? ?/sec1.00263.2±2.58ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01
1.00  8.5±0.08ms? ?/sec1.00  8.5±0.11ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1 
1.01  4.1±0.10ms? ?/sec1.00  4.0±0.05ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8 
1.00  3.3±0.06ms? ?/sec1.07  3.5±0.05ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001 
1.00248.2±1.98ms? ?/sec1.28318.8±4.18ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01  
1.00  9.4±0.10ms? ?/sec1.00  9.4±0.10ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1   
1.01  4.5±0.10ms? ?/sec1.00  4.5±0.06ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8   
1.00  3.7±0.05ms? ?/sec1.26  4.6±0.06ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001   
1.00 59.1±0.55ms? ?/sec1.00 59.3±0.77ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01
1.00 11.6±0.15ms? ?/sec1.00 11.6±0.11ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1 
1.01  9.4±0.14ms? ?/sec1.00  9.3±0.15ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 
1.00  8.1±0.17ms? ?/sec1.34 10.8±0.28ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001 
1.00 69.6±0.48ms? ?/sec1.02 70.8±1.08ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01  
1.00 12.7±0.10ms? ?/sec1.02 13.0±0.18ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1   
1.00  9.8±0.20ms? ?/sec1.00  9.8±0.22ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8   
1.00  8.6±0.21ms? ?/sec1.18 10.1±0.28ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.001  1.00 48.6±0.43ms? ?/sec1.00 48.7±0.30ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.01   1.01  6.0±0.08ms? ?/sec1.00  6.0±0.04ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.11.02  4.6±0.12ms? ?/sec1.00  4.5±0.06ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.81.00  2.6±0.03ms? ?/sec1.17  3.1±0.05ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.0011.00 58.4±0.76ms? ?/sec1.00 58.5±0.46ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.01 1.00  7.9±0.12ms? ?/sec1.02  8.1±0.10ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.1  1.00  5.5±0.12ms? ?/sec1.02  5.7±0.16ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.8  1.00  2.2±0.01ms? ?/sec1.75  3.9±0.04ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.001   1.00 42.5±0.61ms? ?/sec1.00 42.7±0.65ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.011.00  4.7±0.10ms? ?/sec1.01  4.7±0.04ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.1 
1.01  2.3±0.07ms? ?/sec1.00  2.3±0.04ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.8 
1.00  1137.8±16.6

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-08 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3627376800

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (d2b5d296471710319ecc8fa0d0662ea466f5ab80) 
to ed9efe78e4cc958cc96707557818e754419debb0 
[diff](https://github.com/apache/arrow-rs/compare/ed9efe78e4cc958cc96707557818e754419debb0..d2b5d296471710319ecc8fa0d0662ea466f5ab80)
   BENCH_NAME=filter_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench filter_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-08 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3627282606

   🤖 `./gh_compare_arrow.sh` 
[gh_compare_arrow.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (d2b5d296471710319ecc8fa0d0662ea466f5ab80) 
to ed9efe78e4cc958cc96707557818e754419debb0 
[diff](https://github.com/apache/arrow-rs/compare/ed9efe78e4cc958cc96707557818e754419debb0..d2b5d296471710319ecc8fa0d0662ea466f5ab80)
   BENCH_NAME=coalesce_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench coalesce_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-08 Thread via GitHub


rluvaton commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3626040859

   @Dandandan, it looks like the benchmark machines are stuck or something, it 
still running aggregate vectorized benchmark I did more than 10h ago
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-07 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3625209650

   run benchmark coalesce_kernels filter_kernels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-06 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3619763588

   It is now faster in all cases on my machine 🚀 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-06 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3619751276

   run benchmark coalesce_kernels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-06 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3619751135

   @alamb seems we can play with the filter threshold value, probably a value 
with >=0.9 will give nice speedups, we might even go further based on some 
benchmark results


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-06 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3619747417

   run benchmark coalesce_kernels filter_kernels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-06 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3619745121

   run benchmarks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-05 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3617337060

   > Pretty good I would say... probably have to look a bit more at the 
null-handling speed
   
   I feel there is a bunch of null handling performance to be had via work in
   - #8561 
   
   I'll try and review this PR more carefully later today


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-05 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3617296926

   ```
   filter: primitive, 8192, nulls: 0, selectivity: 0.001
1.00 54.1±1.62ms? ?/sec1.80 97.2±0.21ms? 
?/sec
   filter: primitive, 8192, nulls: 0, selectivity: 0.01 
1.00  5.9±0.03ms? ?/sec1.57  9.3±0.02ms? 
?/sec
   filter: primitive, 8192, nulls: 0, selectivity: 0.1  
1.00  3.2±0.09ms? ?/sec1.17  3.7±0.05ms? 
?/sec
   filter: primitive, 8192, nulls: 0, selectivity: 0.8  
1.00  2.7±0.01ms? ?/sec1.14  3.1±0.02ms? 
?/sec
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.001  
1.00 60.9±0.09ms? ?/sec2.06125.4±0.26ms? 
?/sec
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.01   
1.00 10.9±0.04ms? ?/sec1.38 15.1±0.06ms? 
?/sec
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.1
1.16  8.6±0.20ms? ?/sec1.00  7.4±0.36ms? 
?/sec
   filter: primitive, 8192, nulls: 0.1, selectivity: 0.8
1.62 14.7±0.04ms? ?/sec1.00  9.1±0.04ms? 
?/sec
   ```
   
   Pretty good I would say... probably have to look a bit more at the 
null-handling speed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-05 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3617254346

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group
coalesce_batches_filtermain
   -
---
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.001   
1.01261.9±3.23ms? ?/sec1.00259.4±2.06ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.01
1.00  8.6±0.14ms? ?/sec1.01  8.7±0.10ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.1 
1.00  4.1±0.06ms? ?/sec1.01  4.1±0.09ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0, selectivity: 0.8 
1.00  3.5±0.01ms? ?/sec1.02  3.5±0.02ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.001 
1.00245.6±2.39ms? ?/sec1.27312.5±3.08ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.01  
1.01  9.4±0.09ms? ?/sec1.00  9.4±0.07ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.1   
1.00  4.5±0.08ms? ?/sec1.02  4.6±0.08ms? 
?/sec
   filter: mixed_dict, 8192, nulls: 0.1, selectivity: 0.8   
1.00  4.6±0.03ms? ?/sec1.01  4.6±0.02ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.001   
1.01 59.6±1.58ms? ?/sec1.00 59.2±0.34ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.01
1.00 11.6±0.18ms? ?/sec1.00 11.6±0.18ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.1 
1.01  9.3±0.18ms? ?/sec1.00  9.2±0.09ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0, selectivity: 0.8 
1.00  8.2±0.22ms? ?/sec1.28 10.4±0.24ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.001 
1.01 70.3±0.25ms? ?/sec1.00 69.9±0.25ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.01  
1.01 12.9±0.14ms? ?/sec1.00 12.8±0.06ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.1   
1.00  9.8±0.05ms? ?/sec1.06 10.4±0.16ms? 
?/sec
   filter: mixed_utf8, 8192, nulls: 0.1, selectivity: 0.8   
1.00 10.0±0.25ms? ?/sec1.02 10.1±0.20ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.001  1.05 50.7±0.30ms? ?/sec1.00 48.1±0.17ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.01   1.03  6.2±0.06ms? ?/sec1.00  6.0±0.05ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.11.00  4.5±0.11ms? ?/sec1.00  4.5±0.15ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 
0.81.02  3.1±0.03ms? ?/sec1.00  3.0±0.02ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.0011.04 60.3±0.24ms? ?/sec1.00 58.1±0.25ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.01 1.03  8.2±0.03ms? ?/sec1.00  7.9±0.03ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.1  1.00  5.6±0.13ms? ?/sec1.07  6.0±0.11ms? 
?/sec
   filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0.1, selectivity: 
0.8  1.00  3.9±0.02ms? ?/sec1.01  3.9±0.01ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.001   1.03 43.5±0.56ms? ?/sec1.00 42.5±0.09ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.011.05  4.9±0.22ms? ?/sec1.00  4.7±0.01ms
? ?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.1 
1.03  2.4±0.05ms? ?/sec1.00  2.3±0.04ms? 
?/sec
   filter: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 0.8 
1.00  1466.4±10.3

Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-05 Thread via GitHub


alamb-ghbot commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3617181714

   🤖 `./gh_compare_arrow.sh` [Benchmark 
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_arrow.sh)
 Running
   Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing coalesce_batches_filter (dcf48642209f21659775a8ca17957441d54778bd) 
to ed9efe78e4cc958cc96707557818e754419debb0 
[diff](https://github.com/apache/arrow-rs/compare/ed9efe78e4cc958cc96707557818e754419debb0..dcf48642209f21659775a8ca17957441d54778bd)
   BENCH_NAME=coalesce_kernels
   BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental 
--bench coalesce_kernels 
   BENCH_FILTER=
   BENCH_BRANCH_NAME=coalesce_batches_filter
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-05 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3617181393

   run benchmark coalesce_kernels
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-05 Thread via GitHub


alamb commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3617181011

   
   > run benchmark coalesce_kernels
   
   I added this to the allowed benchmarks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Make `push_batch_with_filter` up to 3x faster for primitive types [arrow-rs]

2025-12-04 Thread via GitHub


Dandandan commented on PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#issuecomment-3614262486

   @alamb I think it's ok now - I called AI (Opus 4.5) for some help on the 
`find_nth_set_bit_position` function.
   
   Mainly needs some polish and seeing if we can improve the `filter: 
primitive, 8192, nulls: 0.1, selectivity: 0.8` case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]