albel727 opened a new issue, #3579: URL: https://github.com/apache/arrow-rs/issues/3579
**Describe the bug** `nullif(left, right)` incorrectly calculates `null_count` for the result array, whenever `left` doesn't have a null_buffer and has `len % 64 == 0`. It can even panic, if there are less than 64 true values in `right`. **To Reproduce** ```rust let n = 64; let left = Int32Array::from((0..n as i32).into_iter().collect::<Vec<_>>()); let right = BooleanArray::from((0..n).into_iter().map(|_| None).collect::<Vec<_>>()); arrow_select::nullif::nullif(&left, &right); ``` **Expected behavior** It works. **Actual behavior** It panics with 'attempt to subtract with overflow' at this line: https://github.com/apache/arrow-rs/blob/796b670338ce33806a39777ea18cf6fae8fa7ee4/arrow-select/src/nullif.rs#L107 **Additional context** The problem evaded fixes #3034 and #3263, which were incomplete. The wrong calculation occurs in these lines: https://github.com/apache/arrow-rs/blob/796b670338ce33806a39777ea18cf6fae8fa7ee4/arrow-select/src/nullif.rs#L81-L90 The reason it is wrong is the un-intuitive, if not to say wrong, behavior of `bitwise_unary_op_helper()` and friends. They're calling `op()` **unconditionally** on the remainder bits, even if there are none: https://github.com/apache/arrow-rs/blob/3a90654f4cb98ac7fe278991a6cbc11384664e2e/arrow-buffer/src/buffer/ops.rs#L119-L123 It doesn't make a difference for the produced output buffer, because it is then just extended with slice of 0 bytes, i.e. remains unaffected. But it does matter for FnMut closures with side effects, such as the one found in `nullif`, which, as a result of this behavior, adds an excess of 64 to `valid_count`, when there are no remainder bits, i.e. bit length is a multiple of 64. The quick fix is to do `valid_count -= 64 - remainder_len;` in `nullif()` unconditionally, i.e. just remove the `if remainder_len != 0 {` condition around it. It was clearly written in assumption, that `bitwise_unary_op_helper()` doesn't call `op()` in excess, when there are no remainder bits. A better fix could have been to avoid making excess `op()` calls in `bitwise_unary_op_helper()` and friends, but since they're public, it could lead to bugs in external consumers, which might have come to rely on this weird behavior. In any case, I suggest at least checking whether other usages of `bitwise_unary_op_helper()` and friends in arrow-rs make the same incorrect assumption. Temporarily changing the type of `op` parameter from `FnMut` to `Fn` and looking at the compilation error fallout should point out all suspicious places. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org