alamb opened a new issue, #9085:
URL: https://github.com/apache/arrow-rs/issues/9085
**Describe the bug**
When calling `nullif(a, b)` and b has no nulls, sometimes the null count of
the returned buffer is incorrect
**To Reproduce**
Add this to the `nullif_fuzz` test:
```diff
@@ -518,11 +546,16 @@ mod tests {
let b_start_offset = rng.random_range(0..i);
let b_end_offset = rng.random_range(0..i);
+ // b with 50% nulls
let b: BooleanArray = (0..a_length + b_start_offset +
b_end_offset)
.map(|_| rng.random_bool(0.5).then(||
rng.random_bool(0.5)))
.collect();
let b = b.slice(b_start_offset, a_length);
+ test_nullif(&a, &b);
+ // b with no nulls
+ let b =
make_array(b.into_data().into_builder().nulls(None).build().unwrap());
+ let b = b.as_boolean().slice(b_start_offset, a_length);
test_nullif(&a, &b);
}
```
**Expected behavior**
The test should pass
**Additional context**
I found this while debugging issues in applying the same pattern in
https://github.com/apache/arrow-rs/pull/8996 to this kernel.
I am pretty sure I introduced this in
https://github.com/apache/arrow-rs/pull/8996
Basically, there is an implicit assumption in the `nullif` kernel that the
`op` will only be called on words with bits that are completely contained
within the offset/length of the input arrays, which is not true when creating
using word aligned Vecs
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]