Jefffrey commented on code in PR #9373:
URL: https://github.com/apache/arrow-rs/pull/9373#discussion_r2786664514
##########
arrow-select/src/dictionary.rs:
##########
@@ -275,12 +275,12 @@ pub(crate) fn merge_dictionary_values<K:
ArrowDictionaryKeyType>(
for (value_idx, value) in values {
mapping[value_idx] =
- *interner.intern(value, || match
K::Native::from_usize(indices.len()) {
- Some(idx) => {
- indices.push((dictionary_idx, value_idx));
- Ok(idx)
- }
- None => Err(ArrowError::DictionaryKeyOverflowError),
+ *interner.intern(value, || -> Result<K::Native,
ArrowError> {
Review Comment:
Please ensure you carefully verify LLM output, which it seems this PR is
entirely. This specific code change fixes nothing; it is claimed in the PR body:
> ### Issue 1: `arrow-select/src/dictionary.rs` (line 292)
> The boundary check happens **after** pushing to the indices vector instead
of before:
>
> ```rust
> // Before (buggy):
> K::Native::from_usize(indices.len()) // Checked after push
> ```
Simply looking at the code on main reveals this is completely inaccurate:
https://github.com/apache/arrow-rs/blob/0c1ec0b31d507bf051d9b3eca2642a6f867a3c42/arrow-select/src/dictionary.rs#L278-L284
- We are pushing after we check that `from_usize()` returns `Ok(_)`
And reverting this code change to main and running tests shows no failing
tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]