tinezivic commented on PR #49797: URL: https://github.com/apache/arrow/pull/49797#issuecomment-4282403238
Thank you for the review and for pointing to #49798! **Regarding the test from #49798:** I have added `TestMakeEmptyArrayPreservesDictionaryOrdered` to `array_test.cc` (just amended the commit). It verifies that `MakeEmptyArray` with an `ordered=true` `DictionaryType` produces an array whose type still has `ordered()==true`. Since `MakeEmptyArray` calls `MakeBuilder` internally, this test directly exercises the fix path and covers the scenario from #49674. **Regarding ordering of values:** You are right to flag this. The `ordered` flag in Arrow `DictionaryType` is a semantic annotation for comparisons (analogous to pandas `Categorical(ordered=True)`), not a guarantee that dictionary values are physically sorted. A `DictionaryBuilder` that sees values in insertion order may produce a dictionary where the physical ordering of values does not match the logical ordering implied by `ordered=true`. This is an inherent characteristic of builder-based encoding: if the caller knows the complete dictionary in advance and wants a truly sorted encoding, they should construct a `DictionaryArray` directly rather than via a builder. Preserving the `ordered` flag here is still correct — it prevents silent data loss where a type annotation is simply dropped — but the responsibility for ensuring value ordering rests with the caller. I can add a note to this effect in the PR description if that would be helpful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
