tinezivic commented on PR #49797:
URL: https://github.com/apache/arrow/pull/49797#issuecomment-4282403238

   Thank you for the review and for pointing to #49798!
   
   **Regarding the test from #49798:** I have added 
`TestMakeEmptyArrayPreservesDictionaryOrdered` to `array_test.cc` (just amended 
the commit). It verifies that `MakeEmptyArray` with an `ordered=true` 
`DictionaryType` produces an array whose type still has `ordered()==true`. 
Since `MakeEmptyArray` calls `MakeBuilder` internally, this test directly 
exercises the fix path and covers the scenario from #49674.
   
   **Regarding ordering of values:** You are right to flag this. The `ordered` 
flag in Arrow `DictionaryType` is a semantic annotation for comparisons 
(analogous to pandas `Categorical(ordered=True)`), not a guarantee that 
dictionary values are physically sorted. A `DictionaryBuilder` that sees values 
in insertion order may produce a dictionary where the physical ordering of 
values does not match the logical ordering implied by `ordered=true`. This is 
an inherent characteristic of builder-based encoding: if the caller knows the 
complete dictionary in advance and wants a truly sorted encoding, they should 
construct a `DictionaryArray` directly rather than via a builder. Preserving 
the `ordered` flag here is still correct — it prevents silent data loss where a 
type annotation is simply dropped — but the responsibility for ensuring value 
ordering rests with the caller. I can add a note to this effect in the PR 
description if that would be helpful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to