jorisvandenbossche commented on a change in pull request #10101:
URL: https://github.com/apache/arrow/pull/10101#discussion_r620264160
##########
File path: python/pyarrow/tests/test_array.py
##########
@@ -688,6 +688,69 @@ def test_dictionary_from_numpy():
assert d2[i].as_py() == dictionary[indices[i]]
+def test_dictionary_to_numpy():
+ expected = pa.array(
+ ["foo", "bar", None, "foo"]
+ ).to_numpy(zero_copy_only=False)
+ a = pa.DictionaryArray.from_arrays(
+ pa.array([0, 1, None, 0]),
+ pa.array(['foo', 'bar'])
+ )
+ assert (a.to_numpy(zero_copy_only=False) == expected).all()
+
+ with pytest.raises(pa.ArrowInvalid):
+ # to_numpy takes for granted that when zero_copy_only=True
+ # there will be no nulls.
+ # If that changes, nulls handling will have to be updated in to_numpy
+ # as we won't be able to rely anymore on decoding the DictionaryArray
+ # to handle nulls.
Review comment:
I see, understand it now, but still find the original comment a bit
confusing. How about something like:
```
# If this would be changed to no longer raise in the future, ensure
to test the actual result
# because, currently, to_numpy takes for granted that when
zero_copy_only=True
# there will be no nulls (it's the decoding of the DictionaryArray
that handles the nulls and
# this is only activated with zero_copy_only=False)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]