AlenkaF commented on a change in pull request #12178:
URL: https://github.com/apache/arrow/pull/12178#discussion_r797358139



##########
File path: python/pyarrow/tests/test_pandas.py
##########
@@ -4082,6 +4082,66 @@ def test_array_to_pandas():
         # tm.assert_series_equal(result, expected)
 
 
+def test_to_pandas_types_mapper():
+    # https://issues.apache.org/jira/browse/ARROW-9664
+    if Version(pd.__version__) < Version("1.0.0"):
+        pytest.skip("ExtensionDtype to_pandas method missing")
+
+    data = pa.array([1, 2, 3], pa.int64())
+
+    # Test with mapper function
+    types_mapper = {pa.int64(): pd.Int64Dtype()}.get
+    result = data.to_pandas(types_mapper=types_mapper)
+    assert result.dtype == types_mapper(data.type)
+
+    # Test mapper function returning None
+    types_mapper = {pa.int64(): None}.get
+    result = data.to_pandas(types_mapper=types_mapper)
+    assert result.dtype == data.type.to_pandas_dtype()
+
+    # Test mapper function not containing the dtype
+    types_mapper = {pa.float64(): pd.Float64Dtype()}.get
+    result = data.to_pandas(types_mapper=types_mapper)
+    assert result.dtype == data.type.to_pandas_dtype()
+
+    # Test for the interval extension dtype
+    # -> ignores mapping and uses default conversion
+    types_mapper = {pa.float64(): pd.IntervalDtype()}.get
+    result = data.to_pandas(types_mapper=types_mapper)

Review comment:
       I have been thinking about this before and am not sure what is the best 
way. I agree that it doesn't really makes that much sense to have 
`types_mapper` for Array as it does for the Table. `dtype` kind of a think 
should do.
   
   If I use the example for the intervaldtype you mentioned, it works fine 
without the need for a `types_mapper`, so I am not sure what should be done 
here (you can't change to any other dtype anyways?):
   
   ```
   >>> interval = pd.Series(pd.interval_range(0, 5, 5))
   >>> data = pa.array(interval)
   >>> data.to_pandas()
   0    (0, 1]
   1    (1, 2]
   2    (2, 3]
   3    (3, 4]
   4    (4, 5]
   dtype: interval
   ````
   
   I was thinking in general about intervaldtype or any other Pandas extension 
dtype and the only thing I could think of is to add:
   ```
   if hasattr(types_mapper(dtype), '__from_arrow__'):
               array = types_mapper(dtype).__from_arrow__(array)
   ```
   but that didn't really work as the array is a numpy array at that point 
(`_array_like_to_pandas()`) - if I remember correctly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to