[ 
https://issues.apache.org/jira/browse/ARROW-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van den Bossche updated ARROW-5271:
-----------------------------------------
    Description: 
Related to ARROW-2428, which describes the issue to convert back to an 
ExtensionArray in {{to_pandas}}.

To start supporting to convert custom ExtensionArrays (eg the nullable 
Int64Dtype in pandas, or the arrow-backed fletcher arrays, ...) to arrow Arrays 
(eg in {{pyarrow.array(..)}}), I think it would be good to define an interface 
or hook that external projects can implement and that pyarrow will call if 
available. 
This would allow external projects to define how they can be converted to arrow 
arrays, without the need that pyarrow itself starts to gather a lot of special 
cased code for certain types (like pandas' nullable Int64).

This could similar to how numpy looks for the {{\_\_array\_\_}} method, so we 
might call it {{\_\_arrow_array\_\_}}.

See also https://github.com/pandas-dev/pandas/issues/20612 for an issue 
discussing this on the pandas side.

  was:
Related to ARROW-2428, which describes the issue to convert back to an 
ExtensionArray in {{to_pandas}}.

To start supporting to convert custom ExtensionArrays (eg the nullable 
Int64Dtype in pandas, or the arrow-backed fletcher arrays, ...) to arrow Arrays 
(eg in {{pyarrow.array(..)}}), I think it would be good to define an interface 
or hook that external projects can implement and that pyarrow will call if 
available. 
This would allow external projects to define how they can be converted to arrow 
arrays, without the need that pyarrow itself starts to gather a lot of special 
cased code for certain types (like pandas' nullable Int64).

This could similar to how numpy looks for the {{__array__}} method, so we might 
call it {{__arrow_array__}}.

See also https://github.com/pandas-dev/pandas/issues/20612 for an issue 
discussing this on the pandas side.


> [Python] Interface for converting pandas ExtensionArray / other custom array 
> objects to pyarrow Array
> -----------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-5271
>                 URL: https://issues.apache.org/jira/browse/ARROW-5271
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Joris Van den Bossche
>            Priority: Major
>
> Related to ARROW-2428, which describes the issue to convert back to an 
> ExtensionArray in {{to_pandas}}.
> To start supporting to convert custom ExtensionArrays (eg the nullable 
> Int64Dtype in pandas, or the arrow-backed fletcher arrays, ...) to arrow 
> Arrays (eg in {{pyarrow.array(..)}}), I think it would be good to define an 
> interface or hook that external projects can implement and that pyarrow will 
> call if available. 
> This would allow external projects to define how they can be converted to 
> arrow arrays, without the need that pyarrow itself starts to gather a lot of 
> special cased code for certain types (like pandas' nullable Int64).
> This could similar to how numpy looks for the {{\_\_array\_\_}} method, so we 
> might call it {{\_\_arrow_array\_\_}}.
> See also https://github.com/pandas-dev/pandas/issues/20612 for an issue 
> discussing this on the pandas side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to