[jira] [Commented] (ARROW-1664) [Python] Support for xarray.DataArray and xarray.Dataset

Joris Van den Bossche (Jira) Wed, 18 Sep 2019 13:51:38 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932827#comment-16932827
 ]


Joris Van den Bossche commented on ARROW-1664:
----------------------------------------------

In general, xarray datasets/dataarrays do not necessarily match Arrow's data 
model (eg they can have multiple dimensions). Of course, you can have a subset 
of cases where your xarray object would map nicely to an Arrow table.  
Also, given that xarray uses contiguous numpy arrays and Arrow 1D arrays, I am 
not sure that Arrow is very suited for zero-copy serialization for such 
objects? (converting to arrow could be zero-copy, but not the other way around?)

So given that, I am not sure pyarrow should necessarily support xarray objects 
specifically. 
We could indeed think about a "table protocol", but for that I think it would 
be nice to have some more practical use cases.


> [Python] Support for xarray.DataArray and xarray.Dataset
> --------------------------------------------------------
>
>                 Key: ARROW-1664
>                 URL: https://issues.apache.org/jira/browse/ARROW-1664
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Mitar
>            Priority: Minor
>
> DataArray and Dataset are efficient in-memory representations for multi 
> dimensional data. It would be great if one could share them between processes 
> using Arrow.
> http://xarray.pydata.org/en/stable/generated/xarray.DataArray.html#xarray.DataArray
> http://xarray.pydata.org/en/stable/generated/xarray.Dataset.html#xarray.Dataset



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-1664) [Python] Support for xarray.DataArray and xarray.Dataset

Reply via email to