Adding dev@

The is one purpose of the Arrow C data interface, which was developed
after the __arrow_array__ protocol, and worth investigating

https://github.com/apache/arrow/blob/master/docs/source/format/CDataInterface.rst

On Sat, Sep 12, 2020 at 2:16 PM Marc Garcia <garcia.m...@gmail.com> wrote:
>
> Hi there,
>
> I'm writing a document analyzing different options for a Python dataframe 
> exchange protocol. And I wanted to ask a question regarding the 
> __arrow_array__ protocol.
>
> I checked the code, and looks like the producer is expected to be sending an 
> Arrow array, and the consumer just receives it. This is the code I'm 
> checking, I guess it's the right one: 
> https://github.com/apache/arrow/blob/master/python/pyarrow/array.pxi#L110
>
> Compared to the array interface (the NumPy buffer protocol), it works a bit 
> differently. In the NumPy one, the producer exposes the pointer, the size... 
> So, the producer doesn't need to depend on NumPy or any other library, and 
> then the consumer can simply use `numpy.array(obj)` and generate the NumPy 
> array. Or if other implementations support the protocol (not sure if they 
> do), they could call something like `tensorflow.Tensor(obj)`, and NumPy would 
> not be used at all.
>
> Am I understanding correctly the `__arrow_array__` protocol? And if I am, is 
> there anything else similar to the NumPy protocol that can be used to 
> exchange data without relying on a particular implementation?
>
> Thanks in advance!

Reply via email to