Re: [I] [Python] Expose the device interface through the Arrow PyCapsule protocol [arrow]

via GitHub Thu, 19 Oct 2023 13:44:53 -0700


kkraus14 commented on issue #38325:
URL: https://github.com/apache/arrow/issues/38325#issuecomment-1771680556

The downside of using a separate dunder method is that you run into
ecosystem fragmentation and inefficiencies. I.E. there was a similar effect
that happened with `__array_interface__`
(https://numpy.org/doc/stable/reference/arrays.interface.html) and
`__cuda_array_interface__`
(https://numba.readthedocs.io/en/stable/cuda/cuda_array_interface.html).
Libraries and applications primarily developed against numpy despite the
`__array_function__` protocol which made them think and develop in a CPU
centric way despite just using array APIs. GPU libraries then needed to either
implicitly copy to CPU to support `__array_interface__` or raise an Exception
to try to guide the developer in the right direction, neither of which is a
good developer UX.

Compare that to dlpack (https://github.com/dmlc/dlpack) and the associated
Python protocol surrounding it
(https://data-apis.org/array-api/latest/API_specification/generated/array_api.array.__dlpack__.html)
which has had device support as a primary goal from its inception. Deep
Learning libraries that have multiple device backends like PyTorch, Tensorflow,
JAX, etc. all support dlpack but they don't support `__array_interface__` or
`__cuda_array_interface__` because of those issues. Similarly, the
`__dataframe__` protocol
(https://data-apis.org/dataframe-protocol/latest/API.html) has device support
as a primary goal as well.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [Python] Expose the device interface through the Arrow PyCapsule protocol [arrow]

Reply via email to