kkraus14 commented on issue #38325: URL: https://github.com/apache/arrow/issues/38325#issuecomment-1771680556
The downside of using a separate dunder method is that you run into ecosystem fragmentation and inefficiencies. I.E. there was a similar effect that happened with `__array_interface__` (https://numpy.org/doc/stable/reference/arrays.interface.html) and `__cuda_array_interface__` (https://numba.readthedocs.io/en/stable/cuda/cuda_array_interface.html). Libraries and applications primarily developed against numpy despite the `__array_function__` protocol which made them think and develop in a CPU centric way despite just using array APIs. GPU libraries then needed to either implicitly copy to CPU to support `__array_interface__` or raise an Exception to try to guide the developer in the right direction, neither of which is a good developer UX. Compare that to dlpack (https://github.com/dmlc/dlpack) and the associated Python protocol surrounding it (https://data-apis.org/array-api/latest/API_specification/generated/array_api.array.__dlpack__.html) which has had device support as a primary goal from its inception. Deep Learning libraries that have multiple device backends like PyTorch, Tensorflow, JAX, etc. all support dlpack but they don't support `__array_interface__` or `__cuda_array_interface__` because of those issues. Similarly, the `__dataframe__` protocol (https://data-apis.org/dataframe-protocol/latest/API.html) has device support as a primary goal as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
