Re: [I] [Python] pyarrow.array should special-case array.array objects [arrow]

via GitHub Wed, 11 Oct 2023 02:34:59 -0700


jorisvandenbossche commented on issue #38137:
URL: https://github.com/apache/arrow/issues/38137#issuecomment-1757264027


   The `pa.array(..)` function indeed currently supports specifically 
numpy.ndarray and pandas array-likes (which is essentially ducktyping for 
`.dtype` attribute, and then converting to numpy ndarray), and other objects 
are treated as general sequences (and converted as such, by iterating over the 
elements).
   
   We should probably expand that to support any buffer-like object as well, 
and in that case take the faster code path for ndarrays.
   
   Manual workaround for now is to create a Buffer object (which does support 
objects that implement the buffer protocol, and thus supports array.array 
objects), and then create an Array from that Buffer with the low-level 
`from_buffers`:
   
   ```
   In [12]: %timeit pa.array(x)
   2.53 ms ± 15.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
   
   In [13]: %timeit pa.Array.from_buffers(pa.uint8(), len(x), [None, 
pa.py_buffer(x)])
   1.54 µs ± 105 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [Python] pyarrow.array should special-case array.array objects [arrow]

Reply via email to