jorisvandenbossche opened a new issue, #42018: URL: https://github.com/apache/arrow/issues/42018
With the upcoming numpy 2.0 release, they introduce a new variable-length string dtype, [StringDType](https://numpy.org/devdocs/reference/routines.dtypes.html#numpy.dtypes.StringDType) (https://numpy.org/devdocs/release/2.0.0-notes.html#highlights). Currently trying to convert that obviously gives an error: ```python >>> arr = np.array(["some", "strings"], dtype=np.dtypes.StringDType()) >>> arr array(['some', 'strings'], dtype=StringDType()) >>> pa.array(arr) ... ArrowNotImplementedError: Unsupported numpy type 2056 ``` But ideally we should support this dtype both for conversion from numpy -> arrow, as in the arrow -> numpy conversion (although here it should probably be opt-in and by default still return object dtype) Numpy provides a C API to access the individual string elements: https://numpy.org/devdocs/reference/c-api/strings.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
