Try to give Arrow the JSON text containing all the records. Working one
record at a time goes against the philosophy of vectorized array processing.

https://arrow.apache.org/docs/python/generated/pyarrow.json.read_json.html

Instead of getting an array of structs, you will get a table where each
key/column is its own array. If you have nested JSON objects, they become
arrays of structs in the table.

If you really want an array of struct, you can .flatten() the table

I hope this helps.

--
Felipe

On Mon, Sep 11, 2023 at 2:05 PM Priyam Roy <roypriyam7...@gmail.com> wrote:

> Hi, just checking in if someone knows this?
>
> On Mon, 11 Sept 2023 at 02:35, Priyam Roy <roypriyam7...@gmail.com> wrote:
>
>> I have an array, for example:
>>
>> >>> arr = pa.array(["[{'key': 1}]", None, None])
>>
>> When I do -
>> >>> arr.cast(pa.list_(pa.struct([('key', pa.int64())])))
>>
>> I get this traceback:
>> File "pyarrow/_compute.pyx", line 572, in pyarrow._compute.call_function
>>   File "pyarrow/_compute.pyx", line 367, in pyarrow._compute.Function.call
>>   File "pyarrow/error.pxi", line 144, in
>> pyarrow.lib.pyarrow_internal_check_status
>>   File "pyarrow/error.pxi", line 121, in pyarrow.lib.check_status
>> pyarrow.lib.ArrowNotImplementedError: Unsupported cast from string to
>> list using function cast_list
>>
>> Any idea how can I get it happen please?
>>
>

Reply via email to