Hi Yevgeni,
The main Arrow classes (such as Array, ChunkedArray, RecordBatch, Table)
are immutable so support multi-thread usage out of the box.
We have mutable classes as well (e.g. IO classes, ArrayBuilders, mutable
Buffers...) and those are not thread-safe.
Regards
Antoine.
Le 26/09/2019 à 06:03, Yevgeni Litvin a écrit :
> Where in the documentation can I find information about thread-safety
> guarantee of arrow classes? In particular, is the following usage of
> pyarrow.Table showed by the pseudo-code thread-safe?
>
>
> arrow_table = pa.Table.from_pandas(df)
>
>
> def other_thread_worker_impl(arrow_table):
>
> arrow_table.column('some_column')[row].as_py()
>
>
> run_in_parallel(other_thread_worker_impl, arrow_table)
>
>
> I tried using pandas.DataFrame in the same multi-threaded setup and it
> turned out to be unsafe (https://github.com/pandas-dev/pandas/issues/28439).
>
> Thank you.
>
> - Yevgeni
>