Hello there,
I'm recording an a-priori known number of entries per column, and I want to
create a Table using these entries. I'm currently using numpy.empty to
pre-allocate empty arrays, then creating a Table from that via the
pyarrow.table(data={}) constructor.
It seems a bit silly to create a bunch of NumPy arrays, only to convert
them to Arrow arrays to serialize. Is there any benefit to
creating/populating pyarrow.array() objects directly, and if so, how do I
do that? Otherwise, is the recommendation to first create a DataFrame in
pandas (or a number of NumPy arrays as I'm doing currently), then convert
to a Table?
I think I want to have a way to create a fixed-size Table consisting of a
number of columns, then set the values for each column one by one (similar
to iloc/iat in pandas). Is this a sensible thing to try to do?
Best,
Jonathan