[
https://issues.apache.org/jira/browse/ARROW-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364904#comment-17364904
]
Joris Van den Bossche commented on ARROW-13089:
-----------------------------------------------
{{RecordBatch}} is the more fundamental building block which is defined in the
specification and used in the IPC format
([https://arrow.apache.org/docs/format/Columnar.html#serialization-and-interprocess-communication-ipc),]
and so the python RecordBatch class reflects that data structure.
The {{Table}} on the other hand is more a convenience class to handle chunked
arrays / multiple record batches in a single object (so the field arrays in a
RecordBatch are single, contiguous arrays, while in a Table those are chunked
arrays)
> [Python] Allow creating RecordBatch from Python dict
> ----------------------------------------------------
>
> Key: ARROW-13089
> URL: https://issues.apache.org/jira/browse/ARROW-13089
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Andrei Nesterov
> Priority: Major
> Labels: beginner
>
> There's already {{Table.to_pydict()}} and {{Table.from_pydict()}} methods ,
> but only {{RecordBatch.to_pydict()}}. Perhaps, we should also add
> {{RecordBatch.from_pydict()}} to make the API consistent.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)