[jira] [Commented] (ARROW-13089) [Python] Allow creating RecordBatch from Python dict

Joris Van den Bossche (Jira) Thu, 17 Jun 2021 05:02:06 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364904#comment-17364904
 ]


Joris Van den Bossche commented on ARROW-13089:
-----------------------------------------------

{{RecordBatch}} is the more fundamental building block which is defined in the 
specification and used in the IPC format 
([https://arrow.apache.org/docs/format/Columnar.html#serialization-and-interprocess-communication-ipc),]
 and so the python RecordBatch class reflects that data structure.

The {{Table}} on the other hand is more a convenience class to handle chunked 
arrays / multiple record batches in a single object (so the field arrays in a 
RecordBatch are single, contiguous arrays, while in a Table those are chunked 
arrays)

> [Python] Allow creating RecordBatch from Python dict
> ----------------------------------------------------
>
>                 Key: ARROW-13089
>                 URL: https://issues.apache.org/jira/browse/ARROW-13089
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Andrei Nesterov
>            Priority: Major
>              Labels: beginner
>
> There's already {{Table.to_pydict()}} and {{Table.from_pydict()}} methods , 
> but only {{RecordBatch.to_pydict()}}. Perhaps, we should also add 
> {{RecordBatch.from_pydict()}} to make the API consistent.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-13089) [Python] Allow creating RecordBatch from Python dict

Reply via email to