[
https://issues.apache.org/jira/browse/ARROW-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17661459#comment-17661459
]
Rok Mihevc commented on ARROW-4437:
-----------------------------------
This issue has been migrated to [issue
#20996|https://github.com/apache/arrow/issues/20996] on GitHub. Please see the
[migration documentation|https://github.com/apache/arrow/issues/14542] for
further details.
> [Python] Add builder API
> ------------------------
>
> Key: ARROW-4437
> URL: https://issues.apache.org/jira/browse/ARROW-4437
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Python
> Environment: Python 3.7.0 pyarrow-0.12.0
> Reporter: Zhuang Tianyi
> Priority: Minor
>
> There is no [Array
> Builder|https://arrow.apache.org/docs/cpp/api/builder.html#_CPPv3N5arrow12ArrayBuilderE]
> API in python bindings. When I generate data from a stream, I have to build
> a python list (high overhead) or pandas, then finalize it by call pa.array
> with copy operation. It seems like that we can build an Array directly from
> some (two or three) pa.ResizableBuffer in O(1) time.
> It's possible that maintain these buffers (value buffer, null bitmap, offset
> buffer) manually by current exported API, but not safe enough.
>
> I found undocumented StringBuilder API in
> [python/pyarrow/builder.pxi|https://github.com/apache/arrow/blob/master/python/pyarrow/builder.pxi],
> corresponding to
> [https://arrow.apache.org/docs/cpp/api/builder.html#classarrow_1_1_string_builder].
> Will other ArrayBuilder APIs to be add in python binding?
>
> ----
> Something more
> a BatchBuilder API is better if possible.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)