[
https://issues.apache.org/jira/browse/ARROW-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124048#comment-17124048
]
Krisztian Szucs commented on ARROW-9017:
----------------------------------------
I started to factor out the elementwise conversion code required to convert
single python objects to intermediate C representation. I hit a couple of
roadblock in that conversion code and there were also missing utilities like
the GetScalar Ben has implemented recently.
We also have an outstanding issue with the auto chunking during conversion: in
case of nested types a binary/string field gets chunked if the size limited is
reached but the rest of the fields have a single chunk resulting a corrupted
nested array.
> [Python] Refactor the Scalar classes
> ------------------------------------
>
> Key: ARROW-9017
> URL: https://issues.apache.org/jira/browse/ARROW-9017
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Joris Van den Bossche
> Priority: Major
>
> The situation regarding scalars in Python is currently not optimal.
> We have two different "types" of scalars:
> - {{ArrayValue(Scalar)}} (and subclasses of that for all types): this is
> used when you access a single element of an array (eg {{arr[0]}})
> - {{ScalarValue(Scalar)}} (and subclasses of that for _some_ types): this is
> used when wrapping a C++ scalar into a python scalar, eg when you get back a
> scalar from a reduction like {{arr.sum()}}.
> And while we have two versions of scalars, neither of them can actually
> easily be used as scalar as they both can't be constructed from a python
> scalar (there is no {{scalar(1)}} function to use when calling a kernel, for
> example).
> I think we should try to unify those scalar classes? (which probably means
> getting rid of the ArrayValue scalar)
> In addition, there is an issue of trying to re-use python scalar <-> arrow
> conversion code, as this is also logic for this in the {{python_to_arrow.cc}}
> code. But this is probably a bigger change. cc [~kszucs]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)