[
https://issues.apache.org/jira/browse/ARROW-229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146125#comment-16146125
]
Wes McKinney commented on ARROW-229:
------------------------------------
hi [~ankit5012], would it be OK if I took an initial pass at writing an
implementation of this, and we can discuss further in code review on GitHub?
This is one of the most complex issues currently in the Arrow JIRA as it
touches most aspects of C++ library: memory management, array containers, array
builders, multiple dispatch, etc. Because Arrow is not a compile-time C++
library, we must compile cast functions for each pair of supported input and
output type so that the specific implementations are available in the shared
library.
TensorFlow's implementation of cast might give us some ideas, see all of the
files starting with cast_op_impl* in
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/kernels
And the dynamic dispatch here:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/cast_op.cc#L99
Arrow is actually more complicated since we have non-numeric and nested types
to think about. I'd like to do something for this JIRA for numeric, non-nested
types, while leaving the door open for extending to more complicated types
(e.g. you could imagine casting {{List<Int64>}} to {{List<Int32>}}).
> [C++] Implement safe casts for primitive types
> ----------------------------------------------
>
> Key: ARROW-229
> URL: https://issues.apache.org/jira/browse/ARROW-229
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Reporter: Uwe L. Korn
> Fix For: 0.7.0
>
>
> In some situations, you want to cast the data in a PrimitiveArray to a
> different (but similar) data type, e.g. from {{uint32_t}} to {{int32_t}} or
> {{uint32_t}} to {{uint8_t}}. This can either be done by reinterpreting the
> data or needs to involve a copy if the size of the underlying type changes.
> There is already an implementation for this in {{parquet-cpp}} that could be
> pulled out into Arrow:
> https://github.com/apache/parquet-cpp/blob/9a0407e684c0a6299d0e6ab98c11c1162915c0ee/src/parquet/arrow/writer.cc#L71
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)