[ 
https://issues.apache.org/jira/browse/ARROW-229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146125#comment-16146125
 ] 

Wes McKinney commented on ARROW-229:
------------------------------------

hi [~ankit5012], would it be OK if I took an initial pass at writing an 
implementation of this, and we can discuss further in code review on GitHub? 

This is one of the most complex issues currently in the Arrow JIRA as it 
touches most aspects of C++ library: memory management, array containers, array 
builders, multiple dispatch, etc. Because Arrow is not a compile-time C++ 
library, we must compile cast functions for each pair of supported input and 
output type so that the specific implementations are available in the shared 
library.

TensorFlow's implementation of cast might give us some ideas, see all of the 
files starting with cast_op_impl* in

https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/kernels

And the dynamic dispatch here:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/cast_op.cc#L99

Arrow is actually more complicated since we have non-numeric and nested types 
to think about. I'd like to do something for this JIRA for numeric, non-nested 
types, while leaving the door open for extending to more complicated types 
(e.g. you could imagine casting {{List<Int64>}} to {{List<Int32>}}). 

> [C++] Implement safe casts for primitive types
> ----------------------------------------------
>
>                 Key: ARROW-229
>                 URL: https://issues.apache.org/jira/browse/ARROW-229
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Uwe L. Korn
>             Fix For: 0.7.0
>
>
> In some situations, you want to cast the data in a PrimitiveArray to a 
> different (but similar) data type, e.g. from {{uint32_t}} to {{int32_t}} or 
> {{uint32_t}} to {{uint8_t}}. This can either be done by reinterpreting the 
> data or needs to involve a copy if the size of the underlying type changes. 
> There is already an implementation for this in {{parquet-cpp}} that could be 
> pulled out into Arrow: 
> https://github.com/apache/parquet-cpp/blob/9a0407e684c0a6299d0e6ab98c11c1162915c0ee/src/parquet/arrow/writer.cc#L71



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to