[ 
https://issues.apache.org/jira/browse/ARROW-15029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468754#comment-17468754
 ] 

Eduardo Ponce commented on ARROW-15029:
---------------------------------------

I agree with separating ASCII and UTF-8 string kernels for the following 
reasons:
* The symmetric split makes sense and can make it easier to compare 
corresponding kernels.
* Support for UTF-8 is controlled by CMake variable {{-DARROW_WITH_UTF8PROC}} 
so this will allow skipping {{scalar_string_utf8.cc}} at the CMake level and 
not with C++ {{#ifdef}}.
* If additional string encodings are to be added, then they can be placed in 
their own source files.

> [C++] Split compute/kernels/scalar_string.cc
> --------------------------------------------
>
>                 Key: ARROW-15029
>                 URL: https://issues.apache.org/jira/browse/ARROW-15029
>             Project: Apache Arrow
>          Issue Type: Task
>          Components: C++
>            Reporter: Antoine Pitrou
>            Assignee: Jeroen van Straten
>            Priority: Minor
>              Labels: good-first-issue, good-second-issue
>
> {{compute/kernels/scalar_string.cc}}, which defines scalar string kernels, is 
> getting pretty large (and probably long-ish to compile). It would be nice to 
> split it up thematically into 2 or 3 source files. Common utilities may be 
> factored into a {{scalar_string_internal.h}} header, for example.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to