[
https://issues.apache.org/jira/browse/ARROW-15029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468754#comment-17468754
]
Eduardo Ponce commented on ARROW-15029:
---------------------------------------
I agree with separating ASCII and UTF-8 string kernels for the following
reasons:
* The symmetric split makes sense and can make it easier to compare
corresponding kernels.
* Support for UTF-8 is controlled by CMake variable {{-DARROW_WITH_UTF8PROC}}
so this will allow skipping {{scalar_string_utf8.cc}} at the CMake level and
not with C++ {{#ifdef}}.
* If additional string encodings are to be added, then they can be placed in
their own source files.
> [C++] Split compute/kernels/scalar_string.cc
> --------------------------------------------
>
> Key: ARROW-15029
> URL: https://issues.apache.org/jira/browse/ARROW-15029
> Project: Apache Arrow
> Issue Type: Task
> Components: C++
> Reporter: Antoine Pitrou
> Assignee: Jeroen van Straten
> Priority: Minor
> Labels: good-first-issue, good-second-issue
>
> {{compute/kernels/scalar_string.cc}}, which defines scalar string kernels, is
> getting pretty large (and probably long-ish to compile). It would be nice to
> split it up thematically into 2 or 3 source files. Common utilities may be
> factored into a {{scalar_string_internal.h}} header, for example.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)