felipecrv commented on code in PR #41827: URL: https://github.com/apache/arrow/pull/41827#discussion_r1623498992
########## cpp/src/arrow/compute/kernels/scalar_cast_string.cc: ########## @@ -510,6 +510,60 @@ void AddBinaryToFixedSizeBinaryCast(CastFunction* func) { AddBinaryToFixedSizeBinaryCast<FixedSizeBinaryType>(func); } +// ---------------------------------------------------------------------- +// Union to String + +template <typename O> +struct UnionToStringCastFunctor { + using BuilderType = typename TypeTraits<O>::BuilderType; + + static Status Exec(KernelContext* ctx, const ExecSpan& batch, ExecResult* out) { + const ArraySpan& input = batch[0].array; + const auto& union_type = checked_cast<const UnionType&>(*input.type); + const auto type_ids = input.GetValues<int8_t>(1); + const auto& offsets = input.GetValues<int32_t>(2); + + BuilderType builder(input.type->GetSharedPtr(), ctx->memory_pool()); + RETURN_NOT_OK(builder.Reserve(input.length)); + + for (int64_t i = 0; i < input.length; ++i) { Review Comment: > In this way, we can unify it with other type's implementations and shield the logic of converting strings in the current file. True, but that can prevent optimizations in the future. The approach of taking a scalar function and turning it into an array function by mapping —`array::map(scalar_function: scalar -> scalar) -> array` — is appealing but prevents vectorization techniques. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org