Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20926 )
Change subject: IMPALA-12718: Provides UTF-8 support for the trim functions ...................................................................... IMPALA-12718: Provides UTF-8 support for the trim functions Currently, the trim function (including BTRIM, LTRIM, RTRIM) cannot correctly handle strings containing multi-byte UTF-8 characters. Multi-byte UTF-8 characters are interpreted as multiple single-byte characters, leading to unexpected results. This patch provides UTF-8 support for the trim functions, enabling these functions to correctly handle multi-byte UTF-8 characters (when set utf8_mode=true). It also introduces a set of trim functions with the 'utf8_' prefix, offering the same capability even when utf8_mode is not enabled. Testing: - Added new BE test case in ExprTest#Utf8Test - Added new E2E test case in TestUtf8StringFunctions Change-Id: I5cfaffd71009f16eae75910af835bd2a34410856 Reviewed-on: http://gerrit.cloudera.org:8080/20926 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> --- M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc M be/src/exprs/string-functions.h M be/src/util/bit-util.h M be/src/util/string-util.cc M common/function-registry/impala_functions.py M testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test 7 files changed, 342 insertions(+), 39 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/20926 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I5cfaffd71009f16eae75910af835bd2a34410856 Gerrit-Change-Number: 20926 Gerrit-PatchSet: 14 Gerrit-Owner: Zihao Ye <eyiz...@163.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Zihao Ye <eyiz...@163.com>