[ https://issues.apache.org/jira/browse/ARROW-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633817#comment-17633817 ]
Antoine Pitrou commented on ARROW-18319: ---------------------------------------- cc [~rok] > `binary_replace_slice` should not work with `string` types > ---------------------------------------------------------- > > Key: ARROW-18319 > URL: https://issues.apache.org/jira/browse/ARROW-18319 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Reporter: Kshiteej K > Priority: Major > > `binary_replace_slice` can give in invalid output when used with string > types. Given that there is `utf8_replace_slice`, I think > `binary_replace_slice` should not support string types. > If a user actually wants to play with bytes for string type, they should > explicitly cast to binary type and use `binary_replace_slice`. > {code:java} > >>> pc.binary_replace_slice(["hé"], 1, 2, "x") > <pyarrow.lib.StringArray object at 0x7fdbc09937c0> > [ > "hx�" > ] > >>> pc.binary_replace_slice(["hé"], 1, 2, "x").validate(full=True) > Traceback (most recent call last): > ... > ArrowInvalid: Invalid UTF8 sequence at string index 0 {code} > Ref: [https://github.com/apache/arrow/pull/14550#discussion_r1021545816] > > cc: [~apitrou] -- This message was sent by Atlassian Jira (v8.20.10#820010)