Kshiteej K created ARROW-18319: ---------------------------------- Summary: `binary_replace_slice` should not work with `string` types Key: ARROW-18319 URL: https://issues.apache.org/jira/browse/ARROW-18319 Project: Apache Arrow Issue Type: Bug Reporter: Kshiteej K
`binary_replace_slice` can give in invalid output when used with string types. Given that there is `utf8_replace_slice`, I think `binary_replace_slice` should not support string types. If a user actually wants to play with bytes for string type, they should explicitly cast to binary type and use `binary_replace_slice`. {code:java} >>> pc.binary_replace_slice(["hé"], 1, 2, "x") <pyarrow.lib.StringArray object at 0x7fdbc09937c0> [ "hx�" ] >>> pc.binary_replace_slice(["hé"], 1, 2, "x").validate(full=True) Traceback (most recent call last): ... ArrowInvalid: Invalid UTF8 sequence at string index 0 {code} Ref: [https://github.com/apache/arrow/pull/14550#discussion_r1021545816] cc: [~apitrou] -- This message was sent by Atlassian Jira (v8.20.10#820010)