gaborcsardi commented on code in PR #43351: URL: https://github.com/apache/arrow/pull/43351#discussion_r1686735951
########## r/src/arrow_cpp11.h: ########## @@ -148,7 +148,7 @@ inline SEXP utf8_strings(SEXP x) { for (R_xlen_t i = 0; i < n; i++, ++p_x) { SEXP s = *p_x; - if (s != NA_STRING) { + if (s != NA_STRING && ALTREP(s)) { Review Comment: Yeah, that C++ looks correct, except you don't need to translate twice: ```c++ SEXP new_s = Rf_mkCharCE(Rf_translateCharUTF8(s); if (new_s != s) { SET_STRING_ELT(x, i, Rf_mkCharCE(new_s)); } ``` However, I think the issue is that if `x` is ALTREP, then it does not matter if it is materialized of not, it is still ALTREP, so you still cannot change it with `SET_STRING_ELT`. To avoid this, you'd need to call `Rf_duplicate()` on `x`. That should create a non-altrep copy of it. You can always call `Rf_duplicate()`, or you can call it when you encounter a non-utf8 string. Note that you need to `PROTECT()` the result of `Rf_duplicate()` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org