Hello, There is a long-lasting problem with processing UTF-8 source code in R on Windows OS. As Windows do not have "UTF-8" locale and R passes source code through OS before executing it, some characters are "simplified" by the OS before processing, leading to undesirable changes.
Minimalistic example: Let's type "ř" (LATIN SMALL LETTER R WITH CARON) in RGui console: > "ř" [1] "r" Let's assume the following script: # file [script.R] if ("ř" != "\U00159") { stop("Problem: Unexpected character conversion.") } else { cat("o.k.\n") } Problem: source("script.R", encoding = "UTF-8") OK (see https://stackoverflow.com/questions/5031630/how-to-source-r-file-saved-using-utf-8-encoding): eval(parse("script.R", encoding = "UTF-8")) Although the script is in UTF-8, the characters are replaced by "simplified" substitutes uncontrollably (depending on OS locale). The same goes with simply entering the code statements in R Console. The problem does not occur on OS with UTF-8 locale (Mac OS, Linux...) Best regards Tomas Boril > R.version _ platform x86_64-w64-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status alpha major 3 minor 6.0 year 2019 month 04 day 07 svn rev 76333 language R version.string R version 3.6.0 alpha (2019-04-07 r76333) nickname > Sys.getlocale() [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252" ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel