On 13-09-19 5:06 AM, Maxim Linchits wrote:
Have any of the thread participants sent a bug report to R? If not,
let me know if you intend to so so. Otherwise, I'll send a report
myself.

There's no bug, as far as I know. The issue is that various functions (by design) convert strings to the local encoding, and in the example you were trying, the local encoding can't represent all the characters, so they are shown using the hex codes, and things get messed up.

I'm currently looking into changing the design, so that there is more use of UTF-8 internally. This is likely to have side effects, which need to be investigated carefully.

Duncan Murdoch


thanks

On Tue, Sep 17, 2013 at 5:01 PM, Duncan Murdoch
<murdoch.dun...@gmail.com> wrote:
On 13-09-17 8:15 AM, Milan Bouchet-Valat wrote:

Le lundi 16 septembre 2013 à 20:04 +0400, Maxim Linchits a écrit :

Here is that old post:

http://r.789695.n4.nabble.com/read-csv-and-FileEncoding-in-Windows-version-of-R-2-13-0-td3567177.html

A taste: "Again, the issue is that opening this UTF-8 encoded file
under R 2.13.0 yields an error, but opening it under R 2.12.2 works
without any issues. (...)"

I have tried with R 2.12.2 both 32 and 64 bit on Windows Server 2008
with the French (CP1252) locale, and I still experience an error with
the test case I provided in previous messages. So it does not sound like
it is the same issue.



I can reproduce the error with a file sent to me by Maxim.  From a quick
look, I suspect that changes will be needed to read.table to handle this,
and they'll be large enough that they won't make it into 3.0.2, but
hopefully will go into R-patched after the release.

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to