Dear developers,

I have come across a (somewhat strange) change in the encoding of Sweave output from R-2.9.2pat to R-2.10.0beta (apparently specific to Rgui) on Windows installations. Of course, the NEWS file contains quite a few changes concerning encoding, but I was not able to locate an entry which explains the observed behaviour. I am not very familiar with encodings/locales/codepages, but I will try to explain my observations as best I can.

In R-2.9.2pat, when invoking R via Rgui --vanilla (output of seesionInfo() below), the output of Sweave for .rnw files containing german umlaute (latin1-encoded) is again latin1-encoded (the resulting .tex-file compiles with \usepackage[latin1]{inputenc} and \usepackage[german]{babel}). In R-2.10.0beta, however, when invoking R via Rgui --vanilla (output of seesionInfo() below), some of Sweave's output (more precisely, Soutput environments containing german umlaute, Sinput environments with german umlaute are still latin1) is utf-8 encoded (with some extra characters at the start and the end, which could be BOMs). Surprisingly, when R is invoked from (Windows) command line (R --vanilla or Rterm --vanilla), the encoding is completely latin1 again (as in R-2.9.2pat). So, the change to utf-8 encoding for parts of Sweave's output seems to be specific to Rgui.

Of course, I can work around this problem by using Rterm instead of Rgui when Sweav'ing, but I am not sure if the current behaviour of R via Rgui is as intended. I will try to attach the .rnw - file as well as the resulting .tex - files (and hope, that the attachements pass through).

Best wishes,

  Martin



sessionInfo() for R-2.9.2pat (same for Rgui, R, Rterm):
R version 2.9.2 Patched (2009-09-24 r50041)
i386-pc-mingw32

locale:
LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

sessionInfo() for R-2.10.0beta (same for Rgui, R, Rterm):
R version 2.10.0 beta (2009-10-11 r50037)
i386-pc-mingw32

locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C [5] LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base

--
Dr. Martin Becker
Statistics and Econometrics
Saarland University
Campus C3 1, Room 206
66123 Saarbruecken
Germany

\documentclass{article} 
\usepackage{c:/Programme/R/R-2.10.0beta/share/texmf/Sweave}
\usepackage[latin1]{inputenc}
%\usepackage[utf8x]{inputenc}
\usepackage[german]{babel}

\begin{document}
<<>>=
Umlaute <- c("ä","ö","ü","ß")
Umlaute
@
\end{document}
\documentclass{article} 
\usepackage{c:/Programme/R/R-2.10.0beta/share/texmf/Sweave}
\usepackage[latin1]{inputenc}
%\usepackage[utf8x]{inputenc}
\usepackage[german]{babel}

\begin{document}
\begin{Schunk}
\begin{Sinput}
> Umlaute <- c("ä", "ö", "ü", "ß")
> Umlaute
\end{Sinput}
\begin{Soutput}
[1] "ä" "ö" "ü" "ß"
\end{Soutput}
\end{Schunk}
\end{document}
\documentclass{article} 
\usepackage{c:/Programme/R/R-2.10.0beta/share/texmf/Sweave}
\usepackage[latin1]{inputenc}
%\usepackage[utf8x]{inputenc}
\usepackage[german]{babel}

\begin{document}
\begin{Schunk}
\begin{Sinput}
> Umlaute <- c("ä", "ö", "ü", "ß")
> Umlaute
\end{Sinput}
\begin{Soutput}
[1] "ä" "ö" "ü" "ß"
\end{Soutput}
\end{Schunk}
\end{document}
\documentclass{article} 
\usepackage{c:/Programme/R/R-2.10.0beta/share/texmf/Sweave}
\usepackage[latin1]{inputenc}
%\usepackage[utf8x]{inputenc}
\usepackage[german]{babel}

\begin{document}
\begin{Schunk}
\begin{Sinput}
> Umlaute <- c("ä", "ö", "ü", "ß")
> Umlaute
\end{Sinput}
\begin{Soutput}
[1] "ä" "ö" "ü" "ß"
\end{Soutput}
\end{Schunk}
\end{document}
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to