On Sep 7, 2010, at 2:29 PM, Matt Shotwell wrote:

Weird, my (Ubuntu, shhhh don't tell Dirk) iconv doesn't add the
backticks or single quotes.


I don't see any promise in the help page that iconv should substitute anything for the accents. It just says each OS may have its own behavior and suggest that you are accessing glibc while I am using libiconv and warns to expect different results.

--
David.

tst <- c("à", "è", "ì", "ò", "ù" , "À", "È", "Ì", "Ò", "Ù", "á",
+ "é", "í", "ó", "ú", "ý" , "Á", "É", "Í", "Ó", "Ú", "Ý")
iconv(tst, to="ASCII//TRANSLIT")
[1] "a" "e" "i" "o" "u" "A" "E" "I" "O" "U" "a" "e" "i" "o" "u" "y" "A"
"E" "I"
[20] "O" "U" "Y"

By the way, I'll take this moment to remind anyone interested that R
still has trouble with embedded zeros in character strings. I may be
abusing terminology, but I think that makes R "8-bit dirty".

-Matt

On Tue, 2010-09-07 at 14:01 -0400, David Winsemius wrote:
On Sep 7, 2010, at 1:35 PM, Matt Shotwell wrote:

If you know the encoding of the string, or if its encoding is the
current locale encoding, then you can use the iconv function to
convert
the string to ASCII. Something like:

iconv(accented.string, to="ASCII//TRANSLIT")

While 7-bit ASCII does not permit accented characters, extended (8-
bit)
ASCII does. Hence, I'm not sure this will work. But it's worth a try.

tst <- c("à", "è", "ì", "ò", "ù" , "À", "È", "Ì", "Ò", "Ù", "á",
"é", "í", "ó", "ú", "ý" , "Á", "É", "Í", "Ó", "Ú", "Ý")
iconv(tst, to="ASCII//TRANSLIT")
 [1] "`a" "`e" "`i" "`o" "`u" "`A" "`E" "`I" "`O" "`U" "'a" "'e" "'i"
"'o" "'u" "'y"
[17] "'A" "'E" "'I" "'O" "'U" "'Y"
gsub("`|\\'", "", iconv(tst, to="ASCII//TRANSLIT"))
 [1] "a" "e" "i" "o" "u" "A" "E" "I" "O" "U" "a" "e" "i" "o" "u" "y"
"A" "E" "I" "O"
[21] "U" "Y"

Notice that the accent acute gets converted to a single quote and
therefore needs to be dbl-\-ed to get recognized in an R regex pattern.

On a Mac with: locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8


--
Matthew S. Shotwell
Graduate Student
Division of Biostatistics and Epidemiology
Medical University of South Carolina


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to