Thanks, David

I need an all-R solution for this, because the author.csv file is exported from a database that enforces the HTML encoding and the import into R may have to be repeated several times as the database is updated.

-Michael

On 8/10/2012 12:40 PM, David L Carlson wrote:
It's not quite an R solution, but I just pasted your examples into a script
window in R and saved it as chars.html. Then I opened it in Firefox and
pasted the results here (with returns inserted to match your original).

grep("&", author$lname, value=TRUE)
[1] "Frère de Montizon" "Lumière"
[3] "Lumière" "Niépce"
[5] "Süssmilch" "Schüpbach"
grep("&", author$birthplace, value=TRUE)
[1] "Marbach, Württemberg"
[2] "Côte-d'Or"
[3] "Chalon-sur-Saône, Saône-et-Loire"
[4] "Groß Särchen, Germany"
apropos("HTML")
For a CSV file you would want to preserve the lines by adding <br> to the
end of each line first.

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352



-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
project.org] On Behalf Of Michael Friendly
Sent: Friday, August 10, 2012 11:15 AM
To: R-help
Subject: [R] translating HTML character entities to accented characters

I've imported a .csv file where character strings that contained
accented characters were written as HTML
character entities.  Is there a function that works on a vector to
translate them back to accented (latin1) characters?

Some examples:

  > grep("&", author$lname, value=TRUE)
[1] "Fr&egrave;re de Montizon" "Lumi&egrave;re"
[3] "Lumi&egrave;re"           "Ni&eacute;pce"
[5] "S&uuml;ssmilch"           "Sch&uuml;pbach"
  > grep("&", author$birthplace, value=TRUE)
[1] "Marbach, W&uuml;rttemberg"
[2] "C&ocirc;te-d&#039;Or"
[3] "Chalon-sur-Sa&ocirc;ne, Sa&ocirc;ne-et-Loire"
[4] "Gro&szlig; S&auml;rchen, Germany"
  > apropos("HTML")

thx,
-Michael

--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to