R 3.0.2
OS X Mavericks
Colleagues
I have a file that I converted from SAS (sas7bdat) to CSV (filename:
ORIGINAL.csv). I try to read it with read.csv and I receive the error message:
Error in type.convert(data[[i]], as.is = as.is[i], dec = dec,
na.strings = character(0L)) :
invalid multibyte string at '<b0>C’
The problem resolves if I delete a single character from each of lines 2 and 4
of the file (filename: FIXED.csv)
readLines can read both files without problem and displays the offending
character as:
\xb0
which appears to be a degree sign.
I also tried:
read.csv(textConnection(readLines(“ORIGINAL.csv”)))
and encountered the same error message.
In the past, I have encountered the same problem with Greek symbols (e.g., mu)
and other special characters.
Short of editing the input file, is there a simple solution within R so that I
can read the input data into a dataframe?
One possible (but ugly) solution would be:
TEMP <- readLines(FILENAME)
TEMP <- gsub(offendingcharacter, replacementcharacter, TEMP)
However, this would require that I find all possible offending characters and
the corresponding replacements.
The files are available for inspection at:
http://www.plessthan.com/FILES/ARCHIVE.zip
Dennis
Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.