try this; looks for strings of numbers with commas and quotes them:
> x <- readLines(textConnection("Time,Value + 32,-7,183246E-02 + 32,05,3,469364E-02")) > # process the data putting in quotes on scientific > x.new1 <- gsub("(-?[0-9]+,[0-9]+E-?[0-9]+)", '"\\1"', x) > x.new1 [1] "Time,Value" "32,\"-7,183246E-02\"" "32,05,\"3,469364E-02\"" > # put quotes on just numbers > x.new2 <- gsub("(-?[0-9]+,[0-9]+)(,|$)", '"\\1"\\2', x.new1) > x.new2 [1] "Time,Value" "32,\"-7,183246E-02\"" "\"32,05\",\"3,469364E-02\"" > temp <- tempfile() > writeLines(x.new2, temp) > x.input <- read.csv(temp) > x.input Time Value 1 32 -7,183246E-02 2 32,05 3,469364E-02 On Mon, Jul 23, 2012 at 9:06 AM, Guillaume Meurice <guillaume.meur...@igr.fr> wrote: > Dear all, > > I have some encoding problem which I'm not familiar with. > Here is the case : > I'm read data files which can have been generated from a computer either > with global settings in french or in english. > > Here is an exemple ouf data file : > > * English output > Time,Value > 17,-0.0753953 > 17.05,-6.352454E-02 > > * French output. > Time,Value > 32,-7,183246E-02 > 32,05,3,469364E-02 > > In the first case, I can totally retrieve both columns, splitting each line > using the comma as a separator. > In the second case, it's impossible, since the comma (in french) is also used > to separate decimal. Usually, the CSV french file format add some quote, to > distinguish the comma used as column separator from comma used as decimal, > like the following : > > Time,Value > 32,"-7,183246E-02" > "32,05","3,469364E-02" > > Since I'm expecting 2 numbers, I can set that if there is 3 comma, the first > two number are to be gathered as well as the two lefting ones. > But in case of only two comma, which number is the floating one (I know that > it is the second one, but who this is a great source of bugs ...). > > the unix tools "file" returns : > === > $ file P23_RD06_High\ Sensitivity\ DNA\ > Assay_DE04103664_2012-06-27_11-57-29_Sample1.csv > $ P23_RD06_High Sensitivity DNA > Assay_DE04103664_2012-06-27_11-57-29_Sample1.csv: ASCII text, with CRLF line > terminators > === > > > Unfortunately, the raw file doesn't contains the precious quote. So sorry to > bother with this question which is not totally related to R (which I'm > using). Do you know if there any facilities using R to get the data in the > good format ? > > > Bests, > -- > Guillaume Meurice - PhD > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.