Looks very neat, Gabor! I just cannot fathom why anyone who want to write numerics with those separators in a flat file. That's usually not for human consumption, and computers don't need those separators!
Andy From: Gabor Grothendieck > > That could be accomplished using a custom class like this: > > library(methods) > setClass("num.with.junk") > setAs("character", "num.with.junk", > function(from) as.numeric(gsub(",", "", from))) > > > ### test ### > > Input <- "A B > 1,000 1 > 2,000 2 > 3,000 3 > " > DF <- read.table(textConnection(Input), header = TRUE, > colClasses = c("num.with.junk", "numeric")) > str(DF) > > > > On 4/30/07, Liaw, Andy <[EMAIL PROTECTED]> wrote: > > Still, though, it would be nice to have the data read in > correctly in > > the first place, instead of having to do this kind of > post-processing > > afterwards... > > > > Andy > > > > From: Bert Gunter > > > > > > Nothing! My mistake! gsub -- not sub -- is what you want to > > > get 'em all. > > > > > > -- Bert > > > > > > > > > Bert Gunter > > > Genentech Nonclinical Statistics > > > > > > -----Original Message----- > > > From: [EMAIL PROTECTED] > > > [mailto:[EMAIL PROTECTED] On Behalf Of > Marc Schwartz > > > Sent: Monday, April 30, 2007 10:18 AM > > > To: Bert Gunter > > > Cc: r-help@stat.math.ethz.ch > > > Subject: Re: [R] thousand separator (was RE: weight) > > > > > > Bert, > > > > > > What am I missing? > > > > > > > print(as.numeric(gsub(",", "", "1,123,456.789")), 10) > > > [1] 1123456.789 > > > > > > > > > FWIW, this is using: > > > > > > R version 2.5.0 Patched (2007-04-27 r41355) > > > > > > Marc > > > > > > On Mon, 2007-04-30 at 10:13 -0700, Bert Gunter wrote: > > > > Except this doesn't work for "1,123,456.789" Marc. > > > > > > > > I hesitate to suggest it, but gregexpr() will do it, as it > > > captures the > > > > position of **every** match to ",". This could be then used > > > to process the > > > > vector via some sort of loop/apply statement. > > > > > > > > But I think there **must** be a more elegant way using > > > regular expressions > > > > alone, so I, too, await a clever reply. > > > > > > > > -- Bert > > > > > > > > > > > > Bert Gunter > > > > Genentech Nonclinical Statistics > > > > > > > > -----Original Message----- > > > > From: [EMAIL PROTECTED] > > > > [mailto:[EMAIL PROTECTED] On Behalf Of > Marc Schwartz > > > > Sent: Monday, April 30, 2007 10:02 AM > > > > To: Liaw, Andy > > > > Cc: r-help@stat.math.ethz.ch > > > > Subject: Re: [R] thousand separator (was RE: weight) > > > > > > > > One possibility would be to use something like the following > > > > post-import: > > > > > > > > > WTPP > > > > [1] 1,106.8250 1,336.5138 > > > > > > > > > str(WTPP) > > > > Factor w/ 2 levels "1,106.8250","1,336.5138": 1 2 > > > > > > > > > as.numeric(gsub(",", "", WTPP)) > > > > [1] 1106.825 1336.514 > > > > > > > > > > > > Essentially strip the ',' characters from the factors and > > > then coerce > > > > the resultant character vector to numeric. > > > > > > > > HTH, > > > > > > > > Marc Schwartz > > > > > > > > > > > > On Mon, 2007-04-30 at 12:26 -0400, Liaw, Andy wrote: > > > > > I've run into this occasionally. My current solution is > > > simply to read > > > > > it into Excel, re-format the offending column(s) by > unchecking the > > > > > "thousand separator" box, and write it back out. Not > > > exactly ideal to > > > > > say the least. If anyone can provide a better solution > > > in R, I'm all > > > > > ears... > > > > > > > > > > Andy > > > > > > > > > > From: Natalie O'Toole > > > > > > > > > > > > Hi, > > > > > > > > > > > > These are the variables in my file. I think the > > > variable i'm having > > > > > > problems with is WTPP which is of the Factor type. Does > > > > > > anyone know how to > > > > > > fix this, please? > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Nat > > > > > > > > > > > > data.frame': 290 obs. of 5 variables: > > > > > > $ PROV : num 48 48 48 48 48 48 48 48 48 48 ... > > > > > > $ REGION: num 4 4 4 4 4 4 4 4 4 4 ... > > > > > > $ GRADE : num 7 7 7 7 7 7 7 7 7 7 ... > > > > > > $ Y_Q10A: num 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 ... > > > > > > $ WTPP : Factor w/ 1884 levels > > > > > > "1,106.8250","1,336.5138",..: 1544 67 > > > > > > 1568 40 221 1702 1702 1434 310 310 ... > > > > > > > > > > > > > > > > > > __________________ > > > > > > > > > > > > > > > > > > > > > > > > --- Douglas Bates <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > > On 4/28/07, John Kane <[EMAIL PROTECTED]> wrote: > > > > > > > > IIRC you have a yes/no smoking variable scored 1/2 > > > > > > > ? > > > > > > > > > > > > > > > > It is possibly being read in as a factor not as an > > > > > > > > integer. > > > > > > > > > > > > > > > > try > > > > > > > > class(df$smoking.variable) > > > > > > > > to see . > > > > > > > > > > > > > > Good point. In general I would recommend using > > > > > > > > > > > > > > str(df) > > > > > > > > > > > > > > to check on the class or storage type of all > > > > > > > variables in a data frame > > > > > > > if you are getting unexpected results when > > > > > > > manipulating it. That > > > > > > > function is carefully written to provide a maximum > > > > > > > of information in a > > > > > > > minimum of space. > > > > > > > > > > > > Yes but I'm an relative newbie at R and didn't realise > > > > > > that str() would do that. I always thought it was > > > > > > some kind of string function. > > > > > > > > > > > > Thanks, it makes life much easier. > > > > > > > > > > > > > > > > > > > > > > --- Natalie O'Toole <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > I'm getting an error message: > > > > > > > > > > > > > > > > > > Error in df[, 1:4] * df[, 5] : non-numeric > > > > > > > argument > > > > > > > > > to binary operator > > > > > > > > > In addition: Warning message: > > > > > > > > > Incompatible methods ("Ops.data.frame", > > > > > > > > > "Ops.factor") for "*" > > > > > > > > > > > > > > > > > > here is my code: > > > > > > > > > > > > > > > > > > > > > > > > > > > ##reading in the file > > > > > > > > > happyguys<-read.table("c:/test4.dat", > > > > > > > header=TRUE, > > > > > > > > > row.names=1) > > > > > > > > > > > > > > > > > > ##subset the file based on Select If > > > > > > > > > > > > > > > > > > test<-subset(happyguys, PROV==48 & GRADE == 7 & > > > > > > > > > Y_Q10A < 9) > > > > > > > > > > > > > > > > > > ##sorting the file > > > > > > > > > > > > > > > > > > mydata<-test > > > > > > > > > mydataSorted<-mydata[ order(mydata$Y_Q10A), ] > > > > > > > > > print(mydataSorted) > > > > > > > > > > > > > > > > > > > > > > > > > > > ##assigning a different name to file > > > > > > > > > > > > > > > > > > happyguys<-mydataSorted > > > > > > > > > > > > > > > > > > > > > > > > > > > ##trying to weight my data > > > > > > > > > > > > > > > > > > data.frame<-happyguys > > > > > > > > > df<-data.frame > > > > > > > > > df1<-df[, 1:4] * df[, 5] > > > > > > > > > > > > > > > > > > ##getting error message here?? > > > > > > > > > > > > > > > > > > Error in df[, 1:4] * df[, 5] : non-numeric > > > > > > > argument > > > > > > > > > to binary operator > > > > > > > > > In addition: Warning message: > > > > > > > > > Incompatible methods ("Ops.data.frame", > > > > > > > > > "Ops.factor") for "*" > > > > > > > > > > > > > > > > > > Does anyone know what this error message means? > > > > > > > > > > > > > > > > > > I've been reviewing R code all day & getting > > > > > > > more > > > > > > > > > familiar with it > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > Nat > > > > > > > > > > > > > > > > > ______________________________________________ > > > > R-help@stat.math.ethz.ch mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, > reproducible code. > > > > > > ______________________________________________ > > > R-help@stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > ______________________________________________ > > > R-help@stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > > > > -------------------------------------------------------------- > ---------------- > > Notice: This e-mail message, together with any > attachments,...{{dropped}} > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > ------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}} ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.