That's quite nice. Three comments: - colClasses() in R.utils is similar, except for the particular codes and classes supported, to expandClasses() here.
- not sure if this is important but if as() were the last possibility tried rather than the first then in most cases (in fact all cases handled by expandClasses() ) there would be no use of the methods package. - paste("as", ...) handles all the common cases including all cases handled by expandClasses() except NA_character_ and could be used as a poor man's doCoerce(). On Thu, Jun 25, 2009 at 3:43 AM, Bengoechea Bartolomé Enrique (SIES 73)<enrique.bengoec...@credit-suisse.com> wrote: > Hi Mark, > > I frequently need to do that when importing data. This one-liner works: > >> data.frame(mapply(as, x, c("integer", "character", "factor"), >> SIMPLIFY=FALSE), stringsAsFactors=FALSE); > > but it has two problems: > > 1) as() is an S4 method that does not always work > 2) writting the vector of classes for 60 variables is rather tedious. > > Both issues can be solved with the following two helper functions. The first > function tries to use as(x, class); if it doesn't work, tries as.<class>(x); > If it still doesn't work, tries <class>(x). The second function tranforms a > single string to a character vector of classes, by transforming each letter > in the string to a class name (i.e. "D" is tranformed to "Date", "i" to > "integer", etc.), so that writting 60 classes is fast. > > doCoerce <- function(x, class) { > if (canCoerce(x, class)) > as(x, class) > else { > result <- try(match.fun(paste("as", class, sep="."))(x), > silent=TRUE); > if (inherits(result, "try-error")) > result <- match.fun(class)(x) > result; > } > } > > expandClasses <- function (x) { > unknowns <- character(0) > result <- lapply(strsplit(as.character(x), NULL, fixed = TRUE), > function(y) { > sapply(y, function(z) switch(z, > i = "integer", n = "numeric", > l = "logical", c = "character", x = "complex", > r = "raw", f = "factor", D = "Date", P = "POSIXct", > t = "POSIXlt", N = NA_character_, { > unknowns <<- c(unknowns, z) > NA_character_ > }), USE.NAMES = FALSE) > }) > if (length(unknowns)) { > unknowns <- unique(unknowns) > warning(sprintf(ngettext(length(unknowns), "code %s not recognized", > "codes %s not recognized"), dqMsg(unknowns))) > } > result > } > > An example: > >> x <- data.frame(X="2008-01-01", Y=1.1:3.1, Z=letters[1:3]) >> data.frame(mapply(doCoerce, x, expandClasses("Dif")[[1L]], SIMPLIFY=FALSE), >> stringsAsFactors=FALSE); > > Regards, > > Enrique > > > ------------------------------ > > Message: 99 > Date: Tue, 23 Jun 2009 15:23:54 -0600 > From: Mark Na <mtb...@gmail.com> > Subject: [R] Apply as.factor (or as.numeric etc) to multiple columns > To: r-help@r-project.org > Message-ID: > <e40d78ce0906231423m4c3da14i2f6270f92463c...@mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Hi R-helpers, > > I have a dataframe with 60columns and I would like to convert several > columns to factor, others to numeric, and yet others to dates. Rather > than having 60 lines like this: > > data$Var1<-as.factor(data$Var1) > > I wonder if it's possible to write one line of code (per data type, > e.g. factor) that would apply a function (e.g., as.factor) to several > (non-contiguous) columns. So, I could then use 3 or 4 lines of code > (for 3 or 4 data types) instead of 60. > > I have tried writing an apply function, but it failed. > > Thanks for any help you might be able to provide. > > Mark Na > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.