On Fri, Jan 13, 2012 at 7:02 AM, Francisco <franciscororol...@google.com> wrote: > Hello, > I have a csv file with many variables, both characters and integers. > I would like to load it on R and do some operations on integer variables, > the problem is that R loads the entire dataset considering all variables as > characters, instead I would like that R makes the distinction between the > two types, because there are too many variables to do: > x1<-as.integer(x1) > x2<-as.integer(x2) > x3<-as.integer(x3) > ... > > I tried to specify read.table(... stringsAsFactors=FALSE) but it doesn't > work.
There must be non-integers in some of the columns that are supposed to be integer. Lets assume that the first row has no such garbage. Then we can get the desired classes from that row and apply it to the entire data frame. In this example the second column has such garbage: # test data Lines <- "a,b,c D,2,3 a,b,9 C,5,6" # read in just row 1 and read in all rows DF1 <- read.csv(text = Lines, nrow = 1, as.is = TRUE) DF <- DF0 <- read.csv(text = Lines, as.is = TRUE) # there will warning as its converting garbage to NAs to.int <- function(v, v1) if (inherits(v1, "integer")) as.integer(v) else v DF <- mapply(to.int, DF0, DF1, SIMPLIFY = FALSE) DF <- as.data.frame(DF) As we see here the second column becomes integer despite garbage in it: > str(DF0) # as read in 'data.frame': 3 obs. of 3 variables: $ a: chr "D" "a" "C" $ b: chr "2" "b" "5" $ c: int 3 9 6 > str(DF) # as converted 'data.frame': 3 obs. of 3 variables: $ a: Factor w/ 3 levels "a","C","D": 3 1 2 $ b: int 2 NA 5 $ c: int 3 9 6 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.