2009/9/26 "Jens Oehlschlägel" <oehl_l...@gmx.de>: > Hi, > > Is there any official way to determine the colClasses of a data.frame? > Why has POSIXct such a strange class structure? > Why is colClasses "ordered" not allowed (and doesn't work)? > > Background > ========== > I am writing a chunked csv reader that provides the functionality of > read.table for large files (in the next version of package ff). In chunked > reading, one wants to learn the colClasses from the data.frame returned for > the first chunk and submit this as argument colClasses= to the following > chunks (following calls to read.table). > > for most column types > colClasses <- sapply(data.frame, class) > works fine. However, two column types have more than one class: > > "ordered" has c("ordered", "factor") - currently we can't tell read.table > that a column is an ordered factor
Possibly more complex than one would wish but it is possible to do this: Lines <- "A B D C" setOldClass("ordered") setAs("character", "ordered", function(from) ordered(from)) DF <- read.table(textConnection(Lines), colClasses = "ordered") str(DF) > "POSIXct" has c("POSIXt","POSIXct") - here the LESS specific class "POSIXt" > is in the first position and would win in class-dispatch over the MORE > specific class "POSIXct". Why? > Its a historical error that is too late to correct now. See discussion in Chambers' recent book. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.