My first thought was that all it looked a bit complicated for something that should be straightforward.
I created a file called t.txt. I worked out the way I would have done it and then I tested to see which was fastest. One little hiccup is that the two objects are not identical and I though they would be. Of course I could have made a typo somewhere. But then there may be something I have not come across. Guess it's time to see what identical really means. > system.time({ + file <- read.csv("t.txt",header=F, + col.names =c("c_field_1", + "n_field_2", + "d_field_3", + "d_field_4", + "n_field_5"), + colClasses = c("character", + "numeric", + "character", + "character", + "numeric") + ) + file$d_field_3 <- as.POSIXct(strptime(file$d_field_3,format="%m/%d/%Y" )) + file$d_field_4 <- as.POSIXct(strptime(file$d_field_4,format="%m/%d/%Y %I:%M:%S %p" )) + }) [1] 0.00 0.00 0.02 NA NA > > > > read_file <- function(file,nrows=-1) { + + # create temp classes + setClass("t_class_",representation("character")) + setAs("character", "t_class_", function(from) + as.POSIXct(strptime(from,format="%m/%d/%Y"))) + + setClass("t_class2_", representation("character")) + setAs("character", "t_class2_", function(from) + as.POSIXct(strptime(from,format="%m/%d/%Y %I:%M:%S %p"))) + + # read the file + file <- read.csv(file, + header=FALSE, + comment.char = "", + nrows=nrows, + as.is=FALSE, + col.names=c("c_field_1", + "n_field_2", + "d_field_3", + "d_field_4", + "n_field_5"), + colClasses=c("character", + "numeric", + "t_class_", + "t_class2_", + "numeric") + ) + + # remove them now that we are done with them + removeClass("t_class_") + removeClass("t_class2_") + + return(file) + + } > system.time(file2 <- read_file("t.txt")) [1] 0.14 0.00 0.16 NA NA > > identical(file, file2) [1] FALSE > > file c_field_1 n_field_2 d_field_3 d_field_4 n_field_5 1 MHK 76.53 2004-05-21 2004-05-04 16:00:00 60 2 MHK 76.53 2004-06-21 2004-05-05 16:00:00 60 3 MHK 76.53 2004-07-21 2004-05-06 16:00:00 65 4 MHK 76.53 2004-08-21 2004-05-07 16:00:00 65 5 MHK 76.53 2004-09-21 2004-05-08 16:00:00 70 > file2 c_field_1 n_field_2 d_field_3 d_field_4 n_field_5 1 MHK 76.53 2004-05-21 2004-05-04 16:00:00 60 2 MHK 76.53 2004-06-21 2004-05-05 16:00:00 60 3 MHK 76.53 2004-07-21 2004-05-06 16:00:00 65 4 MHK 76.53 2004-08-21 2004-05-07 16:00:00 65 5 MHK 76.53 2004-09-21 2004-05-08 16:00:00 70 > str(file) `data.frame': 5 obs. of 5 variables: $ c_field_1: chr "MHK" "MHK" "MHK" "MHK" ... $ n_field_2: num 76.5 76.5 76.5 76.5 76.5 $ d_field_3:`POSIXct', format: chr "2004-05-21" "2004-06-21" "2004-07-21" "2004-08-21" ... $ d_field_4:`POSIXct', format: chr "2004-05-04 16:00:00" "2004-05-05 16:00:00" "2004-05-06 16:00:00" "2004-05-07 16:00:00" ... $ n_field_5: num 60 60 65 65 70 > str(file2) `data.frame': 5 obs. of 5 variables: $ c_field_1: chr "MHK" "MHK" "MHK" "MHK" ... $ n_field_2: num 76.5 76.5 76.5 76.5 76.5 $ d_field_3:`POSIXct', format: chr "2004-05-21" "2004-06-21" "2004-07-21" "2004-08-21" ... $ d_field_4:`POSIXct', format: chr "2004-05-04 16:00:00" "2004-05-05 16:00:00" "2004-05-06 16:00:00" "2004-05-07 16:00:00" ... $ n_field_5: num 60 60 65 65 70 > > -----Original Message----- > From: Charles and Kimberly Maner [mailto:[EMAIL PROTECTED] > Sent: Tuesday, 8 February 2005 12:08 PM > To: r-help@stat.math.ethz.ch > Subject: [R] RE: Reading Dates in a csv File > > > > Hi all. Thanks for all of your help/suggestions. I found an > old email in > the R-help archives, pieced together a couple things and > arrived at the > solution below. As an additional followup, I thought I would > go ahead and > post it should other readers come across this same situation. > Here goes.. > > Raw data: > MHK,76.53,05/21/2004,5/4/2004 4:00:00 PM,60 > MHK,76.53,06/21/2004,5/5/2004 4:00:00 PM,60 > MHK,76.53,07/21/2004,5/6/2004 4:00:00 PM,65 > MHK,76.53,08/21/2004,5/7/2004 4:00:00 PM,65 > MHK,76.53,09/21/2004,5/8/2004 4:00:00 PM,70 > > Code: > read_file <- function(file,nrows=-1) { > > # create temp classes > setClass("t_class_",representation("character")) > setAs("character", "t_class_", function(from) > as.POSIXct(strptime(from,format="%m/%d/%Y"))) > > setClass("t_class2_", representation("character")) > setAs("character", "t_class2_", function(from) > as.POSIXct(strptime(from,format="%m/%d/%Y %I:%M:%S %p"))) > > # read the file > file <- read.csv(file, > header=FALSE, > comment.char = "", > nrows=nrows, > as.is=FALSE, > col.names=c("c_field_1", > "n_field_2", > "d_field_3", > "d_field_4", > "n_field_5), > colClasses=c("character", > "numeric", > "t_class_", > "t_class2_", > "numeric") > ) > > # remove them now that we are done with them > removeClass("t_class_") > removeClass("t_class2_") > > return(file) > > } > > If any of you folks know a better way and/or have > comments/enhancements to > this code, feel free to post/email your critique. > > > Thanks, > Charles > > > > > > _____________________________________________ > > From: Charles and Kimberly Maner > [mailto:[EMAIL PROTECTED] > > > > Sent: Thursday, February 03, 2005 8:35 AM > > To: 'r-help@stat.math.ethz.ch' > > Subject: Reading Dates in a csv File > > > > > > Hi all. I'm reading in a flat, comma-delimited flat file > using read.csv. > > It works marvelously for the most part. I am using the colClasses > > argument to, basically, create numeric, factor and > character classes for > > the columns I'm reading in. However, a couple of the > fields in the file > > are date fields. I'm fully aware that POSIXct can be used > as a class, > > however the field must obey, (I think), the standard/default POSIXct > > format. Hence the following question: Does anyone have a > method they can > > share to read in a non-standard formatted date to convert > to POSIXct? I > > can read it in then convert it, but that's a two pass > approach and not as > > elegant as a single pass through read.csv. I've read, from the > > documentation, that "[o]therwise there needs to be an as > method (from > > package methods) for conversion from "character" to the > specified formal > > class" but I do not know and have not figured out how to do that. > > > > Any suggestion(s) would be greatly appreciated. > > > > > > Thanks, > > Charles > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html