Hello Mr. Holtman, Thank you very much for your reply and suggestion. This is what each Year's data looks like;
tmp1 <- structure(list(FIPS = c(1001L, 1003L, 1005L), X2026.01.01.1 = > c(285.5533142, > 285.5533142, 286.2481079), X2026.01.01.2 = c(283.4977112, 283.4977112, > 285.0860291), X2026.01.01.3 = c(281.9733887, 281.9733887, 284.1548767 > ), X2026.01.01.4 = c(280.0234985, 280.0234985, 282.6075745), > X2026.01.01.5 = c(278.7125854, 278.7125854, 281.2553711), > X2026.01.01.6 = c(278.5204773, 278.5204773, 280.6148071)), .Names = > c("FIPS", > "X2026.01.01.1", "X2026.01.01.2", "X2026.01.01.3", "X2026.01.01.4", > "X2026.01.01.5", "X2026.01.01.6"), class = "data.frame", row.names = > c(NA, > -3L)) The data is in 3-hour blocks for every day by US FIPS code from 2026-2045, each year's data is in a difference csv. My goal is to to compute max, min, and mean by week and month. I used the following code to assign week numbers to the observations; nweek <- function(x, format="%Y-%m-%d", origin){ > if(missing(origin)){ > as.integer(format(strptime(x, format=format), "%W")) > }else{ > x <- as.Date(x, format=format) > o <- as.Date(origin, format=format) > w <- as.integer(format(strptime(x, format=format), "%w")) > 2 + as.integer(x - o - w) %/% 7 > } > } > Then the following; for (i in filelist) { > nweek(tmp2$date) > } > for (i in filelist) { > nweek(dates, origin="2026-01-01") > } > for (i in filelist) { > wkn<-nweek(tmp2$date) > } Is this efficient? Thank you so much again. I really appreciate it. Sincerely, Shouro On Sun, Feb 1, 2015 at 1:22 AM, jim holtman <jholt...@gmail.com> wrote: > It would have been nice if you had at least supplied a subset (~10 lines) > from a couple of files so we could see what the data looks like and test > out any solution. Since you are using 'data.table', you should probably > also use 'fread' for reading in the data. Here is a possible approach of > reading the data into a list and then creating a single, large data.table: > > ------- > myDTs <- lapply(filelist, function(.file) { > tmp1 <- fread(.file, sep=",") > tmp2 <- melt(tmp1, id="FIPS") > tmp2$year <- as.numeric(substr(tmp2$variable,2,5)) > tmp2$month <- as.numeric(substr(tmp2$variable,7,8)) > tmp2$day <- as.numeric(substr(tmp2$variable,10,11)) > tmp2 # return value > }) > > bigDT <- rbindlist(myDTs) # rbind all the data.tables together > > # then you should be able to do: > > mean.temp <- bigDT[, list(temp.mean=lapply(.SD, mean), > by=c("FIPS","year","month"), .SDcols=c("temp")] > > > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > On Sat, Jan 31, 2015 at 5:57 PM, Shouro Dasgupta <sho...@gmail.com> wrote: > >> I have climate data for 20 years for US counties (FIPS) in csv format, >> each >> file represents one year of data. I have extracted the data and reshaped >> the yearly data files using melt(); >> >> for (i in filelist) { >> > tmp1 <- as.data.table(read.csv(i,header=T, sep=",")) >> > tmp2 <- melt(tmp1, id="FIPS") >> > tmp2$year <- as.numeric(substr(tmp2$variable,2,5)) >> > tmp2$month <- as.numeric(substr(tmp2$variable,7,8)) >> > tmp2$day <- as.numeric(substr(tmp2$variable,10,11)) >> > } >> >> >> Should I *rbind *in the loop here as I have the memory? >> So, the file (i) tmp2 looks like this: >> >> FIPS temp year month date >> > 1001 276.7936 2045 1 1/1/2045 >> > 1003 276.7936 2045 1 1/1/2045 >> > 1005 279.6452 2045 1 1/1/2045 >> > 1007 276.7936 2045 1 1/1/2045 >> > 1009 272.3748 2045 1 1/1/2045 >> > 1011 279.6452 2045 1 1/1/2045 >> >> >> My goal is calculate the mean by FIPS code by month/week, however, when I >> use the following code, I get a NULL value. >> >> mean.temp<- for (i in filelist) {tmp2[, list(temp.mean=lapply(.SD, mean), >> > by=c("FIPS","year","month"), .SDcols=c("temp")]} >> >> >> This works fine for individual years but with *for (i in filelist)*. What >> am I doing wrong? Can include a rbind/bindlist in the loop to make a big >> data.frame? Any suggestions will be highly appreciated. Thank you. >> >> Sincerely, >> >> Shouro >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.