In the real data the months are all complete, but the years can be missing. So years can be missing up front, in the middle, at the end. but if a year is present than every month has a value or NA.
To create regular R ts I had to plow through the data frame, collect a year caluculate an index to put it into the final time series. I had tried zoo out and it handled the irregular spaced data, but a large data structure of zoo objects had stumped me. espcially since I need to do matching and selecting of the zoo objects. In the real data, there are about 7000 time series of 1500 months and those 7000 get averaged and combined in different ways On Sat, Aug 7, 2010 at 8:45 PM, Gabor Grothendieck <ggrothendi...@gmail.com>wrote: > On Sat, Aug 7, 2010 at 9:18 PM, steven mosher <mosherste...@gmail.com> > wrote: > > Very Slick. > > Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a > > bunch of working code. > > > > > > > > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) > > Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) > > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) > > Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) > > > > Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values) > > Data > > Index Year Jan Feb Mar Apr Jun > > 1 67543 1989 12 6.0 12 12 12 > > 2 67543 1990 14 7.0 NA NA 14 > > 3 67543 1991 34 17.0 34 34 34 > > 4 67543 1992 21 10.5 21 21 21 > > 5 12345 1991 54 27.0 NA NA 54 > > 6 12345 1993 65 32.5 65 65 65 > > 7 12345 1994 23 11.5 23 23 23 > > 8 89765 1991 12 6.0 NA NA 12 > > 9 89765 1992 13 6.5 13 13 13 > > 10 89765 1993 13 6.5 NA NA 13 > > 11 89765 1994 13 6.5 13 13 13 > > 12 89765 1995 14 7.0 14 14 14 > > # Gabor's solution > > f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1]) > > do.call(cbind, by(Data, Data$Index, f)) > > 12345 67543 89765 > > > The original data had consecutive months in each series (actually > there was a missing 1992 in one case but I assumed that was an > inadvertent omission and the actual data was complete); however, here > we have missing 6 month chunks in addition. That makes the series > non-consecutive so to solve that we could either apply this to the > data (after putting the missing 1992 year back in): > > Data <- cbind(Data, NA, NA, NA, NA, NA, NA) > > or we could use a time series class that can handle irregularly spaced > data: > > library(zoo) > f <- function(x) { > dat <- x[-(1:2)] > tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+")) > zoo(c(as.matrix(dat)), tim) > } > do.call(cbind, by(Data, Data$Index, f)) > > The last line is unchanged from before. This code will also handle > the original situation correctly even if the missing 1992 is truly > missing. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.