Re: [R] Data frame reordering to time series
On Sun, Aug 8, 2010 at 5:54 PM, steven mosher wrote: > z<-as.zooreg(as.ts(g)) >> z > X12345 X34567 X56789 > 1989(1) NA 3 6 > 1989(2) NA 3 6 > 1989(3) NA 3 6 > 1989(4) NA 3 6 > 1989(5) NA 3 6 > 1989(6) NA 3 6 > 1989(7) NA 3 6 > 1989(8) NA 3 6 > 1989(9) NA 3 6 > 1989(10) NA 3 6 > 1989(11) NA 3 6 > 1989(12) NA 3 6 > 1990(1) 2 4 6 > 1990(2) 2 4 6 > 1990(3) 2 4 6 > 1990(4) 2 4 6 > 1990(5) 2 4 6 > 1990(6) 2 4 6 > 1990(7) 2 4 6 > 1990(8) 2 4 6 > 1990(9) 2 4 6 > 1990(10) 2 4 6 > 1990(11) 2 4 6 > 1990(12) 2 4 6 > 1991(1) NA 5 NA > 1991(2) NA 5 NA > 1991(3) NA 5 NA > 1991(4) NA 5 NA > 1991(5) NA 5 NA > 1991(6) NA 5 NA > 1991(7) NA 5 NA > 1991(8) NA 5 NA > 1991(9) NA 5 NA > 1991(10) NA 5 NA > 1991(11) NA 5 NA > 1991(12) NA 5 NA > 1992(1) 2 NA NA > 1992(2) 2 NA NA > > *** > The interesting this is the change from months to the (1)... zooreg converts a ts series to one with a numeric index and the same frequency. You can convert the index to "yearmon" class if you wish: z<-as.zooreg(as.ts(g)) time(z) <- as.yearmon(time(z)) or z <- aggregate(as.zooreg(as.ts(g)), as.yearmon, identity) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame reordering to time series
Thanks again, They worked for me as well. I did a simpler example with fewer years just to show that it worked...( shorted here for display) f <- function(x) { +dat <- x[-(1:2)] +tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+")) +zoo(c(as.matrix(dat)), tim) + } > g<-do.call(cbind, by(Data, Data$Index, f)) > g X12345 X34567 X56789 Jan 1989 NA 3 6 Feb 1989 NA 3 6 Mar 1989 NA 3 6 Apr 1989 NA 3 6 May 1989 NA 3 6 Jun 1989 NA 3 6 Jul 1989 NA 3 6 Aug 1989 NA 3 6 Sep 1989 NA 3 6 Oct 1989 NA 3 6 Nov 1989 NA 3 6 Dec 1989 NA 3 6 Jan 1990 2 4 6 Feb 1990 2 4 6 Mar 1990 2 4 6 Apr 1990 2 4 6 May 1990 2 4 6 Jun 1990 2 4 6 Jul 1990 2 4 6 Aug 1990 2 4 6 Sep 1990 2 4 6 Oct 1990 2 4 6 Nov 1990 2 4 6 Dec 1990 2 4 6 Jan 1991 NA 5 NA . z<-as.zooreg(as.ts(g)) > z X12345 X34567 X56789 1989(1) NA 3 6 1989(2) NA 3 6 1989(3) NA 3 6 1989(4) NA 3 6 1989(5) NA 3 6 1989(6) NA 3 6 1989(7) NA 3 6 1989(8) NA 3 6 1989(9) NA 3 6 1989(10) NA 3 6 1989(11) NA 3 6 1989(12) NA 3 6 1990(1) 2 4 6 1990(2) 2 4 6 1990(3) 2 4 6 1990(4) 2 4 6 1990(5) 2 4 6 1990(6) 2 4 6 1990(7) 2 4 6 1990(8) 2 4 6 1990(9) 2 4 6 1990(10) 2 4 6 1990(11) 2 4 6 1990(12) 2 4 6 1991(1) NA 5 NA 1991(2) NA 5 NA 1991(3) NA 5 NA 1991(4) NA 5 NA 1991(5) NA 5 NA 1991(6) NA 5 NA 1991(7) NA 5 NA 1991(8) NA 5 NA 1991(9) NA 5 NA 1991(10) NA 5 NA 1991(11) NA 5 NA 1991(12) NA 5 NA 1992(1) 2 NA NA 1992(2) 2 NA NA *** The interesting this is the change from months to the (1)... On Sun, Aug 8, 2010 at 8:55 AM, Gabor Grothendieck wrote: > On Sun, Aug 8, 2010 at 11:21 AM, steven mosher > wrote: > > Ok, > > I'm a bit confused by what you mean by "regularly spaced" > > After I do the do.call I do get a data structure with all the times > present > > and every time has a NA or a data value. > > Steve > > > > regularly spaced means that every observation is one month later than > the prior. If there are missing 6 month chunks or missing entire > years then the observations are not regularly spaced since there are > some months not present. > > It works for me: > > > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) > > Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) > > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) > > Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) > > > > Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values, > + Oct=Values,Nov=Values,Dec=Values2) > > > > library(zoo) > > f <- function(x) { > +dat <- x[-(1:2)] > +tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, > "+")) > +zoo(c(as.matrix(dat)), tim) > + } > > do.call(cbind, by(Data, Data$Index, f)) > X12345X67543X89765 > Jan 1989NA 12.00NA > Feb 1989NA 6.00NA > Mar 1989NA 12.00NA > Apr 1989NA 12.00NA > May 1989NA 12.00NA > Jun 1989NA 4.00NA > Jul 1989NA 12.00NA > Aug 1989NA 12.00NA > Sep 1989NA 12.00NA > Oct 1989NA 12.00NA > Nov 1989NA 12.00NA > Jan 1990NA 14.00NA > Feb 1990NA 7.00NA > Mar 1990NANANA > Apr 1990NANANA > May 1990NA 14.00NA > Jun 1990NA 4.67NA > Jul 1990NANANA > Aug 1990NA 14.00NA > Sep 1990NA 14.00NA > Oct 1990NA 14.00NA > Nov 1990NANANA > Jan 1991 54.00 34.00 12.00 > Feb 1991 27.00 17.00 6.00 > Mar 1991NA 34.00NA > Apr 1991NA 34.00NA > May 1991 54.00 34.00 12.00 > Jun 1991 18.00 11.33 4.00 > Jul 1991NA 34.00NA > Aug 1991 54.00 34.00 12.00 > Sep 1991 54.00 34.00 12.
Re: [R] Data frame reordering to time series
On Sun, Aug 8, 2010 at 11:55 AM, Gabor Grothendieck wrote: > On Sun, Aug 8, 2010 at 11:21 AM, steven mosher wrote: >> Ok, >> I'm a bit confused by what you mean by "regularly spaced" >> After I do the do.call I do get a data structure with all the times present >> and every time has a NA or a data value. >> Steve >> > > regularly spaced means that every observation is one month later than > the prior. If there are missing 6 month chunks or missing entire > years then the observations are not regularly spaced since there are > some months not present. > And here it is with as.ts > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) > Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) > Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) > > Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values, + Oct=Values,Nov=Values,Dec=Values2) > > library(zoo) > f <- function(x) { +dat <- x[-(1:2)] +tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+")) +zoo(c(as.matrix(dat)), tim) + } > z <- do.call(cbind, by(Data, Data$Index, f)) > as.ts(z) X12345X67543X89765 Jan 1989NA 12.00NA Feb 1989NA 6.00NA Mar 1989NA 12.00NA Apr 1989NA 12.00NA May 1989NA 12.00NA Jun 1989NA 4.00NA Jul 1989NA 12.00NA Aug 1989NA 12.00NA Sep 1989NA 12.00NA Oct 1989NA 12.00NA Nov 1989NA 12.00NA Dec 1989NANANA Jan 1990NA 14.00NA Feb 1990NA 7.00NA Mar 1990NANANA Apr 1990NANANA May 1990NA 14.00NA Jun 1990NA 4.67NA Jul 1990NANANA Aug 1990NA 14.00NA Sep 1990NA 14.00NA Oct 1990NA 14.00NA Nov 1990NANANA Dec 1990NANANA Jan 1991 54.00 34.00 12.00 Feb 1991 27.00 17.00 6.00 Mar 1991NA 34.00NA Apr 1991NA 34.00NA May 1991 54.00 34.00 12.00 Jun 1991 18.00 11.33 4.00 Jul 1991NA 34.00NA Aug 1991 54.00 34.00 12.00 Sep 1991 54.00 34.00 12.00 Oct 1991 54.00 34.00 12.00 Nov 1991NA 34.00NA Dec 1991NANANA Jan 1992NA 21.00 13.00 Feb 1992NA 10.50 6.50 Mar 1992NA 21.00 13.00 Apr 1992NA 21.00 13.00 May 1992NA 21.00 13.00 Jun 1992NA 7.00 4.33 Jul 1992NA 21.00 13.00 Aug 1992NA 21.00 13.00 Sep 1992NA 21.00 13.00 Oct 1992NA 21.00 13.00 Nov 1992NA 21.00 13.00 Dec 1992NANANA Jan 1993 65.00NA 13.00 Feb 1993 32.50NA 6.50 Mar 1993 65.00NANA Apr 1993 65.00NANA May 1993 65.00NA 13.00 Jun 1993 21.67NA 4.33 Jul 1993 65.00NANA Aug 1993 65.00NA 13.00 Sep 1993 65.00NA 13.00 Oct 1993 65.00NA 13.00 Nov 1993 65.00NANA Dec 1993NANANA Jan 1994 23.00NA 13.00 Feb 1994 11.50NA 6.50 Mar 1994 23.00NA 13.00 Apr 1994 23.00NA 13.00 May 1994 23.00NA 13.00 Jun 1994 7.67NA 4.33 Jul 1994 23.00NA 13.00 Aug 1994 23.00NA 13.00 Sep 1994 23.00NA 13.00 Oct 1994 23.00NA 13.00 Nov 1994 23.00NA 13.00 Dec 1994NANANA Jan 1995NANA 14.00 Feb 1995NANA 7.00 Mar 1995NANA 14.00 Apr 1995NANA 14.00 May 1995NANA 14.00 Jun 1995NANA 4.67 Jul 1995NANA 14.00 Aug 1995NANA 14.00 Sep 1995NANA 14.00 Oct 1995NANA 14.00 Nov 1995NANA 14.00 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame reordering to time series
On Sun, Aug 8, 2010 at 11:21 AM, steven mosher wrote: > Ok, > I'm a bit confused by what you mean by "regularly spaced" > After I do the do.call I do get a data structure with all the times present > and every time has a NA or a data value. > Steve > regularly spaced means that every observation is one month later than the prior. If there are missing 6 month chunks or missing entire years then the observations are not regularly spaced since there are some months not present. It works for me: > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) > Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) > Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) > > Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values, + Oct=Values,Nov=Values,Dec=Values2) > > library(zoo) > f <- function(x) { +dat <- x[-(1:2)] +tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+")) +zoo(c(as.matrix(dat)), tim) + } > do.call(cbind, by(Data, Data$Index, f)) X12345X67543X89765 Jan 1989NA 12.00NA Feb 1989NA 6.00NA Mar 1989NA 12.00NA Apr 1989NA 12.00NA May 1989NA 12.00NA Jun 1989NA 4.00NA Jul 1989NA 12.00NA Aug 1989NA 12.00NA Sep 1989NA 12.00NA Oct 1989NA 12.00NA Nov 1989NA 12.00NA Jan 1990NA 14.00NA Feb 1990NA 7.00NA Mar 1990NANANA Apr 1990NANANA May 1990NA 14.00NA Jun 1990NA 4.67NA Jul 1990NANANA Aug 1990NA 14.00NA Sep 1990NA 14.00NA Oct 1990NA 14.00NA Nov 1990NANANA Jan 1991 54.00 34.00 12.00 Feb 1991 27.00 17.00 6.00 Mar 1991NA 34.00NA Apr 1991NA 34.00NA May 1991 54.00 34.00 12.00 Jun 1991 18.00 11.33 4.00 Jul 1991NA 34.00NA Aug 1991 54.00 34.00 12.00 Sep 1991 54.00 34.00 12.00 Oct 1991 54.00 34.00 12.00 Nov 1991NA 34.00NA Jan 1992NA 21.00 13.00 Feb 1992NA 10.50 6.50 Mar 1992NA 21.00 13.00 Apr 1992NA 21.00 13.00 May 1992NA 21.00 13.00 Jun 1992NA 7.00 4.33 Jul 1992NA 21.00 13.00 Aug 1992NA 21.00 13.00 Sep 1992NA 21.00 13.00 Oct 1992NA 21.00 13.00 Nov 1992NA 21.00 13.00 Jan 1993 65.00NA 13.00 Feb 1993 32.50NA 6.50 Mar 1993 65.00NANA Apr 1993 65.00NANA May 1993 65.00NA 13.00 Jun 1993 21.67NA 4.33 Jul 1993 65.00NANA Aug 1993 65.00NA 13.00 Sep 1993 65.00NA 13.00 Oct 1993 65.00NA 13.00 Nov 1993 65.00NANA Jan 1994 23.00NA 13.00 Feb 1994 11.50NA 6.50 Mar 1994 23.00NA 13.00 Apr 1994 23.00NA 13.00 May 1994 23.00NA 13.00 Jun 1994 7.67NA 4.33 Jul 1994 23.00NA 13.00 Aug 1994 23.00NA 13.00 Sep 1994 23.00NA 13.00 Oct 1994 23.00NA 13.00 Nov 1994 23.00NA 13.00 Jan 1995NANA 14.00 Feb 1995NANA 7.00 Mar 1995NANA 14.00 Apr 1995NANA 14.00 May 1995NANA 14.00 Jun 1995NANA 4.67 Jul 1995NANA 14.00 Aug 1995NANA 14.00 Sep 1995NANA 14.00 Oct 1995NANA 14.00 Nov 1995NANA 14.00 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame reordering to time series
Ok, I'm a bit confused by what you mean by "regularly spaced" After I do the do.call I do get a data structure with all the times present and every time has a NA or a data value. Steve On Sun, Aug 8, 2010 at 2:46 AM, Gabor Grothendieck wrote: > On Sun, Aug 8, 2010 at 2:01 AM, steven mosher > wrote: > > In the real data the months are all complete, but the years can be > missing. > > So years can be missing up front, in the middle, at the end. but if a > year > > is present than every month has a value or NA. > > To create regular R ts I had to plow through the data frame, collect a > year > > caluculate an index to put it into the final time series. > > > > I had tried zoo out and it handled the irregular spaced data, but a large > > data structure of zoo objects had stumped me. espcially since I need to > do > > matching and selecting > > of the zoo objects. > > In the real data, there are about 7000 time series of 1500 months and > those > > 7000 > > get averaged and combined in different ways > > If there are missing years and you want to get a regularly spaced > series out then use the zoo version of f (rather than the ts version of f) > and if this is the last statement (same as before but assigning > it to the variable z): > > z <- do.call(cbind, by(Data, Data$Index, f)) > > then to get a regularly spaced ts object just do this: > > as.ts(z) > > or > > as.zooreg(as.ts(z)) > > to create a regularly spaced zooreg object. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame reordering to time series
On Sun, Aug 8, 2010 at 2:01 AM, steven mosher wrote: > In the real data the months are all complete, but the years can be missing. > So years can be missing up front, in the middle, at the end. but if a year > is present than every month has a value or NA. > To create regular R ts I had to plow through the data frame, collect a year > caluculate an index to put it into the final time series. > > I had tried zoo out and it handled the irregular spaced data, but a large > data structure of zoo objects had stumped me. espcially since I need to do > matching and selecting > of the zoo objects. > In the real data, there are about 7000 time series of 1500 months and those > 7000 > get averaged and combined in different ways If there are missing years and you want to get a regularly spaced series out then use the zoo version of f (rather than the ts version of f) and if this is the last statement (same as before but assigning it to the variable z): z <- do.call(cbind, by(Data, Data$Index, f)) then to get a regularly spaced ts object just do this: as.ts(z) or as.zooreg(as.ts(z)) to create a regularly spaced zooreg object. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame reordering to time series
In the real data the months are all complete, but the years can be missing. So years can be missing up front, in the middle, at the end. but if a year is present than every month has a value or NA. To create regular R ts I had to plow through the data frame, collect a year caluculate an index to put it into the final time series. I had tried zoo out and it handled the irregular spaced data, but a large data structure of zoo objects had stumped me. espcially since I need to do matching and selecting of the zoo objects. In the real data, there are about 7000 time series of 1500 months and those 7000 get averaged and combined in different ways On Sat, Aug 7, 2010 at 8:45 PM, Gabor Grothendieck wrote: > On Sat, Aug 7, 2010 at 9:18 PM, steven mosher > wrote: > > Very Slick. > > Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a > > bunch of working code. > > > > > > > > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) > > Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) > > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) > > Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) > > > > Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values) > > Data > >Index Year Jan Feb Mar Apr Jun > > 1 67543 1989 12 6.0 12 12 12 > > 2 67543 1990 14 7.0 NA NA 14 > > 3 67543 1991 34 17.0 34 34 34 > > 4 67543 1992 21 10.5 21 21 21 > > 5 12345 1991 54 27.0 NA NA 54 > > 6 12345 1993 65 32.5 65 65 65 > > 7 12345 1994 23 11.5 23 23 23 > > 8 89765 1991 12 6.0 NA NA 12 > > 9 89765 1992 13 6.5 13 13 13 > > 10 89765 1993 13 6.5 NA NA 13 > > 11 89765 1994 13 6.5 13 13 13 > > 12 89765 1995 14 7.0 14 14 14 > > # Gabor's solution > > f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1]) > > do.call(cbind, by(Data, Data$Index, f)) > > 12345 67543 89765 > > > The original data had consecutive months in each series (actually > there was a missing 1992 in one case but I assumed that was an > inadvertent omission and the actual data was complete); however, here > we have missing 6 month chunks in addition. That makes the series > non-consecutive so to solve that we could either apply this to the > data (after putting the missing 1992 year back in): > > Data <- cbind(Data, NA, NA, NA, NA, NA, NA) > > or we could use a time series class that can handle irregularly spaced > data: > > library(zoo) > f <- function(x) { >dat <- x[-(1:2)] >tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+")) >zoo(c(as.matrix(dat)), tim) > } > do.call(cbind, by(Data, Data$Index, f)) > > The last line is unchanged from before. This code will also handle > the original situation correctly even if the missing 1992 is truly > missing. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame reordering to time series
On Sat, Aug 7, 2010 at 9:18 PM, steven mosher wrote: > Very Slick. > Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a > bunch of working code. > > > > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) > Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) > Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) > Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values) > Data > Index Year Jan Feb Mar Apr Jun > 1 67543 1989 12 6.0 12 12 12 > 2 67543 1990 14 7.0 NA NA 14 > 3 67543 1991 34 17.0 34 34 34 > 4 67543 1992 21 10.5 21 21 21 > 5 12345 1991 54 27.0 NA NA 54 > 6 12345 1993 65 32.5 65 65 65 > 7 12345 1994 23 11.5 23 23 23 > 8 89765 1991 12 6.0 NA NA 12 > 9 89765 1992 13 6.5 13 13 13 > 10 89765 1993 13 6.5 NA NA 13 > 11 89765 1994 13 6.5 13 13 13 > 12 89765 1995 14 7.0 14 14 14 > # Gabor's solution > f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1]) > do.call(cbind, by(Data, Data$Index, f)) > 12345 67543 89765 The original data had consecutive months in each series (actually there was a missing 1992 in one case but I assumed that was an inadvertent omission and the actual data was complete); however, here we have missing 6 month chunks in addition. That makes the series non-consecutive so to solve that we could either apply this to the data (after putting the missing 1992 year back in): Data <- cbind(Data, NA, NA, NA, NA, NA, NA) or we could use a time series class that can handle irregularly spaced data: library(zoo) f <- function(x) { dat <- x[-(1:2)] tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+")) zoo(c(as.matrix(dat)), tim) } do.call(cbind, by(Data, Data$Index, f)) The last line is unchanged from before. This code will also handle the original situation correctly even if the missing 1992 is truly missing. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame reordering to time series
Very Slick. Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a bunch of working code. Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values) Data Index Year Jan Feb Mar Apr Jun 1 67543 1989 12 6.0 12 12 12 2 67543 1990 14 7.0 NA NA 14 3 67543 1991 34 17.0 34 34 34 4 67543 1992 21 10.5 21 21 21 5 12345 1991 54 27.0 NA NA 54 6 12345 1993 65 32.5 65 65 65 7 12345 1994 23 11.5 23 23 23 8 89765 1991 12 6.0 NA NA 12 9 89765 1992 13 6.5 13 13 13 10 89765 1993 13 6.5 NA NA 13 11 89765 1994 13 6.5 13 13 13 12 89765 1995 14 7.0 14 14 14 # Gabor's solution f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1]) do.call(cbind, by(Data, Data$Index, f)) 12345 67543 89765 Jan 1989NA 12.0NA Feb 1989NA 6.0NA Mar 1989NA 12.0NA Apr 1989NA 12.0NA May 1989NA 12.0NA Jun 1989NA 14.0NA Jul 1989NA 7.0NA Aug 1989NANANA Sep 1989NANANA Oct 1989NA 14.0NA Nov 1989NA 34.0NA Dec 1989NA 17.0NA Jan 1990NA 34.0NA Feb 1990NA 34.0NA Mar 1990NA 34.0NA Apr 1990NA 21.0NA May 1990NA 10.5NA Jun 1990NA 21.0NA Jul 1990NA 21.0NA Aug 1990NA 21.0NA Sep 1990NANANA Oct 1990NANANA Nov 1990NANANA Dec 1990NANANA Jan 1991 54.0NA 12.0 Feb 1991 27.0NA 6.0 ... On Sat, Aug 7, 2010 at 5:09 PM, steven mosher wrote: > Thanks Gabor, I probably should have done an example with fewer columns. > > i will rework the example and post it up so the next guys who has this > issue can have a > clear example with a solution. > > > > On Sat, Aug 7, 2010 at 5:04 PM, Gabor Grothendieck < > ggrothendi...@gmail.com> wrote: > >> On Sat, Aug 7, 2010 at 4:49 PM, steven mosher >> wrote: >> > Given a data frame, or it could be a matrix if I choose to. >> > The data consists of an ID, a year, and data for all 12 months. >> > Missing values are a factor AND missing years. >> > >> > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) >> > Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) >> > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) >> > Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) >> > >> >> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values, >> > + Oct=Values,Nov=Values,Dec=Values2) >> > Data >> > Index Year Jan Feb Mar Apr Jun July Aug Sep Oct Nov Dec >> > 1 67543 1989 12 6.0 12 12 12 4.00 12 12 12 12 12 >> > 2 67543 1990 14 7.0 NA NA 14 4.67 NA 14 14 14 NA >> > 3 67543 1991 34 17.0 34 34 34 11.33 34 34 34 34 34 >> > 4 67543 1992 21 10.5 21 21 21 7.00 21 21 21 21 21 >> > 5 12345 1991 54 27.0 NA NA 54 18.00 NA 54 54 54 NA >> > 6 12345 1993 65 32.5 65 65 65 21.67 65 65 65 65 65 >> > 7 12345 1994 23 11.5 23 23 23 7.67 23 23 23 23 23 >> > 8 89765 1991 12 6.0 NA NA 12 4.00 NA 12 12 12 NA >> > 9 89765 1992 13 6.5 13 13 13 4.33 13 13 13 13 13 >> > 10 89765 1993 13 6.5 NA NA 13 4.33 NA 13 13 13 NA >> > 11 89765 1994 13 6.5 13 13 13 4.33 13 13 13 13 13 >> > 12 89765 1995 14 7.0 14 14 14 4.67 14 14 14 14 14 >> > >> > >> > The Goal is to return a Time series object for each ID. Alternatively >> one >> > could return a matrix that I can turn into a Time series. >> > The final structure would be something like this ( done in matrix form >> for >> > illustration) >> > 1989.0 1989.083 >> >1991 ..19921993. 1994 1995 >> > 67543 12 6.0 12 12 12 4.00 12 12 12 12 12... >> > .34...21.. NA.NANA >> > 12345 NA, NA, >> > NA,.54 27 >> > >> > Basically the time series will have patches at the front, middle and end >> > where you may have years of NA >> > The must be column ordered by time and aligned so that averages for all >> > series can be computed per month. >> > >> > Now I have looping code to do this, where I loop through all the IDs and >> map >> > the row of data into the correct >> > column. and create column names based on the data and row names based on >> the >> > ID, but it's painfully >> > slow. Any wizardry would help. >> >> Your email came out a bit garbled so its not clear what you want to >> get out but this code will produce a multivariate ts series, i.e. an >> mts series, with one column for each series: >> >> f <- function(x) ts(c(t(x[-(1:2
Re: [R] Data frame reordering to time series
Thanks Gabor, I probably should have done an example with fewer columns. i will rework the example and post it up so the next guys who has this issue can have a clear example with a solution. On Sat, Aug 7, 2010 at 5:04 PM, Gabor Grothendieck wrote: > On Sat, Aug 7, 2010 at 4:49 PM, steven mosher > wrote: > > Given a data frame, or it could be a matrix if I choose to. > > The data consists of an ID, a year, and data for all 12 months. > > Missing values are a factor AND missing years. > > > > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) > > Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) > > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) > > Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) > > > > Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values, > > + Oct=Values,Nov=Values,Dec=Values2) > > Data > > Index Year Jan Feb Mar Apr Jun July Aug Sep Oct Nov Dec > > 1 67543 1989 12 6.0 12 12 12 4.00 12 12 12 12 12 > > 2 67543 1990 14 7.0 NA NA 14 4.67 NA 14 14 14 NA > > 3 67543 1991 34 17.0 34 34 34 11.33 34 34 34 34 34 > > 4 67543 1992 21 10.5 21 21 21 7.00 21 21 21 21 21 > > 5 12345 1991 54 27.0 NA NA 54 18.00 NA 54 54 54 NA > > 6 12345 1993 65 32.5 65 65 65 21.67 65 65 65 65 65 > > 7 12345 1994 23 11.5 23 23 23 7.67 23 23 23 23 23 > > 8 89765 1991 12 6.0 NA NA 12 4.00 NA 12 12 12 NA > > 9 89765 1992 13 6.5 13 13 13 4.33 13 13 13 13 13 > > 10 89765 1993 13 6.5 NA NA 13 4.33 NA 13 13 13 NA > > 11 89765 1994 13 6.5 13 13 13 4.33 13 13 13 13 13 > > 12 89765 1995 14 7.0 14 14 14 4.67 14 14 14 14 14 > > > > > > The Goal is to return a Time series object for each ID. Alternatively one > > could return a matrix that I can turn into a Time series. > > The final structure would be something like this ( done in matrix form > for > > illustration) > > 1989.0 1989.083 > >1991 ..19921993. 1994 1995 > > 67543 12 6.0 12 12 12 4.00 12 12 12 12 12... > > .34...21.. NA.NANA > > 12345 NA, NA, > > NA,.54 27 > > > > Basically the time series will have patches at the front, middle and end > > where you may have years of NA > > The must be column ordered by time and aligned so that averages for all > > series can be computed per month. > > > > Now I have looping code to do this, where I loop through all the IDs and > map > > the row of data into the correct > > column. and create column names based on the data and row names based on > the > > ID, but it's painfully > > slow. Any wizardry would help. > > Your email came out a bit garbled so its not clear what you want to > get out but this code will produce a multivariate ts series, i.e. an > mts series, with one column for each series: > > f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1]) > do.call(cbind, by(Data, Data$Index, f)) > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data frame reordering to time series
On Sat, Aug 7, 2010 at 4:49 PM, steven mosher wrote: > Given a data frame, or it could be a matrix if I choose to. > The data consists of an ID, a year, and data for all 12 months. > Missing values are a factor AND missing years. > > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) > Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) > Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) > Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values, > + Oct=Values,Nov=Values,Dec=Values2) > Data > Index Year Jan Feb Mar Apr Jun July Aug Sep Oct Nov Dec > 1 67543 1989 12 6.0 12 12 12 4.00 12 12 12 12 12 > 2 67543 1990 14 7.0 NA NA 14 4.67 NA 14 14 14 NA > 3 67543 1991 34 17.0 34 34 34 11.33 34 34 34 34 34 > 4 67543 1992 21 10.5 21 21 21 7.00 21 21 21 21 21 > 5 12345 1991 54 27.0 NA NA 54 18.00 NA 54 54 54 NA > 6 12345 1993 65 32.5 65 65 65 21.67 65 65 65 65 65 > 7 12345 1994 23 11.5 23 23 23 7.67 23 23 23 23 23 > 8 89765 1991 12 6.0 NA NA 12 4.00 NA 12 12 12 NA > 9 89765 1992 13 6.5 13 13 13 4.33 13 13 13 13 13 > 10 89765 1993 13 6.5 NA NA 13 4.33 NA 13 13 13 NA > 11 89765 1994 13 6.5 13 13 13 4.33 13 13 13 13 13 > 12 89765 1995 14 7.0 14 14 14 4.67 14 14 14 14 14 > > > The Goal is to return a Time series object for each ID. Alternatively one > could return a matrix that I can turn into a Time series. > The final structure would be something like this ( done in matrix form for > illustration) > 1989.0 1989.083 > 1991 ..19921993. 1994 1995 > 67543 12 6.0 12 12 12 4.00 12 12 12 12 12... > .34...21.. NA.NANA > 12345 NA, NA, > NA,.54 27 > > Basically the time series will have patches at the front, middle and end > where you may have years of NA > The must be column ordered by time and aligned so that averages for all > series can be computed per month. > > Now I have looping code to do this, where I loop through all the IDs and map > the row of data into the correct > column. and create column names based on the data and row names based on the > ID, but it's painfully > slow. Any wizardry would help. Your email came out a bit garbled so its not clear what you want to get out but this code will produce a multivariate ts series, i.e. an mts series, with one column for each series: f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1]) do.call(cbind, by(Data, Data$Index, f)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data frame reordering to time series
Given a data frame, or it could be a matrix if I choose to. The data consists of an ID, a year, and data for all 12 months. Missing values are a factor AND missing years. Id<-c(rep(67543,4),rep(12345,3),rep(89765,5)) Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1)) Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14) Values<-c(12,14,34,21,54,65,23,12,13,13,13,14) Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values, + Oct=Values,Nov=Values,Dec=Values2) Data Index Year Jan Feb Mar Apr Jun July Aug Sep Oct Nov Dec 1 67543 1989 12 6.0 12 12 12 4.00 12 12 12 12 12 2 67543 1990 14 7.0 NA NA 14 4.67 NA 14 14 14 NA 3 67543 1991 34 17.0 34 34 34 11.33 34 34 34 34 34 4 67543 1992 21 10.5 21 21 21 7.00 21 21 21 21 21 5 12345 1991 54 27.0 NA NA 54 18.00 NA 54 54 54 NA 6 12345 1993 65 32.5 65 65 65 21.67 65 65 65 65 65 7 12345 1994 23 11.5 23 23 23 7.67 23 23 23 23 23 8 89765 1991 12 6.0 NA NA 12 4.00 NA 12 12 12 NA 9 89765 1992 13 6.5 13 13 13 4.33 13 13 13 13 13 10 89765 1993 13 6.5 NA NA 13 4.33 NA 13 13 13 NA 11 89765 1994 13 6.5 13 13 13 4.33 13 13 13 13 13 12 89765 1995 14 7.0 14 14 14 4.67 14 14 14 14 14 The Goal is to return a Time series object for each ID. Alternatively one could return a matrix that I can turn into a Time series. The final structure would be something like this ( done in matrix form for illustration) 1989.0 1989.083 1991 ..19921993. 1994 1995 67543 12 6.0 12 12 12 4.00 12 12 12 12 12... .34...21.. NA.NANA 12345 NA, NA, NA,.54 27 Basically the time series will have patches at the front, middle and end where you may have years of NA The must be column ordered by time and aligned so that averages for all series can be computed per month. Now I have looping code to do this, where I loop through all the IDs and map the row of data into the correct column. and create column names based on the data and row names based on the ID, but it's painfully slow. Any wizardry would help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.