Very Slick.

Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a
bunch of working code.




Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
 Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
 Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
 
Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values)
 Data
   Index Year Jan  Feb Mar Apr Jun
1  67543 1989  12  6.0  12  12  12
2  67543 1990  14  7.0  NA  NA  14
3  67543 1991  34 17.0  34  34  34
4  67543 1992  21 10.5  21  21  21
5  12345 1991  54 27.0  NA  NA  54
6  12345 1993  65 32.5  65  65  65
7  12345 1994  23 11.5  23  23  23
8  89765 1991  12  6.0  NA  NA  12
9  89765 1992  13  6.5  13  13  13
10 89765 1993  13  6.5  NA  NA  13
11 89765 1994  13  6.5  13  13  13
12 89765 1995  14  7.0  14  14  14

#  Gabor's solution

 f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
 do.call(cbind, by(Data, Data$Index, f))
             12345 67543 89765
Jan 1989    NA  12.0    NA
Feb 1989    NA   6.0    NA
Mar 1989    NA  12.0    NA
Apr 1989    NA  12.0    NA
May 1989    NA  12.0    NA
Jun 1989    NA  14.0    NA
Jul 1989    NA   7.0    NA
Aug 1989    NA    NA    NA
Sep 1989    NA    NA    NA
Oct 1989    NA  14.0    NA
Nov 1989    NA  34.0    NA
Dec 1989    NA  17.0    NA
Jan 1990    NA  34.0    NA
Feb 1990    NA  34.0    NA
Mar 1990    NA  34.0    NA
Apr 1990    NA  21.0    NA
May 1990    NA  10.5    NA
Jun 1990    NA  21.0    NA
Jul 1990    NA  21.0    NA
Aug 1990    NA  21.0    NA
Sep 1990    NA    NA    NA
Oct 1990    NA    NA    NA
Nov 1990    NA    NA    NA
Dec 1990    NA    NA    NA
Jan 1991  54.0    NA  12.0
Feb 1991  27.0    NA   6.0
.......

On Sat, Aug 7, 2010 at 5:09 PM, steven mosher <mosherste...@gmail.com>wrote:

> Thanks Gabor, I probably should have done an example with fewer columns.
>
> i will rework the example and post it up so the next guys who has this
> issue can have a
> clear example with a solution.
>
>
>
> On Sat, Aug 7, 2010 at 5:04 PM, Gabor Grothendieck <
> ggrothendi...@gmail.com> wrote:
>
>> On Sat, Aug 7, 2010 at 4:49 PM, steven mosher <mosherste...@gmail.com>
>> wrote:
>> > Given a data frame, or it could be a matrix if I choose to.
>> > The data consists of an ID, a year, and data for all 12 months.
>> > Missing values are a factor AND missing years.
>> >
>> > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
>> >  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
>> >  Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
>> >  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
>> >
>>  
>> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
>> > + Oct=Values,Nov=Values,Dec=Values2)
>> >  Data
>> >   Index Year Jan  Feb Mar Apr Jun      July Aug Sep Oct Nov Dec
>> > 1  67543 1989  12  6.0  12  12  12  4.000000  12  12  12  12  12
>> > 2  67543 1990  14  7.0  NA  NA  14  4.666667  NA  14  14  14  NA
>> > 3  67543 1991  34 17.0  34  34  34 11.333333  34  34  34  34  34
>> > 4  67543 1992  21 10.5  21  21  21  7.000000  21  21  21  21  21
>> > 5  12345 1991  54 27.0  NA  NA  54 18.000000  NA  54  54  54  NA
>> > 6  12345 1993  65 32.5  65  65  65 21.666667  65  65  65  65  65
>> > 7  12345 1994  23 11.5  23  23  23  7.666667  23  23  23  23  23
>> > 8  89765 1991  12  6.0  NA  NA  12  4.000000  NA  12  12  12  NA
>> > 9  89765 1992  13  6.5  13  13  13  4.333333  13  13  13  13  13
>> > 10 89765 1993  13  6.5  NA  NA  13  4.333333  NA  13  13  13  NA
>> > 11 89765 1994  13  6.5  13  13  13  4.333333  13  13  13  13  13
>> > 12 89765 1995  14  7.0  14  14  14  4.666667  14  14  14  14  14
>> >
>> >
>> > The Goal is to return a Time series object for each ID. Alternatively
>> one
>> > could return a matrix that I can turn into a Time series.
>> > The final structure would be something like this ( done in matrix form
>> for
>> > illustration)
>> >          1989.0  1989.083
>> >    1991 ......1992....1993..... 1994 .... 1995
>> > 67543 12       6.0   12  12  12  4.000000  12  12  12  12  12...
>> > .34...........21..     NA.........NA........NA
>> > 12345  NA, NA,
>> > NA,.............................................................54 27
>> >
>> > Basically the time series will have patches at the front, middle and end
>> > where you may have years of NA
>> > The must be column ordered by time and aligned so that averages for all
>> > series can be computed per month.
>> >
>> > Now I have looping code to do this, where I loop through all the IDs and
>> map
>> > the row of data into the correct
>> > column. and create column names based on the data and row names based on
>> the
>> > ID, but it's painfully
>> > slow. Any wizardry would help.
>>
>> Your email came out a bit garbled so its not clear what you want to
>> get out but this code will produce a multivariate ts series, i.e. an
>> mts series, with one column for each series:
>>
>> f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
>> do.call(cbind, by(Data, Data$Index, f))
>>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to