Hello!

I have data that contain, among other things the date for the
beginning and for the end of a (daily) time series (see example below
- "mydata")

mystring1<-c("String 1", "String 2")
mystring2<-c("String a", "String b")
starts<-c(as.Date("2011-02-01"),as.Date("2011-03-02"))
ends<-c(as.Date("2011-03-15"),as.Date("2011-03-31"))
values<-c(2000,10000)
mydata<-data.frame(starts=starts,ends=ends,values=values,mystring1=mystring1,mystring2=mystring2)
(mydata)

I have to reshape it so that: for each row of "mydata" I have daily
time series that start on the start date and end on the end date; what
used to be in the column "values" has to be distributed equally across
those dates; all other columns keep their original values.
My code below does it (see the end result "newdata"). However, to
achieve my goal, I am looping through rows of "mydata" - I am not sure
it will work with my real data set that already has thousands of rows
and also a lot of other columns with strings. I am afraid I'll run out
of memory. Is there maybe a way of doing it more efficiently?
Thanks a lot for your pointers!

newdata<-data.frame(mydate=NA,myvalues=NA,mystring1=NA,mystring2=NA)
for(i in 1:nrow(mydata)){  # i<-2
        start.date = mydata[i,"starts"]
        end.date =  mydata[i,"ends"]
        all.dates = seq(start.date, length = end.date - start.date, by = "day")
        temp.df <- data.frame(mydate = all.dates)
        temp.df$myvalues = mydata[i,"values"]/length(all.dates)
        temp.df[names(mydata)[4:5]] = mydata[i,4:5]
        newdata<-rbind(newdata,temp.df)
}
newdata<-newdata[-1,]
(newdata);(mydata)

-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to