Re: [R] fill data forward in data frame.

Ben quant Thu, 01 Mar 2012 14:18:03 -0800

That is great! Thank you very much.

Ben


On Thu, Mar 1, 2012 at 2:57 PM, Petr Savicky <savi...@cs.cas.cz> wrote:

> On Thu, Mar 01, 2012 at 02:31:01PM -0700, Ben quant wrote:
> > Hello,
> >
> > My direct desire is a good (fast) way to fill values forward until there
> is
> > another value then fill that value foward in the data xx (at the bottom
> of
> > this email).  For example, from row 1 to row 45 should be NA (no change),
> > but from row 46 row 136 the value should be 12649, and from row 137 to
> the
> > next value should be 13039.00.  The last line of code is all you need for
> > this part.
> >
> > If you are so inclined, my goal is this: I want to create a weekly time
> > series out of some data based on the report date. The report date is 'rd'
> > below, and is the correct date for the time series. My idea (in part seen
> > below) is to align rd and ua via the incorrect date (the time series
> date),
> > then merge that using the report date (rd) and a daily series (so I
> capture
> > all of the dates) of dates (dt). That gets the data in the right start
> > period. I've done all of this so far below and it looks fine. Then I plan
> > to roll all of those values forward to the next value (see question
> above),
> > then I'll do something like this:
> >
> > xx[weekdays(xx[,1]) == "Friday",]
> >
> > ...to get a weekly series of Friday values. I'm thinking someone probably
> > has a faster way of doing this. I have to do this many times, so speed is
> > important. Thanks!
> >
> > Here is what I have done so far:
> >
> > dt <- seq(from =as.Date("2009-06-01"), to = Sys.Date(), by = "day")
> >
> > > nms
> > [1] "2009-06-30" "2009-09-30" "2009-12-31" "2010-03-31" "2010-06-30"
> > "2010-09-30" "2010-12-31" "2011-03-31" "2011-06-30" "2011-09-30"
> > [11] "2011-12-31"
> >
> > > rd
> > 2009-06-30   2009-09-30   2009-12-31   2010-03-31   2010-06-30
> > 2010-09-30   2010-12-31   2011-03-31   2011-06-30   2011-09-30
> > "2009-07-16" "2009-10-15" "2010-01-19" "2010-04-19" "2010-07-19"
> > "2010-10-18" "2011-01-18" "2011-04-19" "2011-07-18" "2011-10-17"
> > 2011-12-31
> > "2012-01-19"
> >
> > > ua
> > 2009-06-30 2009-09-30 2009-12-31 2010-03-31 2010-06-30 2010-09-30
> > 2010-12-31 2011-03-31 2011-06-30 2011-09-30 2011-12-31
> > 12649.00   13039.00   13425.00   13731.00   14014.00   14389.00
> > 14833.00   15095.00   15481.43   15846.43   16186.43
> >
> > > x = merge(ua,rd,by='row.names')
> > > names(x) = c('z.date','val','rt_date')
> > > xx = merge(dt,x,by.y= 'rt_date',by.x=1,all.x=T)
> > > xx
> > x                          z.date   val
> > 1   2009-06-01       <NA>    NA
> > 2   2009-06-02       <NA>    NA
> > 3   2009-06-03       <NA>    NA
> > 4   2009-06-04       <NA>    NA
> > 5   2009-06-05       <NA>    NA
> >
> > ...ect....
> >
> > 36  2009-07-06       <NA>    NA
> > 37  2009-07-07       <NA>    NA
> > 38  2009-07-08       <NA>    NA
> > 39  2009-07-09       <NA>    NA
> > 40  2009-07-10       <NA>    NA
> > 41  2009-07-11       <NA>    NA
> > 42  2009-07-12       <NA>    NA
> > 43  2009-07-13       <NA>    NA
> > 44  2009-07-14       <NA>    NA
> > 45  2009-07-15       <NA>    NA
> > 46  2009-07-16 2009-06-30 12649
> > 47  2009-07-17       <NA>    NA
> > 48  2009-07-18       <NA>    NA
> > 49  2009-07-19       <NA>    NA
> > 50  2009-07-20       <NA>    NA
> > 51  2009-07-21       <NA>    NA
> > 52  2009-07-22       <NA>    NA
> > 53  2009-07-23       <NA>    NA
> > 54  2009-07-24       <NA>    NA
> > 55  2009-07-25       <NA>    NA
> > 56  2009-07-26       <NA>    NA
> > 57  2009-07-27       <NA>    NA
> > 58  2009-07-28       <NA>    NA
> >
> > ...ect....
> >
> > 129  2009-10-07       <NA>       NA
> > 130  2009-10-08       <NA>       NA
> > 131  2009-10-09       <NA>       NA
> > 132  2009-10-10       <NA>       NA
> > 133  2009-10-11       <NA>       NA
> > 134  2009-10-12       <NA>       NA
> > 135  2009-10-13       <NA>       NA
> > 136  2009-10-14       <NA>       NA
> > 137  2009-10-15 2009-09-30 13039.00
> > 138  2009-10-16       <NA>       NA
> > 139  2009-10-17       <NA>       NA
> > 140  2009-10-18       <NA>       NA
> > 141  2009-10-19       <NA>       NA
> > 142  2009-10-20       <NA>       NA
> > 143  2009-10-21       <NA>       NA
>
> Hi.
>
> Try first the following simpler version.
>
>  # an input vector
>  x <- rep(NA, times=20)
>  x[4] <- "A"
>  x[9] <- "B"
>  x[17] <- "C"
>
>  # extending the values forward
>  values <- c(NA, x[!is.na(x)])
>  ind <- cumsum(!is.na(x)) + 1
>  y <- values[ind]
>
>  # compare with the original
>  cbind(x, y)
>
>        x   y
>   [1,] NA  NA
>   [2,] NA  NA
>   [3,] NA  NA
>   [4,] "A" "A"
>   [5,] NA  "A"
>   [6,] NA  "A"
>   [7,] NA  "A"
>   [8,] NA  "A"
>   [9,] "B" "B"
>  [10,] NA  "B"
>  [11,] NA  "B"
>  [12,] NA  "B"
>  [13,] NA  "B"
>  [14,] NA  "B"
>  [15,] NA  "B"
>  [16,] NA  "B"
>  [17,] "C" "C"
>  [18,] NA  "C"
>  [19,] NA  "C"
>  [20,] NA  "C"
>
> This could be applied directly to the last two columns of your
> data frame "xx". However, it may be more natural to obtain the
> vector "values" from the input data and not from their sparse
> form, which is the data frame. Also, the logical vector !is.na(x)
> is the same for the last two columns of your data frame, so
> it may be computed only once.
>
> Hope this helps.
>
> Petr Savicky.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fill data forward in data frame.

Reply via email to