Thanks. Another question. I have event data with timestamps. I want to create a sliding window using timestamps. Some windows will have a lot of events in them others won’t. Is there a way to get an RDD made of this kind of a variable length window?
On Tue, Jan 6, 2015 at 1:03 PM, Sean Owen <so...@cloudera.com> wrote: > First you'd need to sort the RDD to give it a meaningful order, but I > assume you have some kind of timestamp in your data you can sort on. > > I think you might be after the sliding() function, a developer API in > MLlib: > > > https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/rdd/RDDFunctions.scala#L43 > > On Tue, Jan 6, 2015 at 5:25 PM, Asim Jalis <asimja...@gmail.com> wrote: > >> Is there an easy way to do a moving average across a single RDD (in a >> non-streaming app). Here is the use case. I have an RDD made up of stock >> prices. I want to calculate a moving average using a window size of N. >> >> Thanks. >> >> Asim >> > >