Key idea is to simulate your app time as you enter data . So you can connect spark streaming to a queue and insert data in it spaced by time. Easier said than done :). What are the parallelism issues you are hitting with your static approach.
On Friday, July 4, 2014, alessandro finamore <alessandro.finam...@polito.it> wrote: > Thanks for the replies > > What is not completely clear to me is how time is managed. > I can create a DStream from file. > But if I set the window property that will be bounded to the application > time, right? > > If I got it right, with a receiver I can control the way DStream are > created. > But, how can apply then the windowing already shipped with the framework if > this is bounded to the "application time"? > I would like to do define a window of N files but the window() function > requires a duration as input... > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/window-analysis-with-Spark-and-Spark-streaming-tp8806p8824.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > -- Sent from Gmail Mobile