Hi, For QueueRDD, have a look here. https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/QueueStream.scala
Regards, Laeeq, PhD candidatte, KTH, Stockholm. On Sunday, July 6, 2014 10:20 AM, alessandro finamore <alessandro.finam...@polito.it> wrote: On 5 July 2014 23:08, Mayur Rustagi [via Apache Spark User List] <[hidden email]> wrote: > Key idea is to simulate your app time as you enter data . So you can connect > spark streaming to a queue and insert data in it spaced by time. Easier said > than done :). I see. I'll try to implement also this solution so that I can compare it with my current spark implementation. I'm interested in seeing if this is faster...as I assume it should be :) > What are the parallelism issues you are hitting with your > static approach. In my current spark implementation, whenever I need to get the aggregated stats over the window, I'm re-mapping all the current bins to have the same key so that they can be reduced altogether. This means that data need to shipped to a single reducer. As results, adding nodes/cores to the application does not really affect the total time :( > > > On Friday, July 4, 2014, alessandro finamore <[hidden email]> wrote: >> >> Thanks for the replies >> >> What is not completely clear to me is how time is managed. >> I can create a DStream from file. >> But if I set the window property that will be bounded to the application >> time, right? >> >> If I got it right, with a receiver I can control the way DStream are >> created. >> But, how can apply then the windowing already shipped with the framework >> if >> this is bounded to the "application time"? >> I would like to do define a window of N files but the window() function >> requires a duration as input... >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/window-analysis-with-Spark-and-Spark-streaming-tp8806p8824.html >> >> Sent from the Apache Spark User List mailing list archive at Nabble.com. > > > > -- > Sent from Gmail Mobile > > > ________________________________ > If you reply to this email, your message will be added to the discussion > below: > http://apache-spark-user-list.1001560.n3.nabble.com/window-analysis-with-Spark-and-Spark-streaming-tp8806p8860.html > To unsubscribe from window analysis with Spark and Spark streaming, click > here. > NAML -- -------------------------------------------------- Alessandro Finamore, PhD Politecnico di Torino -- Office: +39 0115644127 Mobile: +39 3280251485 SkypeId: alessandro.finamore --------------------------------------------------- ________________________________ View this message in context: Re: window analysis with Spark and Spark streaming Sent from the Apache Spark User List mailing list archive at Nabble.com.