Re: Union of DStream and RDD
Not specifically, I want to generally be able to union any form of DStream/RDD. I'm working on Apache Beam's Spark runner so the abstraction their does not tell between streaming/batch (kinda like Dataset API). Since I wrote my own InputDStream I will simply stream any "batch source" instead, because I really don't see a way to union both. On Sun, Feb 12, 2017 at 6:49 AM Egor Pahomovwrote: > Interestingly, I just faced with the same problem. By any change, do you > want to process old files in the directory as well as new ones? It's my > motivation and checkpointing my problem as well. > > 2017-02-08 22:02 GMT-08:00 Amit Sela : > > Not with checkpointing. > > On Thu, Feb 9, 2017, 04:58 Egor Pahomov wrote: > > Just guessing here, but would > http://spark.apache.org/docs/latest/streaming-programming-guide.html#basic-sources > "*Queue of RDDs as a Stream*" work? Basically create DStream from your > RDD and than union with other DStream. > > 2017-02-08 12:32 GMT-08:00 Amit Sela : > > Hi all, > > I'm looking to union a DStream and RDD into a single stream. > One important note is that the RDD has to be added to the DStream just > once. > > Ideas ? > > Thanks, > Amit > > > > > > -- > > > *Sincerely yoursEgor Pakhomov* > > > > > -- > > > *Sincerely yoursEgor Pakhomov* >
Re: Union of DStream and RDD
Interestingly, I just faced with the same problem. By any change, do you want to process old files in the directory as well as new ones? It's my motivation and checkpointing my problem as well. 2017-02-08 22:02 GMT-08:00 Amit Sela: > Not with checkpointing. > > On Thu, Feb 9, 2017, 04:58 Egor Pahomov wrote: > >> Just guessing here, but would http://spark.apache.org/ >> docs/latest/streaming-programming-guide.html#basic-sources "*Queue of >> RDDs as a Stream*" work? Basically create DStream from your RDD and than >> union with other DStream. >> >> 2017-02-08 12:32 GMT-08:00 Amit Sela : >> >> Hi all, >> >> I'm looking to union a DStream and RDD into a single stream. >> One important note is that the RDD has to be added to the DStream just >> once. >> >> Ideas ? >> >> Thanks, >> Amit >> >> >> >> >> >> -- >> >> >> *Sincerely yoursEgor Pakhomov* >> > -- *Sincerely yoursEgor Pakhomov*
Re: Union of DStream and RDD
Not with checkpointing. On Thu, Feb 9, 2017, 04:58 Egor Pahomovwrote: > Just guessing here, but would > http://spark.apache.org/docs/latest/streaming-programming-guide.html#basic-sources > "*Queue of RDDs as a Stream*" work? Basically create DStream from your > RDD and than union with other DStream. > > 2017-02-08 12:32 GMT-08:00 Amit Sela : > > Hi all, > > I'm looking to union a DStream and RDD into a single stream. > One important note is that the RDD has to be added to the DStream just > once. > > Ideas ? > > Thanks, > Amit > > > > > > -- > > > *Sincerely yoursEgor Pakhomov* >
Re: Union of DStream and RDD
Just guessing here, but would http://spark.apache.org/docs/latest/streaming-programming-guide.html#basic-sources "*Queue of RDDs as a Stream*" work? Basically create DStream from your RDD and than union with other DStream. 2017-02-08 12:32 GMT-08:00 Amit Sela: > Hi all, > > I'm looking to union a DStream and RDD into a single stream. > One important note is that the RDD has to be added to the DStream just > once. > > Ideas ? > > Thanks, > Amit > > -- *Sincerely yoursEgor Pakhomov*