Not specifically, I want to generally be able to union any form of DStream/RDD. I'm working on Apache Beam's Spark runner so the abstraction their does not tell between streaming/batch (kinda like Dataset API). Since I wrote my own InputDStream I will simply stream any "batch source" instead, because I really don't see a way to union both.
On Sun, Feb 12, 2017 at 6:49 AM Egor Pahomov <pahomov.e...@gmail.com> wrote: > Interestingly, I just faced with the same problem. By any change, do you > want to process old files in the directory as well as new ones? It's my > motivation and checkpointing my problem as well. > > 2017-02-08 22:02 GMT-08:00 Amit Sela <amitsel...@gmail.com>: > > Not with checkpointing. > > On Thu, Feb 9, 2017, 04:58 Egor Pahomov <pahomov.e...@gmail.com> wrote: > > Just guessing here, but would > http://spark.apache.org/docs/latest/streaming-programming-guide.html#basic-sources > "*Queue of RDDs as a Stream*" work? Basically create DStream from your > RDD and than union with other DStream. > > 2017-02-08 12:32 GMT-08:00 Amit Sela <amitsel...@gmail.com>: > > Hi all, > > I'm looking to union a DStream and RDD into a single stream. > One important note is that the RDD has to be added to the DStream just > once. > > Ideas ? > > Thanks, > Amit > > > > > > -- > > > *Sincerely yoursEgor Pakhomov* > > > > > -- > > > *Sincerely yoursEgor Pakhomov* >