Hi, I'm yet to.
Just want to know, when does Spark 2.3 with 0.10 Kafka Spark Package allows Python? I read somewhere, as of now Scala and Java are the languages to be used. Please correct me if am wrong. Thanks, Aakash. On 14-Mar-2018 8:24 PM, "Georg Heiler" <georg.kf.hei...@gmail.com> wrote: > Did you try spark 2.3 with structured streaming? There watermarking and > plain sql might be really interesting for you. > Aakash Basu <aakash.spark....@gmail.com> schrieb am Mi. 14. März 2018 um > 14:57: > >> Hi, >> >> >> >> *Info (Using):Spark Streaming Kafka 0.8 package* >> >> *Spark 2.2.1* >> *Kafka 1.0.1* >> >> As of now, I am feeding paragraphs in Kafka console producer and my >> Spark, which is acting as a receiver is printing the flattened words, which >> is a complete RDD operation. >> >> *My motive is to read two tables continuously (being updated) as two >> distinct Kafka topics being read as two Spark Dataframes and join them >> based on a key and produce the output. *(I am from Spark-SQL background, >> pardon my Spark-SQL-ish writing) >> >> *It may happen, the first topic is receiving new data 15 mins prior to >> the second topic, in that scenario, how to proceed? I should not lose any >> data.* >> >> As of now, I want to simply pass paragraphs, read them as RDD, convert to >> DF and then join to get the common keys as the output. (Just for R&D). >> >> Started using Spark Streaming and Kafka today itself. >> >> Please help! >> >> Thanks, >> Aakash. >> >