Re: How to branch a Stream / have multiple Sinks / do multiple Queries on one Stream

2018-07-09 Thread chandan prakash
Thanks Amiya/TD for responding. @TD, Thanks for letting us know about this new foreachBatch api, this handle of per batch dataframe should be useful in many cases. @Amiya, The input source will be read twice, entire dag computation will be done twice. Not limitation but resource utilisation and

Re: How to branch a Stream / have multiple Sinks / do multiple Queries on one Stream

2018-07-06 Thread Amiya Mishra
Hi Tathagata, Is there any limitation of below code while writing to multiple file ? val inputdf:DataFrame = sparkSession.readStream.schema(schema).format("csv").option("delimiter",",").csv("src/main/streamingInput") query1 =

Re: How to branch a Stream / have multiple Sinks / do multiple Queries on one Stream

2018-07-05 Thread Tathagata Das
Hey all, In Spark 2.4.0, there will be a new feature called *foreachBatch* which will expose the output rows of every micro-batch as a dataframe, on which you apply a user-defined function. With that, you can reuse existing batch sources for writing results as well as write results to multiple

Re: How to branch a Stream / have multiple Sinks / do multiple Queries on one Stream

2018-07-05 Thread Amiya Mishra
Hi Chandan/Jürgen, I had tried through a native code having single input data frame with multiple sinks as : Spark provides a method called awaitAnyTermination() in StreamingQueryManager.scala which provides all the required details to handle the query processed by spark.By observing

Re: How to branch a Stream / have multiple Sinks / do multiple Queries on one Stream

2018-07-05 Thread chandan prakash
Hi Amiya/Jürgen, Did you get any lead on this ? I want to process records post some validation. Correct records should go in sink1 and incorrect records should go in sink2. How to achieve this in single stream ? Regards, Chandan On Wed, Jun 13, 2018 at 2:30 PM Amiya Mishra wrote: > Hi Jürgen,

Re: How to branch a Stream / have multiple Sinks / do multiple Queries on one Stream

2018-06-13 Thread Amiya Mishra
Hi Jürgen, Have you found any solution or workaround for multiple sinks from single source as we cannot process multiple sinks at a time ? As i also has a scenario in ETL where we are using clone component having multiple sinks with single input stream dataframe. Can you keep posting once you

How to branch a Stream / have multiple Sinks / do multiple Queries on one Stream

2017-11-07 Thread Jürgen Albersdorfer
Hi, I’m obviously new to Spark Structured Streaming, and I want to 1.) Open one (a single) Connection to a Mqtt broker / topic spewing JSON Objects 2.) Transform JSON to Wide Table 3.) Do several different queries on wide Table What I do: val lines = session.readStream