Thanks Amiya/TD for responding.
@TD,
Thanks for letting us know about this new foreachBatch api, this handle of
per batch dataframe should be useful in many cases.
@Amiya,
The input source will be read twice, entire dag computation will be done
twice. Not limitation but resource utilisation and
Hi Tathagata,
Is there any limitation of below code while writing to multiple file ?
val inputdf:DataFrame =
sparkSession.readStream.schema(schema).format("csv").option("delimiter",",").csv("src/main/streamingInput")
query1 =
Hey all,
In Spark 2.4.0, there will be a new feature called *foreachBatch* which
will expose the output rows of every micro-batch as a dataframe, on which
you apply a user-defined function. With that, you can reuse existing batch
sources for writing results as well as write results to multiple
Hi Chandan/Jürgen,
I had tried through a native code having single input data frame with
multiple sinks as :
Spark provides a method called awaitAnyTermination() in
StreamingQueryManager.scala which provides all the required details to
handle the query processed by spark.By observing
Hi Amiya/Jürgen,
Did you get any lead on this ?
I want to process records post some validation.
Correct records should go in sink1 and incorrect records should go in sink2.
How to achieve this in single stream ?
Regards,
Chandan
On Wed, Jun 13, 2018 at 2:30 PM Amiya Mishra
wrote:
> Hi Jürgen,
Hi Jürgen,
Have you found any solution or workaround for multiple sinks from single
source as we cannot process multiple sinks at a time ?
As i also has a scenario in ETL where we are using clone component having
multiple sinks with single input stream dataframe.
Can you keep posting once you
Hi, I’m obviously new to Spark Structured Streaming, and I want to
1.) Open one (a single) Connection to a Mqtt broker / topic spewing JSON Objects
2.) Transform JSON to Wide Table
3.) Do several different queries on wide Table
What I do:
val lines = session.readStream