[Structured Streaming] Recommended way of joining streams

Priyank Shrivastava Wed, 09 Aug 2017 00:06:06 -0700

I have streams of data coming in from various applications through Kafka.
These streams are converted into dataframes in Spark.  I would like to join
these dataframes on a common ID they all contain.


Since  joining streaming dataframes is currently not supported, what is the
current recommended way to join two dataFrames  in a streaming context.

Is it recommended to keep writing the streaming dataframes into some sink
to convert them into static dataframes which can then be joined?  Would
this guarantee end-to-end exactly once and fault tolerant guarantees?

Priyank

[Structured Streaming] Recommended way of joining streams

Reply via email to