[Spark Streaming]: Save the records that are dropped by watermarking in spark structured streaming

Nandha Kumar Tue, 07 May 2024 19:55:35 -0700

Hi Team,
       We are trying to use *spark structured streaming *for our use case.
We will be joining 2 streaming sources(from kafka topic) with watermarks.
As time progresses, the records that are prior to the watermark timestamp
are removed from the state. For our use case, we want to *store these
dropped records* in some postgres table or s3.


When searching, we found a similar question
<https://stackoverflow.com/questions/60418632/how-to-save-the-records-that-are-dropped-by-watermarking-in-spark-structured-str>in
StackOverflow which is unanswered.
*We would like to know how to store these dropped records due to the
watermark.*

[Spark Streaming]: Save the records that are dropped by watermarking in spark structured streaming

Reply via email to