Re: [Spark Streaming]: Save the records that are dropped by watermarking in spark structured streaming

2024-05-08 Thread Mich Talebzadeh
you may consider - Increase Watermark Retention: Consider increasing the watermark retention duration. This allows keeping records for a longer period before dropping them. However, this might increase processing latency and violate at-least-once semantics if the watermark lags behind real-time.

[Spark Streaming]: Save the records that are dropped by watermarking in spark structured streaming

2024-05-07 Thread Nandha Kumar
Hi Team, We are trying to use *spark structured streaming *for our use case. We will be joining 2 streaming sources(from kafka topic) with watermarks. As time progresses, the records that are prior to the watermark timestamp are removed from the state. For our use case, we want to *store