If I stop and start while processing the batch what will happen? will that batch gets canceled and gets reprocessed again when I click start? Does that mean I need to worry about duplicates in the downstream? Kafka consumers have a pause and resume and they work just fine so I am not sure why Spark doesn't expose that.
On Mon, Aug 5, 2019 at 10:54 PM Gourav Sengupta <gourav.sengu...@gmail.com> wrote: > Hi, > > exactly my question, I was also looking for ways to gracefully exit spark > structured streaming. > > > Regards, > Gourav > > On Tue, Aug 6, 2019 at 3:43 AM kant kodali <kanth...@gmail.com> wrote: > >> Hi All, >> >> I am trying to see if there is a way to pause a spark stream that process >> data from Kafka such that my application can take some actions while the >> stream is paused and resume when the application completes those actions. >> >> Thanks! >> >