Have a look at spark.streaming.backpressure.enabled Property
Regards, Praveen On 18 Feb 2016 00:13, "Abhishek Anand" <abhis.anan...@gmail.com> wrote: > I have a spark streaming application running in production. I am trying to > find a solution for a particular use case when my application has a > downtime of say 5 hours and is restarted. Now, when I start my streaming > application after 5 hours there would be considerable amount of data then > in the Kafka and my cluster would be unable to repartition and process that. > > Is there any workaround so that when my streaming application starts it > starts taking data for 1-2 hours, process it , then take the data for next > 1 hour process it. Now when its done processing of previous 5 hours data > which missed, normal streaming should start with the given slide interval. > > Please suggest any ideas and feasibility of this. > > > Thanks !! > Abhi >