Sorry.. Rephrasing : Can this issue be resolved by having a smaller block interval?
Regards, Praveen On 18 Feb 2016 21:30, "praveen S" <mylogi...@gmail.com> wrote: > Can having a smaller block interval only resolve this? > > Regards, > Praveen > On 18 Feb 2016 21:13, "Cody Koeninger" <c...@koeninger.org> wrote: > >> Backpressure won't help you with the first batch, you'd need >> spark.streaming.kafka.maxRatePerPartition >> for that >> >> On Thu, Feb 18, 2016 at 9:40 AM, praveen S <mylogi...@gmail.com> wrote: >> >>> Have a look at >>> >>> spark.streaming.backpressure.enabled >>> Property >>> >>> Regards, >>> Praveen >>> On 18 Feb 2016 00:13, "Abhishek Anand" <abhis.anan...@gmail.com> wrote: >>> >>>> I have a spark streaming application running in production. I am trying >>>> to find a solution for a particular use case when my application has a >>>> downtime of say 5 hours and is restarted. Now, when I start my streaming >>>> application after 5 hours there would be considerable amount of data then >>>> in the Kafka and my cluster would be unable to repartition and process >>>> that. >>>> >>>> Is there any workaround so that when my streaming application starts it >>>> starts taking data for 1-2 hours, process it , then take the data for next >>>> 1 hour process it. Now when its done processing of previous 5 hours data >>>> which missed, normal streaming should start with the given slide interval. >>>> >>>> Please suggest any ideas and feasibility of this. >>>> >>>> >>>> Thanks !! >>>> Abhi >>>> >>> >>