Re: Spark Streaming with Kafka Use Case

praveen S Thu, 18 Feb 2016 08:02:56 -0800

Sorry.. Rephrasing :
Can this issue be resolved by having a smaller block interval?


Regards,
Praveen
On 18 Feb 2016 21:30, "praveen S" <mylogi...@gmail.com> wrote:

> Can having a smaller block interval only resolve this?
>
> Regards,
> Praveen
> On 18 Feb 2016 21:13, "Cody Koeninger" <c...@koeninger.org> wrote:
>
>> Backpressure won't help you with the first batch, you'd need 
>> spark.streaming.kafka.maxRatePerPartition
>> for that
>>
>> On Thu, Feb 18, 2016 at 9:40 AM, praveen S <mylogi...@gmail.com> wrote:
>>
>>> Have a look at
>>>
>>> spark.streaming.backpressure.enabled
>>> Property
>>>
>>> Regards,
>>> Praveen
>>> On 18 Feb 2016 00:13, "Abhishek Anand" <abhis.anan...@gmail.com> wrote:
>>>
>>>> I have a spark streaming application running in production. I am trying
>>>> to find a solution for a particular use case when my application has a
>>>> downtime of say 5 hours and is restarted. Now, when I start my streaming
>>>> application after 5 hours there would be considerable amount of data then
>>>> in the Kafka and my cluster would be unable to repartition and process 
>>>> that.
>>>>
>>>> Is there any workaround so that when my streaming application starts it
>>>> starts taking data for 1-2 hours, process it , then take the data for next
>>>> 1 hour process it. Now when its done processing of previous 5 hours data
>>>> which missed, normal streaming should start with the given slide interval.
>>>>
>>>> Please suggest any ideas and feasibility of this.
>>>>
>>>>
>>>> Thanks !!
>>>> Abhi
>>>>
>>>
>>

Re: Spark Streaming with Kafka Use Case

Reply via email to