You can call *foreachRDD*(*func*) on the output from the final stage, then
check the time if it's the 15th min of an hour then you flush the output to
DB else you don't.
Let me know if that approach works.

On Tue, Mar 8, 2016 at 2:10 PM, ayan guha <guha.a...@gmail.com> wrote:

> Yes if it falls within the batch. But if the requirement is flush
> everything till 15th min of the hour, then it should work.
> On 9 Mar 2016 04:01, "Ted Yu" <yuzhih...@gmail.com> wrote:
>
>> That may miss the 15th minute of the hour (with non-trivial deviation),
>> right ?
>>
>> On Tue, Mar 8, 2016 at 8:50 AM, ayan guha <guha.a...@gmail.com> wrote:
>>
>>> Why not compare current time in every batch and it meets certain
>>> condition emit the data?
>>> On 9 Mar 2016 00:19, "Abhishek Anand" <abhis.anan...@gmail.com> wrote:
>>>
>>>> I have a spark streaming job where I am aggregating the data by doing
>>>> reduceByKeyAndWindow with inverse function.
>>>>
>>>> I am keeping the data in memory for upto 2 hours and In order to output
>>>> the reduced data to an external storage I conditionally need to puke the
>>>> data to DB say at every 15th minute of the each hour.
>>>>
>>>> How can this be achieved.
>>>>
>>>>
>>>> Regards,
>>>> Abhi
>>>>
>>>
>>

Reply via email to