Re: ProcessFunction Timer

2019-10-18 Thread Navneeth Krishnan
I can use filtering to do it but I preferred process function because I
don't have to do a KeyBy again to do the windowing or use reinterpret. The
problem I'm having is if I use the processFunction and registered a timer
and before the timer is fired if I have more input records, how can I avoid
creating more timers and just use one timer to collect the data and forward
it. I was thinking about using a local variable but it wouldn't work across
keys. The other approach is to have a value state to indicate if the timer
is registered or not but I'm thinking is this the only way or is there a
better approach?

Thanks

On Fri, Oct 18, 2019 at 6:19 AM Andrey Zagrebin 
wrote:

> Hi Navneeth,
>
> You could also apply filtering on the incoming records before windowing.
> This might save you some development effort but I do not know full details
> of your requirement whether filtering is sufficient.
> In general, you can use timers as you suggested as the windowing itself
> works in a similar way.
>
> Best,
> Andrey
>
> On Thu, Oct 17, 2019 at 11:10 PM Navneeth Krishnan <
> reachnavnee...@gmail.com> wrote:
>
>> Hi All,
>>
>> I'm currently using a tumbling window of 5 seconds using
>> TumblingTimeWindow but due to change in requirements I would not have to
>> window every incoming data. With that said I'm planning to use process
>> function to achieve this selective windowing.
>>
>> I looked at the example provided in the documentation and I'm not clear
>> on how I can implement the windowing.
>>
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/process_function.html
>>
>> Basically what I want is keep collecting data until it reaches 5 seconds
>> from the time the first data came in for the key and then forward it to the
>> next operator. I will be using ListState to add the entries and then
>> register a timer when the list is empty. When the timer runs then collect
>> all entries and forward it, also remove entries from the list. Do you guys
>> think this will suffice or anything else has to be done?
>>
>> Also I will have about 1M keys, then would there be any performance
>> impact in creating these many timers? I believe the timers are
>> automatically removed after they are fired or should I do anything extra to
>> remove these timers?
>>
>> Thanks
>>
>


Re: ProcessFunction Timer

2019-10-18 Thread Andrey Zagrebin
Hi Navneeth,

You could also apply filtering on the incoming records before windowing.
This might save you some development effort but I do not know full details
of your requirement whether filtering is sufficient.
In general, you can use timers as you suggested as the windowing itself
works in a similar way.

Best,
Andrey

On Thu, Oct 17, 2019 at 11:10 PM Navneeth Krishnan 
wrote:

> Hi All,
>
> I'm currently using a tumbling window of 5 seconds using
> TumblingTimeWindow but due to change in requirements I would not have to
> window every incoming data. With that said I'm planning to use process
> function to achieve this selective windowing.
>
> I looked at the example provided in the documentation and I'm not clear on
> how I can implement the windowing.
>
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/process_function.html
>
> Basically what I want is keep collecting data until it reaches 5 seconds
> from the time the first data came in for the key and then forward it to the
> next operator. I will be using ListState to add the entries and then
> register a timer when the list is empty. When the timer runs then collect
> all entries and forward it, also remove entries from the list. Do you guys
> think this will suffice or anything else has to be done?
>
> Also I will have about 1M keys, then would there be any performance impact
> in creating these many timers? I believe the timers are automatically
> removed after they are fired or should I do anything extra to remove these
> timers?
>
> Thanks
>


ProcessFunction Timer

2019-10-17 Thread Navneeth Krishnan
Hi All,

I'm currently using a tumbling window of 5 seconds using TumblingTimeWindow
but due to change in requirements I would not have to window every incoming
data. With that said I'm planning to use process function to achieve this
selective windowing.

I looked at the example provided in the documentation and I'm not clear on
how I can implement the windowing.
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/process_function.html

Basically what I want is keep collecting data until it reaches 5 seconds
from the time the first data came in for the key and then forward it to the
next operator. I will be using ListState to add the entries and then
register a timer when the list is empty. When the timer runs then collect
all entries and forward it, also remove entries from the list. Do you guys
think this will suffice or anything else has to be done?

Also I will have about 1M keys, then would there be any performance impact
in creating these many timers? I believe the timers are automatically
removed after they are fired or should I do anything extra to remove these
timers?

Thanks