Re: ProcessFunction Timer
I can use filtering to do it but I preferred process function because I don't have to do a KeyBy again to do the windowing or use reinterpret. The problem I'm having is if I use the processFunction and registered a timer and before the timer is fired if I have more input records, how can I avoid creating more timers and just use one timer to collect the data and forward it. I was thinking about using a local variable but it wouldn't work across keys. The other approach is to have a value state to indicate if the timer is registered or not but I'm thinking is this the only way or is there a better approach? Thanks On Fri, Oct 18, 2019 at 6:19 AM Andrey Zagrebin wrote: > Hi Navneeth, > > You could also apply filtering on the incoming records before windowing. > This might save you some development effort but I do not know full details > of your requirement whether filtering is sufficient. > In general, you can use timers as you suggested as the windowing itself > works in a similar way. > > Best, > Andrey > > On Thu, Oct 17, 2019 at 11:10 PM Navneeth Krishnan < > reachnavnee...@gmail.com> wrote: > >> Hi All, >> >> I'm currently using a tumbling window of 5 seconds using >> TumblingTimeWindow but due to change in requirements I would not have to >> window every incoming data. With that said I'm planning to use process >> function to achieve this selective windowing. >> >> I looked at the example provided in the documentation and I'm not clear >> on how I can implement the windowing. >> >> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/process_function.html >> >> Basically what I want is keep collecting data until it reaches 5 seconds >> from the time the first data came in for the key and then forward it to the >> next operator. I will be using ListState to add the entries and then >> register a timer when the list is empty. When the timer runs then collect >> all entries and forward it, also remove entries from the list. Do you guys >> think this will suffice or anything else has to be done? >> >> Also I will have about 1M keys, then would there be any performance >> impact in creating these many timers? I believe the timers are >> automatically removed after they are fired or should I do anything extra to >> remove these timers? >> >> Thanks >> >
Re: ProcessFunction Timer
Hi Navneeth, You could also apply filtering on the incoming records before windowing. This might save you some development effort but I do not know full details of your requirement whether filtering is sufficient. In general, you can use timers as you suggested as the windowing itself works in a similar way. Best, Andrey On Thu, Oct 17, 2019 at 11:10 PM Navneeth Krishnan wrote: > Hi All, > > I'm currently using a tumbling window of 5 seconds using > TumblingTimeWindow but due to change in requirements I would not have to > window every incoming data. With that said I'm planning to use process > function to achieve this selective windowing. > > I looked at the example provided in the documentation and I'm not clear on > how I can implement the windowing. > > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/process_function.html > > Basically what I want is keep collecting data until it reaches 5 seconds > from the time the first data came in for the key and then forward it to the > next operator. I will be using ListState to add the entries and then > register a timer when the list is empty. When the timer runs then collect > all entries and forward it, also remove entries from the list. Do you guys > think this will suffice or anything else has to be done? > > Also I will have about 1M keys, then would there be any performance impact > in creating these many timers? I believe the timers are automatically > removed after they are fired or should I do anything extra to remove these > timers? > > Thanks >
ProcessFunction Timer
Hi All, I'm currently using a tumbling window of 5 seconds using TumblingTimeWindow but due to change in requirements I would not have to window every incoming data. With that said I'm planning to use process function to achieve this selective windowing. I looked at the example provided in the documentation and I'm not clear on how I can implement the windowing. https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/process_function.html Basically what I want is keep collecting data until it reaches 5 seconds from the time the first data came in for the key and then forward it to the next operator. I will be using ListState to add the entries and then register a timer when the list is empty. When the timer runs then collect all entries and forward it, also remove entries from the list. Do you guys think this will suffice or anything else has to be done? Also I will have about 1M keys, then would there be any performance impact in creating these many timers? I believe the timers are automatically removed after they are fired or should I do anything extra to remove these timers? Thanks