Hi Johannes,
I think what you can do is not register a timer for every event but for
every key, with a certain granularity. When that timer fires you check
what you want to clean up for that key and maybe register another timer
for the future. This way, the size of your timer state is bounded by
your key cardinality and I think people have used Flink with
timers/windows with key cardinalities of several 100 millions.

Best,
Aljoscha

On Wed, Mar 8, 2017, at 14:37, Ufuk Celebi wrote:
> Looping in Aljoscha and Kostas who are the expert on this. :-)
> 
> On Mon, Mar 6, 2017 at 6:06 PM, Johannes Schulte
> <johannes.schu...@gmail.com> wrote:
> > Hi,
> >
> > I am trying to achieve a stream-to-stream join with big windows and are
> > searching for a way to clean up state of old keys. I am already using a
> > RichCoProcessFunction
> >
> > I found there is already an existing ticket
> >
> > https://issues.apache.org/jira/browse/FLINK-3089
> >
> > but I have doubts that a registration of a timer for every incoming event is
> > feasible as the timers seem to reside in an in-memory queue.
> >
> > The task is somewhat similar to the following blog post:
> > http://devblog.mediamath.com/real-time-streaming-attribution-using-apache-flink
> >
> > Is the implementation of a custom window operator a necessity for achieving
> > such functionality
> >
> > Thanks a lot,
> >
> > Johannes
> >
> >

Reply via email to