Thank you Nico.

I *think* I should have one stream per key... the stream I get is pretty fast 
and there may be some corner cases I'm not aware of. However, I really need to 
process as a single window per key.


I am worried about the cardinality of the key ...  I wanted to use a timeout to 
remove the window for a key. If not the memory requirements would grow quickly 
(which I think is what is happening). The stream has 60K unique keys per 5 
minutes window (maybe 1/2 million total unique per day...).


Anyway I'll write a test to investigate further...


Thanks for your thoughts


Steve





________________________________
From: Nico Kruber <n...@data-artisans.com>
Sent: Wednesday, August 16, 2017 3:22:41 AM
To: user@flink.apache.org
Cc: Steve Jerman
Subject: Re: Question about Global Windows.

Hi Steve,
are you sure a GlobalWindows assigner fits your needs? This may be the case if
all your events always come in order and you do not ever have overlapping
sessions since a GlobalWindows assigner simply puts all events (per key) into
a single window (per key). If you have overlapping sessions, you may need your
own window assigner that handles multiple windows (see the
EventTimeSessionWindows assigner for our take on event-time session windows).

Regarding the timer: if you set it via `#registerEventTimeTimer()`, it only
fires if a watermark passes the given timestamp, so you need to make sure your
sources create them (see [1] and its sub-topics). Depending on your further
constraints in your application, it may be ok to use
`registerProcessingTimeTimer()` instead.

Does this help already? If not, we'd need some (minimal) example of how your
using these things to debug further into your memory issues.


Nico

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/
Apache Flink 1.3 Documentation: Application 
Development<https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/>
ci.apache.org
Application Development; Application Development



event_time.html

On Monday, 14 August 2017 06:32:01 CEST Steve Jerman wrote:
> Hi Folks,
>
> I have a question regarding Global Windows.
>
> I have a stream with a large number of records. The records have a key which
> has a very high cardinality. They also have a state ( start, status,
> finish).

> I need to do some processing where I look at the records separated into
> windows using the ‘state’ property.

> From the documentation, I believe I should be using a Global Window with a
> custom trigger to identify the windows….

> I have this implemented.. the Trigger returns ‘CONTINUE” for ‘start', and
> FIRE_AND_PURGE for ‘finish'.

> I also need to avoid running out of memory  since sometimes I don’t get
> ‘finish’ records… so I added a timer to the Trigger which PURGE’s if it
> fires..

> Is this the correct approach?
>
> I say this since I do in fact see a memory leak …  is there anything else I
> need to be aware of?

> Thanks
>
> Steve

Reply via email to