On Thu, 21 Feb 2019 at 14:00, Dawid Wysakowicz
wrote:
> If an event arrived at WindowOperator before the Watermark, then it will
> be accounted for window aggregation and put in state. Once that state gets
> checkpointed this same event won't be processed again. In other words if a
> checkpoint s
If an event arrived at WindowOperator before the Watermark, then it will
be accounted for window aggregation and put in state. Once that state
gets checkpointed this same event won't be processed again. In other
words if a checkpoint succeeds elements that produced corresponding
state won't be proc
On Thu, 21 Feb 2019 at 13:36, Dawid Wysakowicz
wrote:
> It is definitely a solution ;)
>
> You should be aware of the downsides though:
>
>- you might get different results in case of reprocessing
>- you might drop some data as late, due to some delays in processing,
>if the events ar
It is definitely a solution ;)
You should be aware of the downsides though:
* you might get different results in case of reprocessing
* you might drop some data as late, due to some delays in processing,
if the events arrive later then the "ProcessingTime" threshold
Best,
Dawid
On 21/0
Yes, it was the "watermarks for event time when no events for that shard"
problem.
I am now investigating whether we can use a blended watermark of
max(lastEventTimestamp - 1min, System.currentTimeMillis() - 5min) to ensure
idle shards do not cause excessive data retention.
Is that the best solut
Hi Stephen,
Watermark for a single operator is the minimum of Watermarks received
from all inputs, therefore if one of your shards/operators does not have
incoming data it will not produce Watermarks thus the Watermark of
WindowOperator will not progress. So this is sort of an expected behavior.
Hi Stephen
If the window has not been triggered ever, maybe you could investigate the
watermark, maybe the doc[1][2] can be helpful.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/windows.html#interaction-of-watermarks-and-windows
[2]
https://ci.apache.org
>
>
>
> From: Stephen Connolly [mailto:stephen.alan.conno...@gmail.com
> <mailto:stephen.alan.conno...@gmail.com>]
> Sent: Tuesday, February 19, 2019 6:32 AM
> To: user mailto:user@flink.apache.org>>
> Subject: EXT :Re: How to debug difference between Kinesis
urce
> function isn’t getting data you have to watch out for this.
>
>
>
> *From:* Stephen Connolly [mailto:stephen.alan.conno...@gmail.com]
> *Sent:* Tuesday, February 19, 2019 6:32 AM
> *To:* user
> *Subject:* EXT :Re: How to debug difference between Kinesis and Kafka
>
source function isn’t getting
data you have to watch out for this.
From: Stephen Connolly [mailto:stephen.alan.conno...@gmail.com]
Sent: Tuesday, February 19, 2019 6:32 AM
To: user
Subject: EXT :Re: How to debug difference between Kinesis and Kafka
Hmmm my suspicions are now quite high. I created a
Hmmm my suspicions are now quite high. I created a file source that just
replays the events straight then I get more results
On Tue, 19 Feb 2019 at 11:50, Stephen Connolly <
stephen.alan.conno...@gmail.com> wrote:
> Hmmm after expanding the dataset such that there was additional data that
> e
Hmmm after expanding the dataset such that there was additional data that
ended up on shard-0 (everything in my original dataset was coincidentally
landing on shard-1) I am now getting output... should I expect this kind of
behaviour if no data arrives at shard-0 ever?
On Tue, 19 Feb 2019 at 11:14
Hi, I’m having a strange situation and I would like to know where I should
start trying to debug.
I have set up a configurable swap in source, with three implementations:
1. A mock implementation
2. A Kafka consumer implementation
3. A Kinesis consumer implementation
>From injecting a log and no
13 matches
Mail list logo