Hi John,

Thanks for the reply - yep, that was a dumb copy and paste error, which is
what I get for coding while surrounded by kids. >_< I'm deploying a fixed
version of it as we speak. Thanks for the reply though :)

Kind regards,

Liam Clarke



On Mon, 20 Apr. 2020, 2:08 am John Roesler, <[email protected]> wrote:

> Hi Liam,
>
> I took a quick look. On the output side, it looks like you’re adding the
> count to the prior count. Should that just set the outbound vale to the new
> count? Maybe I misunderstood the situation.
>
> What I mean is, suppose you get two events for the same window:
>
> Inbound map := 0+1 = 1
> Count = 1
> Outbound map := 0+1 = 1
> (Proposed outbound := 1)
>
> Then,
>
> Inbound map := 1+1 = 2
> Count = 2
> Outbound map := 1+2 = 3
> (Proposed outbound := 2)
>
> Does that make sense?
> -John
>
> On Sun, Apr 19, 2020, at 03:08, Liam Clarke wrote:
> > Hello all,
> >
> > I have been running this code against production data, and I'm emitting
> > counts/sums for a sentinel record id to stdout so I can observe the
> > behaviour:
> >
> > https://gist.github.com/LiamClarkeNZ/b101ce6a42a2e5e1efddfe3a98c5805f
> >
> > When this code is run, the window duration is 2 minutes, grace period is
> 20
> > seconds, and retention time is 20 minutes.
> >
> > I am endeavouring to use event time as the timestamp basis for this
> process:
> > https://gist.github.com/LiamClarkeNZ/8265cec02e21f5969e0fedb8281a2180
> >
> > So, my sentinel debugging output shows a surprising behaviour in that the
> > outbound counts for the key always sum higher than the inbound count. For
> > example:
> >
> > Sample: 2020-04-19T07:31:37.492Z
> >
> > Inbound
> > {
> >     2020-04-19T03:00:00Z=4563,
> >     2020-04-19T04:00:00Z=5629,
> >     2020-04-19T05:00:00Z=8489,
> >     2020-04-19T06:00:00Z=13599
> > }
> >
> > Outbound
> > {
> >     2020-04-19T03:00:00Z=4717,
> >     2020-04-19T04:00:00Z=5890,
> >     2020-04-19T05:00:00Z=8826,
> >     2020-04-19T06:00:00Z=13951
> > }
> >
> > This makes me suspect that either I'm not using the window I thought I
> was
> > (e.g., I'm somehow using a sliding window instead of a tumbling window)
> or
> > that I have made a rookie error somewhere in my aggregations, or I've
> just
> > misunderstood something about this. Does it matter that the window size
> in
> > the persistent window store doesn't match the windowing time + grace time
> > in the windowing clause?
> >
> > Any pointers gratefully welcome.
> >
> > Kind regards,
> >
> > Liam Clarke-Hutchinson
> >
>

Reply via email to