Re: [DISCUSS] Better user experience in the WindowAggregate upon Changelog (contains update message)

2021-07-01 Thread 刘建刚
Thanks for the discussion, JING ZHANG. I like the first proposal since it
is simple and consistent with dataStream API. It is helpful to add more
docs about the special late case in WindowAggregate. Also, I expect the
more flexible emit strategies later.

Jark Wu  于2021年7月2日周五 上午10:33写道:

> Sorry, I made a typo above. I mean I prefer proposal (1) that
> only needs to set `table.exec.emit.allow-lateness` to handle late events.
> `table.exec.emit.late-fire.delay` can be optional which is 0s by default.
> `table.exec.state.ttl` will not affect window state anymore, so window
> state
> is still cleaned accurately by watermark.
>
> We don't need to expose `table.exec.emit.late-fire.enabled` on docs and
> can remove it in the next version.
>
> Best,
> Jark
>
> On Thu, 1 Jul 2021 at 21:20, Jark Wu  wrote:
>
> > Thanks Jing for bringing up this topic,
> >
> > The emit strategy configs are annotated as Experiential and not public on
> > documentations.
> > However, I see this is a very useful feature which many users are looking
> > for.
> > I have posted these configs for many questions like "how to handle late
> > events in SQL".
> > Thus, I think it's time to make the configuration public and explicitly
> > document it. In the long
> > term, we would like to propose an EMIT syntax for SQL, but until then we
> > can get more
> > valuable feedback from users when they are using the configs.
> >
> > Regarding the exposed configuration, I prefer proposal (2).
> > But it would be better not to expose `table.exec.emit.late-fire.enabled`
> > on docs and we can
> > remove it in the next version.
> >
> > Best,
> > Jark
> >
> >
> > On Tue, 29 Jun 2021 at 11:09, JING ZHANG  wrote:
> >
> >> When WindowAggregate works upon Changelog which contains update
> messages,
> >> UPDATE BEFORE message may be dropped as a late message. [1]
> >>
> >> In order to handle late UB message, user needs to set *all* the
> >> following 3 parameters:
> >>
> >> (1) enable late fire by setting
> >>
> >> table.exec.emit.late-fire.enabled : true
> >>
> >> (2) set per record emit behavior for late records by setting
> >>
> >> table.exec.emit.late-fire.delay : 0 s
> >>
> >> (3) keep window state for extra time after window is fired by setting
> >>
> >> table.exec.emit.allow-lateness : 1 h// 或者table.exec.state.ttl: 1h
> >>
> >>
> >> The solution has two disadvantages:
> >>
> >> (1) Users may not realize that UB messages may be dropped as a late
> >> event, so they will not set related parameters.
> >>
> >> (2) When users look for a solution to solve the dropped UB messages
> >> problem, the current solution is a bit inconvenient for them because
> they
> >> need to set all the 3 parameters. Besides, some configurations have
> overlap
> >> ability.
> >>
> >>
> >> Now there are two proposals to simplify the 3 parameters a little.
> >>
> >> (1) Users only need set table.exec.emit.allow-lateness (just like the
> >> behavior on Datastream, user only need set allow-lateness), framework
> could
> >> atom set `table.exec.emit.late-fire.enabled` to true and set
> >> `table.exec.emit.late-fire.delay` to 0s.
> >>
> >> And in the later version, we deprecate `table.exec.emit.late-fire.delay`
> >> and `table.exec.emit.late-fire.enabled`.
> >>
> >>
> >> (2) Users need set `table.exec.emit.late-fire.enabled` to true and set
> >> `table.exec.state.ttl`, framework  could atom set
> >> `table.exec.emit.late-fire.delay` to 0s.
> >>
> >> And in the later version, we deprecate `table.exec.emit.late-fire.delay`
> >> and `table.exec.emit.allow-lateness `.
> >>
> >>
> >> Please let me know what you think about the issue.
> >>
> >> Thank you.
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-22781
> >>
> >>
> >> Best regards,
> >> JING ZHANG
> >>
> >>
> >>
> >>
>


Re: [DISCUSS] Better user experience in the WindowAggregate upon Changelog (contains update message)

2021-07-01 Thread Jark Wu
Sorry, I made a typo above. I mean I prefer proposal (1) that
only needs to set `table.exec.emit.allow-lateness` to handle late events.
`table.exec.emit.late-fire.delay` can be optional which is 0s by default.
`table.exec.state.ttl` will not affect window state anymore, so window state
is still cleaned accurately by watermark.

We don't need to expose `table.exec.emit.late-fire.enabled` on docs and
can remove it in the next version.

Best,
Jark

On Thu, 1 Jul 2021 at 21:20, Jark Wu  wrote:

> Thanks Jing for bringing up this topic,
>
> The emit strategy configs are annotated as Experiential and not public on
> documentations.
> However, I see this is a very useful feature which many users are looking
> for.
> I have posted these configs for many questions like "how to handle late
> events in SQL".
> Thus, I think it's time to make the configuration public and explicitly
> document it. In the long
> term, we would like to propose an EMIT syntax for SQL, but until then we
> can get more
> valuable feedback from users when they are using the configs.
>
> Regarding the exposed configuration, I prefer proposal (2).
> But it would be better not to expose `table.exec.emit.late-fire.enabled`
> on docs and we can
> remove it in the next version.
>
> Best,
> Jark
>
>
> On Tue, 29 Jun 2021 at 11:09, JING ZHANG  wrote:
>
>> When WindowAggregate works upon Changelog which contains update messages,
>> UPDATE BEFORE message may be dropped as a late message. [1]
>>
>> In order to handle late UB message, user needs to set *all* the
>> following 3 parameters:
>>
>> (1) enable late fire by setting
>>
>> table.exec.emit.late-fire.enabled : true
>>
>> (2) set per record emit behavior for late records by setting
>>
>> table.exec.emit.late-fire.delay : 0 s
>>
>> (3) keep window state for extra time after window is fired by setting
>>
>> table.exec.emit.allow-lateness : 1 h// 或者table.exec.state.ttl: 1h
>>
>>
>> The solution has two disadvantages:
>>
>> (1) Users may not realize that UB messages may be dropped as a late
>> event, so they will not set related parameters.
>>
>> (2) When users look for a solution to solve the dropped UB messages
>> problem, the current solution is a bit inconvenient for them because they
>> need to set all the 3 parameters. Besides, some configurations have overlap
>> ability.
>>
>>
>> Now there are two proposals to simplify the 3 parameters a little.
>>
>> (1) Users only need set table.exec.emit.allow-lateness (just like the
>> behavior on Datastream, user only need set allow-lateness), framework could
>> atom set `table.exec.emit.late-fire.enabled` to true and set
>> `table.exec.emit.late-fire.delay` to 0s.
>>
>> And in the later version, we deprecate `table.exec.emit.late-fire.delay`
>> and `table.exec.emit.late-fire.enabled`.
>>
>>
>> (2) Users need set `table.exec.emit.late-fire.enabled` to true and set
>> `table.exec.state.ttl`, framework  could atom set
>> `table.exec.emit.late-fire.delay` to 0s.
>>
>> And in the later version, we deprecate `table.exec.emit.late-fire.delay`
>> and `table.exec.emit.allow-lateness `.
>>
>>
>> Please let me know what you think about the issue.
>>
>> Thank you.
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-22781
>>
>>
>> Best regards,
>> JING ZHANG
>>
>>
>>
>>


Re: [DISCUSS] Better user experience in the WindowAggregate upon Changelog (contains update message)

2021-07-01 Thread Jark Wu
Thanks Jing for bringing up this topic,

The emit strategy configs are annotated as Experiential and not public on
documentations.
However, I see this is a very useful feature which many users are looking
for.
I have posted these configs for many questions like "how to handle late
events in SQL".
Thus, I think it's time to make the configuration public and explicitly
document it. In the long
term, we would like to propose an EMIT syntax for SQL, but until then we
can get more
valuable feedback from users when they are using the configs.

Regarding the exposed configuration, I prefer proposal (2).
But it would be better not to expose `table.exec.emit.late-fire.enabled` on
docs and we can
remove it in the next version.

Best,
Jark


On Tue, 29 Jun 2021 at 11:09, JING ZHANG  wrote:

> When WindowAggregate works upon Changelog which contains update messages,
> UPDATE BEFORE message may be dropped as a late message. [1]
>
> In order to handle late UB message, user needs to set *all* the following
> 3 parameters:
>
> (1) enable late fire by setting
>
> table.exec.emit.late-fire.enabled : true
>
> (2) set per record emit behavior for late records by setting
>
> table.exec.emit.late-fire.delay : 0 s
>
> (3) keep window state for extra time after window is fired by setting
>
> table.exec.emit.allow-lateness : 1 h// 或者table.exec.state.ttl: 1h
>
>
> The solution has two disadvantages:
>
> (1) Users may not realize that UB messages may be dropped as a late event,
> so they will not set related parameters.
>
> (2) When users look for a solution to solve the dropped UB messages
> problem, the current solution is a bit inconvenient for them because they
> need to set all the 3 parameters. Besides, some configurations have overlap
> ability.
>
>
> Now there are two proposals to simplify the 3 parameters a little.
>
> (1) Users only need set table.exec.emit.allow-lateness (just like the
> behavior on Datastream, user only need set allow-lateness), framework could
> atom set `table.exec.emit.late-fire.enabled` to true and set
> `table.exec.emit.late-fire.delay` to 0s.
>
> And in the later version, we deprecate `table.exec.emit.late-fire.delay`
> and `table.exec.emit.late-fire.enabled`.
>
>
> (2) Users need set `table.exec.emit.late-fire.enabled` to true and set
> `table.exec.state.ttl`, framework  could atom set
> `table.exec.emit.late-fire.delay` to 0s.
>
> And in the later version, we deprecate `table.exec.emit.late-fire.delay`
> and `table.exec.emit.allow-lateness `.
>
>
> Please let me know what you think about the issue.
>
> Thank you.
>
> [1] https://issues.apache.org/jira/browse/FLINK-22781
>
>
> Best regards,
> JING ZHANG
>
>
>
>


[DISCUSS] Better user experience in the WindowAggregate upon Changelog (contains update message)

2021-06-28 Thread JING ZHANG
When WindowAggregate works upon Changelog which contains update messages,
UPDATE BEFORE message may be dropped as a late message. [1]

In order to handle late UB message, user needs to set *all* the following 3
parameters:

(1) enable late fire by setting

table.exec.emit.late-fire.enabled : true

(2) set per record emit behavior for late records by setting

table.exec.emit.late-fire.delay : 0 s

(3) keep window state for extra time after window is fired by setting

table.exec.emit.allow-lateness : 1 h// 或者table.exec.state.ttl: 1h


The solution has two disadvantages:

(1) Users may not realize that UB messages may be dropped as a late event,
so they will not set related parameters.

(2) When users look for a solution to solve the dropped UB messages
problem, the current solution is a bit inconvenient for them because they
need to set all the 3 parameters. Besides, some configurations have overlap
ability.


Now there are two proposals to simplify the 3 parameters a little.

(1) Users only need set table.exec.emit.allow-lateness (just like the
behavior on Datastream, user only need set allow-lateness), framework could
atom set `table.exec.emit.late-fire.enabled` to true and set
`table.exec.emit.late-fire.delay` to 0s.

And in the later version, we deprecate `table.exec.emit.late-fire.delay`
and `table.exec.emit.late-fire.enabled`.


(2) Users need set `table.exec.emit.late-fire.enabled` to true and set
`table.exec.state.ttl`, framework  could atom set
`table.exec.emit.late-fire.delay` to 0s.

And in the later version, we deprecate `table.exec.emit.late-fire.delay`
and `table.exec.emit.allow-lateness `.


Please let me know what you think about the issue.

Thank you.

[1] https://issues.apache.org/jira/browse/FLINK-22781


Best regards,
JING ZHANG