Re: How User-Defined AggregateFunctions handle deletes of all aggregated rows.

2021-01-26 Thread Rex Fenley
Hello, I still don't have a good understanding of how UDAF in the Table API handles deletes. If every row aggregated into one groupBy(key) gets a retract, meaning nothing should be grouped by that key, will the state get deleted? Is there a way to delete the state for that row i.e. forward a retra

Re: How User-Defined AggregateFunctions handle deletes of all aggregated rows.

2020-12-11 Thread Rex Fenley
Hi, Does this question make sense or am I missing something? Thanks! On Thu, Dec 10, 2020 at 10:24 AM Rex Fenley wrote: > Ok, that makes sense. > > > You just need to correct the acc state to what it expects to be (say > re-evaluate the acc without the record that needs retraction) when you >

Re: How User-Defined AggregateFunctions handle deletes of all aggregated rows.

2020-12-10 Thread Rex Fenley
Ok, that makes sense. > You just need to correct the acc state to what it expects to be (say re-evaluate the acc without the record that needs retraction) when you received the retraction message. So for example, if i just remove all items from acc.groupIdSet on retraction it will know to clear

Re: How User-Defined AggregateFunctions handle deletes of all aggregated rows.

2020-12-09 Thread Danny Chan
No, the group agg, stream-stream join and rank are all stateful operators which need a state-backend to bookkeep the acc values. But it is only required to emit the retractions when the stateful operator A has a downstream operator B that is also stateful, because the B needs the retractions to co

Re: How User-Defined AggregateFunctions handle deletes of all aggregated rows.

2020-12-09 Thread Rex Fenley
So from what I'm understanding, the aggregate itself is not a "stateful operator" but one may follow it? How does the aggregate accumulator keep old values then? It can't all just live in memory, actually, looking at the savepoints it looks like there's state associated with our aggregate operator.

Re: How User-Defined AggregateFunctions handle deletes of all aggregated rows.

2020-12-09 Thread Danny Chan
Hi, Rex Fenley ~ If there is stateful operator as the output of the aggregate function. Then each time the function receives an update (or delete) for the key, the agg operator would emit 2 messages, one for retracting the old record, one for the new message. For your case, the new message is the

How User-Defined AggregateFunctions handle deletes of all aggregated rows.

2020-12-08 Thread Rex Fenley
Hello, I'd like to better understand delete behavior of AggregateFunctions. Let's assume there's an aggregate of `user_id` to a set of `group_ids` for groups belonging to that user. `user_id_1 -> [group_id_1, group_id_2, etc.]` Now let's assume sometime later that deletes arrive for all rows which