Subtractor

2018-09-24 Thread Michael Eugene
Can someone explain to me the point of the Subtractor in an aggregator? I have to have one, because there is no concrete default implentation of it, but I am just trying to get a "normal" aggregation working and I don't see why I need a subtractor. Other than of course I n

Re: Subtractor

2018-09-24 Thread Vasily Sulatskov
d as two separate kafka messages. On Mon, Sep 24, 2018 at 10:56 AM Michael Eugene wrote: > > Can someone explain to me the point of the Subtractor in an aggregator? I > have to have one, because there is no concrete default implentation of it, > but I am just trying to get a "n

Re: Subtractor

2018-09-24 Thread Michael Eugene
? I have no interest in mapping to different keys. That's kind of making this exercise more complex. Also one of the confusing points is why in older versions of Kafka did you not need a subtractor? Only in 2.0 am I required to give a subtractor. 1.1 I

Re: Subtractor

2018-09-24 Thread Vasily Sulatskov
Hi, Given that you need a subtractor you are probably calling KGroupedTable.aggregate(). In order to get a KGroupedTable you called (in a general case) KTable.groupBy(). I.e you have an original (pre-groupBy) table stream (changelog): where a message key is say pageId, and value is say number of

Re: Subtractor

2018-09-24 Thread Michael Eugene
Ok thanks for taking the time again to respond. So you are saying that the subtractor actually handles when the key changes for the earlier groupBy? (Maybe not 100% but that is sort of what it is handling). Also my code is below (I took out some of it to avoid clutter) - I am grouping twice

Re: Subtractor

2018-09-24 Thread Vasily Sulatskov
u try to observe how the data flows through your app, that wouldn't show much about real aggregation, as no real aggregation actually takes place. I can't really add much to this. I think the explanation I've given is mostly correct, and what really helped me to understand why a sub

Re: Subtractor

2018-09-24 Thread Michael Eugene
. I am mapping multiple rows of data into one output row. I didn’t want to add all of my test data and output data. I’m not saying your explanation is incorrect. On the contrary, I wanted to summarize it in my own terms. Can you say that my saying “subtractor in the aggregate method actually

Re: Subtractor

2018-09-25 Thread Matthias J. Sax
oved from the result KTable while the new value must be added to the result KTable. The `Subtractor` is used for the former. Example: you count some keys: This result is a KTable Next, you apply a groupBy to the KTable, to build a histogram over the count (ie, you count how many differen

Clarify “the order of execution for the subtractor and adder is not defined”

2021-01-28 Thread Fq Public
a key (e.g., UPDATE), then (1) the subtractor is called with the old value as stored in the table and (2) the adder is called with the new value of the input record that was just received. *The order of execution for the subtractor and adder is not defined.* My interpretation of that last line i

Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-12 Thread Vasily Sulatskov
kafka-streams KGroupedTable.reduce() can call subtractor function with null aggregator value, and if you try to work around that, by interpreting null aggregator value as zero for numeric value you get incorrect aggregation result. I do understand that the proper way of handling this is to do a res

Re: Clarify “the order of execution for the subtractor and adder is not defined”

2021-01-28 Thread Fq Public
Hi everyone! I posted this same question on stackoverflow <https://stackoverflow.com/questions/65888756/clarify-the-order-of-execution-for-the-subtractor-and-adder-is-not-defined> a few days ago but didn't get any responses. Was hoping someone here might be able to help clarify this

Re: Clarify “the order of execution for the subtractor and adder is not defined”

2021-01-29 Thread Alexandre Brasil
>From the source code in KGroupedTableImpl, the subtractor is always called before the adder. By not guaranteeing the order, I think the devs meant that it might change on future versions of Kafka Streams (although I'd think it's unlikely to). I have use cases similars with your exam

Re: Clarify “the order of execution for the subtractor and adder is not defined”

2021-01-29 Thread Matthias J. Sax
as a matter of fact, it does not really matter (detail below). For the three scenarios you mentioned, the 3rd one cannot happen though: We execute an aggregator in a single thread (per shard) and thus we either call the adder or subtractor first. > 1. Seems like an unusual design choice W

Re: Clarify “the order of execution for the subtractor and adder is not defined”

2021-02-01 Thread Fq Public
y matter (detail below). > > For the three scenarios you mentioned, the 3rd one cannot happen though: > We execute an aggregator in a single thread (per shard) and thus we > either call the adder or subtractor first. > > > > > 1. Seems like an unusual design choice > > Why

Re: Clarify “the order of execution for the subtractor and adder is not defined”

2021-02-01 Thread Fq Public
could join with the KTable in a state where the subtractor has been already executed but the adder has not yet been executed for the corresponding KTable-update event? Cheers, FQ On Mon, 1 Feb 2021 at 11:53, Fq Public wrote: > Hiya Matthias, Alexandre, > > Thanks for your detailed

Re: Clarify “the order of execution for the subtractor and adder is not defined”

2021-02-01 Thread Matthias J. Sax
I take it this means that, if I were joining a KStream with this KTable, it >> is entirely possible that a KStream record could join with the KTable in a >> state where the subtractor has been already executed but the adder has not >> yet been executed for the corresponding KTable-

Re: Clarify “the order of execution for the subtractor and adder is not defined”

2021-02-02 Thread Fq Public
gt;> is entirely possible that a KStream record could join with the KTable > in a > >> state where the subtractor has been already executed but the adder has > not > >> yet been executed for the corresponding KTable-update event? > > Yes, that would be possible (even witho

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-12 Thread John Roesler
r what would happen if you > change a streams topology without doing a proper reset. > > I've noticed that from time to time, kafka-streams > KGroupedTable.reduce() can call subtractor function with null > aggregator value, and if you try to work around that, by interpreting &g

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-13 Thread Vasily Sulatskov
Hi John, Thanks for your reply. I am not sure if this behavior I've observed is a bug or not, as I've not been resetting my application properly. On the other hand if the subtractor or adder in the reduce operation are never supposed to be called with null aggregator value, perhaps it

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-13 Thread John Roesler
Value"/"newValue" . (side-note: It's by forwarding both the old and new value that we are able to maintain aggregates using the subtractor/adder pairs) 2. In the full topology, these old/new pairs go through some transformations, but still in some form eventually make the

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-13 Thread Vasily Sulatskov
27;s what I think happened. > > 1. Your reduce node in subtopology1 (KSTREAM-REDUCE-02 / "table" ) > internally emits pairs of "oldValue"/"newValue" . (side-note: It's by > forwarding both the old and new value that we are able to maintain > a

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-13 Thread John Roesler
ted, here's what I think happened. > > > > 1. Your reduce node in subtopology1 (KSTREAM-REDUCE-02 / "table" > ) > > internally emits pairs of "oldValue"/"newValue" . (side-note: It's by > > forwarding both the old and new value

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-16 Thread Vasily Sulatskov
lementation, but can't find who makes these > > Change value. > > On Fri, Jul 13, 2018 at 6:17 PM John Roesler wrote: > > > > > > Hi again Vasily, > > > > > > Ok, it looks to me like this behavior is the result of the un-clean > > > topolog

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-16 Thread Matthias J. Sax
serialized back >>> to kafka, yet somehow this Change values makes it to reducer. I feel >>> like I am missing something here. Could you please clarify this? >>> >>> Can you please point me to a place in kafka-streams sources where a >>> Change of newValue/oldValue is p

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-16 Thread Vasily Sulatskov
uot; topic, it contains regular values, not a pair of > >>> old/new. > >>> > >>> As far as I understand, Change is a purely in-memory representation of > >>> the state for a particular key, and at no point it's serialized back > >>> to kafk

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-16 Thread Matthias J. Sax
UCE-09 (stores: >>>>> [aggregated-table]), where the actual aggregation takes place. What I >>>>> don't get is where this Change value comes from, I mean if it's been >>>>> produced by KSTREAM-REDUCE-02, but it shouldn't matte

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-17 Thread Vasily Sulatskov
e is read from kafka by > >>>>> KSTREAM-SOURCE-08 (topics: [aggregated-table-repartition]), > >>>>> and finally gets to Processor: KTABLE-REDUCE-09 (stores: > >>>>> [aggregated-table]), where the actual aggregation takes place. What I

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-17 Thread Matthias J. Sax
TREAM-REDUCE-02 / "table" goes goes to >>>>>>> KTABLE-SELECT-06 which in turn forwards data to >>>>>>> KSTREAM-SINK-07 (topic: aggregated-table-repartition), and at >>>>>>> this point I assume that dat

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-18 Thread Vasily Sulatskov
>>>> Hi John, > >>>>>>> > >>>>>>> Thanks for your explanation. > >>>>>>> > >>>>>>> I have an answer to the practical question, i.e. a null aggregator > >>>>>>> value shou