Re: KTable aggregations send intermediate results downstream?

2016-08-26 Thread Guozhang Wang
The comment from Kasier Chen on the patch is correct: 1. Objects.equals() depends on the typed class object.equal() function, which may not be safe: for example users can override the equals function to only compare a subset of all the fields, whereas for Kafka Streams we need these two objects

Re: KTable aggregations send intermediate results downstream?

2016-08-24 Thread Mathieu Fenniak
Hi Guozhang, I've been working around this issue by dropping down to the Processor API, but, I was hoping you might be able to point out if there is a flaw is in this proposed change: https://github.com/apache/kafka/compare/trunk...mfenniak:suppress-duplicate-repartition-output This

Re: KTable aggregations send intermediate results downstream?

2016-08-19 Thread Guozhang Wang
Hi Mathieu, If you are only interested in the aggregate result "snapshot" but not its change stream (note that KTable itself is not actually a "table" as in RDBMS, but still a stream), you can try to use the queryable state feature that is available in trunk, which will be available in 0.10.1.0

Re: KTable aggregations send intermediate results downstream?

2016-08-17 Thread Guozhang Wang
The problem is that Kafka Streams need to repartition the streams based on the groupBy keys when doing aggregations. For your case, the original stream may be read from a topic that is partitioned on "K", and you need to first repartition on "category" on an intermediate topic before the

KTable aggregations send intermediate results downstream?

2016-08-17 Thread Mathieu Fenniak
Hello again, kafka-users, When I aggregate a KTable, a future input that updates a KTable's value for a specific key causes the aggregate's subtractor to be invoked, and then its adder. This part is great, completely as-expected. But what I didn't expect is that the intermediate result of the