Re: How can I repartition/rebalance topics processed by a Kafka Streams topology?

2018-01-16 Thread Dmitry Minkovsky
> Thus, only left/outer KStream-KStream and KStream-KTable join have some runtime dependencies. For more details about join, check out this blog post: https://www.confluent.io/blog/crossing-streams-joins-apache-kafka/ So I am trying to reprocess and topology and seem to have encountered this. I

Re: How can I repartition/rebalance topics processed by a Kafka Streams topology?

2017-12-10 Thread Dmitry Minkovsky
Matthias, Thank you for your detailed response. Yes—of course I can use the record timestamp when copying from topic to topic. For some reason that always slips my mind. > This will always be computed correctly, even if both records are not in the buffer at the same time :) This is music to my

Re: How can I repartition/rebalance topics processed by a Kafka Streams topology?

2017-12-09 Thread Matthias J. Sax
About timestamps: embedding timestamps in the payload itself is not really necessary IMHO. Each record has meta-data timestamp that provides the exact same semantic. If you just copy data from one topic to another, the timestamp can be preserved (using plain consumer/producer and setting the

Re: How can I repartition/rebalance topics processed by a Kafka Streams topology?

2017-12-09 Thread Dmitry Minkovsky
> How large is the record buffer? Is it configurable? I seem to have just discovered this answer to this: buffered.records.per.partition On Sat, Dec 9, 2017 at 2:48 PM, Dmitry Minkovsky wrote: > Hi Matthias, yes that definitely helps. A few thoughts inline below. > >

Re: How can I repartition/rebalance topics processed by a Kafka Streams topology?

2017-12-09 Thread Dmitry Minkovsky
Hi Matthias, yes that definitely helps. A few thoughts inline below. Thank you! On Fri, Dec 8, 2017 at 4:21 PM, Matthias J. Sax wrote: > Hard to give a generic answer. > > 1. We recommend to over-partitions your input topics to start with (to > avoid that you need to add

Re: How can I repartition/rebalance topics processed by a Kafka Streams topology?

2017-12-08 Thread Matthias J. Sax
Hard to give a generic answer. 1. We recommend to over-partitions your input topics to start with (to avoid that you need to add new partitions later on); problem avoidance is the best strategy. There will be some overhead for this obviously on the broker side, but it's not too big. 2. Not sure

How can I repartition/rebalance topics processed by a Kafka Streams topology?

2017-12-08 Thread Dmitry Minkovsky
I am about to put a topology into production and I am concerned that I don't know how to repartition/rebalance the topics in the event that I need to add more partitions. My inclination is that I should spin up a new cluster and run some kind of consumer/producer combination that takes data from