Re: Kafka Streams change log behavior

2016-07-07 Thread Guozhang Wang
Hi Manu, Just realized you have a question in the previous email that I missed. Sorry! The local log offset is actually only used for shortening the recovery time upon resuming: it acts as an indicator that the task was cleanly shutdown previously, and hence the state stores are in consistent sta

Re: Kafka Streams change log behavior

2016-05-23 Thread Manu Zhang
Well, that's something our user is asking for so I'd like to learn how Kafka Streams has done it. FYI, Gearpump checkpoints the mapping of event time clock to kafka offset for replay. We are thinking about allowing users to configure a topic to recover from after application upgrade. Meanwhile, w

Re: Kafka Streams change log behavior

2016-05-23 Thread Guozhang Wang
Hi Manu, That is a good question, currently if users change their application logic while upgrading, the intermediate state stores may not be valid anymore, and hence the change log. In this case we need to wipe out the invalid internal data and restart from scratch. We are working on a better re

Re: Kafka Streams change log behavior

2016-05-23 Thread Manu Zhang
That's one case. Well, if I get it right, the change log's topic name is bound to applicationId, so what I want to ask is how change log works for scenarios like application upgrade (with new applicationId, I guess). Does the system handle that for user ? On Mon, May 23, 2016 at 9:21 PM Matthias J

Re: Kafka Streams change log behavior

2016-05-23 Thread Matthias J. Sax
If I understand correctly, you want to have a non-Kafka-Streams consumer to read a change-log topic that was written by a Kafka-Streams application? That is certainly possible. Kafka is agnostic to Kafka Stream, ie, all topics are regular topic and can be read by any consumer. -Matthias On 05/2

Re: Kafka Streams change log behavior

2016-05-23 Thread Manu Zhang
Thanks Matthias. Is there a way to allow users to read change logs from a previous application ? On Mon, May 23, 2016 at 3:57 PM Matthias J. Sax wrote: > Hi Manu, > > Yes. If a StreamTask recovers, it will write to the same change log's > topic partition. Log compaction is enable per default for

Re: Kafka Streams change log behavior

2016-05-23 Thread Matthias J. Sax
Hi Manu, Yes. If a StreamTask recovers, it will write to the same change log's topic partition. Log compaction is enable per default for those topics. You still might see some duplicates in your output. Currently, Kafka Streams guarantees at-least-once processing (exactly-once processing is on the

Kafka Streams change log behavior

2016-05-22 Thread Manu Zhang
Hi All, I'm new to Kafka Streams and have a question on change log. If a StreamTask fails and is restarted, will the change log be written to the old change log's topic partition ? Is it possible for some change log topic partition to have duplicate records so that log compaction is required ? T