Re: Consumer / Streams causes deletes in __consumer_offsets?

2017-02-22 Thread Mahendra Kariya
Thanks! On Thu, Feb 23, 2017 at 10:49 AM, Guozhang Wang wrote: > Mahendra, > > That is right, what I meant is that at the end of each loop in the thread, > it will check against the commit internal and see if it should do so. That > means, commit will only happen after any records have been comp

Re: Consumer / Streams causes deletes in __consumer_offsets?

2017-02-22 Thread Guozhang Wang
Mahendra, That is right, what I meant is that at the end of each loop in the thread, it will check against the commit internal and see if it should do so. That means, commit will only happen after any records have been completely processed in the topology, and that also means that the actual commi

Re: Consumer / Streams causes deletes in __consumer_offsets?

2017-02-22 Thread Mahendra Kariya
Hi Guozhang, On Thu, Feb 23, 2017 at 2:48 AM, Guozhang Wang wrote: > With that even if you do > not have any data processed the commit operation will be triggered after > that configured period of time. > The above statement is confusing. As per this thread , offsets are o

Re: Consumer / Streams causes deletes in __consumer_offsets?

2017-02-22 Thread Guozhang Wang
What you said is absolutely right, and sorry I missed that part in the previous email. I think for now it is OK to tune offsets.retention.minutes, as for the long term fix, there are some discussions on this: the retention of offsets today is not tied to whether the group has active members, and i

Re: Consumer / Streams causes deletes in __consumer_offsets?

2017-02-22 Thread Mathieu Fenniak
Thanks Guozhang, that clarifies the Streams behavior. I'm imagining that a Streams application might only commit partition offsets that have changed, and therefore a partition that is idle for greater than offsets.retention.minutes might lose its offsets when the app restarts. Does that seem plau

Re: Consumer / Streams causes deletes in __consumer_offsets?

2017-02-22 Thread Guozhang Wang
Hi Mathieu, In Streams the consumer config "enable.auto.commit" is always forced to false, and a separate "commit.interval.ms" is set. With that even if you do not have any data processed the commit operation will be triggered after that configured period of time. Guozhang On Wed, Feb 22, 2017

Re: Consumer / Streams causes deletes in __consumer_offsets?

2017-02-22 Thread Mathieu Fenniak
Hi Eno, Thanks for the quick reply. I think that probably does match the data I'm seeing. This surprises me a bit because my streams app was only offline for a few minutes, but ended up losing its offset. My interpretation is that the source partition had been idle for 24 hours, streams doesn't

Re: Consumer / Streams causes deletes in __consumer_offsets?

2017-02-22 Thread Eno Thereska
Hi Mathieu, It could be that the offset retention period has expired. See this: http://stackoverflow.com/questions/39131465/how-does-an-offset-expire-for-an-apache-kafka-consumer-group Th

Consumer / Streams causes deletes in __consumer_offsets?

2017-02-22 Thread Mathieu Fenniak
Hey users, What causes delete tombstones (value=null) to be sent to the __consumer_offsets topic? I'm observing that a Kafka Streams application that is restarted after a crash appears to be reprocessing messages from the beginning of a topic. I've dumped the __consumer_offsets topic and found th