It controls the minimum frequency at which the offsets are committed. The StreamThread runs in a loop that looks something like this:
while(true) records = consumer.poll(..) for each record task = findTask(record) task.process(..) end maybeCommit() end This is grossly simplified, but each time through the loop the consumer may receive some records. Those records may take milliseconds, seconds, minutes, to process. At the end of the loop maybeCommit is called - it will only commit the offsets if the value of COMMIT_INTERVAL_MS_CONFIG has passed. On Fri, 3 Feb 2017 at 10:56 Mahendra Kariya <mahendra.kar...@go-jek.com> wrote: > Thanks Damian for this info. > > On Fri, Feb 3, 2017 at 3:29 PM, Damian Guy <damian....@gmail.com> wrote: > > > The commit is done on the same thread as the processing, so only offsets > > that have been fully processed by the topology will be committed. > > > > > I am still not clear about why do we need the COMMIT_INTERVAL_MS_CONFIG > config > for streams. If the offsets are only going to be committed after the > topology processing is over, what does this config exactly do? >