Re: Duplicates consumed on rebalance. No compression, autocommit enabled.
I'd suggest using the new consumer instead of the old consumer. We've refined the implementation such that even with auto-commit you should get at least once processing in the worst case (and when there aren't failures, exactly once). The 0.10.0.0 release should get all of these semantics right. -Ewen On Mon, Jul 11, 2016 at 7:05 AM, Gerard Klijswrote: > You could set the auto.commit.interval.ms to a lower value, in your > example > it is 10 seconds, which can be a lot of messages. I don't really see how it > could be prevented any further, since offset's can only committed by > consumer to the partitions they are assigned to. I do believe there is some > work in progress in which the assigned of partitions to consumers is > somewhat sticky. > In that case when a consumer has been assigned the same partitions after > the rebalance as it has had before, and then it should not be necessary to > consume the same data again in those partitions. > > On Mon, Jul 11, 2016 at 3:18 PM Michael Luban > wrote: > > > Using the 0.8.2.1 client. > > > > Is it possible to statistically minimize the possibility of duplication > in > > this scenario or has this behavior been corrected in a later client > > version? Or is the test flawed? > > > > https://gist.github.com/mluban/03a5c0d9221182e6ddbc37189c4d3eb0 > > > -- Thanks, Ewen
Re: Duplicates consumed on rebalance. No compression, autocommit enabled.
You could set the auto.commit.interval.ms to a lower value, in your example it is 10 seconds, which can be a lot of messages. I don't really see how it could be prevented any further, since offset's can only committed by consumer to the partitions they are assigned to. I do believe there is some work in progress in which the assigned of partitions to consumers is somewhat sticky. In that case when a consumer has been assigned the same partitions after the rebalance as it has had before, and then it should not be necessary to consume the same data again in those partitions. On Mon, Jul 11, 2016 at 3:18 PM Michael Lubanwrote: > Using the 0.8.2.1 client. > > Is it possible to statistically minimize the possibility of duplication in > this scenario or has this behavior been corrected in a later client > version? Or is the test flawed? > > https://gist.github.com/mluban/03a5c0d9221182e6ddbc37189c4d3eb0 >
Duplicates consumed on rebalance. No compression, autocommit enabled.
Using the 0.8.2.1 client. Is it possible to statistically minimize the possibility of duplication in this scenario or has this behavior been corrected in a later client version? Or is the test flawed? https://gist.github.com/mluban/03a5c0d9221182e6ddbc37189c4d3eb0