Hi,

My team and I are looking into a problem where the Java high level consumer
provides duplicate messages if we turn auto commit off (using version
0.8.2.1 of the server and Java client).  The expected sequence of events
are:

1. Start high-level consumer and initialize a KafkaStream to get a
ConsumerIterator
2. Consume n items (could be 10,000, could be 1,000,000) from the iterator
3. Commit the new offsets

What we are seeing is that during step 2, some number of the n messages are
getting returned by the iterator in duplicate (in some cases, we've seen
n*5 messages consumed).  The problem appears to go away if we turn on auto
commit (and committing offsets to kafka helped too), but auto commit causes
conflicts with our offset rollback logic.  The issue seems to happen more
when we are in our test environment on a lower-cost cloud provider.

Diving into the Java and Scala classes including the ConsumerIterator, it's
not obvious what event causes a duplicate offset to be requested or
returned (there's even a loop that is supposed to exclude duplicate
messages in this class).  I tried turning on trace logging but my log4j
config isn't getting the Kafka client logs to write out.

Does anyone have suggestions of where to look or how to enable logging?

Thanks,
Cliff

Reply via email to