Hi Carl,
"** Disclaimer: I know there's a new consumer API on the way, this mail is

about the currently available API. I also apologise if the below has
already been discussed previously. I did try to check previous discussions
It seems to me that the high-level consumer would be able to support
at-least-once messaging, even if one uses auto-commit, by changing
kafka.consumer.ConsumerIterator.next() to call
currentTopicInfo.resetConsumeOffset(..) _before_ super.next(). This way, a
consumer thread for a KafkaStream could just loop:
while (true) {
    MyMessage message = iterator.next().message();
    process(message);
}
Each call to "iterator.next()" then updates the offset to commit to the end
of the message that was just processed. When offsets are committed for the
ConsumerConnector (either automatically or manually), the commit will not
include offsets of messages that haven't been fully processed.
I've tested the following ConsumerIterator.next(), and it seems to work as
I expect"

What you have proposed seems very reasonable. Essentially we need two sets of 
offsets (one which is published to zookeeper)

and one which is used for tracking which event to return from the iterator 
locally on the machine. (It is like prev/current setup).

In my opinion this is how it should have been implemented in the first place 
rather than forcing everyone to write additional code just for

the at least once guarantee.

If you push for this in open source you will have my vote at the very least :)

-Abhishek

Reply via email to