Guozhang, Thank make sense except for the following:
- the ZookeeperConsumerConnector.commitOffsets() method commits the current value of PartitionTopicInfo.consumeOffset for all of the active streams. - the ConsumerIterator in the streams advances the value of PartitionTopicInfo.consumeOffset *before* next() returns, not after the processing on that message is complete. If you have multiple threads consuming, thread A calling commitOffsets() may commit thread B's retrieved but unprocessed message, no? On Wed, Jan 29, 2014 at 12:20 PM, Guozhang Wang <[email protected]> wrote: > Hi Clark, > > In practice, the client app code need to always commit offset after it has > processed the messages, and hence only the second case may happen, leading > to "at least once". > > Guozhang > > > On Wed, Jan 29, 2014 at 11:51 AM, Clark Breyman <[email protected]> wrote: > > > Wrestling through the at-least/most-once semantics of my application and > I > > was hoping for some confirmation of the semantics. I'm not sure I can > > classify the high level consumer as either type. > > > > False ack scenario: > > - Thread A: call next() on the ConsumerIterator, advancing the > > PartitionTopicInfo offset > > - Thread B: commitOffsets() flushed offset of incomplete message to ZK > > - Thread A: fail processing (e.g. kill -9) > > > > False retry scenario: > > - Thread A: call next() & successfully process, kill -9 before > > commitOffsets either in thread or in parallel. > > > > Is this right or am I missing something (likely)? Seems like the > semantics > > are essentially approximately once. > > > > > > -- > -- Guozhang >
