Yeah that makes sense, again I'm just making sure we understand all the scenarios and what to expect.
I agree that if, more generally speaking, say users have only consumed to offset 8, and then call seek(16) to "jump" to a further position, then she needs to be aware that OORE maybe thrown and she needs to handle it or rely on reset policy which should not surprise her. I'm +1 on the KIP. Guozhang On Wed, Jan 24, 2018 at 12:31 AM, Dong Lin <lindon...@gmail.com> wrote: > Yes, in general we can not prevent OffsetOutOfRangeException if user seeks > to a wrong offset. The main goal is to prevent OffsetOutOfRangeException if > user has done things in the right way, e.g. user should know that there is > message with this offset. > > For example, if user calls seek(..) right after construction, the only > reason I can think of is that user stores offset externally. In this case, > user currently needs to use the offset which is obtained using position(..) > from the last run. With this KIP, user needs to get the offset and the > offsetEpoch using positionAndOffsetEpoch(...) and stores these information > externally. The next time user starts consumer, he/she needs to call > seek(..., offset, offsetEpoch) right after construction. Then KIP should be > able to ensure that we don't throw OffsetOutOfRangeException if there is no > unclean leader election. > > Does this sound OK? > > Regards, > Dong > > > On Tue, Jan 23, 2018 at 11:44 PM, Guozhang Wang <wangg...@gmail.com> > wrote: > > > "If consumer wants to consume message with offset 16, then consumer must > > have > > already fetched message with offset 15" > > > > --> this may not be always true right? What if consumer just call > seek(16) > > after construction and then poll without committed offset ever stored > > before? Admittedly it is rare but we do not programmably disallow it. > > > > > > Guozhang > > > > On Tue, Jan 23, 2018 at 10:42 PM, Dong Lin <lindon...@gmail.com> wrote: > > > > > Hey Guozhang, > > > > > > Thanks much for reviewing the KIP! > > > > > > In the scenario you described, let's assume that broker A has messages > > with > > > offset up to 10, and broker B has messages with offset up to 20. If > > > consumer wants to consume message with offset 9, it will not receive > > > OffsetOutOfRangeException > > > from broker A. > > > > > > If consumer wants to consume message with offset 16, then consumer must > > > have already fetched message with offset 15, which can only come from > > > broker B. Because consumer will fetch from broker B only if leaderEpoch > > >= > > > 2, then the current consumer leaderEpoch can not be 1 since this KIP > > > prevents leaderEpoch rewind. Thus we will not have > > > OffsetOutOfRangeException > > > in this case. > > > > > > Does this address your question, or maybe there is more advanced > scenario > > > that the KIP does not handle? > > > > > > Thanks, > > > Dong > > > > > > On Tue, Jan 23, 2018 at 9:43 PM, Guozhang Wang <wangg...@gmail.com> > > wrote: > > > > > > > Thanks Dong, I made a pass over the wiki and it lgtm. > > > > > > > > Just a quick question: can we completely eliminate the > > > > OffsetOutOfRangeException with this approach? Say if there is > > consecutive > > > > leader changes such that the cached metadata's partition epoch is 1, > > and > > > > the metadata fetch response returns with partition epoch 2 pointing > to > > > > leader broker A, while the actual up-to-date metadata has partition > > > epoch 3 > > > > whose leader is now broker B, the metadata refresh will still succeed > > and > > > > the follow-up fetch request may still see OORE? > > > > > > > > > > > > Guozhang > > > > > > > > > > > > On Tue, Jan 23, 2018 at 3:47 PM, Dong Lin <lindon...@gmail.com> > wrote: > > > > > > > > > Hi all, > > > > > > > > > > I would like to start the voting process for KIP-232: > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > > > > 232%3A+Detect+outdated+metadata+using+leaderEpoch+ > and+partitionEpoch > > > > > > > > > > The KIP will help fix a concurrency issue in Kafka which currently > > can > > > > > cause message loss or message duplication in consumer. > > > > > > > > > > Regards, > > > > > Dong > > > > > > > > > > > > > > > > > > > > > -- > > > > -- Guozhang > > > > > > > > > > > > > > > -- > > -- Guozhang > > > -- -- Guozhang