Re: [VOTE] KIP-232: Detect outdated metadata using leaderEpoch and partitionEpoch

Dong Lin Wed, 24 Jan 2018 00:32:14 -0800

Yes, in general we can not prevent OffsetOutOfRangeException if user seeks
to a wrong offset. The main goal is to prevent OffsetOutOfRangeException if
user has done things in the right way, e.g. user should know that there is
message with this offset.


For example, if user calls seek(..) right after construction, the only
reason I can think of is that user stores offset externally. In this case,
user currently needs to use the offset which is obtained using position(..)
from the last run. With this KIP, user needs to get the offset and the
offsetEpoch using positionAndOffsetEpoch(...) and stores these information
externally. The next time user starts consumer, he/she needs to call
seek(..., offset, offsetEpoch) right after construction. Then KIP should be
able to ensure that we don't throw OffsetOutOfRangeException if there is no
unclean leader election.

Does this sound OK?

Regards,
Dong


On Tue, Jan 23, 2018 at 11:44 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> "If consumer wants to consume message with offset 16, then consumer must
> have
> already fetched message with offset 15"
>
> --> this may not be always true right? What if consumer just call seek(16)
> after construction and then poll without committed offset ever stored
> before? Admittedly it is rare but we do not programmably disallow it.
>
>
> Guozhang
>
> On Tue, Jan 23, 2018 at 10:42 PM, Dong Lin <lindon...@gmail.com> wrote:
>
> > Hey Guozhang,
> >
> > Thanks much for reviewing the KIP!
> >
> > In the scenario you described, let's assume that broker A has messages
> with
> > offset up to 10, and broker B has messages with offset up to 20. If
> > consumer wants to consume message with offset 9, it will not receive
> > OffsetOutOfRangeException
> > from broker A.
> >
> > If consumer wants to consume message with offset 16, then consumer must
> > have already fetched message with offset 15, which can only come from
> > broker B. Because consumer will fetch from broker B only if leaderEpoch
> >=
> > 2, then the current consumer leaderEpoch can not be 1 since this KIP
> > prevents leaderEpoch rewind. Thus we will not have
> > OffsetOutOfRangeException
> > in this case.
> >
> > Does this address your question, or maybe there is more advanced scenario
> > that the KIP does not handle?
> >
> > Thanks,
> > Dong
> >
> > On Tue, Jan 23, 2018 at 9:43 PM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> > > Thanks Dong, I made a pass over the wiki and it lgtm.
> > >
> > > Just a quick question: can we completely eliminate the
> > > OffsetOutOfRangeException with this approach? Say if there is
> consecutive
> > > leader changes such that the cached metadata's partition epoch is 1,
> and
> > > the metadata fetch response returns  with partition epoch 2 pointing to
> > > leader broker A, while the actual up-to-date metadata has partition
> > epoch 3
> > > whose leader is now broker B, the metadata refresh will still succeed
> and
> > > the follow-up fetch request may still see OORE?
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Tue, Jan 23, 2018 at 3:47 PM, Dong Lin <lindon...@gmail.com> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I would like to start the voting process for KIP-232:
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 232%3A+Detect+outdated+metadata+using+leaderEpoch+and+partitionEpoch
> > > >
> > > > The KIP will help fix a concurrency issue in Kafka which currently
> can
> > > > cause message loss or message duplication in consumer.
> > > >
> > > > Regards,
> > > > Dong
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [VOTE] KIP-232: Detect outdated metadata using leaderEpoch and partitionEpoch

Reply via email to