I also prefer option 1.

Ismael

On Fri, Dec 16, 2016 at 7:11 PM, Jason Gustafson <ja...@confluent.io> wrote:

> Thanks Vahid. To clarify the impact of this issue, since we have no way to
> send an error code in the OffsetFetchResponse when requesting all offsets,
> we cannot detect when the coordinator has moved to another broker or when
> it is still in the process of loading the offsets. This means we cannot
> tell if there were was an error or if there were just no offsets stored for
> the group. We've considered a few options:
>
> 1. Include an error code at the top level of the response. This seems like
> the cleanest approach. The downside is that clients need to look for errors
> in two locations for response errors. One small benefit is that many
> OffsetFetch errors are group-level, so in that case, we can save the need
> to return responses for all the requested partitions.
> 2. Sort of hacky, but we could insert a "dummy" partition into the response
> so that we have somewhere to return an error code.
> 3. Include no error code, but use a null array in the response to indicate
> that there was some error. If there was no error, and the group simply had
> no partitions, then we return an empty array. I guess in this case, if the
> client receives a null array in the response, it should assume the worst
> and rediscover the coordinator and try again.
>
> My preference is the first one. Not sure if there are any other ideas?
>
> -Jason
>
> On Thu, Dec 15, 2016 at 3:02 PM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > Hi all,
> >
> > Even though KIP-88 was recently approved, due to a limitation that comes
> > with the proposed protocol change in KIP-88 I'll have to re-open it to
> > address the problem.
> > I'd like to thank Jason Gustafson for catching this issue.
> >
> > I'll explain this in the KIP as well, but to summarize, KIP-88 suggests
> > adding the option of passing a "null" array in FetchOffset request to
> > query all existing offsets for a consumer group. It does not suggest any
> > modification to FetchOffset response.
> >
> > In the existing protocol, group or coordinator related errors are
> reported
> > along with each partition in the OffsetFetch response.
> >
> > If there are partitions in the request, they are guaranteed to appear in
> > the response (there could be an error code associated with each). So if
> > there is an error, it is reported back by being attached to some
> partition
> > in the request.
> > If an empty array is passed, no error is reported (no matter what the
> > group or coordinator status is). The response comes back with an empty
> > list.
> >
> > With the proposed change in KIP-88 we could have a scenario in which a
> > null array is sent in FetchOffset request, and due to some errors (for
> > example if coordinator just started and hasn't caught up yet with the
> > offset topic), an empty list is returned in the FetchOffset response (the
> > group may or may not actually be empty). The issue is in situations like
> > this no error can be returned in the response because there is no
> > partition to attach the error to.
> >
> > I'll update the KIP with more details and propose to add to OffsetFetch
> > response schema an "error_code" at the top level that can be used to
> > report group related errors (instead of reporting those errors with each
> > individual partition).
> >
> > I apologize if this causes any inconvenience.
> >
> > Feedback and comments are always welcome.
> >
> > Thanks.
> > --Vahid
> >
> >
>

Reply via email to