I think one of the hard parts is that while we agree that a call to
ApiVersions is required, I don't think there is agreement on how to
use the response.

In kafka-python, for example, to support backwards compatibility we
set a single "api" configuration value during the client constructor.
This value is either set by the user explicitly via configuration or,
if unset by the user, inferred from the first bootstrap server that
responds. In either case we are choosing 1 "api" to apply to all
broker connections. We then use this configuration throughout the
client code to choose different feature paths (Zookeeper v. Kafka
offsets, for example; or whether to use group coordination apis v.
assign all partitions locally on 'subscribe'; etc).

We do not short-circuit requests that appear unsupported by a
particular broker based on this api configuration. We send all
requests normally. If the broker doesn't understand a request, we
expect it to fail normally. Of course, "normally" here means no
response and an abruptly closed socket, which I think could be
improved.

So KAFKA-3600. At its core, this revision implements a check on each
connection guaranteeing support for a configured api vector. If any
connection does not support that vector, the client raises a
KafkaException in the response handler.

I think this is definitely an improvement. Compatibility errors are
completely opaque right now and can be very difficult to diagnose for
users that do not have direct access to their kafka cluster / brokers,
and/or do not have a firm grasp of the internals of the kafka client
and kafka protocol. But I also don't think this error handling is core
to compatibility. While supporting backwards compatibility, we
continue to rely on standard error handling (indeed, we have no choice
since we support backwards compatibility to 0.8.0).

So I think the revision is attempting to add a few things that are
assumed to be building blocks for a compatibility layer:
(1) ApiVersions request / response handler;
(2) a new connection state,
(3) a hashmap of nodeApiVersions {node_id -> [api_versions]} .

I'm confident that #1 is necessary. Is a new connection state
necessary? Is a full map of all nodes -> supported api version
necessary? These aren't used in the kafka-python implementation, so
from my perspective the answer is no. But maybe the java community has
some interesting ideas for improvement where these are key
ingredients. In any case, I think it would probably help substantially
to get some agreement on where the goalposts are for compatibility to
help push this forward.

-Dana

On Fri, Aug 12, 2016 at 7:37 AM, Ashish Singh <asi...@cloudera.com> wrote:
> Most of the work in the patch is to enhance NetworkClient to maintain api
> versions supported by brokers it has connections to. When a broker
> disconnects, its api versions info is removed and when it reconnects that
> info is fetched again. In short, with these changes Network Client at any
> given point knows what are the various brokers it is connected to and what
> are the various apis they support. This information is required to enable/
> select features based on brokers the client is talking to. I want to do the
> following in the order mentioned.
>
>    - Enable clients to detect, and so inform applications, that they are
>    not compatible with brokers they are trying to talk to, rather than getting
>    connection dropped or similar not so sure response. This has been ready for
>    some time, so we can definitely work to include in next release.
>    - Use the api versions info from NetworkClient to enable/ disable
>    feature sets, something similar to what librdkafka already has. This will
>    need much broader testing.
>
> Note that both of the aforementioned steps have been discussed as part of
> KIP-35 discussions, but only first step was voted in. We will have to go
> through another KIP for second step.
>
> Any suggestions on how to make progress here will be helpful. Dana has left
> some comments and I will address them soon. However, I would appreciate if
> I can get a committer willing to help merging this in.
>
>
> On Thu, Aug 11, 2016 at 4:15 PM, Gwen Shapira <g...@confluent.io> wrote:
>
>> I am 100% pro smart Java clients that support KIP-35 and can use it to
>> work with newer brokers. If this JIRA makes sense as a step in that
>> direction, I think its great and remove my objection.  I didn't see
>> anything that looked like a plan toward full forward-backward
>> compatibility, which is why I responded as I did...
>>
>> Verification is good, but it looked like there was much complexity
>> added toward very little benefits.
>>
>> On Thu, Aug 11, 2016 at 3:37 PM, Ashish Singh <asi...@cloudera.com> wrote:
>> > Hey Gwen,
>> >
>> > I think this was more than a verification step, it was a building step
>> > towards a backwards compatible clients or for clients that can select
>> > feature based on brokers it is talking to. Are we now against the idea of
>> > having smarter clients? This adds complexity to enable clients to inform
>> > applications of incompatible broker versions, I think that is some value
>> > add. In future, the api versions info can be used to take smarter
>> decisions.
>> >
>> > On Wed, Aug 10, 2016 at 11:04 PM, Gwen Shapira <g...@confluent.io>
>> wrote:
>> >
>> >> I hate doing this, because Ashish has really been good about following
>> >> up on the PR, but I'm questioning the usefulness of this patch.
>> >>
>> >> It adds non-trivial complexity to the client... with not much return
>> >> on the investment, as far as I can see?
>> >> When I first suggested it, it was before KIP-35 was merged and
>> >> released and the intent was to validate KIP-35 (since I have low
>> >> opinion of protocols that aren't used). Since then KIP-35 was already
>> >> released, the followup turned more complex than we expected, I think.
>> >> And I'm wondering if it is worth it.
>> >>
>> >> The work and followup from Ashish is still super appreciated, but I
>> >> think we need more than appreciation - adding complexity to already
>> >> complex clients need to have functional justification...
>> >>
>> >> Anyway, I was out of the loop for ages, so feel free to yell at me for
>> >> missing the obvious.
>> >>
>> >> Gwen
>> >>
>> >> On Tue, Aug 9, 2016 at 8:47 AM, Ashish Singh <asi...@cloudera.com>
>> wrote:
>> >> > Provided wrong link to PR, here is the PR
>> >> > <https://github.com/apache/kafka/pull/1251> for KAFKA-3600.
>> >> >
>> >> > On Tue, Aug 9, 2016 at 9:45 AM, Ashish Singh <asi...@cloudera.com>
>> >> wrote:
>> >> >
>> >> >> Hey Guys,
>> >> >>
>> >> >> KAFKA-3600 <https://issues.apache.org/jira/browse/KAFKA-3600> was
>> part
>> >> of
>> >> >> KIP-35's proposal. KAFKA-3307
>> >> >> <https://issues.apache.org/jira/browse/KAFKA-3307>,
>> >> >> adding ApiVersionsRequest/Response, was committed to 0.10.0.0, but
>> >> >> KAFKA-3600, enhancing java clients, is still under review. Here is
>> the
>> >> PR
>> >> >> <https://github.com/apache/kafka/pull/986>
>> >> >>
>> >> >> I have addressed all review comments and have been waiting for
>> further
>> >> >> reviews/ this to go in for quite some time. I will really appreciate
>> if
>> >> a
>> >> >> committer can help with making progress on this.
>> >> >>
>> >> >> --
>> >> >>
>> >> >> Regards,
>> >> >> Ashish
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> >
>> >> > Regards,
>> >> > Ashish
>> >>
>> >>
>> >>
>> >> --
>> >> Gwen Shapira
>> >> Product Manager | Confluent
>> >> 650.450.2760 | @gwenshap
>> >> Follow us: Twitter | blog
>> >>
>> >
>> >
>> >
>> > --
>> >
>> > Regards,
>> > Ashish
>>
>>
>>
>> --
>> Gwen Shapira
>> Product Manager | Confluent
>> 650.450.2760 | @gwenshap
>> Follow us: Twitter | blog
>>
>
>
>
> --
>
> Regards,
> Ashish

Reply via email to