Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-04 Thread Jay Kreps
Hey Ashish,

Both good points.

I think the issue with the general metadata request is the same as the
issue with a version-specific metadata request from the other
proposal--basically it's a chicken and egg problem, to find out anything
about the cluster you have to be able to communicate something in a format
the server can understand without knowing a priori what version it's on. I
guess the question is how can you continue to evolve the metadata request
(whether it is the existing metadata or a protocol-version specific
metadata request) given that you need this information to bootstrap you
have to be more careful in how that request evolves.

I think deprecation/removal may be okay. Ultimately clients will always use
the highest possible version of the protocol the server supports so if
we've already deprecated and removed your highest version then you are
screwed and you're going to get an error no matter what, right? Basically
there is nothing dynamic you can do in that case.

-Jay

On Fri, Mar 4, 2016 at 4:05 PM, Ashish Singh  wrote:

> Hello Jay,
>
> The overall approach sounds good. I do realize that this discussion has
> gotten too lengthy and is starting to shoot tangents. Maybe a KIP call will
> help us getting to a decision faster. I do have a few questions though.
>
> On Fri, Mar 4, 2016 at 9:52 AM, Jay Kreps  wrote:
>
> > Yeah here is my summary of my take:
> >
> > 1. Negotiating a per-connection protocol actually does add a lot of
> > complexity to clients (many more failure states to get right).
> >
> > 2. Having the client configure the protocol version manually is doable
> now
> > but probably a worse state. I suspect this will lead to more not less
> > confusion.
> >
> > 3. I don't think the current state is actually that bad. Integrators
> pick a
> > conservative version and build against that. There is a tradeoff between
> > getting the new features and being compatible with old Kafka versions.
> But
> > a large part of this tradeoff is essential since new features aren't
> going
> > to magically appear on old servers, so even if you upgrade your client
> you
> > likely aren't going to get the new stuff (since we will end up
> dynamically
> > turning it off). Having client features that are there but don't work
> > because you're on an old cluster may actually be a worse experience if
> not
> > handled very carefully..
> >
> > 4. The problems Dana brought up are totally orthogonal to the problem of
> > having per-api versions or overall versions. The problem was that we
> > changed behavior subtly without changing the version. This will be an
> issue
> > regardless of whether the version is global or not.
> >
> > 5. Using the broker release as the version is strictly worse than using a
> > global protocol version (0, 1, 2, ...) that increments any time any api
> > changes but doesn't increment just because non-protocol code is changed.
> > The problem with using the broker release version is we want to be able
> to
> > keep Kafka releasable from any commit which means there isn't as clear a
> > sequencing of releases as you would think.
> >
> > 6. We need to consider the case of mixed version clusters during the time
> > period when you are upgrading Kafka.
> >
> > So overall I think this is not a critical thing to do right now, but if
> we
> > are going to do it we should do it in a way that actually improves
> things.
> >
> > Here would be one proposal for that:
> > a. Add a global protocol version that increments with any api version
> > update. Move the documentation so that the docs are by version. This is
> > basically just a short-hand for a complete set of supported api versions.
> > b. Include a field in the metadata response for each broker that adds the
> > protocol version.
> >
> There might be an issue here where the metadata request version sent by
> client is not supported by broker, an older broker. However, if we are
> clearly stating that a client is not guaranteed to work with an older
> broker then this becomes expected. This will potentially limit us in terms
> of supporting downgrades though, if we ever want to.
>
> > c. To maintain the protocol version this information will have to get
> > propagated with the rest of the broker metadata like host, port, id, etc.
> >
> > The instructions to clients would be:
> > - By default you build against a single conservative Kafka protocol
> version
> > and we carry that support forward, as today
> >
> If I am getting this correct, this will mean we will never deprecate/remove
> any protocol version in future. Having some way to deprecate/remove older
> protocol versions will probably be a good idea. It is possible with the
> global protocol version approach, it could be as simple as marking a
> protocol deprecated in protocol doc before removing it. Just want to make
> sure deprecation is still on the table.
>
> > - If you want to get fancy you can use the protocol version field in 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-04 Thread Ashish Singh
Hello Jay,

The overall approach sounds good. I do realize that this discussion has
gotten too lengthy and is starting to shoot tangents. Maybe a KIP call will
help us getting to a decision faster. I do have a few questions though.

On Fri, Mar 4, 2016 at 9:52 AM, Jay Kreps  wrote:

> Yeah here is my summary of my take:
>
> 1. Negotiating a per-connection protocol actually does add a lot of
> complexity to clients (many more failure states to get right).
>
> 2. Having the client configure the protocol version manually is doable now
> but probably a worse state. I suspect this will lead to more not less
> confusion.
>
> 3. I don't think the current state is actually that bad. Integrators pick a
> conservative version and build against that. There is a tradeoff between
> getting the new features and being compatible with old Kafka versions. But
> a large part of this tradeoff is essential since new features aren't going
> to magically appear on old servers, so even if you upgrade your client you
> likely aren't going to get the new stuff (since we will end up dynamically
> turning it off). Having client features that are there but don't work
> because you're on an old cluster may actually be a worse experience if not
> handled very carefully..
>
> 4. The problems Dana brought up are totally orthogonal to the problem of
> having per-api versions or overall versions. The problem was that we
> changed behavior subtly without changing the version. This will be an issue
> regardless of whether the version is global or not.
>
> 5. Using the broker release as the version is strictly worse than using a
> global protocol version (0, 1, 2, ...) that increments any time any api
> changes but doesn't increment just because non-protocol code is changed.
> The problem with using the broker release version is we want to be able to
> keep Kafka releasable from any commit which means there isn't as clear a
> sequencing of releases as you would think.
>
> 6. We need to consider the case of mixed version clusters during the time
> period when you are upgrading Kafka.
>
> So overall I think this is not a critical thing to do right now, but if we
> are going to do it we should do it in a way that actually improves things.
>
> Here would be one proposal for that:
> a. Add a global protocol version that increments with any api version
> update. Move the documentation so that the docs are by version. This is
> basically just a short-hand for a complete set of supported api versions.
> b. Include a field in the metadata response for each broker that adds the
> protocol version.
>
There might be an issue here where the metadata request version sent by
client is not supported by broker, an older broker. However, if we are
clearly stating that a client is not guaranteed to work with an older
broker then this becomes expected. This will potentially limit us in terms
of supporting downgrades though, if we ever want to.

> c. To maintain the protocol version this information will have to get
> propagated with the rest of the broker metadata like host, port, id, etc.
>
> The instructions to clients would be:
> - By default you build against a single conservative Kafka protocol version
> and we carry that support forward, as today
>
If I am getting this correct, this will mean we will never deprecate/remove
any protocol version in future. Having some way to deprecate/remove older
protocol versions will probably be a good idea. It is possible with the
global protocol version approach, it could be as simple as marking a
protocol deprecated in protocol doc before removing it. Just want to make
sure deprecation is still on the table.

> - If you want to get fancy you can use the protocol version field in the
> metadata request to more dynamically chose what features are available and
> select api versions appropriately. This is purely optional.
>
> -Jay
>
> On Thu, Mar 3, 2016 at 9:38 PM, Jason Gustafson 
> wrote:
>
> > I talked with Jay about this KIP briefly this morning, so let me try to
> > summarize the discussion (I'm sure he'll jump in if I get anything
> wrong).
> > Apologies in advance for the length.
> >
> > I think we both share some skepticism that a request with all the
> supported
> > versions of all the request APIs is going to be a useful primitive to try
> > and build client compatibility around. In practice I think people would
> end
> > up checking for particular request versions in order to determine if the
> > broker is 0.8 or 0.9 or whatever, and then change behavior accordingly.
> I'm
> > wondering if there's a reasonable way to handle the version responses
> that
> > doesn't amount to that. Maybe you could try to capture feature
> > compatibility by checking the versions for a subset of request types? For
> > example, to ensure that you can use the new consumer API, you check that
> > the group coordinator request is present, the offset commit request
> version
> > is greater than 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-04 Thread Jay Kreps
Yeah here is my summary of my take:

1. Negotiating a per-connection protocol actually does add a lot of
complexity to clients (many more failure states to get right).

2. Having the client configure the protocol version manually is doable now
but probably a worse state. I suspect this will lead to more not less
confusion.

3. I don't think the current state is actually that bad. Integrators pick a
conservative version and build against that. There is a tradeoff between
getting the new features and being compatible with old Kafka versions. But
a large part of this tradeoff is essential since new features aren't going
to magically appear on old servers, so even if you upgrade your client you
likely aren't going to get the new stuff (since we will end up dynamically
turning it off). Having client features that are there but don't work
because you're on an old cluster may actually be a worse experience if not
handled very carefully..

4. The problems Dana brought up are totally orthogonal to the problem of
having per-api versions or overall versions. The problem was that we
changed behavior subtly without changing the version. This will be an issue
regardless of whether the version is global or not.

5. Using the broker release as the version is strictly worse than using a
global protocol version (0, 1, 2, ...) that increments any time any api
changes but doesn't increment just because non-protocol code is changed.
The problem with using the broker release version is we want to be able to
keep Kafka releasable from any commit which means there isn't as clear a
sequencing of releases as you would think.

6. We need to consider the case of mixed version clusters during the time
period when you are upgrading Kafka.

So overall I think this is not a critical thing to do right now, but if we
are going to do it we should do it in a way that actually improves things.

Here would be one proposal for that:
a. Add a global protocol version that increments with any api version
update. Move the documentation so that the docs are by version. This is
basically just a short-hand for a complete set of supported api versions.
b. Include a field in the metadata response for each broker that adds the
protocol version.
c. To maintain the protocol version this information will have to get
propagated with the rest of the broker metadata like host, port, id, etc.

The instructions to clients would be:
- By default you build against a single conservative Kafka protocol version
and we carry that support forward, as today
- If you want to get fancy you can use the protocol version field in the
metadata request to more dynamically chose what features are available and
select api versions appropriately. This is purely optional.

-Jay

On Thu, Mar 3, 2016 at 9:38 PM, Jason Gustafson  wrote:

> I talked with Jay about this KIP briefly this morning, so let me try to
> summarize the discussion (I'm sure he'll jump in if I get anything wrong).
> Apologies in advance for the length.
>
> I think we both share some skepticism that a request with all the supported
> versions of all the request APIs is going to be a useful primitive to try
> and build client compatibility around. In practice I think people would end
> up checking for particular request versions in order to determine if the
> broker is 0.8 or 0.9 or whatever, and then change behavior accordingly. I'm
> wondering if there's a reasonable way to handle the version responses that
> doesn't amount to that. Maybe you could try to capture feature
> compatibility by checking the versions for a subset of request types? For
> example, to ensure that you can use the new consumer API, you check that
> the group coordinator request is present, the offset commit request version
> is greater than 2, the offset fetch request is greater than 1, and the join
> group request is present. And to ensure compatibility with KIP-32, maybe
> you only need to check the appropriate versions of the fetch and produce
> requests. That sounds kind of complicated to keep track of and you probably
> end up trying to handle combinations which aren't even possible in
> practice.
>
> The alternative is to use a single API version. It could be the Kafka
> release version, but then you need to figure out how to handle users who
> are running off of trunk since multiple API versions will typically change
> between releases. Perhaps it makes more sense to keep a separate API
> version number which is incremented every time any one of the API versions
> increases? This also decouples the protocol from the Kafka distribution.
>
> As far as whether there should be a separate request or not, I get Becket's
> point that you would only need to do the version check once when a
> connection is established, but another round trip still complicates the
> picture quite a bit. Before you just need to send a metadata request to
> bootstrap yourself to the cluster, but now you need to do version
> negotiation 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-03 Thread Jason Gustafson
I talked with Jay about this KIP briefly this morning, so let me try to
summarize the discussion (I'm sure he'll jump in if I get anything wrong).
Apologies in advance for the length.

I think we both share some skepticism that a request with all the supported
versions of all the request APIs is going to be a useful primitive to try
and build client compatibility around. In practice I think people would end
up checking for particular request versions in order to determine if the
broker is 0.8 or 0.9 or whatever, and then change behavior accordingly. I'm
wondering if there's a reasonable way to handle the version responses that
doesn't amount to that. Maybe you could try to capture feature
compatibility by checking the versions for a subset of request types? For
example, to ensure that you can use the new consumer API, you check that
the group coordinator request is present, the offset commit request version
is greater than 2, the offset fetch request is greater than 1, and the join
group request is present. And to ensure compatibility with KIP-32, maybe
you only need to check the appropriate versions of the fetch and produce
requests. That sounds kind of complicated to keep track of and you probably
end up trying to handle combinations which aren't even possible in practice.

The alternative is to use a single API version. It could be the Kafka
release version, but then you need to figure out how to handle users who
are running off of trunk since multiple API versions will typically change
between releases. Perhaps it makes more sense to keep a separate API
version number which is incremented every time any one of the API versions
increases? This also decouples the protocol from the Kafka distribution.

As far as whether there should be a separate request or not, I get Becket's
point that you would only need to do the version check once when a
connection is established, but another round trip still complicates the
picture quite a bit. Before you just need to send a metadata request to
bootstrap yourself to the cluster, but now you need to do version
negotiation before you can even do that, and then you need to try adapt to
the versions reported. Jay brought up the point that you probably wouldn't
design a protocol from scratch to work this way. Using the metadata request
would be better if it's possible, but you need a way to handle the fact
that a broker's version might be stale by the time you connect to it. And
even then you're going to have to deal internally with the complexity
involved in trying to upgrade/downgrade dynamically, which sounds to me
like it would have a ton of edge cases.

Taking a bit of a step back, any solution is probably going to be painful
since the Kafka protocol was not designed for this use case. Currently what
that means for clients that /want/ to support compatibility across broker
versions is that they need to have the user tell them the broker version
through configuration (e.g. librdkafka has a "protocol.version" field for
this purpose). The only real problem with this in my mind is that we don't
have a graceful way to detect request incompatibility, which is why there
are so many questions on the user list which basically amount to the client
hanging because the broker refuses to respond to a request it doesn't
understand. If you solve this problem, then depending on configuration
seems totally reasonable and we can skip trying to implement request
version negotiation. Magnus's solution in this KIP may seem a little hacky,
but it also seems like the only way to do it without changing the header.

The Spark problem mentioned above is interesting and I agree that it sucks
for frameworks that need to ship the kafka client library since they have
to figure out how to bundle multiple versions. Ultimately if we want to
solve this problem, then it sounds like we need to commit to maintaining
compatibility with older versions of Kafka in the client going forward.
That's a lot bigger decision and it matters less whether the broker version
is found through configuration, topic metadata, or a new request type.

-Jason


On Thu, Mar 3, 2016 at 3:59 PM, Becket Qin  wrote:

> Hi Ashish,
>
> In approach (1), the clients will still be able to talked to multiple
> versions of Kafka brokers as long as the clients version is not higher than
> the broker version, right?
>
> From Spark's point of view, it seems the difference is whether Spark can
> independently update their Kafka clients dependency or not. More
> specifically, consider the following three scenarios:
> A. Spark has some new features that do not rely on clients or brokers in a
> new Kafka release.
> B. Spark has some new features that only rely on the clients in a new Kafka
> release, but not rely on the brokers in a new Kafka release. e.g. New
> client provides a listTopic() method.
> C. Spark has some new features that rely on both the clients and brokers in
> a new Kafka release. e.g timestamp field.

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-03 Thread Ashish Singh
> I think using Kafka release version makes sense. More particularly, we can
use the ApiVersion and this will cover all the interval version as well. In
KAFKA-3025, we added the ApiVersion to message format version mapping, We
can add the ApiKey to version mapping to ApiVersion as well. We can move
ApiVersion class to o.a.k.c package and use it for both server and clients.

Just a thought here, wouldn't this require the clients to be aware of Kafka
versions, which changes more frequently than the protocol versions. Also,
client might not be interested in all the protocols.

On Wed, Mar 2, 2016 at 5:15 PM, Becket Qin  wrote:

> I think using Kafka release version makes sense. More particularly, we can
> use the ApiVersion and this will cover all the interval version as well. In
> KAFKA-3025, we added the ApiVersion to message format version mapping, We
> can add the ApiKey to version mapping to ApiVersion as well. We can move
> ApiVersion class to o.a.k.c package and use it for both server and clients.
>
> @Jason, if we cache the release info in metadata and not re-validate the
> release on reconnect, would it still work if we do a rolling downgrade?
>
> On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson 
> wrote:
>
> > I think Dana's suggestion to include the Kafka release version makes a
> lot
> > of sense. I'm actually wondering why you would need the individual API
> > versions if you have that? It sounds like keeping track of all the api
> > version information would add a lot of complexity to clients since
> they'll
> > have to try to handle different version permutations which are not
> actually
> > possible in practice. Wouldn't it be simpler to know that you're talking
> to
> > an 0.9 broker than that you're talking to a broker which supports
> version 2
> > of the group coordinator request, version 1 of fetch request, etc? Also,
> > the release version could be included in the broker information in the
> > topic metadata request which would save the need for the additional round
> > trip on every reconnect.
> >
> > -Jason
> >
> > On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh 
> wrote:
> >
> > > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira 
> wrote:
> > >
> > > > One more thing, the KIP actually had 3 parts:
> > > > 1. The version protocol
> > > > 2. New response on messages of wrong API key or wrong version
> > > > 3. Protocol documentation
> > > >
> > > There is a WIP patch for adding protocol docs,
> > > https://github.com/apache/kafka/pull/970 . By protocol documentation,
> > you
> > > mean updating this, right?
> > >
> > > >
> > > > I understand that you are offering to only implement part 1?
> > > > But the KIP discussion and vote should still cover all three parts,
> > > > they will just be implemented in separate JIRA?
> > > >
> > > The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986,
> > covers
> > > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we should include
> all
> > > the three points you mentioned while discussing or voting for KIP-35.
> > >
> > > >
> > > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh 
> > > wrote:
> > > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > >> I don't see a use for the name - clients should be able to
> translate
> > > > >> ApiKey to name for any API they support, and I'm not sure why
> would
> > a
> > > > >> client need to log anything about APIs it does not support. Am I
> > > > >> missing something?
> > > > >>
> > > > > Yea, it is a fair assumption that client would know about APIs it
> > > > supports.
> > > > > It could have been helpful for client users to see new APIs though,
> > > > however
> > > > > users can always refer to protocol doc of new version to find
> > > > corresponding
> > > > > names of the new APIs.
> > > > >
> > > > >>
> > > > >> On a related note, Magnus is currently on vacation, but he should
> be
> > > > >> back at the end of next week. I'd like to hold off on the vote
> until
> > > > >> he gets back since his experience in implementing clients  and his
> > > > >> opinions will be very valuable for this discussion.
> > > > >>
> > > > > That is great. It will be valuable to have his feedback. I will
> hold
> > > off
> > > > on
> > > > > removing "api_name" and "api_deprecated_versions" or adding release
> > > > version.
> > > > >
> > > > >>
> > > > >> Gwen
> > > > >>
> > > > >> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh  >
> > > > wrote:
> > > > >> > Works with me. I will update PR to remove this.
> > > > >> >
> > > > >> > Also, "api_name" have been pointed out as a concern. However, it
> > can
> > > > be
> > > > >> > handy for logging and similar purposes. Any take on that?
> > > > >> >
> > > > >> > On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira  >
> > > > wrote:
> > > > >> >
> > > > >> >> Jay also mentioned:
> > > > 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-03 Thread Ashish Singh
> Also, the release version could be included in the broker information in
the
topic metadata request which would save the need for the additional round
trip on every reconnect.

Wouldn't this cause a catch-22 situation, where you need to know protocol
version of metadata request to get supported protocol versions?

On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson  wrote:

> I think Dana's suggestion to include the Kafka release version makes a lot
> of sense. I'm actually wondering why you would need the individual API
> versions if you have that? It sounds like keeping track of all the api
> version information would add a lot of complexity to clients since they'll
> have to try to handle different version permutations which are not actually
> possible in practice. Wouldn't it be simpler to know that you're talking to
> an 0.9 broker than that you're talking to a broker which supports version 2
> of the group coordinator request, version 1 of fetch request, etc? Also,
> the release version could be included in the broker information in the
> topic metadata request which would save the need for the additional round
> trip on every reconnect.
>
> -Jason
>
> On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh  wrote:
>
> > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira  wrote:
> >
> > > One more thing, the KIP actually had 3 parts:
> > > 1. The version protocol
> > > 2. New response on messages of wrong API key or wrong version
> > > 3. Protocol documentation
> > >
> > There is a WIP patch for adding protocol docs,
> > https://github.com/apache/kafka/pull/970 . By protocol documentation,
> you
> > mean updating this, right?
> >
> > >
> > > I understand that you are offering to only implement part 1?
> > > But the KIP discussion and vote should still cover all three parts,
> > > they will just be implemented in separate JIRA?
> > >
> > The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986,
> covers
> > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we should include all
> > the three points you mentioned while discussing or voting for KIP-35.
> >
> > >
> > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh 
> > wrote:
> > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira 
> > wrote:
> > > >
> > > >> I don't see a use for the name - clients should be able to translate
> > > >> ApiKey to name for any API they support, and I'm not sure why would
> a
> > > >> client need to log anything about APIs it does not support. Am I
> > > >> missing something?
> > > >>
> > > > Yea, it is a fair assumption that client would know about APIs it
> > > supports.
> > > > It could have been helpful for client users to see new APIs though,
> > > however
> > > > users can always refer to protocol doc of new version to find
> > > corresponding
> > > > names of the new APIs.
> > > >
> > > >>
> > > >> On a related note, Magnus is currently on vacation, but he should be
> > > >> back at the end of next week. I'd like to hold off on the vote until
> > > >> he gets back since his experience in implementing clients  and his
> > > >> opinions will be very valuable for this discussion.
> > > >>
> > > > That is great. It will be valuable to have his feedback. I will hold
> > off
> > > on
> > > > removing "api_name" and "api_deprecated_versions" or adding release
> > > version.
> > > >
> > > >>
> > > >> Gwen
> > > >>
> > > >> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh 
> > > wrote:
> > > >> > Works with me. I will update PR to remove this.
> > > >> >
> > > >> > Also, "api_name" have been pointed out as a concern. However, it
> can
> > > be
> > > >> > handy for logging and similar purposes. Any take on that?
> > > >> >
> > > >> > On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira 
> > > wrote:
> > > >> >
> > > >> >> Jay also mentioned:
> > > >> >> "Or, alternately, since deprecation has no functional impact and
> is
> > > >> >> just a message
> > > >> >> to developers, we could just leave it out of the protocol and
> just
> > > have
> > > >> it
> > > >> >> in release notes etc."
> > > >> >>
> > > >> >> I'm in favor of leaving it out of the protocol. I can't really
> see
> > a
> > > >> >> use-case.
> > > >> >>
> > > >> >> Gwen
> > > >> >>
> > > >> >> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh <
> asi...@cloudera.com
> > >
> > > >> wrote:
> > > >> >>
> > > >> >> > I hope it is OK for me to make some progress here. I have made
> > the
> > > >> >> > following changes.
> > > >> >> >
> > > >> >> > 1. Updated KIP-35, to adopt Jay's suggestion on maintaining
> > > separate
> > > >> list
> > > >> >> > of deprecated versions, instead of using a version of -1.
> > > >> >> > 2. Added information on required permissions, Describe action
> on
> > > >> Cluster
> > > >> >> > resource, to be able to retrieve protocol versions from a auth
> > > enabled
> > > >> >> > Kafka cluster.
> > > >> >> >
> > > >> >> > Created 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-02 Thread Becket Qin
Hi Jason,

I was thinking that every time when we connect to a broker, we first send
the version check request. (The version check request itself should be very
simple and never changes across all server releases.) This does add an
additional round trip, but given reconnect is rare, it is probably fine. On
the client side, the client will always send request using the lowest
supported version across all brokers. That means if a Kafka cluster is
downgrading, we will use the downgraded protocol as soon as the client
connected to an older broker.

@Ashish,
Can you help me understand the pain points from other open source projects
that you mentioned a little more? There are two different levels of
requirements:
1. User wants to know if the client is compatible with the broker or not.
2. User wants the client and the broker to negotiate the protocol on their
own.

Currently in Kafka the principle we are following is to let clients stick
to a certain version and server will adapt to the clients accordingly.
If this KIP doesn't want to break this rule, it seems we should simply let
the clients send the ApiVersion it is using to the brokers and the brokers
will decide whether to accept or reject the clients. This means user have
to upgrade broker before they upgrade clients. This satisfies (1) so that a
newer client will know it does not compatible with an older server
immediately.
If this KIP will change that to let the newer clients adapt to the older
brokers,  compatibility wise it is a good thing to have. With this now
users are able to upgrade clients before they upgrade Kafka brokers. This
means user can upgrade clients even before upgrade servers. This satisfies
(2) as the newer clients can also talk to the older servers.

If we decide to go with (2). The benefit is that a newer client won't break
when talking to an older broker. But functionality wise, it might be the
same as an older clients.
In the downgrading case, we probably still have to notify all the users.
For example, if application is sending messages with timestamp and the
broker got downgraded to an older version that does not support timestamp.
The clients will suddenly start to throw away timestamps. This might affect
the application logic. In this case even if we have clients automatically
adapted to a lower version broker, the applications might still break.
Hence we still need to notify the users about the case when the clients is
newer than the brokers. This is the same for both (1) and (2).
Supporting (2) will introduce more complication on the client side. And we
may also have to communicate with users about what function is supported in
the new clients and what is not supported after the protocol negotiation
finishes.

Thanks,

Jiangjie (Becket) Qin




On Wed, Mar 2, 2016 at 5:58 PM, Dana Powers  wrote:

> In kafka-python we've been doing something like:
>
> if version >= (0, 9):
>   Do cool new stuff
> elif version >= (0, 8, 2):
>   Do some older stuff
> 
> else:
>   raise UnsupportedVersionError
>
> This will break if / when the new 0.9 apis are completely removed from some
> future release, but should handle intermediate broker upgrades. Because we
> can't add support for future apis a priori, I think the best we could do
> here is throw an error that request protocol version X is not supported.
> For now that comes through as a broken socket connection, so there is an
> error - just not a super helpful one.
>
> For that reason I'm also in favor of a generic error response when a
> protocol req is not recognized.
>
> -Dana
> On Mar 2, 2016 5:38 PM, "Jay Kreps"  wrote:
>
> > But won't it be the case that what clients end up doing would be
> something
> > like
> >if(version != 0.8.1)
> >   throw new UnsupportedVersionException()
> > which then means the client is broken as soon as we release a new server
> > version even though the protocol didn't change. I'm actually not sure how
> > you could use that information in a forward compatible way since you
> can't
> > know a priori if you will work with the next release until you know if
> the
> > protocol changed.
> >
> > -Jay
> >
> > On Wed, Mar 2, 2016 at 5:28 PM, Jason Gustafson 
> > wrote:
> >
> > > Hey Jay,
> > >
> > > Yeah, I wasn't suggesting that we eliminate request API versions.
> They're
> > > definitely needed on the broker to support compatibility. I was just
> > saying
> > > that if a client wants to support multiple broker versions (e.g. 0.8
> and
> > > 0.9), then it makes more sense to me to make the kafka release version
> > > available in order to determine which version of the request API should
> > be
> > > used rather than adding a new request type which exposes all of the
> > > different supported versions for all of the request types. Request API
> > > versions all change in lockstep with Kafka releases anyway.
> > >
> > > -Jason
> > >
> > > On Wed, Mar 2, 2016 at 5:15 PM, Becket Qin 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-02 Thread Dana Powers
In kafka-python we've been doing something like:

if version >= (0, 9):
  Do cool new stuff
elif version >= (0, 8, 2):
  Do some older stuff

else:
  raise UnsupportedVersionError

This will break if / when the new 0.9 apis are completely removed from some
future release, but should handle intermediate broker upgrades. Because we
can't add support for future apis a priori, I think the best we could do
here is throw an error that request protocol version X is not supported.
For now that comes through as a broken socket connection, so there is an
error - just not a super helpful one.

For that reason I'm also in favor of a generic error response when a
protocol req is not recognized.

-Dana
On Mar 2, 2016 5:38 PM, "Jay Kreps"  wrote:

> But won't it be the case that what clients end up doing would be something
> like
>if(version != 0.8.1)
>   throw new UnsupportedVersionException()
> which then means the client is broken as soon as we release a new server
> version even though the protocol didn't change. I'm actually not sure how
> you could use that information in a forward compatible way since you can't
> know a priori if you will work with the next release until you know if the
> protocol changed.
>
> -Jay
>
> On Wed, Mar 2, 2016 at 5:28 PM, Jason Gustafson 
> wrote:
>
> > Hey Jay,
> >
> > Yeah, I wasn't suggesting that we eliminate request API versions. They're
> > definitely needed on the broker to support compatibility. I was just
> saying
> > that if a client wants to support multiple broker versions (e.g. 0.8 and
> > 0.9), then it makes more sense to me to make the kafka release version
> > available in order to determine which version of the request API should
> be
> > used rather than adding a new request type which exposes all of the
> > different supported versions for all of the request types. Request API
> > versions all change in lockstep with Kafka releases anyway.
> >
> > -Jason
> >
> > On Wed, Mar 2, 2016 at 5:15 PM, Becket Qin  wrote:
> >
> > > I think using Kafka release version makes sense. More particularly, we
> > can
> > > use the ApiVersion and this will cover all the interval version as
> well.
> > In
> > > KAFKA-3025, we added the ApiVersion to message format version mapping,
> We
> > > can add the ApiKey to version mapping to ApiVersion as well. We can
> move
> > > ApiVersion class to o.a.k.c package and use it for both server and
> > clients.
> > >
> > > @Jason, if we cache the release info in metadata and not re-validate
> the
> > > release on reconnect, would it still work if we do a rolling downgrade?
> > >
> > > On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson 
> > > wrote:
> > >
> > > > I think Dana's suggestion to include the Kafka release version makes
> a
> > > lot
> > > > of sense. I'm actually wondering why you would need the individual
> API
> > > > versions if you have that? It sounds like keeping track of all the
> api
> > > > version information would add a lot of complexity to clients since
> > > they'll
> > > > have to try to handle different version permutations which are not
> > > actually
> > > > possible in practice. Wouldn't it be simpler to know that you're
> > talking
> > > to
> > > > an 0.9 broker than that you're talking to a broker which supports
> > > version 2
> > > > of the group coordinator request, version 1 of fetch request, etc?
> > Also,
> > > > the release version could be included in the broker information in
> the
> > > > topic metadata request which would save the need for the additional
> > round
> > > > trip on every reconnect.
> > > >
> > > > -Jason
> > > >
> > > > On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh 
> > > wrote:
> > > >
> > > > > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > > > One more thing, the KIP actually had 3 parts:
> > > > > > 1. The version protocol
> > > > > > 2. New response on messages of wrong API key or wrong version
> > > > > > 3. Protocol documentation
> > > > > >
> > > > > There is a WIP patch for adding protocol docs,
> > > > > https://github.com/apache/kafka/pull/970 . By protocol
> > documentation,
> > > > you
> > > > > mean updating this, right?
> > > > >
> > > > > >
> > > > > > I understand that you are offering to only implement part 1?
> > > > > > But the KIP discussion and vote should still cover all three
> parts,
> > > > > > they will just be implemented in separate JIRA?
> > > > > >
> > > > > The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986
> ,
> > > > covers
> > > > > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we should
> include
> > > all
> > > > > the three points you mentioned while discussing or voting for
> KIP-35.
> > > > >
> > > > > >
> > > > > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh <
> asi...@cloudera.com>
> > > > > wrote:
> > > > > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira <
> 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-02 Thread Jason Gustafson
Hey Becket,

I was thinking about that too. I guess there are two scenarios: upgrades
and downgrades. For upgrades, the client may continue using the old API
version until the next metadata refresh, which seems ok. For downgrades, we
still need some way to signal the client that an API version is not
supported by the broker (that's one part of this KIP that we should
definitely get in even if we reject the rest). In that case, the client can
refetch topic metadata from a different broker and update versions
accordingly, though there's clearly some trickiness if you have to revert
to an older version of the topic metadata request. What do you think?

-Jason

On Wed, Mar 2, 2016 at 5:28 PM, Jason Gustafson  wrote:

> Hey Jay,
>
> Yeah, I wasn't suggesting that we eliminate request API versions. They're
> definitely needed on the broker to support compatibility. I was just saying
> that if a client wants to support multiple broker versions (e.g. 0.8 and
> 0.9), then it makes more sense to me to make the kafka release version
> available in order to determine which version of the request API should be
> used rather than adding a new request type which exposes all of the
> different supported versions for all of the request types. Request API
> versions all change in lockstep with Kafka releases anyway.
>
> -Jason
>
> On Wed, Mar 2, 2016 at 5:15 PM, Becket Qin  wrote:
>
>> I think using Kafka release version makes sense. More particularly, we can
>> use the ApiVersion and this will cover all the interval version as well.
>> In
>> KAFKA-3025, we added the ApiVersion to message format version mapping, We
>> can add the ApiKey to version mapping to ApiVersion as well. We can move
>> ApiVersion class to o.a.k.c package and use it for both server and
>> clients.
>>
>> @Jason, if we cache the release info in metadata and not re-validate the
>> release on reconnect, would it still work if we do a rolling downgrade?
>>
>> On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson 
>> wrote:
>>
>> > I think Dana's suggestion to include the Kafka release version makes a
>> lot
>> > of sense. I'm actually wondering why you would need the individual API
>> > versions if you have that? It sounds like keeping track of all the api
>> > version information would add a lot of complexity to clients since
>> they'll
>> > have to try to handle different version permutations which are not
>> actually
>> > possible in practice. Wouldn't it be simpler to know that you're
>> talking to
>> > an 0.9 broker than that you're talking to a broker which supports
>> version 2
>> > of the group coordinator request, version 1 of fetch request, etc? Also,
>> > the release version could be included in the broker information in the
>> > topic metadata request which would save the need for the additional
>> round
>> > trip on every reconnect.
>> >
>> > -Jason
>> >
>> > On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh 
>> wrote:
>> >
>> > > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira 
>> wrote:
>> > >
>> > > > One more thing, the KIP actually had 3 parts:
>> > > > 1. The version protocol
>> > > > 2. New response on messages of wrong API key or wrong version
>> > > > 3. Protocol documentation
>> > > >
>> > > There is a WIP patch for adding protocol docs,
>> > > https://github.com/apache/kafka/pull/970 . By protocol documentation,
>> > you
>> > > mean updating this, right?
>> > >
>> > > >
>> > > > I understand that you are offering to only implement part 1?
>> > > > But the KIP discussion and vote should still cover all three parts,
>> > > > they will just be implemented in separate JIRA?
>> > > >
>> > > The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986,
>> > covers
>> > > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we should include
>> all
>> > > the three points you mentioned while discussing or voting for KIP-35.
>> > >
>> > > >
>> > > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh 
>> > > wrote:
>> > > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira 
>> > > wrote:
>> > > > >
>> > > > >> I don't see a use for the name - clients should be able to
>> translate
>> > > > >> ApiKey to name for any API they support, and I'm not sure why
>> would
>> > a
>> > > > >> client need to log anything about APIs it does not support. Am I
>> > > > >> missing something?
>> > > > >>
>> > > > > Yea, it is a fair assumption that client would know about APIs it
>> > > > supports.
>> > > > > It could have been helpful for client users to see new APIs
>> though,
>> > > > however
>> > > > > users can always refer to protocol doc of new version to find
>> > > > corresponding
>> > > > > names of the new APIs.
>> > > > >
>> > > > >>
>> > > > >> On a related note, Magnus is currently on vacation, but he
>> should be
>> > > > >> back at the end of next week. I'd like to hold off on the vote
>> until
>> > > > >> he gets back 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-02 Thread Jason Gustafson
I'm assuming that the broker would continue to support older versions of
the requests, so the check would probably be more like this:

if (version > 0.9)
// send 0.9 requests
else if (version > 0.8)
// send 0.8 requests

Does that not work? This version can also can be used by clients to tell
whether the broker supports major new features (such as the group
coordinator protocol or the admin APIs).

-Jason



On Wed, Mar 2, 2016 at 5:38 PM, Jay Kreps  wrote:

> But won't it be the case that what clients end up doing would be something
> like
>if(version != 0.8.1)
>   throw new UnsupportedVersionException()
> which then means the client is broken as soon as we release a new server
> version even though the protocol didn't change. I'm actually not sure how
> you could use that information in a forward compatible way since you can't
> know a priori if you will work with the next release until you know if the
> protocol changed.
>
> -Jay
>
> On Wed, Mar 2, 2016 at 5:28 PM, Jason Gustafson 
> wrote:
>
> > Hey Jay,
> >
> > Yeah, I wasn't suggesting that we eliminate request API versions. They're
> > definitely needed on the broker to support compatibility. I was just
> saying
> > that if a client wants to support multiple broker versions (e.g. 0.8 and
> > 0.9), then it makes more sense to me to make the kafka release version
> > available in order to determine which version of the request API should
> be
> > used rather than adding a new request type which exposes all of the
> > different supported versions for all of the request types. Request API
> > versions all change in lockstep with Kafka releases anyway.
> >
> > -Jason
> >
> > On Wed, Mar 2, 2016 at 5:15 PM, Becket Qin  wrote:
> >
> > > I think using Kafka release version makes sense. More particularly, we
> > can
> > > use the ApiVersion and this will cover all the interval version as
> well.
> > In
> > > KAFKA-3025, we added the ApiVersion to message format version mapping,
> We
> > > can add the ApiKey to version mapping to ApiVersion as well. We can
> move
> > > ApiVersion class to o.a.k.c package and use it for both server and
> > clients.
> > >
> > > @Jason, if we cache the release info in metadata and not re-validate
> the
> > > release on reconnect, would it still work if we do a rolling downgrade?
> > >
> > > On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson 
> > > wrote:
> > >
> > > > I think Dana's suggestion to include the Kafka release version makes
> a
> > > lot
> > > > of sense. I'm actually wondering why you would need the individual
> API
> > > > versions if you have that? It sounds like keeping track of all the
> api
> > > > version information would add a lot of complexity to clients since
> > > they'll
> > > > have to try to handle different version permutations which are not
> > > actually
> > > > possible in practice. Wouldn't it be simpler to know that you're
> > talking
> > > to
> > > > an 0.9 broker than that you're talking to a broker which supports
> > > version 2
> > > > of the group coordinator request, version 1 of fetch request, etc?
> > Also,
> > > > the release version could be included in the broker information in
> the
> > > > topic metadata request which would save the need for the additional
> > round
> > > > trip on every reconnect.
> > > >
> > > > -Jason
> > > >
> > > > On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh 
> > > wrote:
> > > >
> > > > > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > > > One more thing, the KIP actually had 3 parts:
> > > > > > 1. The version protocol
> > > > > > 2. New response on messages of wrong API key or wrong version
> > > > > > 3. Protocol documentation
> > > > > >
> > > > > There is a WIP patch for adding protocol docs,
> > > > > https://github.com/apache/kafka/pull/970 . By protocol
> > documentation,
> > > > you
> > > > > mean updating this, right?
> > > > >
> > > > > >
> > > > > > I understand that you are offering to only implement part 1?
> > > > > > But the KIP discussion and vote should still cover all three
> parts,
> > > > > > they will just be implemented in separate JIRA?
> > > > > >
> > > > > The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986
> ,
> > > > covers
> > > > > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we should
> include
> > > all
> > > > > the three points you mentioned while discussing or voting for
> KIP-35.
> > > > >
> > > > > >
> > > > > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh <
> asi...@cloudera.com>
> > > > > wrote:
> > > > > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira <
> g...@confluent.io>
> > > > > wrote:
> > > > > > >
> > > > > > >> I don't see a use for the name - clients should be able to
> > > translate
> > > > > > >> ApiKey to name for any API they support, and I'm not sure why
> > > would
> > > > a
> > > > > > >> client need to log anything about APIs it 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-02 Thread Jay Kreps
But won't it be the case that what clients end up doing would be something
like
   if(version != 0.8.1)
  throw new UnsupportedVersionException()
which then means the client is broken as soon as we release a new server
version even though the protocol didn't change. I'm actually not sure how
you could use that information in a forward compatible way since you can't
know a priori if you will work with the next release until you know if the
protocol changed.

-Jay

On Wed, Mar 2, 2016 at 5:28 PM, Jason Gustafson  wrote:

> Hey Jay,
>
> Yeah, I wasn't suggesting that we eliminate request API versions. They're
> definitely needed on the broker to support compatibility. I was just saying
> that if a client wants to support multiple broker versions (e.g. 0.8 and
> 0.9), then it makes more sense to me to make the kafka release version
> available in order to determine which version of the request API should be
> used rather than adding a new request type which exposes all of the
> different supported versions for all of the request types. Request API
> versions all change in lockstep with Kafka releases anyway.
>
> -Jason
>
> On Wed, Mar 2, 2016 at 5:15 PM, Becket Qin  wrote:
>
> > I think using Kafka release version makes sense. More particularly, we
> can
> > use the ApiVersion and this will cover all the interval version as well.
> In
> > KAFKA-3025, we added the ApiVersion to message format version mapping, We
> > can add the ApiKey to version mapping to ApiVersion as well. We can move
> > ApiVersion class to o.a.k.c package and use it for both server and
> clients.
> >
> > @Jason, if we cache the release info in metadata and not re-validate the
> > release on reconnect, would it still work if we do a rolling downgrade?
> >
> > On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson 
> > wrote:
> >
> > > I think Dana's suggestion to include the Kafka release version makes a
> > lot
> > > of sense. I'm actually wondering why you would need the individual API
> > > versions if you have that? It sounds like keeping track of all the api
> > > version information would add a lot of complexity to clients since
> > they'll
> > > have to try to handle different version permutations which are not
> > actually
> > > possible in practice. Wouldn't it be simpler to know that you're
> talking
> > to
> > > an 0.9 broker than that you're talking to a broker which supports
> > version 2
> > > of the group coordinator request, version 1 of fetch request, etc?
> Also,
> > > the release version could be included in the broker information in the
> > > topic metadata request which would save the need for the additional
> round
> > > trip on every reconnect.
> > >
> > > -Jason
> > >
> > > On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh 
> > wrote:
> > >
> > > > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira 
> > wrote:
> > > >
> > > > > One more thing, the KIP actually had 3 parts:
> > > > > 1. The version protocol
> > > > > 2. New response on messages of wrong API key or wrong version
> > > > > 3. Protocol documentation
> > > > >
> > > > There is a WIP patch for adding protocol docs,
> > > > https://github.com/apache/kafka/pull/970 . By protocol
> documentation,
> > > you
> > > > mean updating this, right?
> > > >
> > > > >
> > > > > I understand that you are offering to only implement part 1?
> > > > > But the KIP discussion and vote should still cover all three parts,
> > > > > they will just be implemented in separate JIRA?
> > > > >
> > > > The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986,
> > > covers
> > > > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we should include
> > all
> > > > the three points you mentioned while discussing or voting for KIP-35.
> > > >
> > > > >
> > > > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh 
> > > > wrote:
> > > > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira 
> > > > wrote:
> > > > > >
> > > > > >> I don't see a use for the name - clients should be able to
> > translate
> > > > > >> ApiKey to name for any API they support, and I'm not sure why
> > would
> > > a
> > > > > >> client need to log anything about APIs it does not support. Am I
> > > > > >> missing something?
> > > > > >>
> > > > > > Yea, it is a fair assumption that client would know about APIs it
> > > > > supports.
> > > > > > It could have been helpful for client users to see new APIs
> though,
> > > > > however
> > > > > > users can always refer to protocol doc of new version to find
> > > > > corresponding
> > > > > > names of the new APIs.
> > > > > >
> > > > > >>
> > > > > >> On a related note, Magnus is currently on vacation, but he
> should
> > be
> > > > > >> back at the end of next week. I'd like to hold off on the vote
> > until
> > > > > >> he gets back since his experience in implementing clients  and
> his
> > > > > >> opinions will be very valuable for 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-02 Thread Jason Gustafson
Hey Jay,

Yeah, I wasn't suggesting that we eliminate request API versions. They're
definitely needed on the broker to support compatibility. I was just saying
that if a client wants to support multiple broker versions (e.g. 0.8 and
0.9), then it makes more sense to me to make the kafka release version
available in order to determine which version of the request API should be
used rather than adding a new request type which exposes all of the
different supported versions for all of the request types. Request API
versions all change in lockstep with Kafka releases anyway.

-Jason

On Wed, Mar 2, 2016 at 5:15 PM, Becket Qin  wrote:

> I think using Kafka release version makes sense. More particularly, we can
> use the ApiVersion and this will cover all the interval version as well. In
> KAFKA-3025, we added the ApiVersion to message format version mapping, We
> can add the ApiKey to version mapping to ApiVersion as well. We can move
> ApiVersion class to o.a.k.c package and use it for both server and clients.
>
> @Jason, if we cache the release info in metadata and not re-validate the
> release on reconnect, would it still work if we do a rolling downgrade?
>
> On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson 
> wrote:
>
> > I think Dana's suggestion to include the Kafka release version makes a
> lot
> > of sense. I'm actually wondering why you would need the individual API
> > versions if you have that? It sounds like keeping track of all the api
> > version information would add a lot of complexity to clients since
> they'll
> > have to try to handle different version permutations which are not
> actually
> > possible in practice. Wouldn't it be simpler to know that you're talking
> to
> > an 0.9 broker than that you're talking to a broker which supports
> version 2
> > of the group coordinator request, version 1 of fetch request, etc? Also,
> > the release version could be included in the broker information in the
> > topic metadata request which would save the need for the additional round
> > trip on every reconnect.
> >
> > -Jason
> >
> > On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh 
> wrote:
> >
> > > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira 
> wrote:
> > >
> > > > One more thing, the KIP actually had 3 parts:
> > > > 1. The version protocol
> > > > 2. New response on messages of wrong API key or wrong version
> > > > 3. Protocol documentation
> > > >
> > > There is a WIP patch for adding protocol docs,
> > > https://github.com/apache/kafka/pull/970 . By protocol documentation,
> > you
> > > mean updating this, right?
> > >
> > > >
> > > > I understand that you are offering to only implement part 1?
> > > > But the KIP discussion and vote should still cover all three parts,
> > > > they will just be implemented in separate JIRA?
> > > >
> > > The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986,
> > covers
> > > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we should include
> all
> > > the three points you mentioned while discussing or voting for KIP-35.
> > >
> > > >
> > > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh 
> > > wrote:
> > > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > >> I don't see a use for the name - clients should be able to
> translate
> > > > >> ApiKey to name for any API they support, and I'm not sure why
> would
> > a
> > > > >> client need to log anything about APIs it does not support. Am I
> > > > >> missing something?
> > > > >>
> > > > > Yea, it is a fair assumption that client would know about APIs it
> > > > supports.
> > > > > It could have been helpful for client users to see new APIs though,
> > > > however
> > > > > users can always refer to protocol doc of new version to find
> > > > corresponding
> > > > > names of the new APIs.
> > > > >
> > > > >>
> > > > >> On a related note, Magnus is currently on vacation, but he should
> be
> > > > >> back at the end of next week. I'd like to hold off on the vote
> until
> > > > >> he gets back since his experience in implementing clients  and his
> > > > >> opinions will be very valuable for this discussion.
> > > > >>
> > > > > That is great. It will be valuable to have his feedback. I will
> hold
> > > off
> > > > on
> > > > > removing "api_name" and "api_deprecated_versions" or adding release
> > > > version.
> > > > >
> > > > >>
> > > > >> Gwen
> > > > >>
> > > > >> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh  >
> > > > wrote:
> > > > >> > Works with me. I will update PR to remove this.
> > > > >> >
> > > > >> > Also, "api_name" have been pointed out as a concern. However, it
> > can
> > > > be
> > > > >> > handy for logging and similar purposes. Any take on that?
> > > > >> >
> > > > >> > On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira  >
> > > > wrote:
> > > > >> >
> > > > >> >> Jay also mentioned:
> 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-02 Thread Becket Qin
I think using Kafka release version makes sense. More particularly, we can
use the ApiVersion and this will cover all the interval version as well. In
KAFKA-3025, we added the ApiVersion to message format version mapping, We
can add the ApiKey to version mapping to ApiVersion as well. We can move
ApiVersion class to o.a.k.c package and use it for both server and clients.

@Jason, if we cache the release info in metadata and not re-validate the
release on reconnect, would it still work if we do a rolling downgrade?

On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson  wrote:

> I think Dana's suggestion to include the Kafka release version makes a lot
> of sense. I'm actually wondering why you would need the individual API
> versions if you have that? It sounds like keeping track of all the api
> version information would add a lot of complexity to clients since they'll
> have to try to handle different version permutations which are not actually
> possible in practice. Wouldn't it be simpler to know that you're talking to
> an 0.9 broker than that you're talking to a broker which supports version 2
> of the group coordinator request, version 1 of fetch request, etc? Also,
> the release version could be included in the broker information in the
> topic metadata request which would save the need for the additional round
> trip on every reconnect.
>
> -Jason
>
> On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh  wrote:
>
> > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira  wrote:
> >
> > > One more thing, the KIP actually had 3 parts:
> > > 1. The version protocol
> > > 2. New response on messages of wrong API key or wrong version
> > > 3. Protocol documentation
> > >
> > There is a WIP patch for adding protocol docs,
> > https://github.com/apache/kafka/pull/970 . By protocol documentation,
> you
> > mean updating this, right?
> >
> > >
> > > I understand that you are offering to only implement part 1?
> > > But the KIP discussion and vote should still cover all three parts,
> > > they will just be implemented in separate JIRA?
> > >
> > The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986,
> covers
> > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we should include all
> > the three points you mentioned while discussing or voting for KIP-35.
> >
> > >
> > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh 
> > wrote:
> > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira 
> > wrote:
> > > >
> > > >> I don't see a use for the name - clients should be able to translate
> > > >> ApiKey to name for any API they support, and I'm not sure why would
> a
> > > >> client need to log anything about APIs it does not support. Am I
> > > >> missing something?
> > > >>
> > > > Yea, it is a fair assumption that client would know about APIs it
> > > supports.
> > > > It could have been helpful for client users to see new APIs though,
> > > however
> > > > users can always refer to protocol doc of new version to find
> > > corresponding
> > > > names of the new APIs.
> > > >
> > > >>
> > > >> On a related note, Magnus is currently on vacation, but he should be
> > > >> back at the end of next week. I'd like to hold off on the vote until
> > > >> he gets back since his experience in implementing clients  and his
> > > >> opinions will be very valuable for this discussion.
> > > >>
> > > > That is great. It will be valuable to have his feedback. I will hold
> > off
> > > on
> > > > removing "api_name" and "api_deprecated_versions" or adding release
> > > version.
> > > >
> > > >>
> > > >> Gwen
> > > >>
> > > >> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh 
> > > wrote:
> > > >> > Works with me. I will update PR to remove this.
> > > >> >
> > > >> > Also, "api_name" have been pointed out as a concern. However, it
> can
> > > be
> > > >> > handy for logging and similar purposes. Any take on that?
> > > >> >
> > > >> > On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira 
> > > wrote:
> > > >> >
> > > >> >> Jay also mentioned:
> > > >> >> "Or, alternately, since deprecation has no functional impact and
> is
> > > >> >> just a message
> > > >> >> to developers, we could just leave it out of the protocol and
> just
> > > have
> > > >> it
> > > >> >> in release notes etc."
> > > >> >>
> > > >> >> I'm in favor of leaving it out of the protocol. I can't really
> see
> > a
> > > >> >> use-case.
> > > >> >>
> > > >> >> Gwen
> > > >> >>
> > > >> >> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh <
> asi...@cloudera.com
> > >
> > > >> wrote:
> > > >> >>
> > > >> >> > I hope it is OK for me to make some progress here. I have made
> > the
> > > >> >> > following changes.
> > > >> >> >
> > > >> >> > 1. Updated KIP-35, to adopt Jay's suggestion on maintaining
> > > separate
> > > >> list
> > > >> >> > of deprecated versions, instead of using a version of -1.
> > > >> >> > 2. Added information on required 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-02 Thread Jay Kreps
Hey guys,

I think we want the clients to depend not on the broker version but rather
on the protocol version. If you build against protocol version x it is
valid with any Kafka version that supports x even as Kafka releases
progress.

We discussed originally whether to version the protocol as a whole or the
individual apis. The decision to version the apis individually was based on
the fact that there are multiple categories of apis: internal, producer,
consumer, admin, etc, and these could be exposed in different combinations.
If you version the protocol as a whole it bumps up every time anything
changes. However clients often only use a small fraction of these apis. For
example, practically, the largest number of clients are producers which are
generally just the metadata request and the produce request. A client built
against a particular version of these really doesn't care the version of
any other apis and shouldn't need be tested against anything that didn't
change these two things.

You could argue pros and cons of this, but changing the versioning scheme
is pretty hard nowl.

I'd argue that rather than introduce a new broker versioning scheme let's
just make correct use of what we have. The complaint was that we changed
the meaning of some of these things without changing the version--I'm not
sure adding another version field fixes this in general though you could
definitely invent heuristics that fix the specific cases we had. But this
pushes the complexity onto the client. I think maybe a better solution
would just be not to make those kinds of changes within a single protocol
version.

-Jay

On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson  wrote:

> I think Dana's suggestion to include the Kafka release version makes a lot
> of sense. I'm actually wondering why you would need the individual API
> versions if you have that? It sounds like keeping track of all the api
> version information would add a lot of complexity to clients since they'll
> have to try to handle different version permutations which are not actually
> possible in practice. Wouldn't it be simpler to know that you're talking to
> an 0.9 broker than that you're talking to a broker which supports version 2
> of the group coordinator request, version 1 of fetch request, etc? Also,
> the release version could be included in the broker information in the
> topic metadata request which would save the need for the additional round
> trip on every reconnect.
>
> -Jason
>
> On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh  wrote:
>
> > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira  wrote:
> >
> > > One more thing, the KIP actually had 3 parts:
> > > 1. The version protocol
> > > 2. New response on messages of wrong API key or wrong version
> > > 3. Protocol documentation
> > >
> > There is a WIP patch for adding protocol docs,
> > https://github.com/apache/kafka/pull/970 . By protocol documentation,
> you
> > mean updating this, right?
> >
> > >
> > > I understand that you are offering to only implement part 1?
> > > But the KIP discussion and vote should still cover all three parts,
> > > they will just be implemented in separate JIRA?
> > >
> > The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986,
> covers
> > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we should include all
> > the three points you mentioned while discussing or voting for KIP-35.
> >
> > >
> > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh 
> > wrote:
> > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira 
> > wrote:
> > > >
> > > >> I don't see a use for the name - clients should be able to translate
> > > >> ApiKey to name for any API they support, and I'm not sure why would
> a
> > > >> client need to log anything about APIs it does not support. Am I
> > > >> missing something?
> > > >>
> > > > Yea, it is a fair assumption that client would know about APIs it
> > > supports.
> > > > It could have been helpful for client users to see new APIs though,
> > > however
> > > > users can always refer to protocol doc of new version to find
> > > corresponding
> > > > names of the new APIs.
> > > >
> > > >>
> > > >> On a related note, Magnus is currently on vacation, but he should be
> > > >> back at the end of next week. I'd like to hold off on the vote until
> > > >> he gets back since his experience in implementing clients  and his
> > > >> opinions will be very valuable for this discussion.
> > > >>
> > > > That is great. It will be valuable to have his feedback. I will hold
> > off
> > > on
> > > > removing "api_name" and "api_deprecated_versions" or adding release
> > > version.
> > > >
> > > >>
> > > >> Gwen
> > > >>
> > > >> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh 
> > > wrote:
> > > >> > Works with me. I will update PR to remove this.
> > > >> >
> > > >> > Also, "api_name" have been pointed out as a concern. However, it
> 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-02 Thread Jason Gustafson
I think Dana's suggestion to include the Kafka release version makes a lot
of sense. I'm actually wondering why you would need the individual API
versions if you have that? It sounds like keeping track of all the api
version information would add a lot of complexity to clients since they'll
have to try to handle different version permutations which are not actually
possible in practice. Wouldn't it be simpler to know that you're talking to
an 0.9 broker than that you're talking to a broker which supports version 2
of the group coordinator request, version 1 of fetch request, etc? Also,
the release version could be included in the broker information in the
topic metadata request which would save the need for the additional round
trip on every reconnect.

-Jason

On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh  wrote:

> On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira  wrote:
>
> > One more thing, the KIP actually had 3 parts:
> > 1. The version protocol
> > 2. New response on messages of wrong API key or wrong version
> > 3. Protocol documentation
> >
> There is a WIP patch for adding protocol docs,
> https://github.com/apache/kafka/pull/970 . By protocol documentation, you
> mean updating this, right?
>
> >
> > I understand that you are offering to only implement part 1?
> > But the KIP discussion and vote should still cover all three parts,
> > they will just be implemented in separate JIRA?
> >
> The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986, covers
> 1 and 2. KAFKA-3309 tracks documentation part. Yes, we should include all
> the three points you mentioned while discussing or voting for KIP-35.
>
> >
> > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh 
> wrote:
> > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira 
> wrote:
> > >
> > >> I don't see a use for the name - clients should be able to translate
> > >> ApiKey to name for any API they support, and I'm not sure why would a
> > >> client need to log anything about APIs it does not support. Am I
> > >> missing something?
> > >>
> > > Yea, it is a fair assumption that client would know about APIs it
> > supports.
> > > It could have been helpful for client users to see new APIs though,
> > however
> > > users can always refer to protocol doc of new version to find
> > corresponding
> > > names of the new APIs.
> > >
> > >>
> > >> On a related note, Magnus is currently on vacation, but he should be
> > >> back at the end of next week. I'd like to hold off on the vote until
> > >> he gets back since his experience in implementing clients  and his
> > >> opinions will be very valuable for this discussion.
> > >>
> > > That is great. It will be valuable to have his feedback. I will hold
> off
> > on
> > > removing "api_name" and "api_deprecated_versions" or adding release
> > version.
> > >
> > >>
> > >> Gwen
> > >>
> > >> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh 
> > wrote:
> > >> > Works with me. I will update PR to remove this.
> > >> >
> > >> > Also, "api_name" have been pointed out as a concern. However, it can
> > be
> > >> > handy for logging and similar purposes. Any take on that?
> > >> >
> > >> > On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira 
> > wrote:
> > >> >
> > >> >> Jay also mentioned:
> > >> >> "Or, alternately, since deprecation has no functional impact and is
> > >> >> just a message
> > >> >> to developers, we could just leave it out of the protocol and just
> > have
> > >> it
> > >> >> in release notes etc."
> > >> >>
> > >> >> I'm in favor of leaving it out of the protocol. I can't really see
> a
> > >> >> use-case.
> > >> >>
> > >> >> Gwen
> > >> >>
> > >> >> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh  >
> > >> wrote:
> > >> >>
> > >> >> > I hope it is OK for me to make some progress here. I have made
> the
> > >> >> > following changes.
> > >> >> >
> > >> >> > 1. Updated KIP-35, to adopt Jay's suggestion on maintaining
> > separate
> > >> list
> > >> >> > of deprecated versions, instead of using a version of -1.
> > >> >> > 2. Added information on required permissions, Describe action on
> > >> Cluster
> > >> >> > resource, to be able to retrieve protocol versions from a auth
> > enabled
> > >> >> > Kafka cluster.
> > >> >> >
> > >> >> > Created https://issues.apache.org/jira/browse/KAFKA-3304.
> Primary
> > >> patch
> > >> >> is
> > >> >> > available to review, https://github.com/apache/kafka/pull/986
> > >> >> >
> > >> >> > On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh <
> asi...@cloudera.com
> > >
> > >> >> wrote:
> > >> >> >
> > >> >> > > Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have
> found
> > it
> > >> >> > really
> > >> >> > > difficult to cope up with Kafka releases as they want to
> support
> > >> users
> > >> >> on
> > >> >> > > different Kafka versions. Capability to retrieve protocol
> version
> > >> will
> > >> >> > go a
> > >> >> > > long way to 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-01 Thread Ashish Singh
On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira  wrote:

> One more thing, the KIP actually had 3 parts:
> 1. The version protocol
> 2. New response on messages of wrong API key or wrong version
> 3. Protocol documentation
>
There is a WIP patch for adding protocol docs,
https://github.com/apache/kafka/pull/970 . By protocol documentation, you
mean updating this, right?

>
> I understand that you are offering to only implement part 1?
> But the KIP discussion and vote should still cover all three parts,
> they will just be implemented in separate JIRA?
>
The patch for KAFKA-3307, https://github.com/apache/kafka/pull/986, covers
1 and 2. KAFKA-3309 tracks documentation part. Yes, we should include all
the three points you mentioned while discussing or voting for KIP-35.

>
> On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh  wrote:
> > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira  wrote:
> >
> >> I don't see a use for the name - clients should be able to translate
> >> ApiKey to name for any API they support, and I'm not sure why would a
> >> client need to log anything about APIs it does not support. Am I
> >> missing something?
> >>
> > Yea, it is a fair assumption that client would know about APIs it
> supports.
> > It could have been helpful for client users to see new APIs though,
> however
> > users can always refer to protocol doc of new version to find
> corresponding
> > names of the new APIs.
> >
> >>
> >> On a related note, Magnus is currently on vacation, but he should be
> >> back at the end of next week. I'd like to hold off on the vote until
> >> he gets back since his experience in implementing clients  and his
> >> opinions will be very valuable for this discussion.
> >>
> > That is great. It will be valuable to have his feedback. I will hold off
> on
> > removing "api_name" and "api_deprecated_versions" or adding release
> version.
> >
> >>
> >> Gwen
> >>
> >> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh 
> wrote:
> >> > Works with me. I will update PR to remove this.
> >> >
> >> > Also, "api_name" have been pointed out as a concern. However, it can
> be
> >> > handy for logging and similar purposes. Any take on that?
> >> >
> >> > On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira 
> wrote:
> >> >
> >> >> Jay also mentioned:
> >> >> "Or, alternately, since deprecation has no functional impact and is
> >> >> just a message
> >> >> to developers, we could just leave it out of the protocol and just
> have
> >> it
> >> >> in release notes etc."
> >> >>
> >> >> I'm in favor of leaving it out of the protocol. I can't really see a
> >> >> use-case.
> >> >>
> >> >> Gwen
> >> >>
> >> >> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh 
> >> wrote:
> >> >>
> >> >> > I hope it is OK for me to make some progress here. I have made the
> >> >> > following changes.
> >> >> >
> >> >> > 1. Updated KIP-35, to adopt Jay's suggestion on maintaining
> separate
> >> list
> >> >> > of deprecated versions, instead of using a version of -1.
> >> >> > 2. Added information on required permissions, Describe action on
> >> Cluster
> >> >> > resource, to be able to retrieve protocol versions from a auth
> enabled
> >> >> > Kafka cluster.
> >> >> >
> >> >> > Created https://issues.apache.org/jira/browse/KAFKA-3304. Primary
> >> patch
> >> >> is
> >> >> > available to review, https://github.com/apache/kafka/pull/986
> >> >> >
> >> >> > On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh  >
> >> >> wrote:
> >> >> >
> >> >> > > Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have found
> it
> >> >> > really
> >> >> > > difficult to cope up with Kafka releases as they want to support
> >> users
> >> >> on
> >> >> > > different Kafka versions. Capability to retrieve protocol version
> >> will
> >> >> > go a
> >> >> > > long way to ease out those pain points. I will be happy to help
> out
> >> >> with
> >> >> > > the work on this KIP. @Magnus, thanks for driving this, is it OK
> if
> >> I
> >> >> > carry
> >> >> > > forward the work from here. It will be ideal to have this in
> >> 0.10.0.0.
> >> >> > >
> >> >> > > On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps 
> >> wrote:
> >> >> > >
> >> >> > >> I wonder if we need to solve the error problem? I think this KIP
> >> >> gives a
> >> >> > >> descent work around.
> >> >> > >>
> >> >> > >> Probably we should have included an error in the response
> header,
> >> but
> >> >> we
> >> >> > >> debated it at the time decided not to and now it is pretty hard
> to
> >> add
> >> >> > >> because the headers aren't versioned (d'oh).
> >> >> > >>
> >> >> > >> It seems like any other solution is going to be kind of a hack,
> >> right?
> >> >> > >> Sending malformed responses back seems like not a clean
> solution...
> >> >> > >>
> >> >> > >> (Not sure if I was pro- having a top-level error or not, but in
> any
> >> >> case
> >> >> > >> the rationale for 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-01 Thread Gwen Shapira
One more thing, the KIP actually had 3 parts:
1. The version protocol
2. New response on messages of wrong API key or wrong version
3. Protocol documentation

I understand that you are offering to only implement part 1?
But the KIP discussion and vote should still cover all three parts,
they will just be implemented in separate JIRA?

On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh  wrote:
> On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira  wrote:
>
>> I don't see a use for the name - clients should be able to translate
>> ApiKey to name for any API they support, and I'm not sure why would a
>> client need to log anything about APIs it does not support. Am I
>> missing something?
>>
> Yea, it is a fair assumption that client would know about APIs it supports.
> It could have been helpful for client users to see new APIs though, however
> users can always refer to protocol doc of new version to find corresponding
> names of the new APIs.
>
>>
>> On a related note, Magnus is currently on vacation, but he should be
>> back at the end of next week. I'd like to hold off on the vote until
>> he gets back since his experience in implementing clients  and his
>> opinions will be very valuable for this discussion.
>>
> That is great. It will be valuable to have his feedback. I will hold off on
> removing "api_name" and "api_deprecated_versions" or adding release version.
>
>>
>> Gwen
>>
>> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh  wrote:
>> > Works with me. I will update PR to remove this.
>> >
>> > Also, "api_name" have been pointed out as a concern. However, it can be
>> > handy for logging and similar purposes. Any take on that?
>> >
>> > On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira  wrote:
>> >
>> >> Jay also mentioned:
>> >> "Or, alternately, since deprecation has no functional impact and is
>> >> just a message
>> >> to developers, we could just leave it out of the protocol and just have
>> it
>> >> in release notes etc."
>> >>
>> >> I'm in favor of leaving it out of the protocol. I can't really see a
>> >> use-case.
>> >>
>> >> Gwen
>> >>
>> >> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh 
>> wrote:
>> >>
>> >> > I hope it is OK for me to make some progress here. I have made the
>> >> > following changes.
>> >> >
>> >> > 1. Updated KIP-35, to adopt Jay's suggestion on maintaining separate
>> list
>> >> > of deprecated versions, instead of using a version of -1.
>> >> > 2. Added information on required permissions, Describe action on
>> Cluster
>> >> > resource, to be able to retrieve protocol versions from a auth enabled
>> >> > Kafka cluster.
>> >> >
>> >> > Created https://issues.apache.org/jira/browse/KAFKA-3304. Primary
>> patch
>> >> is
>> >> > available to review, https://github.com/apache/kafka/pull/986
>> >> >
>> >> > On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh 
>> >> wrote:
>> >> >
>> >> > > Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have found it
>> >> > really
>> >> > > difficult to cope up with Kafka releases as they want to support
>> users
>> >> on
>> >> > > different Kafka versions. Capability to retrieve protocol version
>> will
>> >> > go a
>> >> > > long way to ease out those pain points. I will be happy to help out
>> >> with
>> >> > > the work on this KIP. @Magnus, thanks for driving this, is it OK if
>> I
>> >> > carry
>> >> > > forward the work from here. It will be ideal to have this in
>> 0.10.0.0.
>> >> > >
>> >> > > On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps 
>> wrote:
>> >> > >
>> >> > >> I wonder if we need to solve the error problem? I think this KIP
>> >> gives a
>> >> > >> descent work around.
>> >> > >>
>> >> > >> Probably we should have included an error in the response header,
>> but
>> >> we
>> >> > >> debated it at the time decided not to and now it is pretty hard to
>> add
>> >> > >> because the headers aren't versioned (d'oh).
>> >> > >>
>> >> > >> It seems like any other solution is going to be kind of a hack,
>> right?
>> >> > >> Sending malformed responses back seems like not a clean solution...
>> >> > >>
>> >> > >> (Not sure if I was pro- having a top-level error or not, but in any
>> >> case
>> >> > >> the rationale for the decision was that so many of the requests
>> were
>> >> > >> per-partition or per-topic or whatever and hence fail or succeed at
>> >> that
>> >> > >> level and this makes it hard to know what the right top-level error
>> >> code
>> >> > >> is
>> >> > >> and hard for the client to figure out what to do with the top level
>> >> > error
>> >> > >> if some of the partitions succeed but there is a top-level error).
>> >> > >>
>> >> > >> I think actually this new API actually gives a way to handle this
>> >> > >> gracefully on the client side by just having clients that want to
>> be
>> >> > >> graceful check for support for their version. Clients that do that
>> >> will
>> >> > >> have a graceful 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-01 Thread Ashish Singh
On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira  wrote:

> I don't see a use for the name - clients should be able to translate
> ApiKey to name for any API they support, and I'm not sure why would a
> client need to log anything about APIs it does not support. Am I
> missing something?
>
Yea, it is a fair assumption that client would know about APIs it supports.
It could have been helpful for client users to see new APIs though, however
users can always refer to protocol doc of new version to find corresponding
names of the new APIs.

>
> On a related note, Magnus is currently on vacation, but he should be
> back at the end of next week. I'd like to hold off on the vote until
> he gets back since his experience in implementing clients  and his
> opinions will be very valuable for this discussion.
>
That is great. It will be valuable to have his feedback. I will hold off on
removing "api_name" and "api_deprecated_versions" or adding release version.

>
> Gwen
>
> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh  wrote:
> > Works with me. I will update PR to remove this.
> >
> > Also, "api_name" have been pointed out as a concern. However, it can be
> > handy for logging and similar purposes. Any take on that?
> >
> > On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira  wrote:
> >
> >> Jay also mentioned:
> >> "Or, alternately, since deprecation has no functional impact and is
> >> just a message
> >> to developers, we could just leave it out of the protocol and just have
> it
> >> in release notes etc."
> >>
> >> I'm in favor of leaving it out of the protocol. I can't really see a
> >> use-case.
> >>
> >> Gwen
> >>
> >> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh 
> wrote:
> >>
> >> > I hope it is OK for me to make some progress here. I have made the
> >> > following changes.
> >> >
> >> > 1. Updated KIP-35, to adopt Jay's suggestion on maintaining separate
> list
> >> > of deprecated versions, instead of using a version of -1.
> >> > 2. Added information on required permissions, Describe action on
> Cluster
> >> > resource, to be able to retrieve protocol versions from a auth enabled
> >> > Kafka cluster.
> >> >
> >> > Created https://issues.apache.org/jira/browse/KAFKA-3304. Primary
> patch
> >> is
> >> > available to review, https://github.com/apache/kafka/pull/986
> >> >
> >> > On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh 
> >> wrote:
> >> >
> >> > > Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have found it
> >> > really
> >> > > difficult to cope up with Kafka releases as they want to support
> users
> >> on
> >> > > different Kafka versions. Capability to retrieve protocol version
> will
> >> > go a
> >> > > long way to ease out those pain points. I will be happy to help out
> >> with
> >> > > the work on this KIP. @Magnus, thanks for driving this, is it OK if
> I
> >> > carry
> >> > > forward the work from here. It will be ideal to have this in
> 0.10.0.0.
> >> > >
> >> > > On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps 
> wrote:
> >> > >
> >> > >> I wonder if we need to solve the error problem? I think this KIP
> >> gives a
> >> > >> descent work around.
> >> > >>
> >> > >> Probably we should have included an error in the response header,
> but
> >> we
> >> > >> debated it at the time decided not to and now it is pretty hard to
> add
> >> > >> because the headers aren't versioned (d'oh).
> >> > >>
> >> > >> It seems like any other solution is going to be kind of a hack,
> right?
> >> > >> Sending malformed responses back seems like not a clean solution...
> >> > >>
> >> > >> (Not sure if I was pro- having a top-level error or not, but in any
> >> case
> >> > >> the rationale for the decision was that so many of the requests
> were
> >> > >> per-partition or per-topic or whatever and hence fail or succeed at
> >> that
> >> > >> level and this makes it hard to know what the right top-level error
> >> code
> >> > >> is
> >> > >> and hard for the client to figure out what to do with the top level
> >> > error
> >> > >> if some of the partitions succeed but there is a top-level error).
> >> > >>
> >> > >> I think actually this new API actually gives a way to handle this
> >> > >> gracefully on the client side by just having clients that want to
> be
> >> > >> graceful check for support for their version. Clients that do that
> >> will
> >> > >> have a graceful message.
> >> > >>
> >> > >> At some point if we're ever reworking the headers we should really
> >> > >> consider
> >> > >> (a) versioning them and (b) adding a top-level error code in the
> >> > response.
> >> > >> But given this would be a big breaking change and this is really
> just
> >> to
> >> > >> give a nicer error message seems like it probably isn't worth it to
> >> try
> >> > to
> >> > >> do something now.
> >> > >>
> >> > >> -Jay
> >> > >>
> >> > >>
> >> > >>
> >> > >> On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie Qin
> >> 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-01 Thread Gwen Shapira
I don't see a use for the name - clients should be able to translate
ApiKey to name for any API they support, and I'm not sure why would a
client need to log anything about APIs it does not support. Am I
missing something?

On a related note, Magnus is currently on vacation, but he should be
back at the end of next week. I'd like to hold off on the vote until
he gets back since his experience in implementing clients  and his
opinions will be very valuable for this discussion.

Gwen

On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh  wrote:
> Works with me. I will update PR to remove this.
>
> Also, "api_name" have been pointed out as a concern. However, it can be
> handy for logging and similar purposes. Any take on that?
>
> On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira  wrote:
>
>> Jay also mentioned:
>> "Or, alternately, since deprecation has no functional impact and is
>> just a message
>> to developers, we could just leave it out of the protocol and just have it
>> in release notes etc."
>>
>> I'm in favor of leaving it out of the protocol. I can't really see a
>> use-case.
>>
>> Gwen
>>
>> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh  wrote:
>>
>> > I hope it is OK for me to make some progress here. I have made the
>> > following changes.
>> >
>> > 1. Updated KIP-35, to adopt Jay's suggestion on maintaining separate list
>> > of deprecated versions, instead of using a version of -1.
>> > 2. Added information on required permissions, Describe action on Cluster
>> > resource, to be able to retrieve protocol versions from a auth enabled
>> > Kafka cluster.
>> >
>> > Created https://issues.apache.org/jira/browse/KAFKA-3304. Primary patch
>> is
>> > available to review, https://github.com/apache/kafka/pull/986
>> >
>> > On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh 
>> wrote:
>> >
>> > > Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have found it
>> > really
>> > > difficult to cope up with Kafka releases as they want to support users
>> on
>> > > different Kafka versions. Capability to retrieve protocol version will
>> > go a
>> > > long way to ease out those pain points. I will be happy to help out
>> with
>> > > the work on this KIP. @Magnus, thanks for driving this, is it OK if I
>> > carry
>> > > forward the work from here. It will be ideal to have this in 0.10.0.0.
>> > >
>> > > On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps  wrote:
>> > >
>> > >> I wonder if we need to solve the error problem? I think this KIP
>> gives a
>> > >> descent work around.
>> > >>
>> > >> Probably we should have included an error in the response header, but
>> we
>> > >> debated it at the time decided not to and now it is pretty hard to add
>> > >> because the headers aren't versioned (d'oh).
>> > >>
>> > >> It seems like any other solution is going to be kind of a hack, right?
>> > >> Sending malformed responses back seems like not a clean solution...
>> > >>
>> > >> (Not sure if I was pro- having a top-level error or not, but in any
>> case
>> > >> the rationale for the decision was that so many of the requests were
>> > >> per-partition or per-topic or whatever and hence fail or succeed at
>> that
>> > >> level and this makes it hard to know what the right top-level error
>> code
>> > >> is
>> > >> and hard for the client to figure out what to do with the top level
>> > error
>> > >> if some of the partitions succeed but there is a top-level error).
>> > >>
>> > >> I think actually this new API actually gives a way to handle this
>> > >> gracefully on the client side by just having clients that want to be
>> > >> graceful check for support for their version. Clients that do that
>> will
>> > >> have a graceful message.
>> > >>
>> > >> At some point if we're ever reworking the headers we should really
>> > >> consider
>> > >> (a) versioning them and (b) adding a top-level error code in the
>> > response.
>> > >> But given this would be a big breaking change and this is really just
>> to
>> > >> give a nicer error message seems like it probably isn't worth it to
>> try
>> > to
>> > >> do something now.
>> > >>
>> > >> -Jay
>> > >>
>> > >>
>> > >>
>> > >> On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie Qin
>> > > >
>> > >> wrote:
>> > >>
>> > >> > I am thinking instead of returning an empty response, it would be
>> > >> better to
>> > >> > return an explicit UnsupportedVersionException code.
>> > >> >
>> > >> > Today KafkaApis handles the error in the following way:
>> > >> > 1. For requests/responses using old Scala classes, KafkaApis uses
>> > >> > RequestOrResponse.handleError() to return an error response.
>> > >> > 2. For requests/response using Java classes (only JoinGroupRequest
>> and
>> > >> > Heartbeat now), KafkaApis calls AbstractRequest.getErrorResponse()
>> to
>> > >> > return an error response.
>> > >> >
>> > >> > In KAFKA-2512, I am returning an UnsupportedVersionException for
>> case

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-01 Thread Ashish Singh
Works with me. I will update PR to remove this.

Also, "api_name" have been pointed out as a concern. However, it can be
handy for logging and similar purposes. Any take on that?

On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira  wrote:

> Jay also mentioned:
> "Or, alternately, since deprecation has no functional impact and is
> just a message
> to developers, we could just leave it out of the protocol and just have it
> in release notes etc."
>
> I'm in favor of leaving it out of the protocol. I can't really see a
> use-case.
>
> Gwen
>
> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh  wrote:
>
> > I hope it is OK for me to make some progress here. I have made the
> > following changes.
> >
> > 1. Updated KIP-35, to adopt Jay's suggestion on maintaining separate list
> > of deprecated versions, instead of using a version of -1.
> > 2. Added information on required permissions, Describe action on Cluster
> > resource, to be able to retrieve protocol versions from a auth enabled
> > Kafka cluster.
> >
> > Created https://issues.apache.org/jira/browse/KAFKA-3304. Primary patch
> is
> > available to review, https://github.com/apache/kafka/pull/986
> >
> > On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh 
> wrote:
> >
> > > Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have found it
> > really
> > > difficult to cope up with Kafka releases as they want to support users
> on
> > > different Kafka versions. Capability to retrieve protocol version will
> > go a
> > > long way to ease out those pain points. I will be happy to help out
> with
> > > the work on this KIP. @Magnus, thanks for driving this, is it OK if I
> > carry
> > > forward the work from here. It will be ideal to have this in 0.10.0.0.
> > >
> > > On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps  wrote:
> > >
> > >> I wonder if we need to solve the error problem? I think this KIP
> gives a
> > >> descent work around.
> > >>
> > >> Probably we should have included an error in the response header, but
> we
> > >> debated it at the time decided not to and now it is pretty hard to add
> > >> because the headers aren't versioned (d'oh).
> > >>
> > >> It seems like any other solution is going to be kind of a hack, right?
> > >> Sending malformed responses back seems like not a clean solution...
> > >>
> > >> (Not sure if I was pro- having a top-level error or not, but in any
> case
> > >> the rationale for the decision was that so many of the requests were
> > >> per-partition or per-topic or whatever and hence fail or succeed at
> that
> > >> level and this makes it hard to know what the right top-level error
> code
> > >> is
> > >> and hard for the client to figure out what to do with the top level
> > error
> > >> if some of the partitions succeed but there is a top-level error).
> > >>
> > >> I think actually this new API actually gives a way to handle this
> > >> gracefully on the client side by just having clients that want to be
> > >> graceful check for support for their version. Clients that do that
> will
> > >> have a graceful message.
> > >>
> > >> At some point if we're ever reworking the headers we should really
> > >> consider
> > >> (a) versioning them and (b) adding a top-level error code in the
> > response.
> > >> But given this would be a big breaking change and this is really just
> to
> > >> give a nicer error message seems like it probably isn't worth it to
> try
> > to
> > >> do something now.
> > >>
> > >> -Jay
> > >>
> > >>
> > >>
> > >> On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie Qin
>  > >
> > >> wrote:
> > >>
> > >> > I am thinking instead of returning an empty response, it would be
> > >> better to
> > >> > return an explicit UnsupportedVersionException code.
> > >> >
> > >> > Today KafkaApis handles the error in the following way:
> > >> > 1. For requests/responses using old Scala classes, KafkaApis uses
> > >> > RequestOrResponse.handleError() to return an error response.
> > >> > 2. For requests/response using Java classes (only JoinGroupRequest
> and
> > >> > Heartbeat now), KafkaApis calls AbstractRequest.getErrorResponse()
> to
> > >> > return an error response.
> > >> >
> > >> > In KAFKA-2512, I am returning an UnsupportedVersionException for
> case
> > >> [1]
> > >> > when see an unsupported version. This will put the error code per
> > topic
> > >> or
> > >> > partition for most of the requests, but might not work all the time.
> > >> e.g.
> > >> > TopicMetadataRequest with an empty topic set.
> > >> >
> > >> > Case [2] does not quite work for unsupported version, because we
> will
> > >> > thrown an uncaught exception when version is not recognized (BTW
> this
> > >> is a
> > >> > bug). Part of the reason is that for some response types, error code
> > is
> > >> not
> > >> > part of the response level field.
> > >> >
> > >> > Maybe it worth checking how each response is dealing with error code
> > >> today.
> > >> 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-01 Thread Gwen Shapira
Jay also mentioned:
"Or, alternately, since deprecation has no functional impact and is
just a message
to developers, we could just leave it out of the protocol and just have it
in release notes etc."

I'm in favor of leaving it out of the protocol. I can't really see a
use-case.

Gwen

On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh  wrote:

> I hope it is OK for me to make some progress here. I have made the
> following changes.
>
> 1. Updated KIP-35, to adopt Jay's suggestion on maintaining separate list
> of deprecated versions, instead of using a version of -1.
> 2. Added information on required permissions, Describe action on Cluster
> resource, to be able to retrieve protocol versions from a auth enabled
> Kafka cluster.
>
> Created https://issues.apache.org/jira/browse/KAFKA-3304. Primary patch is
> available to review, https://github.com/apache/kafka/pull/986
>
> On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh  wrote:
>
> > Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have found it
> really
> > difficult to cope up with Kafka releases as they want to support users on
> > different Kafka versions. Capability to retrieve protocol version will
> go a
> > long way to ease out those pain points. I will be happy to help out with
> > the work on this KIP. @Magnus, thanks for driving this, is it OK if I
> carry
> > forward the work from here. It will be ideal to have this in 0.10.0.0.
> >
> > On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps  wrote:
> >
> >> I wonder if we need to solve the error problem? I think this KIP gives a
> >> descent work around.
> >>
> >> Probably we should have included an error in the response header, but we
> >> debated it at the time decided not to and now it is pretty hard to add
> >> because the headers aren't versioned (d'oh).
> >>
> >> It seems like any other solution is going to be kind of a hack, right?
> >> Sending malformed responses back seems like not a clean solution...
> >>
> >> (Not sure if I was pro- having a top-level error or not, but in any case
> >> the rationale for the decision was that so many of the requests were
> >> per-partition or per-topic or whatever and hence fail or succeed at that
> >> level and this makes it hard to know what the right top-level error code
> >> is
> >> and hard for the client to figure out what to do with the top level
> error
> >> if some of the partitions succeed but there is a top-level error).
> >>
> >> I think actually this new API actually gives a way to handle this
> >> gracefully on the client side by just having clients that want to be
> >> graceful check for support for their version. Clients that do that will
> >> have a graceful message.
> >>
> >> At some point if we're ever reworking the headers we should really
> >> consider
> >> (a) versioning them and (b) adding a top-level error code in the
> response.
> >> But given this would be a big breaking change and this is really just to
> >> give a nicer error message seems like it probably isn't worth it to try
> to
> >> do something now.
> >>
> >> -Jay
> >>
> >>
> >>
> >> On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie Qin  >
> >> wrote:
> >>
> >> > I am thinking instead of returning an empty response, it would be
> >> better to
> >> > return an explicit UnsupportedVersionException code.
> >> >
> >> > Today KafkaApis handles the error in the following way:
> >> > 1. For requests/responses using old Scala classes, KafkaApis uses
> >> > RequestOrResponse.handleError() to return an error response.
> >> > 2. For requests/response using Java classes (only JoinGroupRequest and
> >> > Heartbeat now), KafkaApis calls AbstractRequest.getErrorResponse() to
> >> > return an error response.
> >> >
> >> > In KAFKA-2512, I am returning an UnsupportedVersionException for case
> >> [1]
> >> > when see an unsupported version. This will put the error code per
> topic
> >> or
> >> > partition for most of the requests, but might not work all the time.
> >> e.g.
> >> > TopicMetadataRequest with an empty topic set.
> >> >
> >> > Case [2] does not quite work for unsupported version, because we will
> >> > thrown an uncaught exception when version is not recognized (BTW this
> >> is a
> >> > bug). Part of the reason is that for some response types, error code
> is
> >> not
> >> > part of the response level field.
> >> >
> >> > Maybe it worth checking how each response is dealing with error code
> >> today.
> >> > A scan of the response formats gives the following result:
> >> > 1. TopicMetadataResponse - per topic error code, does not work when
> the
> >> > topic set is empty in the request.
> >> > 2. ProduceResonse - per partition error code.
> >> > 3. OffsetCommitResponse - per partition.
> >> > 4. OffsetFetchResponse - per partition.
> >> > 5. OffsetResponse - per partition.
> >> > 6. FetchResponse - per partition
> >> > 7. ConsumerMetadataResponse - response level
> >> > 8. ControlledShutdownResponse - 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-03-01 Thread Ashish Singh
Hello Dana,

On Mon, Feb 29, 2016 at 4:11 PM, Dana Powers  wrote:

> This is a fantastic and much-needed KIP. All third-party clients have had
> to deal with this issue. In my experience most clients are either declaring
> they only support version broker version X, or are spending a lot of time
> hacking around the issue. I think the community of non-java drivers would
> see significant benefit from this proposal.
>
> My specific thought is that for kafka-python it has been easier to manage
> compatibility using broker release version to gate various features by
> api-protocol version. For example, only enable group coordination apis if
> >= (0, 9), kafka-backed offsets >= (0, 8, 2), etc. As an example, here are
> some backwards compatibility issues that I think are difficult to capture
> w/ just the protocol versions:
>
> - LZ4 compression only supported in brokers >= 0.8.2, but no protocol
> change.
> - kafka-backed offset storage, in additional to requiring new offset
> commit/fetch protocol versions, also requires adding support for tracking
> the group coordinator.
> - 0.8.2-beta OffsetCommit api [different than 0.8.X release]
>
>
> Could release version be added to the api response in this proposal?
> Perhaps include a KafkaRelease string in the Response before the array of
> api versions?
>
I think that will be useful, however we should discuss this in next KIP
meeting to check if others see similar value. There are a few questions
that are raised on PR, we can briefly touch upon them as well.

>
> Thanks for the great KIP, Magnus. And thanks for restarting the discussion,
> Ashish. I also would like to see this addressed in 0.10
>
> -Dana
>
>
> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh  wrote:
>
> > I hope it is OK for me to make some progress here. I have made the
> > following changes.
> >
> > 1. Updated KIP-35, to adopt Jay's suggestion on maintaining separate list
> > of deprecated versions, instead of using a version of -1.
> > 2. Added information on required permissions, Describe action on Cluster
> > resource, to be able to retrieve protocol versions from a auth enabled
> > Kafka cluster.
> >
> > Created https://issues.apache.org/jira/browse/KAFKA-3304. Primary patch
> is
> > available to review, https://github.com/apache/kafka/pull/986
> >
> > On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh 
> wrote:
> >
> > > Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have found it
> > really
> > > difficult to cope up with Kafka releases as they want to support users
> on
> > > different Kafka versions. Capability to retrieve protocol version will
> > go a
> > > long way to ease out those pain points. I will be happy to help out
> with
> > > the work on this KIP. @Magnus, thanks for driving this, is it OK if I
> > carry
> > > forward the work from here. It will be ideal to have this in 0.10.0.0.
> > >
> > > On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps  wrote:
> > >
> > >> I wonder if we need to solve the error problem? I think this KIP
> gives a
> > >> descent work around.
> > >>
> > >> Probably we should have included an error in the response header, but
> we
> > >> debated it at the time decided not to and now it is pretty hard to add
> > >> because the headers aren't versioned (d'oh).
> > >>
> > >> It seems like any other solution is going to be kind of a hack, right?
> > >> Sending malformed responses back seems like not a clean solution...
> > >>
> > >> (Not sure if I was pro- having a top-level error or not, but in any
> case
> > >> the rationale for the decision was that so many of the requests were
> > >> per-partition or per-topic or whatever and hence fail or succeed at
> that
> > >> level and this makes it hard to know what the right top-level error
> code
> > >> is
> > >> and hard for the client to figure out what to do with the top level
> > error
> > >> if some of the partitions succeed but there is a top-level error).
> > >>
> > >> I think actually this new API actually gives a way to handle this
> > >> gracefully on the client side by just having clients that want to be
> > >> graceful check for support for their version. Clients that do that
> will
> > >> have a graceful message.
> > >>
> > >> At some point if we're ever reworking the headers we should really
> > >> consider
> > >> (a) versioning them and (b) adding a top-level error code in the
> > response.
> > >> But given this would be a big breaking change and this is really just
> to
> > >> give a nicer error message seems like it probably isn't worth it to
> try
> > to
> > >> do something now.
> > >>
> > >> -Jay
> > >>
> > >>
> > >>
> > >> On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie Qin
>  > >
> > >> wrote:
> > >>
> > >> > I am thinking instead of returning an empty response, it would be
> > >> better to
> > >> > return an explicit UnsupportedVersionException code.
> > >> >
> > >> > Today KafkaApis handles the 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-02-29 Thread Dana Powers
This is a fantastic and much-needed KIP. All third-party clients have had
to deal with this issue. In my experience most clients are either declaring
they only support version broker version X, or are spending a lot of time
hacking around the issue. I think the community of non-java drivers would
see significant benefit from this proposal.

My specific thought is that for kafka-python it has been easier to manage
compatibility using broker release version to gate various features by
api-protocol version. For example, only enable group coordination apis if
>= (0, 9), kafka-backed offsets >= (0, 8, 2), etc. As an example, here are
some backwards compatibility issues that I think are difficult to capture
w/ just the protocol versions:

- LZ4 compression only supported in brokers >= 0.8.2, but no protocol
change.
- kafka-backed offset storage, in additional to requiring new offset
commit/fetch protocol versions, also requires adding support for tracking
the group coordinator.
- 0.8.2-beta OffsetCommit api [different than 0.8.X release]


Could release version be added to the api response in this proposal?
Perhaps include a KafkaRelease string in the Response before the array of
api versions?

Thanks for the great KIP, Magnus. And thanks for restarting the discussion,
Ashish. I also would like to see this addressed in 0.10

-Dana


On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh  wrote:

> I hope it is OK for me to make some progress here. I have made the
> following changes.
>
> 1. Updated KIP-35, to adopt Jay's suggestion on maintaining separate list
> of deprecated versions, instead of using a version of -1.
> 2. Added information on required permissions, Describe action on Cluster
> resource, to be able to retrieve protocol versions from a auth enabled
> Kafka cluster.
>
> Created https://issues.apache.org/jira/browse/KAFKA-3304. Primary patch is
> available to review, https://github.com/apache/kafka/pull/986
>
> On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh  wrote:
>
> > Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have found it
> really
> > difficult to cope up with Kafka releases as they want to support users on
> > different Kafka versions. Capability to retrieve protocol version will
> go a
> > long way to ease out those pain points. I will be happy to help out with
> > the work on this KIP. @Magnus, thanks for driving this, is it OK if I
> carry
> > forward the work from here. It will be ideal to have this in 0.10.0.0.
> >
> > On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps  wrote:
> >
> >> I wonder if we need to solve the error problem? I think this KIP gives a
> >> descent work around.
> >>
> >> Probably we should have included an error in the response header, but we
> >> debated it at the time decided not to and now it is pretty hard to add
> >> because the headers aren't versioned (d'oh).
> >>
> >> It seems like any other solution is going to be kind of a hack, right?
> >> Sending malformed responses back seems like not a clean solution...
> >>
> >> (Not sure if I was pro- having a top-level error or not, but in any case
> >> the rationale for the decision was that so many of the requests were
> >> per-partition or per-topic or whatever and hence fail or succeed at that
> >> level and this makes it hard to know what the right top-level error code
> >> is
> >> and hard for the client to figure out what to do with the top level
> error
> >> if some of the partitions succeed but there is a top-level error).
> >>
> >> I think actually this new API actually gives a way to handle this
> >> gracefully on the client side by just having clients that want to be
> >> graceful check for support for their version. Clients that do that will
> >> have a graceful message.
> >>
> >> At some point if we're ever reworking the headers we should really
> >> consider
> >> (a) versioning them and (b) adding a top-level error code in the
> response.
> >> But given this would be a big breaking change and this is really just to
> >> give a nicer error message seems like it probably isn't worth it to try
> to
> >> do something now.
> >>
> >> -Jay
> >>
> >>
> >>
> >> On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie Qin  >
> >> wrote:
> >>
> >> > I am thinking instead of returning an empty response, it would be
> >> better to
> >> > return an explicit UnsupportedVersionException code.
> >> >
> >> > Today KafkaApis handles the error in the following way:
> >> > 1. For requests/responses using old Scala classes, KafkaApis uses
> >> > RequestOrResponse.handleError() to return an error response.
> >> > 2. For requests/response using Java classes (only JoinGroupRequest and
> >> > Heartbeat now), KafkaApis calls AbstractRequest.getErrorResponse() to
> >> > return an error response.
> >> >
> >> > In KAFKA-2512, I am returning an UnsupportedVersionException for case
> >> [1]
> >> > when see an unsupported version. This will put the error code per
> 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-02-29 Thread Ashish Singh
I hope it is OK for me to make some progress here. I have made the
following changes.

1. Updated KIP-35, to adopt Jay's suggestion on maintaining separate list
of deprecated versions, instead of using a version of -1.
2. Added information on required permissions, Describe action on Cluster
resource, to be able to retrieve protocol versions from a auth enabled
Kafka cluster.

Created https://issues.apache.org/jira/browse/KAFKA-3304. Primary patch is
available to review, https://github.com/apache/kafka/pull/986

On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh  wrote:

> Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have found it really
> difficult to cope up with Kafka releases as they want to support users on
> different Kafka versions. Capability to retrieve protocol version will go a
> long way to ease out those pain points. I will be happy to help out with
> the work on this KIP. @Magnus, thanks for driving this, is it OK if I carry
> forward the work from here. It will be ideal to have this in 0.10.0.0.
>
> On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps  wrote:
>
>> I wonder if we need to solve the error problem? I think this KIP gives a
>> descent work around.
>>
>> Probably we should have included an error in the response header, but we
>> debated it at the time decided not to and now it is pretty hard to add
>> because the headers aren't versioned (d'oh).
>>
>> It seems like any other solution is going to be kind of a hack, right?
>> Sending malformed responses back seems like not a clean solution...
>>
>> (Not sure if I was pro- having a top-level error or not, but in any case
>> the rationale for the decision was that so many of the requests were
>> per-partition or per-topic or whatever and hence fail or succeed at that
>> level and this makes it hard to know what the right top-level error code
>> is
>> and hard for the client to figure out what to do with the top level error
>> if some of the partitions succeed but there is a top-level error).
>>
>> I think actually this new API actually gives a way to handle this
>> gracefully on the client side by just having clients that want to be
>> graceful check for support for their version. Clients that do that will
>> have a graceful message.
>>
>> At some point if we're ever reworking the headers we should really
>> consider
>> (a) versioning them and (b) adding a top-level error code in the response.
>> But given this would be a big breaking change and this is really just to
>> give a nicer error message seems like it probably isn't worth it to try to
>> do something now.
>>
>> -Jay
>>
>>
>>
>> On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie Qin 
>> wrote:
>>
>> > I am thinking instead of returning an empty response, it would be
>> better to
>> > return an explicit UnsupportedVersionException code.
>> >
>> > Today KafkaApis handles the error in the following way:
>> > 1. For requests/responses using old Scala classes, KafkaApis uses
>> > RequestOrResponse.handleError() to return an error response.
>> > 2. For requests/response using Java classes (only JoinGroupRequest and
>> > Heartbeat now), KafkaApis calls AbstractRequest.getErrorResponse() to
>> > return an error response.
>> >
>> > In KAFKA-2512, I am returning an UnsupportedVersionException for case
>> [1]
>> > when see an unsupported version. This will put the error code per topic
>> or
>> > partition for most of the requests, but might not work all the time.
>> e.g.
>> > TopicMetadataRequest with an empty topic set.
>> >
>> > Case [2] does not quite work for unsupported version, because we will
>> > thrown an uncaught exception when version is not recognized (BTW this
>> is a
>> > bug). Part of the reason is that for some response types, error code is
>> not
>> > part of the response level field.
>> >
>> > Maybe it worth checking how each response is dealing with error code
>> today.
>> > A scan of the response formats gives the following result:
>> > 1. TopicMetadataResponse - per topic error code, does not work when the
>> > topic set is empty in the request.
>> > 2. ProduceResonse - per partition error code.
>> > 3. OffsetCommitResponse - per partition.
>> > 4. OffsetFetchResponse - per partition.
>> > 5. OffsetResponse - per partition.
>> > 6. FetchResponse - per partition
>> > 7. ConsumerMetadataResponse - response level
>> > 8. ControlledShutdownResponse - response level
>> > 9. JoinGroupResponse - response level
>> > 10. HearbeatResponse - response level
>> > 11. LeaderAndIsrResponse - response level
>> > 12. StopReplicaResponse - response level
>> > 13. UpdateMetadataResponse - response level
>> >
>> > So from the list above it looks for each response we are actually able
>> to
>> > return an error code, as long as we make sure the topic or partition
>> won't
>> > be empty when the error code is at topic or partition level. Luckily in
>> the
>> > above list we only need to worry about TopicMetadataResponse.
>> >
>> > Maybe 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2016-02-25 Thread Ashish Singh
Kafka clients in Hadoop ecosystem, Flume, Spark, etc, have found it really
difficult to cope up with Kafka releases as they want to support users on
different Kafka versions. Capability to retrieve protocol version will go a
long way to ease out those pain points. I will be happy to help out with
the work on this KIP. @Magnus, thanks for driving this, is it OK if I carry
forward the work from here. It will be ideal to have this in 0.10.0.0.

On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps  wrote:

> I wonder if we need to solve the error problem? I think this KIP gives a
> descent work around.
>
> Probably we should have included an error in the response header, but we
> debated it at the time decided not to and now it is pretty hard to add
> because the headers aren't versioned (d'oh).
>
> It seems like any other solution is going to be kind of a hack, right?
> Sending malformed responses back seems like not a clean solution...
>
> (Not sure if I was pro- having a top-level error or not, but in any case
> the rationale for the decision was that so many of the requests were
> per-partition or per-topic or whatever and hence fail or succeed at that
> level and this makes it hard to know what the right top-level error code is
> and hard for the client to figure out what to do with the top level error
> if some of the partitions succeed but there is a top-level error).
>
> I think actually this new API actually gives a way to handle this
> gracefully on the client side by just having clients that want to be
> graceful check for support for their version. Clients that do that will
> have a graceful message.
>
> At some point if we're ever reworking the headers we should really consider
> (a) versioning them and (b) adding a top-level error code in the response.
> But given this would be a big breaking change and this is really just to
> give a nicer error message seems like it probably isn't worth it to try to
> do something now.
>
> -Jay
>
>
>
> On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie Qin 
> wrote:
>
> > I am thinking instead of returning an empty response, it would be better
> to
> > return an explicit UnsupportedVersionException code.
> >
> > Today KafkaApis handles the error in the following way:
> > 1. For requests/responses using old Scala classes, KafkaApis uses
> > RequestOrResponse.handleError() to return an error response.
> > 2. For requests/response using Java classes (only JoinGroupRequest and
> > Heartbeat now), KafkaApis calls AbstractRequest.getErrorResponse() to
> > return an error response.
> >
> > In KAFKA-2512, I am returning an UnsupportedVersionException for case [1]
> > when see an unsupported version. This will put the error code per topic
> or
> > partition for most of the requests, but might not work all the time. e.g.
> > TopicMetadataRequest with an empty topic set.
> >
> > Case [2] does not quite work for unsupported version, because we will
> > thrown an uncaught exception when version is not recognized (BTW this is
> a
> > bug). Part of the reason is that for some response types, error code is
> not
> > part of the response level field.
> >
> > Maybe it worth checking how each response is dealing with error code
> today.
> > A scan of the response formats gives the following result:
> > 1. TopicMetadataResponse - per topic error code, does not work when the
> > topic set is empty in the request.
> > 2. ProduceResonse - per partition error code.
> > 3. OffsetCommitResponse - per partition.
> > 4. OffsetFetchResponse - per partition.
> > 5. OffsetResponse - per partition.
> > 6. FetchResponse - per partition
> > 7. ConsumerMetadataResponse - response level
> > 8. ControlledShutdownResponse - response level
> > 9. JoinGroupResponse - response level
> > 10. HearbeatResponse - response level
> > 11. LeaderAndIsrResponse - response level
> > 12. StopReplicaResponse - response level
> > 13. UpdateMetadataResponse - response level
> >
> > So from the list above it looks for each response we are actually able to
> > return an error code, as long as we make sure the topic or partition
> won't
> > be empty when the error code is at topic or partition level. Luckily in
> the
> > above list we only need to worry about TopicMetadataResponse.
> >
> > Maybe error handling is out of the scope of this KIP, but I prefer we
> think
> > through how to deal with error code for the requests, because there are
> > more request types to be added in KAFKA-2464 and future patches.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Mon, Oct 12, 2015 at 6:04 PM, Jay Kreps  wrote:
> >
> > > Two quick pieces of feedback:
> > > 1. The use of a version of -1 as magical entry dividing between
> > deprecated
> > > versions is a bit hacky. What about instead having an array of
> supported
> > > versions and a separate array of deprecated versions. The deprecated
> > > versions would always be a subset of the supported versions. Or,
> > > 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-10-12 Thread Jay Kreps
Two quick pieces of feedback:
1. The use of a version of -1 as magical entry dividing between deprecated
versions is a bit hacky. What about instead having an array of supported
versions and a separate array of deprecated versions. The deprecated
versions would always be a subset of the supported versions. Or,
alternately, since deprecation has no functional impact and is just a
message to developers, we could just leave it out of the protocol and just
have it in release notes etc.
2. I think including the api name may cause some problems. Currently the
api key is the primary key that we keep consistent but we have actually
evolved the english description of the apis as they have changed. The only
use I can think of for the name would be if people used the logical name
and tried to resolve the api key, but that would be wrong. Not sure if we
actually need the english name, if there is a use case I guess we'll just
have to be very clear that the name is just documentation and can change
any time.

-Jay

On Fri, Sep 25, 2015 at 2:53 PM, Magnus Edenhill  wrote:

> Good evening,
>
> KIP-35 was created to address current and future broker-client
> compatibility.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
>
> Summary:
>  * allow clients to retrieve the broker's protocol version
>  * make broker handle unknown protocol requests gracefully
>
> Feedback and comments welcome!
>
> Regards,
> Magnus
>


Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-10-12 Thread Jiangjie Qin
I am thinking instead of returning an empty response, it would be better to
return an explicit UnsupportedVersionException code.

Today KafkaApis handles the error in the following way:
1. For requests/responses using old Scala classes, KafkaApis uses
RequestOrResponse.handleError() to return an error response.
2. For requests/response using Java classes (only JoinGroupRequest and
Heartbeat now), KafkaApis calls AbstractRequest.getErrorResponse() to
return an error response.

In KAFKA-2512, I am returning an UnsupportedVersionException for case [1]
when see an unsupported version. This will put the error code per topic or
partition for most of the requests, but might not work all the time. e.g.
TopicMetadataRequest with an empty topic set.

Case [2] does not quite work for unsupported version, because we will
thrown an uncaught exception when version is not recognized (BTW this is a
bug). Part of the reason is that for some response types, error code is not
part of the response level field.

Maybe it worth checking how each response is dealing with error code today.
A scan of the response formats gives the following result:
1. TopicMetadataResponse - per topic error code, does not work when the
topic set is empty in the request.
2. ProduceResonse - per partition error code.
3. OffsetCommitResponse - per partition.
4. OffsetFetchResponse - per partition.
5. OffsetResponse - per partition.
6. FetchResponse - per partition
7. ConsumerMetadataResponse - response level
8. ControlledShutdownResponse - response level
9. JoinGroupResponse - response level
10. HearbeatResponse - response level
11. LeaderAndIsrResponse - response level
12. StopReplicaResponse - response level
13. UpdateMetadataResponse - response level

So from the list above it looks for each response we are actually able to
return an error code, as long as we make sure the topic or partition won't
be empty when the error code is at topic or partition level. Luckily in the
above list we only need to worry about TopicMetadataResponse.

Maybe error handling is out of the scope of this KIP, but I prefer we think
through how to deal with error code for the requests, because there are
more request types to be added in KAFKA-2464 and future patches.

Thanks,

Jiangjie (Becket) Qin

On Mon, Oct 12, 2015 at 6:04 PM, Jay Kreps  wrote:

> Two quick pieces of feedback:
> 1. The use of a version of -1 as magical entry dividing between deprecated
> versions is a bit hacky. What about instead having an array of supported
> versions and a separate array of deprecated versions. The deprecated
> versions would always be a subset of the supported versions. Or,
> alternately, since deprecation has no functional impact and is just a
> message to developers, we could just leave it out of the protocol and just
> have it in release notes etc.
> 2. I think including the api name may cause some problems. Currently the
> api key is the primary key that we keep consistent but we have actually
> evolved the english description of the apis as they have changed. The only
> use I can think of for the name would be if people used the logical name
> and tried to resolve the api key, but that would be wrong. Not sure if we
> actually need the english name, if there is a use case I guess we'll just
> have to be very clear that the name is just documentation and can change
> any time.
>
> -Jay
>
> On Fri, Sep 25, 2015 at 2:53 PM, Magnus Edenhill 
> wrote:
>
> > Good evening,
> >
> > KIP-35 was created to address current and future broker-client
> > compatibility.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
> >
> > Summary:
> >  * allow clients to retrieve the broker's protocol version
> >  * make broker handle unknown protocol requests gracefully
> >
> > Feedback and comments welcome!
> >
> > Regards,
> > Magnus
> >
>


Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-10-12 Thread Jay Kreps
I wonder if we need to solve the error problem? I think this KIP gives a
descent work around.

Probably we should have included an error in the response header, but we
debated it at the time decided not to and now it is pretty hard to add
because the headers aren't versioned (d'oh).

It seems like any other solution is going to be kind of a hack, right?
Sending malformed responses back seems like not a clean solution...

(Not sure if I was pro- having a top-level error or not, but in any case
the rationale for the decision was that so many of the requests were
per-partition or per-topic or whatever and hence fail or succeed at that
level and this makes it hard to know what the right top-level error code is
and hard for the client to figure out what to do with the top level error
if some of the partitions succeed but there is a top-level error).

I think actually this new API actually gives a way to handle this
gracefully on the client side by just having clients that want to be
graceful check for support for their version. Clients that do that will
have a graceful message.

At some point if we're ever reworking the headers we should really consider
(a) versioning them and (b) adding a top-level error code in the response.
But given this would be a big breaking change and this is really just to
give a nicer error message seems like it probably isn't worth it to try to
do something now.

-Jay



On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie Qin 
wrote:

> I am thinking instead of returning an empty response, it would be better to
> return an explicit UnsupportedVersionException code.
>
> Today KafkaApis handles the error in the following way:
> 1. For requests/responses using old Scala classes, KafkaApis uses
> RequestOrResponse.handleError() to return an error response.
> 2. For requests/response using Java classes (only JoinGroupRequest and
> Heartbeat now), KafkaApis calls AbstractRequest.getErrorResponse() to
> return an error response.
>
> In KAFKA-2512, I am returning an UnsupportedVersionException for case [1]
> when see an unsupported version. This will put the error code per topic or
> partition for most of the requests, but might not work all the time. e.g.
> TopicMetadataRequest with an empty topic set.
>
> Case [2] does not quite work for unsupported version, because we will
> thrown an uncaught exception when version is not recognized (BTW this is a
> bug). Part of the reason is that for some response types, error code is not
> part of the response level field.
>
> Maybe it worth checking how each response is dealing with error code today.
> A scan of the response formats gives the following result:
> 1. TopicMetadataResponse - per topic error code, does not work when the
> topic set is empty in the request.
> 2. ProduceResonse - per partition error code.
> 3. OffsetCommitResponse - per partition.
> 4. OffsetFetchResponse - per partition.
> 5. OffsetResponse - per partition.
> 6. FetchResponse - per partition
> 7. ConsumerMetadataResponse - response level
> 8. ControlledShutdownResponse - response level
> 9. JoinGroupResponse - response level
> 10. HearbeatResponse - response level
> 11. LeaderAndIsrResponse - response level
> 12. StopReplicaResponse - response level
> 13. UpdateMetadataResponse - response level
>
> So from the list above it looks for each response we are actually able to
> return an error code, as long as we make sure the topic or partition won't
> be empty when the error code is at topic or partition level. Luckily in the
> above list we only need to worry about TopicMetadataResponse.
>
> Maybe error handling is out of the scope of this KIP, but I prefer we think
> through how to deal with error code for the requests, because there are
> more request types to be added in KAFKA-2464 and future patches.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Oct 12, 2015 at 6:04 PM, Jay Kreps  wrote:
>
> > Two quick pieces of feedback:
> > 1. The use of a version of -1 as magical entry dividing between
> deprecated
> > versions is a bit hacky. What about instead having an array of supported
> > versions and a separate array of deprecated versions. The deprecated
> > versions would always be a subset of the supported versions. Or,
> > alternately, since deprecation has no functional impact and is just a
> > message to developers, we could just leave it out of the protocol and
> just
> > have it in release notes etc.
> > 2. I think including the api name may cause some problems. Currently the
> > api key is the primary key that we keep consistent but we have actually
> > evolved the english description of the apis as they have changed. The
> only
> > use I can think of for the name would be if people used the logical name
> > and tried to resolve the api key, but that would be wrong. Not sure if we
> > actually need the english name, if there is a use case I guess we'll just
> > have to be very clear that the name is just documentation and can 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-10-07 Thread Dong Lin
I agree with Jason we should aim to have clients to properly detect and
handle unknown-protocol error. The hack of empty response should be shorten
solution.

After reading the earlier discussion and wiki, I have two questions:

- In the future, do we expect all clients to send ProtocolVersionRequest by
default when it connects to a new broker?

- Going forward, when broker receives an unsupported request, can broker
respond with "protocol version not supported"  + version information that
the broker would otherwise provide in ProtocolVersionResponse? This would
allow client to to send request with preferred version without having to
query protocol version in advance.

The advantage of this is that client can save a little network overhead by
avoiding extra ProtocolVersionRequest after error. The problem with this
approach is that broker will respond to ProduceRequest with
ProtocolVersionResponse
-- but we can treat ProtocolVersionResponse in similar ways as we plan to
do with empty response. What do you think?



On Tue, Oct 6, 2015 at 3:07 PM, Jason Gustafson  wrote:

> It would be nice to get the unknown api workaround into 0.9, even if we
> can't have ProtocolVersionRequest. It should be a small change and it
> allows clients going forward to detect when they have connected to an old
> broker, which lets them surface a helpful error to the user. This is much
> better than retrying indefinitely which is probably how most current
> clients handle this case.
>
> -Jason
>
> On Tue, Oct 6, 2015 at 2:25 PM, Magnus Edenhill 
> wrote:
>
> > After today's KIP call we decided on the following regarding KIP-35:
> >  * limit scope to just propagate supported API versions (no key-value
> tags,
> > broker info, etc)
> >  * let the new API return the full list of broker's supported ApiKeys and
> > ApiVersions, rather than an aggregated global version
> >  * ApiVersions are sorted in order of preference
> >  * rename API from BrokerMetadataRequest to ProtocolVersionRequest
> >  * workaround for disconnect-on-unknown-api-request remains valid.
> >
> > The wiki page has been updated accordingly:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
> >
> > Thanks,
> > Magnus
> >
> >
> > 2015-10-06 17:28 GMT+02:00 Joel Koshy :
> >
> > > Thanks for the write-up and discussion. This would have been super
> > > useful in our last round of deployments at LinkedIn for which we ended
> > > up having to hack around a number of incompatibilities. I could list
> > > all of the compatibility issues that have hit us, but some are
> > > irrelevant to this specific KIP (e.g., ZooKeeper registration
> > > versions). So I should perhaps just list two that I think are
> > > relevant:
> > >
> > > First, is the way that our metrics collection works. We have a special
> > > metrics producer on every service that emits to a separate metrics
> > > cluster. Kafka brokers also use this producer to emit to the
> > > (separate) metrics cluster. So when we upgrade our test clusters, the
> > > metrics producer in those clusters end up sending the latest produce
> > > request version to the yet to be upgraded metrics cluster. This caused
> > > an issue for us in the last round of deployments which bumped up the
> > > protocol version for the quota-related throttle-time response field.
> > > We got around that by just setting the metrics producer requiredAcks
> > > to zero (since the error occurs on parsing the response - and the old
> > > broker fortunately did not check the request version).
> > >
> > > Second, the inter-broker protocol versioning scheme works fine across
> > > official Apache releases but we picked up intermediate versions that
> > > contained some request version bumps, and then follow-up versions that
> > > picked up some more request bumps. For people deploying off trunk,
> > > protocol version lookup would help.
> > >
> > > General comments on the discussion and KIP:
> > >
> > > I like Grant’s suggestion on using this to avoid the explicit
> > > inter-broker-protocol-version - this will not only help address the
> > > second compatibility issue above, but I’m all for anything that
> > > eliminates an hour of config deployment (our deployments can take that
> > > long!)
> > >
> > > +1 on explicit response fields vs. key-value pairs - I don’t see this
> > > reflected on the wiki though.
> > >
> > > Aggregate protocol version vs specific request version: so you are
> > > associating an increasing aggregate version (for each request version
> > > bump). It may be useful to allow look up of the supported version (or
> > > version range) for each request type. The BrokerMetadataResponse could
> > > alternately return a vector of supported version ranges for each
> > > request type.
> > >
> > > Error response for unrecognized request versions: One option raised in
> > > the discussion was to always include the highest supported 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-10-07 Thread Magnus Edenhill
2015-10-07 9:49 GMT+02:00 Dong Lin :

> I agree with Jason we should aim to have clients to properly detect and
> handle unknown-protocol error. The hack of empty response should be shorten
> solution.
>

Yes, in the long run we want to modify the common protocol response header
to
include a request-wide error code, but since this is currently not possible
to do
without breaking every client out there we need this interim solution for
the time being.




> After reading the earlier discussion and wiki, I have two questions:
>
> - In the future, do we expect all clients to send ProtocolVersionRequest by
> default when it connects to a new broker?
>

No, this API is optional and is typically only required by clients that
support
a wide range of broker versions.


>
> - Going forward, when broker receives an unsupported request, can broker
> respond with "protocol version not supported"  + version information that
> the broker would otherwise provide in ProtocolVersionResponse? This would
> allow client to to send request with preferred version without having to
> query protocol version in advance.
>

This requires the major protocol change mentioned above.


>
> The advantage of this is that client can save a little network overhead by
> avoiding extra ProtocolVersionRequest after error. The problem with this
> approach is that broker will respond to ProduceRequest with
> ProtocolVersionResponse
> -- but we can treat ProtocolVersionResponse in similar ways as we plan to
> do with empty response. What do you think?
>

Since the client will be expecting a ProduceResponse it will parse the
returned
ProtocolVersionResponse as a ProduceResponse which hopefully leads to
a parse error, or in worst case is succesfully parsed.
There is no indication of schema in the response, all the client has to
work with
is a binary blob and an expected response format (based on the request).



>
>
>
> On Tue, Oct 6, 2015 at 3:07 PM, Jason Gustafson 
> wrote:
>
> > It would be nice to get the unknown api workaround into 0.9, even if we
> > can't have ProtocolVersionRequest. It should be a small change and it
> > allows clients going forward to detect when they have connected to an old
> > broker, which lets them surface a helpful error to the user. This is much
> > better than retrying indefinitely which is probably how most current
> > clients handle this case.
> >
> > -Jason
> >
> > On Tue, Oct 6, 2015 at 2:25 PM, Magnus Edenhill 
> > wrote:
> >
> > > After today's KIP call we decided on the following regarding KIP-35:
> > >  * limit scope to just propagate supported API versions (no key-value
> > tags,
> > > broker info, etc)
> > >  * let the new API return the full list of broker's supported ApiKeys
> and
> > > ApiVersions, rather than an aggregated global version
> > >  * ApiVersions are sorted in order of preference
> > >  * rename API from BrokerMetadataRequest to ProtocolVersionRequest
> > >  * workaround for disconnect-on-unknown-api-request remains valid.
> > >
> > > The wiki page has been updated accordingly:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
> > >
> > > Thanks,
> > > Magnus
> > >
> > >
> > > 2015-10-06 17:28 GMT+02:00 Joel Koshy :
> > >
> > > > Thanks for the write-up and discussion. This would have been super
> > > > useful in our last round of deployments at LinkedIn for which we
> ended
> > > > up having to hack around a number of incompatibilities. I could list
> > > > all of the compatibility issues that have hit us, but some are
> > > > irrelevant to this specific KIP (e.g., ZooKeeper registration
> > > > versions). So I should perhaps just list two that I think are
> > > > relevant:
> > > >
> > > > First, is the way that our metrics collection works. We have a
> special
> > > > metrics producer on every service that emits to a separate metrics
> > > > cluster. Kafka brokers also use this producer to emit to the
> > > > (separate) metrics cluster. So when we upgrade our test clusters, the
> > > > metrics producer in those clusters end up sending the latest produce
> > > > request version to the yet to be upgraded metrics cluster. This
> caused
> > > > an issue for us in the last round of deployments which bumped up the
> > > > protocol version for the quota-related throttle-time response field.
> > > > We got around that by just setting the metrics producer requiredAcks
> > > > to zero (since the error occurs on parsing the response - and the old
> > > > broker fortunately did not check the request version).
> > > >
> > > > Second, the inter-broker protocol versioning scheme works fine across
> > > > official Apache releases but we picked up intermediate versions that
> > > > contained some request version bumps, and then follow-up versions
> that
> > > > picked up some more request bumps. For people deploying off trunk,
> > > > protocol version lookup would 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-10-06 Thread Joel Koshy
Thanks for the write-up and discussion. This would have been super
useful in our last round of deployments at LinkedIn for which we ended
up having to hack around a number of incompatibilities. I could list
all of the compatibility issues that have hit us, but some are
irrelevant to this specific KIP (e.g., ZooKeeper registration
versions). So I should perhaps just list two that I think are
relevant:

First, is the way that our metrics collection works. We have a special
metrics producer on every service that emits to a separate metrics
cluster. Kafka brokers also use this producer to emit to the
(separate) metrics cluster. So when we upgrade our test clusters, the
metrics producer in those clusters end up sending the latest produce
request version to the yet to be upgraded metrics cluster. This caused
an issue for us in the last round of deployments which bumped up the
protocol version for the quota-related throttle-time response field.
We got around that by just setting the metrics producer requiredAcks
to zero (since the error occurs on parsing the response - and the old
broker fortunately did not check the request version).

Second, the inter-broker protocol versioning scheme works fine across
official Apache releases but we picked up intermediate versions that
contained some request version bumps, and then follow-up versions that
picked up some more request bumps. For people deploying off trunk,
protocol version lookup would help.

General comments on the discussion and KIP:

I like Grant’s suggestion on using this to avoid the explicit
inter-broker-protocol-version - this will not only help address the
second compatibility issue above, but I’m all for anything that
eliminates an hour of config deployment (our deployments can take that
long!)

+1 on explicit response fields vs. key-value pairs - I don’t see this
reflected on the wiki though.

Aggregate protocol version vs specific request version: so you are
associating an increasing aggregate version (for each request version
bump). It may be useful to allow look up of the supported version (or
version range) for each request type. The BrokerMetadataResponse could
alternately return a vector of supported version ranges for each
request type.

Error response for unrecognized request versions: One option raised in
the discussion was to always include the highest supported version of
that request type in the response, but it may be worthwhile avoiding
that (since it is irrelevant most of the time) and fold that into the
BrokerMetadataRequest instead.

Max-message size/compression-codec: I actually prefer having this only
in TopicMetadataResponse and leave it out of the
BrokerMetadataRequest/Response (even for the defaults) since these are
really topic-specific fields. Rack-info on the other hand should
probably be there (at some point) in the BrokerMetadataResponse, and
this should perhaps be just a raw string that would need some
pluggable (deployment-specific) parsing.

Thanks,

Joel


On Wed, Sep 30, 2015 at 3:18 PM, Magnus Edenhill  wrote:
> Everyone, thanks for your comments and input this far, here
> follows an update of the proposal based on the discussions.
>
>
>  BrokerMetadataRequest => [NodeId]
>NodeId => int32   // Request Metadata for these brokers only.
>  // Empty array: retrieve for all brokers.
>  // Use -1 for current broker only.
>
>  BrokerMetadataResponse => [BrokerId Host Port ProtocolVersionMin
> ProtocolVersionMax [Key Value]]
>   NodeId => int32  // Broker NodeId
>   Host => string   // Broker Host
>   Port => int32// Broker Port
>   ProtocolVersionMin => int32  // Broker's minimum supported protocol
> version
>   ProtocolVersionMax => int32  // Broker's maximum supported protocol
> version
>   Key => string// Tag name
>   Value => stirng  // Tag value
>
>
> Builtin tags:
>  "broker.id"  = "9"
>  "broker.version" = "0.9.0.0-SNAPSHOT-d12ca4f"
>  "broker.version.int" = "0x0009"
>  "compression.codecs" = "gzip,snappy,lz4"
>  "message.max.bytes"  = "100"
>  "message.formats"= "v1,v2"  // KIP-31
>  "endpoints"  = "plaintext://host:9092,ssl://host:9192"
>
> These are all documented, including their value format and how to parse it.
>
> The "broker.id" has multiple purposes:
>  * allows upgrading the bootstrap broker connection to a proper one since
> the
>broker_id is initially not known, but would be with this.
>  * verifying that the broker connected to is actually the broker id that
> was learnt
>through TopicMetadata.
>
>
> The BrokerMetadata may be used in broker-broker communication during
> upgrades
> to decide on a common protocol version between brokers with different
> versions.
>
>
>
> User-provided tags (server.properties), examples:
>  "aws.zone"   = "eu-central-1"
>  "rack"   = "r8a9"
>  "cluster"= "kafka3"
>
> User 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-10-06 Thread Magnus Edenhill
After today's KIP call we decided on the following regarding KIP-35:
 * limit scope to just propagate supported API versions (no key-value tags,
broker info, etc)
 * let the new API return the full list of broker's supported ApiKeys and
ApiVersions, rather than an aggregated global version
 * ApiVersions are sorted in order of preference
 * rename API from BrokerMetadataRequest to ProtocolVersionRequest
 * workaround for disconnect-on-unknown-api-request remains valid.

The wiki page has been updated accordingly:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version

Thanks,
Magnus


2015-10-06 17:28 GMT+02:00 Joel Koshy :

> Thanks for the write-up and discussion. This would have been super
> useful in our last round of deployments at LinkedIn for which we ended
> up having to hack around a number of incompatibilities. I could list
> all of the compatibility issues that have hit us, but some are
> irrelevant to this specific KIP (e.g., ZooKeeper registration
> versions). So I should perhaps just list two that I think are
> relevant:
>
> First, is the way that our metrics collection works. We have a special
> metrics producer on every service that emits to a separate metrics
> cluster. Kafka brokers also use this producer to emit to the
> (separate) metrics cluster. So when we upgrade our test clusters, the
> metrics producer in those clusters end up sending the latest produce
> request version to the yet to be upgraded metrics cluster. This caused
> an issue for us in the last round of deployments which bumped up the
> protocol version for the quota-related throttle-time response field.
> We got around that by just setting the metrics producer requiredAcks
> to zero (since the error occurs on parsing the response - and the old
> broker fortunately did not check the request version).
>
> Second, the inter-broker protocol versioning scheme works fine across
> official Apache releases but we picked up intermediate versions that
> contained some request version bumps, and then follow-up versions that
> picked up some more request bumps. For people deploying off trunk,
> protocol version lookup would help.
>
> General comments on the discussion and KIP:
>
> I like Grant’s suggestion on using this to avoid the explicit
> inter-broker-protocol-version - this will not only help address the
> second compatibility issue above, but I’m all for anything that
> eliminates an hour of config deployment (our deployments can take that
> long!)
>
> +1 on explicit response fields vs. key-value pairs - I don’t see this
> reflected on the wiki though.
>
> Aggregate protocol version vs specific request version: so you are
> associating an increasing aggregate version (for each request version
> bump). It may be useful to allow look up of the supported version (or
> version range) for each request type. The BrokerMetadataResponse could
> alternately return a vector of supported version ranges for each
> request type.
>
> Error response for unrecognized request versions: One option raised in
> the discussion was to always include the highest supported version of
> that request type in the response, but it may be worthwhile avoiding
> that (since it is irrelevant most of the time) and fold that into the
> BrokerMetadataRequest instead.
>
> Max-message size/compression-codec: I actually prefer having this only
> in TopicMetadataResponse and leave it out of the
> BrokerMetadataRequest/Response (even for the defaults) since these are
> really topic-specific fields. Rack-info on the other hand should
> probably be there (at some point) in the BrokerMetadataResponse, and
> this should perhaps be just a raw string that would need some
> pluggable (deployment-specific) parsing.
>
> Thanks,
>
> Joel
>
>
> On Wed, Sep 30, 2015 at 3:18 PM, Magnus Edenhill 
> wrote:
> > Everyone, thanks for your comments and input this far, here
> > follows an update of the proposal based on the discussions.
> >
> >
> >  BrokerMetadataRequest => [NodeId]
> >NodeId => int32   // Request Metadata for these brokers only.
> >  // Empty array: retrieve for all brokers.
> >  // Use -1 for current broker only.
> >
> >  BrokerMetadataResponse => [BrokerId Host Port ProtocolVersionMin
> > ProtocolVersionMax [Key Value]]
> >   NodeId => int32  // Broker NodeId
> >   Host => string   // Broker Host
> >   Port => int32// Broker Port
> >   ProtocolVersionMin => int32  // Broker's minimum supported protocol
> > version
> >   ProtocolVersionMax => int32  // Broker's maximum supported protocol
> > version
> >   Key => string// Tag name
> >   Value => stirng  // Tag value
> >
> >
> > Builtin tags:
> >  "broker.id"  = "9"
> >  "broker.version" = "0.9.0.0-SNAPSHOT-d12ca4f"
> >  "broker.version.int" = "0x0009"
> >  "compression.codecs" = "gzip,snappy,lz4"
> >  

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-10-06 Thread Jason Gustafson
It would be nice to get the unknown api workaround into 0.9, even if we
can't have ProtocolVersionRequest. It should be a small change and it
allows clients going forward to detect when they have connected to an old
broker, which lets them surface a helpful error to the user. This is much
better than retrying indefinitely which is probably how most current
clients handle this case.

-Jason

On Tue, Oct 6, 2015 at 2:25 PM, Magnus Edenhill  wrote:

> After today's KIP call we decided on the following regarding KIP-35:
>  * limit scope to just propagate supported API versions (no key-value tags,
> broker info, etc)
>  * let the new API return the full list of broker's supported ApiKeys and
> ApiVersions, rather than an aggregated global version
>  * ApiVersions are sorted in order of preference
>  * rename API from BrokerMetadataRequest to ProtocolVersionRequest
>  * workaround for disconnect-on-unknown-api-request remains valid.
>
> The wiki page has been updated accordingly:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
>
> Thanks,
> Magnus
>
>
> 2015-10-06 17:28 GMT+02:00 Joel Koshy :
>
> > Thanks for the write-up and discussion. This would have been super
> > useful in our last round of deployments at LinkedIn for which we ended
> > up having to hack around a number of incompatibilities. I could list
> > all of the compatibility issues that have hit us, but some are
> > irrelevant to this specific KIP (e.g., ZooKeeper registration
> > versions). So I should perhaps just list two that I think are
> > relevant:
> >
> > First, is the way that our metrics collection works. We have a special
> > metrics producer on every service that emits to a separate metrics
> > cluster. Kafka brokers also use this producer to emit to the
> > (separate) metrics cluster. So when we upgrade our test clusters, the
> > metrics producer in those clusters end up sending the latest produce
> > request version to the yet to be upgraded metrics cluster. This caused
> > an issue for us in the last round of deployments which bumped up the
> > protocol version for the quota-related throttle-time response field.
> > We got around that by just setting the metrics producer requiredAcks
> > to zero (since the error occurs on parsing the response - and the old
> > broker fortunately did not check the request version).
> >
> > Second, the inter-broker protocol versioning scheme works fine across
> > official Apache releases but we picked up intermediate versions that
> > contained some request version bumps, and then follow-up versions that
> > picked up some more request bumps. For people deploying off trunk,
> > protocol version lookup would help.
> >
> > General comments on the discussion and KIP:
> >
> > I like Grant’s suggestion on using this to avoid the explicit
> > inter-broker-protocol-version - this will not only help address the
> > second compatibility issue above, but I’m all for anything that
> > eliminates an hour of config deployment (our deployments can take that
> > long!)
> >
> > +1 on explicit response fields vs. key-value pairs - I don’t see this
> > reflected on the wiki though.
> >
> > Aggregate protocol version vs specific request version: so you are
> > associating an increasing aggregate version (for each request version
> > bump). It may be useful to allow look up of the supported version (or
> > version range) for each request type. The BrokerMetadataResponse could
> > alternately return a vector of supported version ranges for each
> > request type.
> >
> > Error response for unrecognized request versions: One option raised in
> > the discussion was to always include the highest supported version of
> > that request type in the response, but it may be worthwhile avoiding
> > that (since it is irrelevant most of the time) and fold that into the
> > BrokerMetadataRequest instead.
> >
> > Max-message size/compression-codec: I actually prefer having this only
> > in TopicMetadataResponse and leave it out of the
> > BrokerMetadataRequest/Response (even for the defaults) since these are
> > really topic-specific fields. Rack-info on the other hand should
> > probably be there (at some point) in the BrokerMetadataResponse, and
> > this should perhaps be just a raw string that would need some
> > pluggable (deployment-specific) parsing.
> >
> > Thanks,
> >
> > Joel
> >
> >
> > On Wed, Sep 30, 2015 at 3:18 PM, Magnus Edenhill 
> > wrote:
> > > Everyone, thanks for your comments and input this far, here
> > > follows an update of the proposal based on the discussions.
> > >
> > >
> > >  BrokerMetadataRequest => [NodeId]
> > >NodeId => int32   // Request Metadata for these brokers only.
> > >  // Empty array: retrieve for all brokers.
> > >  // Use -1 for current broker only.
> > >
> > >  BrokerMetadataResponse => [BrokerId Host Port ProtocolVersionMin
> > > 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-30 Thread Magnus Edenhill
2015-09-29 3:36 GMT+02:00 Jiangjie Qin :

> Thanks for the writeup. I also think having a specific protocol for
> client-broker version negotiation is better.
>
> I'm wondering is it better to let the broker to decide the version to use?
> It might have some value If brokers have preference for a particular
> version.
>

I dont think we want to put the decision on what common protocol version to
use
on the broker side. A broker only needs to support the APIs and versions it
provides,
while a client can support many more versions (older and newer than the
current
broker's), so I strongly suggest that this logic should be on the client
side.
This is also important for the Kafka eco-system where various Kafka tools
and clients
will end up in various OS distros - these clients will need to support a
broad range
of broker versions to be of value as a (lets say Ubuntu) package.

Client-side decision also allows for client-side workarounds for broker
issues.



Using a global version is a good approach. For the client-broker
> negotiation, I am thinking about something like:
>
> ProtocolSyncRequest => ClientType [ProtocolVersion]
> ClientType => int8
> ProtocolVersion => int16
>
> ProtocolSyncResponse => PreferredVersion
> PreferredVersion => int16
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao  wrote:
>
> > I agree with Ewen that having the keys explicitly specified in the
> response
> > is better.
> >
> > In addition to the supported protocol version, there are other
> interesting
> > metadata at the broker level that could be of interest to things like
> admin
> > tools (e.g., used disk space, remaining disk space, etc). I am wondering
> if
> > we should separate those into different requests. For inquiring the
> > protocol version, we can have a separate BrokerProtocolRequest. The
> > response will just include the broker version and perhaps a list of
> > supported requests and versions?
> >
> > As for sending an empty response for unrecognized requests, do you how is
> > that handled in other similar systems?
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson 
> > wrote:
> >
> > > Having the version API can make clients more robust, so I'm in favor.
> One
> > > note on the addition of the "rack" field. Since this is a
> broker-specific
> > > setting, the client would have to query BrokerMetadata for every new
> > broker
> > > it connects to (unless we also expose rack in TopicMetadata). This is
> > also
> > > kind of unfortunate for admin utilities leveraging this API. It might
> be
> > > more convenient to allow this API to return broker metadata for the
> full
> > > cluster, assuming all of it could be made available in Zookeeper.
> > >
> > > As for using the empty response to indicate an incompatible API, it
> seems
> > > like that could work. I think some of the clients might catch response
> > > parsing exceptions and retry anyway, but that's no worse than retrying
> > > because of a disconnect in the same case.
> > >
> > > -Jason
> > >
> > > On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava <
> > e...@confluent.io>
> > > wrote:
> > >
> > > > The basic functionality is definitely useful here. I'm generally in
> > favor
> > > > of exposing some info about broker versions to clients.
> > > >
> > > > I'd prefer to keep the key/values explicit. Making them extensible
> > > > string/string pairs is good for avoiding unnecessary version changes
> in
> > > the
> > > > protocol, but I think we should explicitly define the valid key/value
> > > > formats in each version of the protocol. New keys can safely be
> > ignored,
> > > > but actually specifying the format of the values is important if we
> > ever
> > > > need to evolve those formats.
> > > >
> > > > I like some of the examples you've provided for returned key/value
> > pairs
> > > > and I think we should provide some of them even when the values
> should
> > be
> > > > obvious from the broker version.
> > > >
> > > > * broker.version - are we definitely standardizing on this versioning
> > > > format? 4 digits, with each level indicating the intuitive levels of
> > > > compatibility? Is there any chance we'll have a 0.10.0.0? This might
> > seem
> > > > like a trivial consideration, but after fighting versioning in
> > different
> > > > packaging systems, I'm a bit more sensitive to the annoying effects
> > that
> > > > not considering this carefully can have. We're at a particularly
> > > sensitive
> > > > point as we hit .9 -> .10 or .9 -> 1.0 ().
> > > > * supported.compression.codecs - nit, but I'd like to figure out a
> good
> > > way
> > > > to keep these as close to the actual config name as possible. in this
> > > case,
> > > > the setting is compression.codec, so just "compression.codecs" seems
> > > ideal
> > > > to me.
> > > > * rack: I think there's a ton of demand for something like 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-30 Thread Magnus Edenhill
Very good points, Todd, totally agree.


2015-09-30 1:04 GMT+02:00 Todd Palino :

> We should also consider what else should be negotiated between the broker
> and the client as this comes together. The version is definitely first, but
> there are other things, such as the max message size, that should not need
> to be replicated on both the broker and the client. Granted, max message
> size has per-topic overrides as well, but that should also be considered
> (possibly as an addition to the topic metadata response).
>
> Ideally you never want a requirement that is enforced by the broker to be a
> surprise to the client, whether that's a supported version or a
> configuration parameter. The client should not have to know it in advance
> (except for the most basic of connection parameters), and even if it does
> have it as a configuration option, it should be able to know before it even
> starts running that what it has configured is in conflict with the server.
>
> -Todd
>
>
> On Tue, Sep 29, 2015 at 11:08 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > Right. But there should be a max old version that the broker should
> support
> > to avoid these incompatibility issues.
> > For example, if the broker is at version X, it should be able to support
> > the versions (clients and interbroker) till X-2. In case we have brokers
> > and clients older than that it can send a response warning them to
> upgrade
> > till X-2 minimum.
> > The backward compatibility limit can be discussed further. This will help
> > for rolling upgrades.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Tue, Sep 29, 2015 at 8:25 AM, Grant Henke 
> wrote:
> >
> > > If we create a protocol version negotiation api for clients, can we use
> > it
> > > to replace or improve the ease of upgrades that break inter-broker
> > > messaging?
> > >
> > > Currently upgrades that break the wire protocol take 2 rolling
> restarts.
> > > The first restart we set inter.broker.protocol.version telling all
> > brokers
> > > to communicate on the old version, and then we restart again removing
> the
> > > inter.broker.protocol.version. With this api the brokers could agree
> on a
> > > version to communicate with, and when bounced re-negotiate to the new
> > > version.
> > >
> > >
> > > On Mon, Sep 28, 2015 at 10:26 PM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com> wrote:
> > >
> > > > Nice write-up.
> > > >
> > > > Just had a question, instead of returning an empty response back to
> the
> > > > client, would it be better for the broker to return a response that
> > gives
> > > > some more info to the client regarding the min version they need to
> > > upgrade
> > > > to in order to communicate with the broker.
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Mon, Sep 28, 2015 at 6:36 PM, Jiangjie Qin
> >  > > >
> > > > wrote:
> > > >
> > > > > Thanks for the writeup. I also think having a specific protocol for
> > > > > client-broker version negotiation is better.
> > > > >
> > > > > I'm wondering is it better to let the broker to decide the version
> to
> > > > use?
> > > > > It might have some value If brokers have preference for a
> particular
> > > > > version.
> > > > > Using a global version is a good approach. For the client-broker
> > > > > negotiation, I am thinking about something like:
> > > > >
> > > > > ProtocolSyncRequest => ClientType [ProtocolVersion]
> > > > > ClientType => int8
> > > > > ProtocolVersion => int16
> > > > >
> > > > > ProtocolSyncResponse => PreferredVersion
> > > > > PreferredVersion => int16
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao 
> wrote:
> > > > >
> > > > > > I agree with Ewen that having the keys explicitly specified in
> the
> > > > > response
> > > > > > is better.
> > > > > >
> > > > > > In addition to the supported protocol version, there are other
> > > > > interesting
> > > > > > metadata at the broker level that could be of interest to things
> > like
> > > > > admin
> > > > > > tools (e.g., used disk space, remaining disk space, etc). I am
> > > > wondering
> > > > > if
> > > > > > we should separate those into different requests. For inquiring
> the
> > > > > > protocol version, we can have a separate BrokerProtocolRequest.
> The
> > > > > > response will just include the broker version and perhaps a list
> of
> > > > > > supported requests and versions?
> > > > > >
> > > > > > As for sending an empty response for unrecognized requests, do
> you
> > > how
> > > > is
> > > > > > that handled in other similar systems?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson <
> > > ja...@confluent.io>
> > > > > > wrote:
> > > > > >
> > > > > > > Having the version API can make clients more robust, so 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-30 Thread Magnus Edenhill
2015-09-26 6:34 GMT+02:00 Ewen Cheslack-Postava :

> The basic functionality is definitely useful here. I'm generally in favor
> of exposing some info about broker versions to clients.
>
> I'd prefer to keep the key/values explicit. Making them extensible
> string/string pairs is good for avoiding unnecessary version changes in the
> protocol, but I think we should explicitly define the valid key/value
> formats in each version of the protocol. New keys can safely be ignored,
> but actually specifying the format of the values is important if we ever
> need to evolve those formats.
>

I agree that the "builtin" key-values populated by the broker itself
should be strictly documented, formatted, and strive to be
as close to any corresponding configuration properties as possible,
both by name and value.


> I like some of the examples you've provided for returned key/value pairs
> and I think we should provide some of them even when the values should be
> obvious from the broker version.
>
> * broker.version - are we definitely standardizing on this versioning
> format? 4 digits, with each level indicating the intuitive levels of
> compatibility? Is there any chance we'll have a 0.10.0.0? This might seem
> like a trivial consideration, but after fighting versioning in different
> packaging systems, I'm a bit more sensitive to the annoying effects that
> not considering this carefully can have. We're at a particularly sensitive
> point as we hit .9 -> .10 or .9 -> 1.0 ().
>

We should specify exactly how to translate the string in to an integer that
can
be used in numeric comparisons, or better yet, provide broker.version
as the human readable representation but also provide a broker.version.int
for
comparison use?



* supported.compression.codecs - nit, but I'd like to figure out a good way
> to keep these as close to the actual config name as possible. in this case,
> the setting is compression.codec, so just "compression.codecs" seems ideal
> to me.
>

We should probably decide on a format for propagating things that directly
map to
a (client) configuration property.
Prefixed with "config.*"?  "config.message.max.bytes"="100",
or in a separate [key value] list in the response?



* rack: I think there's a ton of demand for something like this, but I'd
> really like to think through it more before exposing anything. "rack" is
> *very* specific to a deployment scenario. I think we're comfortable
> adapting the terminology, but the meaning can change drastically, even
> under seemingly similar deployments. For example, if you deploy in EC2, you
> might create instances in multiple AZs. Within an AZ, you might consider
> all the nodes on the same "rack". But there are also placement groups
> within each AZ that provide better guarantees on network performance. Are
> any nodes in the same AZ considered on the same rack or do you need special
> guarantees to be on the same "rack". In general, I don't think there's a
> generic "rack" identifier we can expose -- we'll need to do something
> specialized depending on different environments. This is a case where
> extensibility in the format is probably really useful.
>

Yeah, I dont think this should be builtin tag at all, but rather be
user configurable key-value through server.properties.
e.g.:
   broker.tag.datacenter = amsterdam1
   broker.tag.rack = c12r17
   broker.tag.aw.zone = ...




> * Properties like supported.compression.codecs can presumably be determined
> just via the broker version. Is there a good reason for including them?
> Perhaps explicit info >> implicit info?
>

We dont want client implementations to need to track this, it might also
be the case that some features are manually disabled on the broker (e.g.,
disable a broken compression algorithm until next version is released, hi
snappy!).





> * "All existing clients should be able to handle this gracefully without
> any alterations to the code" - which clients does this refer to? For
> example, will the Java clients continue to behave the same way they do
> today? I'm curious because empty responses seem *very* different from
> closing connections.
>
>
I'm assuming here (and I will verify this), but all clients should have
some sort
of error handling when parsing malformed responses (buffer underflow in
this case).



>
> -Ewen
>
> On Fri, Sep 25, 2015 at 2:53 PM, Magnus Edenhill 
> wrote:
>
> > Good evening,
> >
> > KIP-35 was created to address current and future broker-client
> > compatibility.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
> >
> > Summary:
> >  * allow clients to retrieve the broker's protocol version
> >  * make broker handle unknown protocol requests gracefully
> >
> > Feedback and comments welcome!
> >
> > Regards,
> > Magnus
> >
>
>
>
> --
> Thanks,
> Ewen
>


Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-30 Thread Magnus Edenhill
2015-09-29 17:25 GMT+02:00 Grant Henke :

> If we create a protocol version negotiation api for clients, can we use it
> to replace or improve the ease of upgrades that break inter-broker
> messaging?
>
> Currently upgrades that break the wire protocol take 2 rolling restarts.
> The first restart we set inter.broker.protocol.version telling all brokers
> to communicate on the old version, and then we restart again removing the
> inter.broker.protocol.version. With this api the brokers could agree on a
> version to communicate with, and when bounced re-negotiate to the new
> version.
>


Interesting point. I'm still not convinced we want a stateful
ProtocolVersionSync
handshake, it adds complexity, and since only one side of a Kafka
connection is
issuing requests only that side needs to decide what protocol version to
use -
that side is the client (or replica?).

But yeah, BrokerMetadataRequest could be used in broker-broker
communications too.



>
> On Mon, Sep 28, 2015 at 10:26 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > Nice write-up.
> >
> > Just had a question, instead of returning an empty response back to the
> > client, would it be better for the broker to return a response that gives
> > some more info to the client regarding the min version they need to
> upgrade
> > to in order to communicate with the broker.
> >
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Mon, Sep 28, 2015 at 6:36 PM, Jiangjie Qin  >
> > wrote:
> >
> > > Thanks for the writeup. I also think having a specific protocol for
> > > client-broker version negotiation is better.
> > >
> > > I'm wondering is it better to let the broker to decide the version to
> > use?
> > > It might have some value If brokers have preference for a particular
> > > version.
> > > Using a global version is a good approach. For the client-broker
> > > negotiation, I am thinking about something like:
> > >
> > > ProtocolSyncRequest => ClientType [ProtocolVersion]
> > > ClientType => int8
> > > ProtocolVersion => int16
> > >
> > > ProtocolSyncResponse => PreferredVersion
> > > PreferredVersion => int16
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao  wrote:
> > >
> > > > I agree with Ewen that having the keys explicitly specified in the
> > > response
> > > > is better.
> > > >
> > > > In addition to the supported protocol version, there are other
> > > interesting
> > > > metadata at the broker level that could be of interest to things like
> > > admin
> > > > tools (e.g., used disk space, remaining disk space, etc). I am
> > wondering
> > > if
> > > > we should separate those into different requests. For inquiring the
> > > > protocol version, we can have a separate BrokerProtocolRequest. The
> > > > response will just include the broker version and perhaps a list of
> > > > supported requests and versions?
> > > >
> > > > As for sending an empty response for unrecognized requests, do you
> how
> > is
> > > > that handled in other similar systems?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson <
> ja...@confluent.io>
> > > > wrote:
> > > >
> > > > > Having the version API can make clients more robust, so I'm in
> favor.
> > > One
> > > > > note on the addition of the "rack" field. Since this is a
> > > broker-specific
> > > > > setting, the client would have to query BrokerMetadata for every
> new
> > > > broker
> > > > > it connects to (unless we also expose rack in TopicMetadata). This
> is
> > > > also
> > > > > kind of unfortunate for admin utilities leveraging this API. It
> might
> > > be
> > > > > more convenient to allow this API to return broker metadata for the
> > > full
> > > > > cluster, assuming all of it could be made available in Zookeeper.
> > > > >
> > > > > As for using the empty response to indicate an incompatible API, it
> > > seems
> > > > > like that could work. I think some of the clients might catch
> > response
> > > > > parsing exceptions and retry anyway, but that's no worse than
> > retrying
> > > > > because of a disconnect in the same case.
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava <
> > > > e...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > The basic functionality is definitely useful here. I'm generally
> in
> > > > favor
> > > > > > of exposing some info about broker versions to clients.
> > > > > >
> > > > > > I'd prefer to keep the key/values explicit. Making them
> extensible
> > > > > > string/string pairs is good for avoiding unnecessary version
> > changes
> > > in
> > > > > the
> > > > > > protocol, but I think we should explicitly define the valid
> > key/value
> > > > > > formats in each version of the protocol. New keys can safely be
> > > > ignored,
> > > > > > but actually specifying the format of the values is important 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-30 Thread Magnus Edenhill
Everyone, thanks for your comments and input this far, here
follows an update of the proposal based on the discussions.


 BrokerMetadataRequest => [NodeId]
   NodeId => int32   // Request Metadata for these brokers only.
 // Empty array: retrieve for all brokers.
 // Use -1 for current broker only.

 BrokerMetadataResponse => [BrokerId Host Port ProtocolVersionMin
ProtocolVersionMax [Key Value]]
  NodeId => int32  // Broker NodeId
  Host => string   // Broker Host
  Port => int32// Broker Port
  ProtocolVersionMin => int32  // Broker's minimum supported protocol
version
  ProtocolVersionMax => int32  // Broker's maximum supported protocol
version
  Key => string// Tag name
  Value => stirng  // Tag value


Builtin tags:
 "broker.id"  = "9"
 "broker.version" = "0.9.0.0-SNAPSHOT-d12ca4f"
 "broker.version.int" = "0x0009"
 "compression.codecs" = "gzip,snappy,lz4"
 "message.max.bytes"  = "100"
 "message.formats"= "v1,v2"  // KIP-31
 "endpoints"  = "plaintext://host:9092,ssl://host:9192"

These are all documented, including their value format and how to parse it.

The "broker.id" has multiple purposes:
 * allows upgrading the bootstrap broker connection to a proper one since
the
   broker_id is initially not known, but would be with this.
 * verifying that the broker connected to is actually the broker id that
was learnt
   through TopicMetadata.


The BrokerMetadata may be used in broker-broker communication during
upgrades
to decide on a common protocol version between brokers with different
versions.



User-provided tags (server.properties), examples:
 "aws.zone"   = "eu-central-1"
 "rack"   = "r8a9"
 "cluster"= "kafka3"

User provided tags are configured through the broker configuration file,
server.properties, e.g.:

   tag.aws.zone = eu-central-1
   tag.rack = r8a9
   tag.cluster = kafka3




/Magnus


2015-10-01 0:11 GMT+02:00 Magnus Edenhill :

>
> Very good points, Todd, totally agree.
>
>
> 2015-09-30 1:04 GMT+02:00 Todd Palino :
>
>> We should also consider what else should be negotiated between the broker
>> and the client as this comes together. The version is definitely first,
>> but
>> there are other things, such as the max message size, that should not need
>> to be replicated on both the broker and the client. Granted, max message
>> size has per-topic overrides as well, but that should also be considered
>> (possibly as an addition to the topic metadata response).
>>
>> Ideally you never want a requirement that is enforced by the broker to be
>> a
>> surprise to the client, whether that's a supported version or a
>> configuration parameter. The client should not have to know it in advance
>> (except for the most basic of connection parameters), and even if it does
>> have it as a configuration option, it should be able to know before it
>> even
>> starts running that what it has configured is in conflict with the server.
>>
>> -Todd
>>
>>
>> On Tue, Sep 29, 2015 at 11:08 AM, Mayuresh Gharat <
>> gharatmayures...@gmail.com> wrote:
>>
>> > Right. But there should be a max old version that the broker should
>> support
>> > to avoid these incompatibility issues.
>> > For example, if the broker is at version X, it should be able to support
>> > the versions (clients and interbroker) till X-2. In case we have brokers
>> > and clients older than that it can send a response warning them to
>> upgrade
>> > till X-2 minimum.
>> > The backward compatibility limit can be discussed further. This will
>> help
>> > for rolling upgrades.
>> >
>> > Thanks,
>> >
>> > Mayuresh
>> >
>> > On Tue, Sep 29, 2015 at 8:25 AM, Grant Henke 
>> wrote:
>> >
>> > > If we create a protocol version negotiation api for clients, can we
>> use
>> > it
>> > > to replace or improve the ease of upgrades that break inter-broker
>> > > messaging?
>> > >
>> > > Currently upgrades that break the wire protocol take 2 rolling
>> restarts.
>> > > The first restart we set inter.broker.protocol.version telling all
>> > brokers
>> > > to communicate on the old version, and then we restart again removing
>> the
>> > > inter.broker.protocol.version. With this api the brokers could agree
>> on a
>> > > version to communicate with, and when bounced re-negotiate to the new
>> > > version.
>> > >
>> > >
>> > > On Mon, Sep 28, 2015 at 10:26 PM, Mayuresh Gharat <
>> > > gharatmayures...@gmail.com> wrote:
>> > >
>> > > > Nice write-up.
>> > > >
>> > > > Just had a question, instead of returning an empty response back to
>> the
>> > > > client, would it be better for the broker to return a response that
>> > gives
>> > > > some more info to the client regarding the min version they need to
>> > > upgrade
>> > > > to in order to communicate with the broker.
>> > > >
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Mayuresh
>> 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-30 Thread Magnus Edenhill
2015-09-28 19:47 GMT+02:00 Jason Gustafson :

> Having the version API can make clients more robust, so I'm in favor. One
> note on the addition of the "rack" field. Since this is a broker-specific
> setting, the client would have to query BrokerMetadata for every new broker
> it connects to (unless we also expose rack in TopicMetadata). This is also
> kind of unfortunate for admin utilities leveraging this API. It might be
> more convenient to allow this API to return broker metadata for the full
> cluster, assuming all of it could be made available in Zookeeper.
>

That's a great idea.



>
> As for using the empty response to indicate an incompatible API, it seems
> like that could work. I think some of the clients might catch response
> parsing exceptions and retry anyway, but that's no worse than retrying
> because of a disconnect in the same case.
>

Yep, catching response parsing errors in the client is the general idea
with sending an empty reply,
it will give the application some sort of contextual feedback what went
wrong (parse error for XyzRequest), which a disconnect wont.



>
> -Jason
>
> On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava 
> wrote:
>
> > The basic functionality is definitely useful here. I'm generally in favor
> > of exposing some info about broker versions to clients.
> >
> > I'd prefer to keep the key/values explicit. Making them extensible
> > string/string pairs is good for avoiding unnecessary version changes in
> the
> > protocol, but I think we should explicitly define the valid key/value
> > formats in each version of the protocol. New keys can safely be ignored,
> > but actually specifying the format of the values is important if we ever
> > need to evolve those formats.
> >
> > I like some of the examples you've provided for returned key/value pairs
> > and I think we should provide some of them even when the values should be
> > obvious from the broker version.
> >
> > * broker.version - are we definitely standardizing on this versioning
> > format? 4 digits, with each level indicating the intuitive levels of
> > compatibility? Is there any chance we'll have a 0.10.0.0? This might seem
> > like a trivial consideration, but after fighting versioning in different
> > packaging systems, I'm a bit more sensitive to the annoying effects that
> > not considering this carefully can have. We're at a particularly
> sensitive
> > point as we hit .9 -> .10 or .9 -> 1.0 ().
> > * supported.compression.codecs - nit, but I'd like to figure out a good
> way
> > to keep these as close to the actual config name as possible. in this
> case,
> > the setting is compression.codec, so just "compression.codecs" seems
> ideal
> > to me.
> > * rack: I think there's a ton of demand for something like this, but I'd
> > really like to think through it more before exposing anything. "rack" is
> > *very* specific to a deployment scenario. I think we're comfortable
> > adapting the terminology, but the meaning can change drastically, even
> > under seemingly similar deployments. For example, if you deploy in EC2,
> you
> > might create instances in multiple AZs. Within an AZ, you might consider
> > all the nodes on the same "rack". But there are also placement groups
> > within each AZ that provide better guarantees on network performance. Are
> > any nodes in the same AZ considered on the same rack or do you need
> special
> > guarantees to be on the same "rack". In general, I don't think there's a
> > generic "rack" identifier we can expose -- we'll need to do something
> > specialized depending on different environments. This is a case where
> > extensibility in the format is probably really useful.
> > * Properties like supported.compression.codecs can presumably be
> determined
> > just via the broker version. Is there a good reason for including them?
> > Perhaps explicit info >> implicit info?
> > * "All existing clients should be able to handle this gracefully without
> > any alterations to the code" - which clients does this refer to? For
> > example, will the Java clients continue to behave the same way they do
> > today? I'm curious because empty responses seem *very* different from
> > closing connections.
> >
> >
> > -Ewen
> >
> > On Fri, Sep 25, 2015 at 2:53 PM, Magnus Edenhill 
> > wrote:
> >
> > > Good evening,
> > >
> > > KIP-35 was created to address current and future broker-client
> > > compatibility.
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
> > >
> > > Summary:
> > >  * allow clients to retrieve the broker's protocol version
> > >  * make broker handle unknown protocol requests gracefully
> > >
> > > Feedback and comments welcome!
> > >
> > > Regards,
> > > Magnus
> > >
> >
> >
> >
> > --
> > Thanks,
> > Ewen
> >
>


Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-30 Thread Magnus Edenhill
2015-09-28 20:59 GMT+02:00 Jun Rao :

> I agree with Ewen that having the keys explicitly specified in the response
> is better.
>
> In addition to the supported protocol version, there are other interesting
> metadata at the broker level that could be of interest to things like admin
> tools (e.g., used disk space, remaining disk space, etc). I am wondering if
> we should separate those into different requests. For inquiring the
> protocol version, we can have a separate BrokerProtocolRequest. The
> response will just include the broker version and perhaps a list of
> supported requests and versions?
>

My idea with this is that upon initial connect to a broker the client
would send a BrokerMetadataRequest to understand what the broker supports
(protocol versions, codecs, message format versions, config values, etc),
and as
such the information in the response should not change during the lifetime
of a connection.
So I dont think we want to put volatile stats-like data in there, we should
use
specific APIs for that when the need arise.



> As for sending an empty response for unrecognized requests, do you how is
> that handled in other similar systems?
>

Not sure how this applies to other systems, but this workaround is to
overcome
the lack of a generic request-wide error field in the common protocol
header.
While it is a bit of a hack to trigger a parsing error on the client, that
parsing error
will be in the context of the failed request, which will hopefully(!) emit
an at least somewhat
useful error message to the application/user.




>
> Thanks,
>
> Jun
>
> On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson 
> wrote:
>
> > Having the version API can make clients more robust, so I'm in favor. One
> > note on the addition of the "rack" field. Since this is a broker-specific
> > setting, the client would have to query BrokerMetadata for every new
> broker
> > it connects to (unless we also expose rack in TopicMetadata). This is
> also
> > kind of unfortunate for admin utilities leveraging this API. It might be
> > more convenient to allow this API to return broker metadata for the full
> > cluster, assuming all of it could be made available in Zookeeper.
> >
> > As for using the empty response to indicate an incompatible API, it seems
> > like that could work. I think some of the clients might catch response
> > parsing exceptions and retry anyway, but that's no worse than retrying
> > because of a disconnect in the same case.
> >
> > -Jason
> >
> > On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava <
> e...@confluent.io>
> > wrote:
> >
> > > The basic functionality is definitely useful here. I'm generally in
> favor
> > > of exposing some info about broker versions to clients.
> > >
> > > I'd prefer to keep the key/values explicit. Making them extensible
> > > string/string pairs is good for avoiding unnecessary version changes in
> > the
> > > protocol, but I think we should explicitly define the valid key/value
> > > formats in each version of the protocol. New keys can safely be
> ignored,
> > > but actually specifying the format of the values is important if we
> ever
> > > need to evolve those formats.
> > >
> > > I like some of the examples you've provided for returned key/value
> pairs
> > > and I think we should provide some of them even when the values should
> be
> > > obvious from the broker version.
> > >
> > > * broker.version - are we definitely standardizing on this versioning
> > > format? 4 digits, with each level indicating the intuitive levels of
> > > compatibility? Is there any chance we'll have a 0.10.0.0? This might
> seem
> > > like a trivial consideration, but after fighting versioning in
> different
> > > packaging systems, I'm a bit more sensitive to the annoying effects
> that
> > > not considering this carefully can have. We're at a particularly
> > sensitive
> > > point as we hit .9 -> .10 or .9 -> 1.0 ().
> > > * supported.compression.codecs - nit, but I'd like to figure out a good
> > way
> > > to keep these as close to the actual config name as possible. in this
> > case,
> > > the setting is compression.codec, so just "compression.codecs" seems
> > ideal
> > > to me.
> > > * rack: I think there's a ton of demand for something like this, but
> I'd
> > > really like to think through it more before exposing anything. "rack"
> is
> > > *very* specific to a deployment scenario. I think we're comfortable
> > > adapting the terminology, but the meaning can change drastically, even
> > > under seemingly similar deployments. For example, if you deploy in EC2,
> > you
> > > might create instances in multiple AZs. Within an AZ, you might
> consider
> > > all the nodes on the same "rack". But there are also placement groups
> > > within each AZ that provide better guarantees on network performance.
> Are
> > > any nodes in the same AZ considered on the same rack or do you need
> > special
> > > guarantees to be on the same "rack". 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-30 Thread Magnus Edenhill
2015-09-29 5:26 GMT+02:00 Mayuresh Gharat :

> Nice write-up.
>
> Just had a question, instead of returning an empty response back to the
> client, would it be better for the broker to return a response that gives
> some more info to the client regarding the min version they need to upgrade
> to in order to communicate with the broker.
>

Yes, that would be the proper fix; insert a request-wide ErrorCode field in
the common protocol header.
But this unfortunately breaks the protocol for all existing clients so it
is really
a no-go at this point.
What we could do is add some sort of ProtocolUpgradeRequest that upgrades
the
protocol, including the common headers, for the current connection.

In the meantime we need something like the protocol-compatible
empty-response workaround.


I do like the idea of the broker providing a range of supportd protocol
versions,
which allows brokers to gracefully remove old legacy request types.



>
> Thanks,
>
> Mayuresh
>
> On Mon, Sep 28, 2015 at 6:36 PM, Jiangjie Qin 
> wrote:
>
> > Thanks for the writeup. I also think having a specific protocol for
> > client-broker version negotiation is better.
> >
> > I'm wondering is it better to let the broker to decide the version to
> use?
> > It might have some value If brokers have preference for a particular
> > version.
> > Using a global version is a good approach. For the client-broker
> > negotiation, I am thinking about something like:
> >
> > ProtocolSyncRequest => ClientType [ProtocolVersion]
> > ClientType => int8
> > ProtocolVersion => int16
> >
> > ProtocolSyncResponse => PreferredVersion
> > PreferredVersion => int16
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao  wrote:
> >
> > > I agree with Ewen that having the keys explicitly specified in the
> > response
> > > is better.
> > >
> > > In addition to the supported protocol version, there are other
> > interesting
> > > metadata at the broker level that could be of interest to things like
> > admin
> > > tools (e.g., used disk space, remaining disk space, etc). I am
> wondering
> > if
> > > we should separate those into different requests. For inquiring the
> > > protocol version, we can have a separate BrokerProtocolRequest. The
> > > response will just include the broker version and perhaps a list of
> > > supported requests and versions?
> > >
> > > As for sending an empty response for unrecognized requests, do you how
> is
> > > that handled in other similar systems?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson 
> > > wrote:
> > >
> > > > Having the version API can make clients more robust, so I'm in favor.
> > One
> > > > note on the addition of the "rack" field. Since this is a
> > broker-specific
> > > > setting, the client would have to query BrokerMetadata for every new
> > > broker
> > > > it connects to (unless we also expose rack in TopicMetadata). This is
> > > also
> > > > kind of unfortunate for admin utilities leveraging this API. It might
> > be
> > > > more convenient to allow this API to return broker metadata for the
> > full
> > > > cluster, assuming all of it could be made available in Zookeeper.
> > > >
> > > > As for using the empty response to indicate an incompatible API, it
> > seems
> > > > like that could work. I think some of the clients might catch
> response
> > > > parsing exceptions and retry anyway, but that's no worse than
> retrying
> > > > because of a disconnect in the same case.
> > > >
> > > > -Jason
> > > >
> > > > On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava <
> > > e...@confluent.io>
> > > > wrote:
> > > >
> > > > > The basic functionality is definitely useful here. I'm generally in
> > > favor
> > > > > of exposing some info about broker versions to clients.
> > > > >
> > > > > I'd prefer to keep the key/values explicit. Making them extensible
> > > > > string/string pairs is good for avoiding unnecessary version
> changes
> > in
> > > > the
> > > > > protocol, but I think we should explicitly define the valid
> key/value
> > > > > formats in each version of the protocol. New keys can safely be
> > > ignored,
> > > > > but actually specifying the format of the values is important if we
> > > ever
> > > > > need to evolve those formats.
> > > > >
> > > > > I like some of the examples you've provided for returned key/value
> > > pairs
> > > > > and I think we should provide some of them even when the values
> > should
> > > be
> > > > > obvious from the broker version.
> > > > >
> > > > > * broker.version - are we definitely standardizing on this
> versioning
> > > > > format? 4 digits, with each level indicating the intuitive levels
> of
> > > > > compatibility? Is there any chance we'll have a 0.10.0.0? This
> might
> > > seem
> > > > > like a trivial consideration, but after fighting versioning in
> > > 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-29 Thread Grant Henke
If we create a protocol version negotiation api for clients, can we use it
to replace or improve the ease of upgrades that break inter-broker
messaging?

Currently upgrades that break the wire protocol take 2 rolling restarts.
The first restart we set inter.broker.protocol.version telling all brokers
to communicate on the old version, and then we restart again removing the
inter.broker.protocol.version. With this api the brokers could agree on a
version to communicate with, and when bounced re-negotiate to the new
version.


On Mon, Sep 28, 2015 at 10:26 PM, Mayuresh Gharat <
gharatmayures...@gmail.com> wrote:

> Nice write-up.
>
> Just had a question, instead of returning an empty response back to the
> client, would it be better for the broker to return a response that gives
> some more info to the client regarding the min version they need to upgrade
> to in order to communicate with the broker.
>
>
> Thanks,
>
> Mayuresh
>
> On Mon, Sep 28, 2015 at 6:36 PM, Jiangjie Qin 
> wrote:
>
> > Thanks for the writeup. I also think having a specific protocol for
> > client-broker version negotiation is better.
> >
> > I'm wondering is it better to let the broker to decide the version to
> use?
> > It might have some value If brokers have preference for a particular
> > version.
> > Using a global version is a good approach. For the client-broker
> > negotiation, I am thinking about something like:
> >
> > ProtocolSyncRequest => ClientType [ProtocolVersion]
> > ClientType => int8
> > ProtocolVersion => int16
> >
> > ProtocolSyncResponse => PreferredVersion
> > PreferredVersion => int16
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao  wrote:
> >
> > > I agree with Ewen that having the keys explicitly specified in the
> > response
> > > is better.
> > >
> > > In addition to the supported protocol version, there are other
> > interesting
> > > metadata at the broker level that could be of interest to things like
> > admin
> > > tools (e.g., used disk space, remaining disk space, etc). I am
> wondering
> > if
> > > we should separate those into different requests. For inquiring the
> > > protocol version, we can have a separate BrokerProtocolRequest. The
> > > response will just include the broker version and perhaps a list of
> > > supported requests and versions?
> > >
> > > As for sending an empty response for unrecognized requests, do you how
> is
> > > that handled in other similar systems?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson 
> > > wrote:
> > >
> > > > Having the version API can make clients more robust, so I'm in favor.
> > One
> > > > note on the addition of the "rack" field. Since this is a
> > broker-specific
> > > > setting, the client would have to query BrokerMetadata for every new
> > > broker
> > > > it connects to (unless we also expose rack in TopicMetadata). This is
> > > also
> > > > kind of unfortunate for admin utilities leveraging this API. It might
> > be
> > > > more convenient to allow this API to return broker metadata for the
> > full
> > > > cluster, assuming all of it could be made available in Zookeeper.
> > > >
> > > > As for using the empty response to indicate an incompatible API, it
> > seems
> > > > like that could work. I think some of the clients might catch
> response
> > > > parsing exceptions and retry anyway, but that's no worse than
> retrying
> > > > because of a disconnect in the same case.
> > > >
> > > > -Jason
> > > >
> > > > On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava <
> > > e...@confluent.io>
> > > > wrote:
> > > >
> > > > > The basic functionality is definitely useful here. I'm generally in
> > > favor
> > > > > of exposing some info about broker versions to clients.
> > > > >
> > > > > I'd prefer to keep the key/values explicit. Making them extensible
> > > > > string/string pairs is good for avoiding unnecessary version
> changes
> > in
> > > > the
> > > > > protocol, but I think we should explicitly define the valid
> key/value
> > > > > formats in each version of the protocol. New keys can safely be
> > > ignored,
> > > > > but actually specifying the format of the values is important if we
> > > ever
> > > > > need to evolve those formats.
> > > > >
> > > > > I like some of the examples you've provided for returned key/value
> > > pairs
> > > > > and I think we should provide some of them even when the values
> > should
> > > be
> > > > > obvious from the broker version.
> > > > >
> > > > > * broker.version - are we definitely standardizing on this
> versioning
> > > > > format? 4 digits, with each level indicating the intuitive levels
> of
> > > > > compatibility? Is there any chance we'll have a 0.10.0.0? This
> might
> > > seem
> > > > > like a trivial consideration, but after fighting versioning in
> > > different
> > > > > packaging systems, I'm a bit more 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-29 Thread Todd Palino
We should also consider what else should be negotiated between the broker
and the client as this comes together. The version is definitely first, but
there are other things, such as the max message size, that should not need
to be replicated on both the broker and the client. Granted, max message
size has per-topic overrides as well, but that should also be considered
(possibly as an addition to the topic metadata response).

Ideally you never want a requirement that is enforced by the broker to be a
surprise to the client, whether that's a supported version or a
configuration parameter. The client should not have to know it in advance
(except for the most basic of connection parameters), and even if it does
have it as a configuration option, it should be able to know before it even
starts running that what it has configured is in conflict with the server.

-Todd


On Tue, Sep 29, 2015 at 11:08 AM, Mayuresh Gharat <
gharatmayures...@gmail.com> wrote:

> Right. But there should be a max old version that the broker should support
> to avoid these incompatibility issues.
> For example, if the broker is at version X, it should be able to support
> the versions (clients and interbroker) till X-2. In case we have brokers
> and clients older than that it can send a response warning them to upgrade
> till X-2 minimum.
> The backward compatibility limit can be discussed further. This will help
> for rolling upgrades.
>
> Thanks,
>
> Mayuresh
>
> On Tue, Sep 29, 2015 at 8:25 AM, Grant Henke  wrote:
>
> > If we create a protocol version negotiation api for clients, can we use
> it
> > to replace or improve the ease of upgrades that break inter-broker
> > messaging?
> >
> > Currently upgrades that break the wire protocol take 2 rolling restarts.
> > The first restart we set inter.broker.protocol.version telling all
> brokers
> > to communicate on the old version, and then we restart again removing the
> > inter.broker.protocol.version. With this api the brokers could agree on a
> > version to communicate with, and when bounced re-negotiate to the new
> > version.
> >
> >
> > On Mon, Sep 28, 2015 at 10:26 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> > > Nice write-up.
> > >
> > > Just had a question, instead of returning an empty response back to the
> > > client, would it be better for the broker to return a response that
> gives
> > > some more info to the client regarding the min version they need to
> > upgrade
> > > to in order to communicate with the broker.
> > >
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Mon, Sep 28, 2015 at 6:36 PM, Jiangjie Qin
>  > >
> > > wrote:
> > >
> > > > Thanks for the writeup. I also think having a specific protocol for
> > > > client-broker version negotiation is better.
> > > >
> > > > I'm wondering is it better to let the broker to decide the version to
> > > use?
> > > > It might have some value If brokers have preference for a particular
> > > > version.
> > > > Using a global version is a good approach. For the client-broker
> > > > negotiation, I am thinking about something like:
> > > >
> > > > ProtocolSyncRequest => ClientType [ProtocolVersion]
> > > > ClientType => int8
> > > > ProtocolVersion => int16
> > > >
> > > > ProtocolSyncResponse => PreferredVersion
> > > > PreferredVersion => int16
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao  wrote:
> > > >
> > > > > I agree with Ewen that having the keys explicitly specified in the
> > > > response
> > > > > is better.
> > > > >
> > > > > In addition to the supported protocol version, there are other
> > > > interesting
> > > > > metadata at the broker level that could be of interest to things
> like
> > > > admin
> > > > > tools (e.g., used disk space, remaining disk space, etc). I am
> > > wondering
> > > > if
> > > > > we should separate those into different requests. For inquiring the
> > > > > protocol version, we can have a separate BrokerProtocolRequest. The
> > > > > response will just include the broker version and perhaps a list of
> > > > > supported requests and versions?
> > > > >
> > > > > As for sending an empty response for unrecognized requests, do you
> > how
> > > is
> > > > > that handled in other similar systems?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson <
> > ja...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > Having the version API can make clients more robust, so I'm in
> > favor.
> > > > One
> > > > > > note on the addition of the "rack" field. Since this is a
> > > > broker-specific
> > > > > > setting, the client would have to query BrokerMetadata for every
> > new
> > > > > broker
> > > > > > it connects to (unless we also expose rack in TopicMetadata).
> This
> > is
> > > > > also
> > > > > > kind of unfortunate for admin 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-29 Thread Mayuresh Gharat
Right. But there should be a max old version that the broker should support
to avoid these incompatibility issues.
For example, if the broker is at version X, it should be able to support
the versions (clients and interbroker) till X-2. In case we have brokers
and clients older than that it can send a response warning them to upgrade
till X-2 minimum.
The backward compatibility limit can be discussed further. This will help
for rolling upgrades.

Thanks,

Mayuresh

On Tue, Sep 29, 2015 at 8:25 AM, Grant Henke  wrote:

> If we create a protocol version negotiation api for clients, can we use it
> to replace or improve the ease of upgrades that break inter-broker
> messaging?
>
> Currently upgrades that break the wire protocol take 2 rolling restarts.
> The first restart we set inter.broker.protocol.version telling all brokers
> to communicate on the old version, and then we restart again removing the
> inter.broker.protocol.version. With this api the brokers could agree on a
> version to communicate with, and when bounced re-negotiate to the new
> version.
>
>
> On Mon, Sep 28, 2015 at 10:26 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > Nice write-up.
> >
> > Just had a question, instead of returning an empty response back to the
> > client, would it be better for the broker to return a response that gives
> > some more info to the client regarding the min version they need to
> upgrade
> > to in order to communicate with the broker.
> >
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Mon, Sep 28, 2015 at 6:36 PM, Jiangjie Qin  >
> > wrote:
> >
> > > Thanks for the writeup. I also think having a specific protocol for
> > > client-broker version negotiation is better.
> > >
> > > I'm wondering is it better to let the broker to decide the version to
> > use?
> > > It might have some value If brokers have preference for a particular
> > > version.
> > > Using a global version is a good approach. For the client-broker
> > > negotiation, I am thinking about something like:
> > >
> > > ProtocolSyncRequest => ClientType [ProtocolVersion]
> > > ClientType => int8
> > > ProtocolVersion => int16
> > >
> > > ProtocolSyncResponse => PreferredVersion
> > > PreferredVersion => int16
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao  wrote:
> > >
> > > > I agree with Ewen that having the keys explicitly specified in the
> > > response
> > > > is better.
> > > >
> > > > In addition to the supported protocol version, there are other
> > > interesting
> > > > metadata at the broker level that could be of interest to things like
> > > admin
> > > > tools (e.g., used disk space, remaining disk space, etc). I am
> > wondering
> > > if
> > > > we should separate those into different requests. For inquiring the
> > > > protocol version, we can have a separate BrokerProtocolRequest. The
> > > > response will just include the broker version and perhaps a list of
> > > > supported requests and versions?
> > > >
> > > > As for sending an empty response for unrecognized requests, do you
> how
> > is
> > > > that handled in other similar systems?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson <
> ja...@confluent.io>
> > > > wrote:
> > > >
> > > > > Having the version API can make clients more robust, so I'm in
> favor.
> > > One
> > > > > note on the addition of the "rack" field. Since this is a
> > > broker-specific
> > > > > setting, the client would have to query BrokerMetadata for every
> new
> > > > broker
> > > > > it connects to (unless we also expose rack in TopicMetadata). This
> is
> > > > also
> > > > > kind of unfortunate for admin utilities leveraging this API. It
> might
> > > be
> > > > > more convenient to allow this API to return broker metadata for the
> > > full
> > > > > cluster, assuming all of it could be made available in Zookeeper.
> > > > >
> > > > > As for using the empty response to indicate an incompatible API, it
> > > seems
> > > > > like that could work. I think some of the clients might catch
> > response
> > > > > parsing exceptions and retry anyway, but that's no worse than
> > retrying
> > > > > because of a disconnect in the same case.
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava <
> > > > e...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > The basic functionality is definitely useful here. I'm generally
> in
> > > > favor
> > > > > > of exposing some info about broker versions to clients.
> > > > > >
> > > > > > I'd prefer to keep the key/values explicit. Making them
> extensible
> > > > > > string/string pairs is good for avoiding unnecessary version
> > changes
> > > in
> > > > > the
> > > > > > protocol, but I think we should explicitly define the valid
> > key/value
> > > > > > formats in each version of the protocol. 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-28 Thread Mayuresh Gharat
Nice write-up.

Just had a question, instead of returning an empty response back to the
client, would it be better for the broker to return a response that gives
some more info to the client regarding the min version they need to upgrade
to in order to communicate with the broker.


Thanks,

Mayuresh

On Mon, Sep 28, 2015 at 6:36 PM, Jiangjie Qin 
wrote:

> Thanks for the writeup. I also think having a specific protocol for
> client-broker version negotiation is better.
>
> I'm wondering is it better to let the broker to decide the version to use?
> It might have some value If brokers have preference for a particular
> version.
> Using a global version is a good approach. For the client-broker
> negotiation, I am thinking about something like:
>
> ProtocolSyncRequest => ClientType [ProtocolVersion]
> ClientType => int8
> ProtocolVersion => int16
>
> ProtocolSyncResponse => PreferredVersion
> PreferredVersion => int16
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao  wrote:
>
> > I agree with Ewen that having the keys explicitly specified in the
> response
> > is better.
> >
> > In addition to the supported protocol version, there are other
> interesting
> > metadata at the broker level that could be of interest to things like
> admin
> > tools (e.g., used disk space, remaining disk space, etc). I am wondering
> if
> > we should separate those into different requests. For inquiring the
> > protocol version, we can have a separate BrokerProtocolRequest. The
> > response will just include the broker version and perhaps a list of
> > supported requests and versions?
> >
> > As for sending an empty response for unrecognized requests, do you how is
> > that handled in other similar systems?
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson 
> > wrote:
> >
> > > Having the version API can make clients more robust, so I'm in favor.
> One
> > > note on the addition of the "rack" field. Since this is a
> broker-specific
> > > setting, the client would have to query BrokerMetadata for every new
> > broker
> > > it connects to (unless we also expose rack in TopicMetadata). This is
> > also
> > > kind of unfortunate for admin utilities leveraging this API. It might
> be
> > > more convenient to allow this API to return broker metadata for the
> full
> > > cluster, assuming all of it could be made available in Zookeeper.
> > >
> > > As for using the empty response to indicate an incompatible API, it
> seems
> > > like that could work. I think some of the clients might catch response
> > > parsing exceptions and retry anyway, but that's no worse than retrying
> > > because of a disconnect in the same case.
> > >
> > > -Jason
> > >
> > > On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava <
> > e...@confluent.io>
> > > wrote:
> > >
> > > > The basic functionality is definitely useful here. I'm generally in
> > favor
> > > > of exposing some info about broker versions to clients.
> > > >
> > > > I'd prefer to keep the key/values explicit. Making them extensible
> > > > string/string pairs is good for avoiding unnecessary version changes
> in
> > > the
> > > > protocol, but I think we should explicitly define the valid key/value
> > > > formats in each version of the protocol. New keys can safely be
> > ignored,
> > > > but actually specifying the format of the values is important if we
> > ever
> > > > need to evolve those formats.
> > > >
> > > > I like some of the examples you've provided for returned key/value
> > pairs
> > > > and I think we should provide some of them even when the values
> should
> > be
> > > > obvious from the broker version.
> > > >
> > > > * broker.version - are we definitely standardizing on this versioning
> > > > format? 4 digits, with each level indicating the intuitive levels of
> > > > compatibility? Is there any chance we'll have a 0.10.0.0? This might
> > seem
> > > > like a trivial consideration, but after fighting versioning in
> > different
> > > > packaging systems, I'm a bit more sensitive to the annoying effects
> > that
> > > > not considering this carefully can have. We're at a particularly
> > > sensitive
> > > > point as we hit .9 -> .10 or .9 -> 1.0 ().
> > > > * supported.compression.codecs - nit, but I'd like to figure out a
> good
> > > way
> > > > to keep these as close to the actual config name as possible. in this
> > > case,
> > > > the setting is compression.codec, so just "compression.codecs" seems
> > > ideal
> > > > to me.
> > > > * rack: I think there's a ton of demand for something like this, but
> > I'd
> > > > really like to think through it more before exposing anything. "rack"
> > is
> > > > *very* specific to a deployment scenario. I think we're comfortable
> > > > adapting the terminology, but the meaning can change drastically,
> even
> > > > under seemingly similar deployments. For example, if you deploy 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-28 Thread Jun Rao
I agree with Ewen that having the keys explicitly specified in the response
is better.

In addition to the supported protocol version, there are other interesting
metadata at the broker level that could be of interest to things like admin
tools (e.g., used disk space, remaining disk space, etc). I am wondering if
we should separate those into different requests. For inquiring the
protocol version, we can have a separate BrokerProtocolRequest. The
response will just include the broker version and perhaps a list of
supported requests and versions?

As for sending an empty response for unrecognized requests, do you how is
that handled in other similar systems?

Thanks,

Jun

On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson 
wrote:

> Having the version API can make clients more robust, so I'm in favor. One
> note on the addition of the "rack" field. Since this is a broker-specific
> setting, the client would have to query BrokerMetadata for every new broker
> it connects to (unless we also expose rack in TopicMetadata). This is also
> kind of unfortunate for admin utilities leveraging this API. It might be
> more convenient to allow this API to return broker metadata for the full
> cluster, assuming all of it could be made available in Zookeeper.
>
> As for using the empty response to indicate an incompatible API, it seems
> like that could work. I think some of the clients might catch response
> parsing exceptions and retry anyway, but that's no worse than retrying
> because of a disconnect in the same case.
>
> -Jason
>
> On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava 
> wrote:
>
> > The basic functionality is definitely useful here. I'm generally in favor
> > of exposing some info about broker versions to clients.
> >
> > I'd prefer to keep the key/values explicit. Making them extensible
> > string/string pairs is good for avoiding unnecessary version changes in
> the
> > protocol, but I think we should explicitly define the valid key/value
> > formats in each version of the protocol. New keys can safely be ignored,
> > but actually specifying the format of the values is important if we ever
> > need to evolve those formats.
> >
> > I like some of the examples you've provided for returned key/value pairs
> > and I think we should provide some of them even when the values should be
> > obvious from the broker version.
> >
> > * broker.version - are we definitely standardizing on this versioning
> > format? 4 digits, with each level indicating the intuitive levels of
> > compatibility? Is there any chance we'll have a 0.10.0.0? This might seem
> > like a trivial consideration, but after fighting versioning in different
> > packaging systems, I'm a bit more sensitive to the annoying effects that
> > not considering this carefully can have. We're at a particularly
> sensitive
> > point as we hit .9 -> .10 or .9 -> 1.0 ().
> > * supported.compression.codecs - nit, but I'd like to figure out a good
> way
> > to keep these as close to the actual config name as possible. in this
> case,
> > the setting is compression.codec, so just "compression.codecs" seems
> ideal
> > to me.
> > * rack: I think there's a ton of demand for something like this, but I'd
> > really like to think through it more before exposing anything. "rack" is
> > *very* specific to a deployment scenario. I think we're comfortable
> > adapting the terminology, but the meaning can change drastically, even
> > under seemingly similar deployments. For example, if you deploy in EC2,
> you
> > might create instances in multiple AZs. Within an AZ, you might consider
> > all the nodes on the same "rack". But there are also placement groups
> > within each AZ that provide better guarantees on network performance. Are
> > any nodes in the same AZ considered on the same rack or do you need
> special
> > guarantees to be on the same "rack". In general, I don't think there's a
> > generic "rack" identifier we can expose -- we'll need to do something
> > specialized depending on different environments. This is a case where
> > extensibility in the format is probably really useful.
> > * Properties like supported.compression.codecs can presumably be
> determined
> > just via the broker version. Is there a good reason for including them?
> > Perhaps explicit info >> implicit info?
> > * "All existing clients should be able to handle this gracefully without
> > any alterations to the code" - which clients does this refer to? For
> > example, will the Java clients continue to behave the same way they do
> > today? I'm curious because empty responses seem *very* different from
> > closing connections.
> >
> >
> > -Ewen
> >
> > On Fri, Sep 25, 2015 at 2:53 PM, Magnus Edenhill 
> > wrote:
> >
> > > Good evening,
> > >
> > > KIP-35 was created to address current and future broker-client
> > > compatibility.
> > >
> > >
> > >
> >
> 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-28 Thread Jason Gustafson
Having the version API can make clients more robust, so I'm in favor. One
note on the addition of the "rack" field. Since this is a broker-specific
setting, the client would have to query BrokerMetadata for every new broker
it connects to (unless we also expose rack in TopicMetadata). This is also
kind of unfortunate for admin utilities leveraging this API. It might be
more convenient to allow this API to return broker metadata for the full
cluster, assuming all of it could be made available in Zookeeper.

As for using the empty response to indicate an incompatible API, it seems
like that could work. I think some of the clients might catch response
parsing exceptions and retry anyway, but that's no worse than retrying
because of a disconnect in the same case.

-Jason

On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava 
wrote:

> The basic functionality is definitely useful here. I'm generally in favor
> of exposing some info about broker versions to clients.
>
> I'd prefer to keep the key/values explicit. Making them extensible
> string/string pairs is good for avoiding unnecessary version changes in the
> protocol, but I think we should explicitly define the valid key/value
> formats in each version of the protocol. New keys can safely be ignored,
> but actually specifying the format of the values is important if we ever
> need to evolve those formats.
>
> I like some of the examples you've provided for returned key/value pairs
> and I think we should provide some of them even when the values should be
> obvious from the broker version.
>
> * broker.version - are we definitely standardizing on this versioning
> format? 4 digits, with each level indicating the intuitive levels of
> compatibility? Is there any chance we'll have a 0.10.0.0? This might seem
> like a trivial consideration, but after fighting versioning in different
> packaging systems, I'm a bit more sensitive to the annoying effects that
> not considering this carefully can have. We're at a particularly sensitive
> point as we hit .9 -> .10 or .9 -> 1.0 ().
> * supported.compression.codecs - nit, but I'd like to figure out a good way
> to keep these as close to the actual config name as possible. in this case,
> the setting is compression.codec, so just "compression.codecs" seems ideal
> to me.
> * rack: I think there's a ton of demand for something like this, but I'd
> really like to think through it more before exposing anything. "rack" is
> *very* specific to a deployment scenario. I think we're comfortable
> adapting the terminology, but the meaning can change drastically, even
> under seemingly similar deployments. For example, if you deploy in EC2, you
> might create instances in multiple AZs. Within an AZ, you might consider
> all the nodes on the same "rack". But there are also placement groups
> within each AZ that provide better guarantees on network performance. Are
> any nodes in the same AZ considered on the same rack or do you need special
> guarantees to be on the same "rack". In general, I don't think there's a
> generic "rack" identifier we can expose -- we'll need to do something
> specialized depending on different environments. This is a case where
> extensibility in the format is probably really useful.
> * Properties like supported.compression.codecs can presumably be determined
> just via the broker version. Is there a good reason for including them?
> Perhaps explicit info >> implicit info?
> * "All existing clients should be able to handle this gracefully without
> any alterations to the code" - which clients does this refer to? For
> example, will the Java clients continue to behave the same way they do
> today? I'm curious because empty responses seem *very* different from
> closing connections.
>
>
> -Ewen
>
> On Fri, Sep 25, 2015 at 2:53 PM, Magnus Edenhill 
> wrote:
>
> > Good evening,
> >
> > KIP-35 was created to address current and future broker-client
> > compatibility.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
> >
> > Summary:
> >  * allow clients to retrieve the broker's protocol version
> >  * make broker handle unknown protocol requests gracefully
> >
> > Feedback and comments welcome!
> >
> > Regards,
> > Magnus
> >
>
>
>
> --
> Thanks,
> Ewen
>


Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-28 Thread Jiangjie Qin
Thanks for the writeup. I also think having a specific protocol for
client-broker version negotiation is better.

I'm wondering is it better to let the broker to decide the version to use?
It might have some value If brokers have preference for a particular
version.
Using a global version is a good approach. For the client-broker
negotiation, I am thinking about something like:

ProtocolSyncRequest => ClientType [ProtocolVersion]
ClientType => int8
ProtocolVersion => int16

ProtocolSyncResponse => PreferredVersion
PreferredVersion => int16

Thanks,

Jiangjie (Becket) Qin

On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao  wrote:

> I agree with Ewen that having the keys explicitly specified in the response
> is better.
>
> In addition to the supported protocol version, there are other interesting
> metadata at the broker level that could be of interest to things like admin
> tools (e.g., used disk space, remaining disk space, etc). I am wondering if
> we should separate those into different requests. For inquiring the
> protocol version, we can have a separate BrokerProtocolRequest. The
> response will just include the broker version and perhaps a list of
> supported requests and versions?
>
> As for sending an empty response for unrecognized requests, do you how is
> that handled in other similar systems?
>
> Thanks,
>
> Jun
>
> On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson 
> wrote:
>
> > Having the version API can make clients more robust, so I'm in favor. One
> > note on the addition of the "rack" field. Since this is a broker-specific
> > setting, the client would have to query BrokerMetadata for every new
> broker
> > it connects to (unless we also expose rack in TopicMetadata). This is
> also
> > kind of unfortunate for admin utilities leveraging this API. It might be
> > more convenient to allow this API to return broker metadata for the full
> > cluster, assuming all of it could be made available in Zookeeper.
> >
> > As for using the empty response to indicate an incompatible API, it seems
> > like that could work. I think some of the clients might catch response
> > parsing exceptions and retry anyway, but that's no worse than retrying
> > because of a disconnect in the same case.
> >
> > -Jason
> >
> > On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava <
> e...@confluent.io>
> > wrote:
> >
> > > The basic functionality is definitely useful here. I'm generally in
> favor
> > > of exposing some info about broker versions to clients.
> > >
> > > I'd prefer to keep the key/values explicit. Making them extensible
> > > string/string pairs is good for avoiding unnecessary version changes in
> > the
> > > protocol, but I think we should explicitly define the valid key/value
> > > formats in each version of the protocol. New keys can safely be
> ignored,
> > > but actually specifying the format of the values is important if we
> ever
> > > need to evolve those formats.
> > >
> > > I like some of the examples you've provided for returned key/value
> pairs
> > > and I think we should provide some of them even when the values should
> be
> > > obvious from the broker version.
> > >
> > > * broker.version - are we definitely standardizing on this versioning
> > > format? 4 digits, with each level indicating the intuitive levels of
> > > compatibility? Is there any chance we'll have a 0.10.0.0? This might
> seem
> > > like a trivial consideration, but after fighting versioning in
> different
> > > packaging systems, I'm a bit more sensitive to the annoying effects
> that
> > > not considering this carefully can have. We're at a particularly
> > sensitive
> > > point as we hit .9 -> .10 or .9 -> 1.0 ().
> > > * supported.compression.codecs - nit, but I'd like to figure out a good
> > way
> > > to keep these as close to the actual config name as possible. in this
> > case,
> > > the setting is compression.codec, so just "compression.codecs" seems
> > ideal
> > > to me.
> > > * rack: I think there's a ton of demand for something like this, but
> I'd
> > > really like to think through it more before exposing anything. "rack"
> is
> > > *very* specific to a deployment scenario. I think we're comfortable
> > > adapting the terminology, but the meaning can change drastically, even
> > > under seemingly similar deployments. For example, if you deploy in EC2,
> > you
> > > might create instances in multiple AZs. Within an AZ, you might
> consider
> > > all the nodes on the same "rack". But there are also placement groups
> > > within each AZ that provide better guarantees on network performance.
> Are
> > > any nodes in the same AZ considered on the same rack or do you need
> > special
> > > guarantees to be on the same "rack". In general, I don't think there's
> a
> > > generic "rack" identifier we can expose -- we'll need to do something
> > > specialized depending on different environments. This is a case where
> > > extensibility in the format is probably 

Re: [DISCUSS] KIP-35 - Retrieve protocol version

2015-09-25 Thread Ewen Cheslack-Postava
The basic functionality is definitely useful here. I'm generally in favor
of exposing some info about broker versions to clients.

I'd prefer to keep the key/values explicit. Making them extensible
string/string pairs is good for avoiding unnecessary version changes in the
protocol, but I think we should explicitly define the valid key/value
formats in each version of the protocol. New keys can safely be ignored,
but actually specifying the format of the values is important if we ever
need to evolve those formats.

I like some of the examples you've provided for returned key/value pairs
and I think we should provide some of them even when the values should be
obvious from the broker version.

* broker.version - are we definitely standardizing on this versioning
format? 4 digits, with each level indicating the intuitive levels of
compatibility? Is there any chance we'll have a 0.10.0.0? This might seem
like a trivial consideration, but after fighting versioning in different
packaging systems, I'm a bit more sensitive to the annoying effects that
not considering this carefully can have. We're at a particularly sensitive
point as we hit .9 -> .10 or .9 -> 1.0 ().
* supported.compression.codecs - nit, but I'd like to figure out a good way
to keep these as close to the actual config name as possible. in this case,
the setting is compression.codec, so just "compression.codecs" seems ideal
to me.
* rack: I think there's a ton of demand for something like this, but I'd
really like to think through it more before exposing anything. "rack" is
*very* specific to a deployment scenario. I think we're comfortable
adapting the terminology, but the meaning can change drastically, even
under seemingly similar deployments. For example, if you deploy in EC2, you
might create instances in multiple AZs. Within an AZ, you might consider
all the nodes on the same "rack". But there are also placement groups
within each AZ that provide better guarantees on network performance. Are
any nodes in the same AZ considered on the same rack or do you need special
guarantees to be on the same "rack". In general, I don't think there's a
generic "rack" identifier we can expose -- we'll need to do something
specialized depending on different environments. This is a case where
extensibility in the format is probably really useful.
* Properties like supported.compression.codecs can presumably be determined
just via the broker version. Is there a good reason for including them?
Perhaps explicit info >> implicit info?
* "All existing clients should be able to handle this gracefully without
any alterations to the code" - which clients does this refer to? For
example, will the Java clients continue to behave the same way they do
today? I'm curious because empty responses seem *very* different from
closing connections.


-Ewen

On Fri, Sep 25, 2015 at 2:53 PM, Magnus Edenhill  wrote:

> Good evening,
>
> KIP-35 was created to address current and future broker-client
> compatibility.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
>
> Summary:
>  * allow clients to retrieve the broker's protocol version
>  * make broker handle unknown protocol requests gracefully
>
> Feedback and comments welcome!
>
> Regards,
> Magnus
>



-- 
Thanks,
Ewen


<    1   2