Hello Jay, The overall approach sounds good. I do realize that this discussion has gotten too lengthy and is starting to shoot tangents. Maybe a KIP call will help us getting to a decision faster. I do have a few questions though.
On Fri, Mar 4, 2016 at 9:52 AM, Jay Kreps <j...@confluent.io> wrote: > Yeah here is my summary of my take: > > 1. Negotiating a per-connection protocol actually does add a lot of > complexity to clients (many more failure states to get right). > > 2. Having the client configure the protocol version manually is doable now > but probably a worse state. I suspect this will lead to more not less > confusion. > > 3. I don't think the current state is actually that bad. Integrators pick a > conservative version and build against that. There is a tradeoff between > getting the new features and being compatible with old Kafka versions. But > a large part of this tradeoff is essential since new features aren't going > to magically appear on old servers, so even if you upgrade your client you > likely aren't going to get the new stuff (since we will end up dynamically > turning it off). Having client features that are there but don't work > because you're on an old cluster may actually be a worse experience if not > handled very carefully.. > > 4. The problems Dana brought up are totally orthogonal to the problem of > having per-api versions or overall versions. The problem was that we > changed behavior subtly without changing the version. This will be an issue > regardless of whether the version is global or not. > > 5. Using the broker release as the version is strictly worse than using a > global protocol version (0, 1, 2, ...) that increments any time any api > changes but doesn't increment just because non-protocol code is changed. > The problem with using the broker release version is we want to be able to > keep Kafka releasable from any commit which means there isn't as clear a > sequencing of releases as you would think. > > 6. We need to consider the case of mixed version clusters during the time > period when you are upgrading Kafka. > > So overall I think this is not a critical thing to do right now, but if we > are going to do it we should do it in a way that actually improves things. > > Here would be one proposal for that: > a. Add a global protocol version that increments with any api version > update. Move the documentation so that the docs are by version. This is > basically just a short-hand for a complete set of supported api versions. > b. Include a field in the metadata response for each broker that adds the > protocol version. > There might be an issue here where the metadata request version sent by client is not supported by broker, an older broker. However, if we are clearly stating that a client is not guaranteed to work with an older broker then this becomes expected. This will potentially limit us in terms of supporting downgrades though, if we ever want to. > c. To maintain the protocol version this information will have to get > propagated with the rest of the broker metadata like host, port, id, etc. > > The instructions to clients would be: > - By default you build against a single conservative Kafka protocol version > and we carry that support forward, as today > If I am getting this correct, this will mean we will never deprecate/remove any protocol version in future. Having some way to deprecate/remove older protocol versions will probably be a good idea. It is possible with the global protocol version approach, it could be as simple as marking a protocol deprecated in protocol doc before removing it. Just want to make sure deprecation is still on the table. > - If you want to get fancy you can use the protocol version field in the > metadata request to more dynamically chose what features are available and > select api versions appropriately. This is purely optional. > > -Jay > > On Thu, Mar 3, 2016 at 9:38 PM, Jason Gustafson <ja...@confluent.io> > wrote: > > > I talked with Jay about this KIP briefly this morning, so let me try to > > summarize the discussion (I'm sure he'll jump in if I get anything > wrong). > > Apologies in advance for the length. > > > > I think we both share some skepticism that a request with all the > supported > > versions of all the request APIs is going to be a useful primitive to try > > and build client compatibility around. In practice I think people would > end > > up checking for particular request versions in order to determine if the > > broker is 0.8 or 0.9 or whatever, and then change behavior accordingly. > I'm > > wondering if there's a reasonable way to handle the version responses > that > > doesn't amount to that. Maybe you could try to capture feature > > compatibility by checking the versions for a subset of request types? For > > example, to ensure that you can use the new consumer API, you check that > > the group coordinator request is present, the offset commit request > version > > is greater than 2, the offset fetch request is greater than 1, and the > join > > group request is present. And to ensure compatibility with KIP-32, maybe > > you only need to check the appropriate versions of the fetch and produce > > requests. That sounds kind of complicated to keep track of and you > probably > > end up trying to handle combinations which aren't even possible in > > practice. > > > > The alternative is to use a single API version. It could be the Kafka > > release version, but then you need to figure out how to handle users who > > are running off of trunk since multiple API versions will typically > change > > between releases. Perhaps it makes more sense to keep a separate API > > version number which is incremented every time any one of the API > versions > > increases? This also decouples the protocol from the Kafka distribution. > > > > As far as whether there should be a separate request or not, I get > Becket's > > point that you would only need to do the version check once when a > > connection is established, but another round trip still complicates the > > picture quite a bit. Before you just need to send a metadata request to > > bootstrap yourself to the cluster, but now you need to do version > > negotiation before you can even do that, and then you need to try adapt > to > > the versions reported. Jay brought up the point that you probably > wouldn't > > design a protocol from scratch to work this way. Using the metadata > request > > would be better if it's possible, but you need a way to handle the fact > > that a broker's version might be stale by the time you connect to it. And > > even then you're going to have to deal internally with the complexity > > involved in trying to upgrade/downgrade dynamically, which sounds to me > > like it would have a ton of edge cases. > > > > Taking a bit of a step back, any solution is probably going to be painful > > since the Kafka protocol was not designed for this use case. Currently > what > > that means for clients that /want/ to support compatibility across broker > > versions is that they need to have the user tell them the broker version > > through configuration (e.g. librdkafka has a "protocol.version" field for > > this purpose). The only real problem with this in my mind is that we > don't > > have a graceful way to detect request incompatibility, which is why there > > are so many questions on the user list which basically amount to the > client > > hanging because the broker refuses to respond to a request it doesn't > > understand. If you solve this problem, then depending on configuration > > seems totally reasonable and we can skip trying to implement request > > version negotiation. Magnus's solution in this KIP may seem a little > hacky, > > but it also seems like the only way to do it without changing the header. > > > > The Spark problem mentioned above is interesting and I agree that it > sucks > > for frameworks that need to ship the kafka client library since they have > > to figure out how to bundle multiple versions. Ultimately if we want to > > solve this problem, then it sounds like we need to commit to maintaining > > compatibility with older versions of Kafka in the client going forward. > > That's a lot bigger decision and it matters less whether the broker > version > > is found through configuration, topic metadata, or a new request type. > > > > -Jason > > > > > > On Thu, Mar 3, 2016 at 3:59 PM, Becket Qin <becket....@gmail.com> wrote: > > > > > Hi Ashish, > > > > > > In approach (1), the clients will still be able to talked to multiple > > > versions of Kafka brokers as long as the clients version is not higher > > than > > > the broker version, right? > > > > > > From Spark's point of view, it seems the difference is whether Spark > can > > > independently update their Kafka clients dependency or not. More > > > specifically, consider the following three scenarios: > > > A. Spark has some new features that do not rely on clients or brokers > in > > a > > > new Kafka release. > > > B. Spark has some new features that only rely on the clients in a new > > Kafka > > > release, but not rely on the brokers in a new Kafka release. e.g. New > > > client provides a listTopic() method. > > > C. Spark has some new features that rely on both the clients and > brokers > > in > > > a new Kafka release. e.g timestamp field. > > > > > > For A, Spark does not need to update the Kafka dependency because there > > is > > > no need and the old clients can talk to both new and old Kafka brokers. > > > For C, Spark has to wait for broker upgrade anyways. > > > So in the above two scenarios, there is not much difference between > > > approach (1) and (2). > > > > > > B is a tricky scenario. Because it is possible that we introduce both > > > listTopic() and the timestamp field in the same Kafka release, and we > > don't > > > know if Spark needs both or only uses listTopic(). > > > This indicates the client should work fine if a method is supported and > > > should throw exception when a method is not supported. I think we can > do > > > the following: > > > 0. Clients always use its highest request version. The clients keeps a > > > static final map recording the minimum required ApiVersion for each > > > request. > > > 1. When connect to a broker, the clients always send an > ApiVersionRequest > > > to the broker. > > > 2. The broker replies with the its highest supported ApiVersion. > > > 3. Before sending a request, the clients checks the minimum required > > > ApiVersion for that request. If the broker returned ApiVersion is > higher > > > than this minimum required ApiVersion, then we can proceed. Otherwise > we > > > throw something like NotSupportedOperationException. > > > > > > With this approach, scenario B will also work unless Spark calls some > > > function that is not supported by the Kafka broker, which makes it > become > > > scenario C. > > > > > > Thoughts? > > > > > > Thanks, > > > > > > Jiangjie (Becket) Qin > > > > > > On Thu, Mar 3, 2016 at 11:24 AM, Ashish Singh <asi...@cloudera.com> > > wrote: > > > > > > > On Wed, Mar 2, 2016 at 8:29 PM, Becket Qin <becket....@gmail.com> > > wrote: > > > > > > > > > Hi Jason, > > > > > > > > > > I was thinking that every time when we connect to a broker, we > first > > > send > > > > > the version check request. (The version check request itself should > > be > > > > very > > > > > simple and never changes across all server releases.) This does add > > an > > > > > additional round trip, but given reconnect is rare, it is probably > > > fine. > > > > On > > > > > the client side, the client will always send request using the > lowest > > > > > supported version across all brokers. That means if a Kafka cluster > > is > > > > > downgrading, we will use the downgraded protocol as soon as the > > client > > > > > connected to an older broker. > > > > > > > > > This sounds interesting and very similar to current suggestion. > > However, > > > > just to make sure I am getting it right, you are suggesting send a > > > separate > > > > request only for release version? > > > > > > > > > > > > > > @Ashish, > > > > > Can you help me understand the pain points from other open source > > > > projects > > > > > that you mentioned a little more? There are two different levels of > > > > > requirements: > > > > > 1. User wants to know if the client is compatible with the broker > or > > > not. > > > > > 2. User wants the client and the broker to negotiate the protocol > on > > > > their > > > > > own. > > > > > > > > > Not sure which category it falls in, but below is the excerpt from > > Mark, > > > a > > > > spark dev, who has been trying to upgrade spark kafka integration to > > use > > > > 0.9 clients. > > > > > > > > Based on what I understand, users of Kafka need to upgrade their > > brokers > > > to > > > > Kafka 0.9.x first, before they upgrade their clients to Kafka 0.9.x. > > > > > > > > > However, that presents a problem to other projects that integrate > > with > > > > Kafka (Spark, Flume, Storm, etc.). From here on, I will speak for > > Spark + > > > > Kafka, since that's the one I am most familiar with. > > > > In the light of compatibility (or the lack thereof) between 0.8.x and > > > > 0.9.x, Spark is faced with a problem of what version(s) of Kafka to > be > > > > compatible with, and has 2 options (discussed in this PR > > > > <https://github.com/apache/spark/pull/11143>): > > > > 1. We either upgrade to Kafka 0.9, dropping support for 0.8. Storm > and > > > > Flume are already on this path. > > > > 2. We introduce complexity in our code to support both 0.8 and 0.9 > for > > > the > > > > entire duration of our next major release (Apache Spark 2.x). > > > > I'd love to hear your thoughts on which option, you recommend. > > > > Long term, I'd really appreciate if Kafka could do something that > > doesn't > > > > make Spark having to support two, or even more versions of Kafka. > And, > > if > > > > there is something that I, personally, and Spark project can do in > your > > > > next release candidate phase to make things easier, please do let us > > > know. > > > > > > > > This issue has made other projects worry about how they are going to > > keep > > > > up with Kafka releases. Last I heard, take this with a pinch of salt, > > > Spark > > > > folks are discussing about using Maven profiles to build against > > multiple > > > > Kafka versions at compile time, etc. Also, there are clients who are > > > > relying on class-loading tricks with custom implementation of OSGi to > > > solve > > > > such issues. Don't quote me on the stuff I just mentioned, as this is > > > what > > > > I have heard during casual discussions. The point I am trying to make > > is > > > > that Kafka clients are worried about being able to support multiple > > Kafka > > > > broker versions. I am sure we all agree on that. > > > > > > > > I think the second requirement makes more sense from a client > > > perspective. > > > > First req will just tell them that there is a problem, but no way to > > work > > > > around it. > > > > > > > > > > > > > > Currently in Kafka the principle we are following is to let clients > > > stick > > > > > to a certain version and server will adapt to the clients > > accordingly. > > > > > If this KIP doesn't want to break this rule, it seems we should > > simply > > > > let > > > > > the clients send the ApiVersion it is using to the brokers and the > > > > brokers > > > > > will decide whether to accept or reject the clients. This means > user > > > have > > > > > to upgrade broker before they upgrade clients. This satisfies (1) > so > > > > that a > > > > > newer client will know it does not compatible with an older server > > > > > immediately. > > > > > If this KIP will change that to let the newer clients adapt to the > > > older > > > > > brokers, compatibility wise it is a good thing to have. With this > > now > > > > > users are able to upgrade clients before they upgrade Kafka > brokers. > > > This > > > > > means user can upgrade clients even before upgrade servers. This > > > > satisfies > > > > > (2) as the newer clients can also talk to the older servers. > > > > > > > > > More importantly, this will allow a client to talk to multiple > versions > > > of > > > > Kafka. > > > > > > > > > > > > > > > > If we decide to go with (2). The benefit is that a newer client > won't > > > > break > > > > > when talking to an older broker. But functionality wise, it might > be > > > the > > > > > same as an older clients. > > > > > In the downgrading case, we probably still have to notify all the > > > users. > > > > > For example, if application is sending messages with timestamp and > > the > > > > > broker got downgraded to an older version that does not support > > > > timestamp. > > > > > The clients will suddenly start to throw away timestamps. This > might > > > > affect > > > > > the application logic. In this case even if we have clients > > > automatically > > > > > adapted to a lower version broker, the applications might still > > break. > > > > > Hence we still need to notify the users about the case when the > > clients > > > > is > > > > > newer than the brokers. This is the same for both (1) and (2). > > > > > Supporting (2) will introduce more complication on the client side. > > And > > > > we > > > > > may also have to communicate with users about what function is > > > supported > > > > in > > > > > the new clients and what is not supported after the protocol > > > negotiation > > > > > finishes. > > > > > > > > > Totally agreed, however only if clients want to support multiple > broker > > > > versions. If they want to, then I am sure they are willing to add > some > > > > logic on their end. > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 2, 2016 at 5:58 PM, Dana Powers <dana.pow...@gmail.com > > > > > > wrote: > > > > > > > > > > > In kafka-python we've been doing something like: > > > > > > > > > > > > if version >= (0, 9): > > > > > > Do cool new stuff > > > > > > elif version >= (0, 8, 2): > > > > > > Do some older stuff > > > > > > .... > > > > > > else: > > > > > > raise UnsupportedVersionError > > > > > > > > > > > > This will break if / when the new 0.9 apis are completely removed > > > from > > > > > some > > > > > > future release, but should handle intermediate broker upgrades. > > > Because > > > > > we > > > > > > can't add support for future apis a priori, I think the best we > > could > > > > do > > > > > > here is throw an error that request protocol version X is not > > > > supported. > > > > > > For now that comes through as a broken socket connection, so > there > > is > > > > an > > > > > > error - just not a super helpful one. > > > > > > > > > > > > For that reason I'm also in favor of a generic error response > when > > a > > > > > > protocol req is not recognized. > > > > > > > > > > > > -Dana > > > > > > On Mar 2, 2016 5:38 PM, "Jay Kreps" <j...@confluent.io> wrote: > > > > > > > > > > > > > But won't it be the case that what clients end up doing would > be > > > > > > something > > > > > > > like > > > > > > > if(version != 0.8.1) > > > > > > > throw new UnsupportedVersionException() > > > > > > > which then means the client is broken as soon as we release a > new > > > > > server > > > > > > > version even though the protocol didn't change. I'm actually > not > > > sure > > > > > how > > > > > > > you could use that information in a forward compatible way > since > > > you > > > > > > can't > > > > > > > know a priori if you will work with the next release until you > > know > > > > if > > > > > > the > > > > > > > protocol changed. > > > > > > > > > > > > > > -Jay > > > > > > > > > > > > > > On Wed, Mar 2, 2016 at 5:28 PM, Jason Gustafson < > > > ja...@confluent.io> > > > > > > > wrote: > > > > > > > > > > > > > > > Hey Jay, > > > > > > > > > > > > > > > > Yeah, I wasn't suggesting that we eliminate request API > > versions. > > > > > > They're > > > > > > > > definitely needed on the broker to support compatibility. I > was > > > > just > > > > > > > saying > > > > > > > > that if a client wants to support multiple broker versions > > (e.g. > > > > 0.8 > > > > > > and > > > > > > > > 0.9), then it makes more sense to me to make the kafka > release > > > > > version > > > > > > > > available in order to determine which version of the request > > API > > > > > should > > > > > > > be > > > > > > > > used rather than adding a new request type which exposes all > of > > > the > > > > > > > > different supported versions for all of the request types. > > > Request > > > > > API > > > > > > > > versions all change in lockstep with Kafka releases anyway. > > > > > > > > > > > > > > > > -Jason > > > > > > > > > > > > > > > > On Wed, Mar 2, 2016 at 5:15 PM, Becket Qin < > > becket....@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > I think using Kafka release version makes sense. More > > > > particularly, > > > > > > we > > > > > > > > can > > > > > > > > > use the ApiVersion and this will cover all the interval > > version > > > > as > > > > > > > well. > > > > > > > > In > > > > > > > > > KAFKA-3025, we added the ApiVersion to message format > version > > > > > > mapping, > > > > > > > We > > > > > > > > > can add the ApiKey to version mapping to ApiVersion as > well. > > We > > > > can > > > > > > > move > > > > > > > > > ApiVersion class to o.a.k.c package and use it for both > > server > > > > and > > > > > > > > clients. > > > > > > > > > > > > > > > > > > @Jason, if we cache the release info in metadata and not > > > > > re-validate > > > > > > > the > > > > > > > > > release on reconnect, would it still work if we do a > rolling > > > > > > downgrade? > > > > > > > > > > > > > > > > > > On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson < > > > > > ja...@confluent.io> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > I think Dana's suggestion to include the Kafka release > > > version > > > > > > makes > > > > > > > a > > > > > > > > > lot > > > > > > > > > > of sense. I'm actually wondering why you would need the > > > > > individual > > > > > > > API > > > > > > > > > > versions if you have that? It sounds like keeping track > of > > > all > > > > > the > > > > > > > api > > > > > > > > > > version information would add a lot of complexity to > > clients > > > > > since > > > > > > > > > they'll > > > > > > > > > > have to try to handle different version permutations > which > > > are > > > > > not > > > > > > > > > actually > > > > > > > > > > possible in practice. Wouldn't it be simpler to know that > > > > you're > > > > > > > > talking > > > > > > > > > to > > > > > > > > > > an 0.9 broker than that you're talking to a broker which > > > > supports > > > > > > > > > version 2 > > > > > > > > > > of the group coordinator request, version 1 of fetch > > request, > > > > > etc? > > > > > > > > Also, > > > > > > > > > > the release version could be included in the broker > > > information > > > > > in > > > > > > > the > > > > > > > > > > topic metadata request which would save the need for the > > > > > additional > > > > > > > > round > > > > > > > > > > trip on every reconnect. > > > > > > > > > > > > > > > > > > > > -Jason > > > > > > > > > > > > > > > > > > > > On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh < > > > > > asi...@cloudera.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira < > > > > > g...@confluent.io> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > One more thing, the KIP actually had 3 parts: > > > > > > > > > > > > 1. The version protocol > > > > > > > > > > > > 2. New response on messages of wrong API key or wrong > > > > version > > > > > > > > > > > > 3. Protocol documentation > > > > > > > > > > > > > > > > > > > > > > > There is a WIP patch for adding protocol docs, > > > > > > > > > > > https://github.com/apache/kafka/pull/970 . By protocol > > > > > > > > documentation, > > > > > > > > > > you > > > > > > > > > > > mean updating this, right? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I understand that you are offering to only implement > > part > > > > 1? > > > > > > > > > > > > But the KIP discussion and vote should still cover > all > > > > three > > > > > > > parts, > > > > > > > > > > > > they will just be implemented in separate JIRA? > > > > > > > > > > > > > > > > > > > > > > > The patch for KAFKA-3307, > > > > > > https://github.com/apache/kafka/pull/986 > > > > > > > , > > > > > > > > > > covers > > > > > > > > > > > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we > > > should > > > > > > > include > > > > > > > > > all > > > > > > > > > > > the three points you mentioned while discussing or > voting > > > for > > > > > > > KIP-35. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh < > > > > > > > asi...@cloudera.com> > > > > > > > > > > > wrote: > > > > > > > > > > > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira < > > > > > > > g...@confluent.io> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > >> I don't see a use for the name - clients should be > > > able > > > > to > > > > > > > > > translate > > > > > > > > > > > > >> ApiKey to name for any API they support, and I'm > not > > > > sure > > > > > > why > > > > > > > > > would > > > > > > > > > > a > > > > > > > > > > > > >> client need to log anything about APIs it does not > > > > > support. > > > > > > > Am I > > > > > > > > > > > > >> missing something? > > > > > > > > > > > > >> > > > > > > > > > > > > > Yea, it is a fair assumption that client would know > > > about > > > > > > APIs > > > > > > > it > > > > > > > > > > > > supports. > > > > > > > > > > > > > It could have been helpful for client users to see > > new > > > > APIs > > > > > > > > though, > > > > > > > > > > > > however > > > > > > > > > > > > > users can always refer to protocol doc of new > version > > > to > > > > > find > > > > > > > > > > > > corresponding > > > > > > > > > > > > > names of the new APIs. > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > >> On a related note, Magnus is currently on > vacation, > > > but > > > > he > > > > > > > > should > > > > > > > > > be > > > > > > > > > > > > >> back at the end of next week. I'd like to hold off > > on > > > > the > > > > > > vote > > > > > > > > > until > > > > > > > > > > > > >> he gets back since his experience in implementing > > > > clients > > > > > > and > > > > > > > > his > > > > > > > > > > > > >> opinions will be very valuable for this > discussion. > > > > > > > > > > > > >> > > > > > > > > > > > > > That is great. It will be valuable to have his > > > feedback. > > > > I > > > > > > will > > > > > > > > > hold > > > > > > > > > > > off > > > > > > > > > > > > on > > > > > > > > > > > > > removing "api_name" and "api_deprecated_versions" > or > > > > adding > > > > > > > > release > > > > > > > > > > > > version. > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > >> Gwen > > > > > > > > > > > > >> > > > > > > > > > > > > >> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh < > > > > > > > > asi...@cloudera.com > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > >> > Works with me. I will update PR to remove this. > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > Also, "api_name" have been pointed out as a > > concern. > > > > > > > However, > > > > > > > > it > > > > > > > > > > can > > > > > > > > > > > > be > > > > > > > > > > > > >> > handy for logging and similar purposes. Any take > > on > > > > > that? > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira < > > > > > > > > g...@confluent.io > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > >> > > > > > > > > > > > > > >> >> Jay also mentioned: > > > > > > > > > > > > >> >> "Or, alternately, since deprecation has no > > > functional > > > > > > > impact > > > > > > > > > and > > > > > > > > > > is > > > > > > > > > > > > >> >> just a message > > > > > > > > > > > > >> >> to developers, we could just leave it out of > the > > > > > protocol > > > > > > > and > > > > > > > > > > just > > > > > > > > > > > > have > > > > > > > > > > > > >> it > > > > > > > > > > > > >> >> in release notes etc." > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> I'm in favor of leaving it out of the > protocol. I > > > > can't > > > > > > > > really > > > > > > > > > > see > > > > > > > > > > > a > > > > > > > > > > > > >> >> use-case. > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> Gwen > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh < > > > > > > > > > > asi...@cloudera.com > > > > > > > > > > > > > > > > > > > > > > > > >> wrote: > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > I hope it is OK for me to make some progress > > > here. > > > > I > > > > > > have > > > > > > > > > made > > > > > > > > > > > the > > > > > > > > > > > > >> >> > following changes. > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > 1. Updated KIP-35, to adopt Jay's suggestion > on > > > > > > > maintaining > > > > > > > > > > > > separate > > > > > > > > > > > > >> list > > > > > > > > > > > > >> >> > of deprecated versions, instead of using a > > > version > > > > of > > > > > > -1. > > > > > > > > > > > > >> >> > 2. Added information on required permissions, > > > > > Describe > > > > > > > > action > > > > > > > > > > on > > > > > > > > > > > > >> Cluster > > > > > > > > > > > > >> >> > resource, to be able to retrieve protocol > > > versions > > > > > > from a > > > > > > > > > auth > > > > > > > > > > > > enabled > > > > > > > > > > > > >> >> > Kafka cluster. > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > Created > > > > > > https://issues.apache.org/jira/browse/KAFKA-3304 > > > > > > > . > > > > > > > > > > > Primary > > > > > > > > > > > > >> patch > > > > > > > > > > > > >> >> is > > > > > > > > > > > > >> >> > available to review, > > > > > > > > > https://github.com/apache/kafka/pull/986 > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > On Thu, Feb 25, 2016 at 1:27 PM, Ashish > Singh < > > > > > > > > > > > asi...@cloudera.com > > > > > > > > > > > > > > > > > > > > > > > > > >> >> wrote: > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > Kafka clients in Hadoop ecosystem, Flume, > > > Spark, > > > > > etc, > > > > > > > > have > > > > > > > > > > > found > > > > > > > > > > > > it > > > > > > > > > > > > >> >> > really > > > > > > > > > > > > >> >> > > difficult to cope up with Kafka releases as > > > they > > > > > want > > > > > > > to > > > > > > > > > > > support > > > > > > > > > > > > >> users > > > > > > > > > > > > >> >> on > > > > > > > > > > > > >> >> > > different Kafka versions. Capability to > > > retrieve > > > > > > > protocol > > > > > > > > > > > version > > > > > > > > > > > > >> will > > > > > > > > > > > > >> >> > go a > > > > > > > > > > > > >> >> > > long way to ease out those pain points. I > > will > > > be > > > > > > happy > > > > > > > > to > > > > > > > > > > help > > > > > > > > > > > > out > > > > > > > > > > > > >> >> with > > > > > > > > > > > > >> >> > > the work on this KIP. @Magnus, thanks for > > > driving > > > > > > this, > > > > > > > > is > > > > > > > > > it > > > > > > > > > > > OK > > > > > > > > > > > > if > > > > > > > > > > > > >> I > > > > > > > > > > > > >> >> > carry > > > > > > > > > > > > >> >> > > forward the work from here. It will be > ideal > > to > > > > > have > > > > > > > this > > > > > > > > > in > > > > > > > > > > > > >> 0.10.0.0. > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps > < > > > > > > > > > j...@confluent.io > > > > > > > > > > > > > > > > > > > > > > > >> wrote: > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > >> I wonder if we need to solve the error > > > problem? > > > > I > > > > > > > think > > > > > > > > > this > > > > > > > > > > > KIP > > > > > > > > > > > > >> >> gives a > > > > > > > > > > > > >> >> > >> descent work around. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> Probably we should have included an error > in > > > the > > > > > > > > response > > > > > > > > > > > > header, > > > > > > > > > > > > >> but > > > > > > > > > > > > >> >> we > > > > > > > > > > > > >> >> > >> debated it at the time decided not to and > > now > > > it > > > > > is > > > > > > > > pretty > > > > > > > > > > > hard > > > > > > > > > > > > to > > > > > > > > > > > > >> add > > > > > > > > > > > > >> >> > >> because the headers aren't versioned > (d'oh). > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> It seems like any other solution is going > to > > > be > > > > > kind > > > > > > > of > > > > > > > > a > > > > > > > > > > > hack, > > > > > > > > > > > > >> right? > > > > > > > > > > > > >> >> > >> Sending malformed responses back seems > like > > > not > > > > a > > > > > > > clean > > > > > > > > > > > > solution... > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> (Not sure if I was pro- having a top-level > > > error > > > > > or > > > > > > > not, > > > > > > > > > but > > > > > > > > > > > in > > > > > > > > > > > > any > > > > > > > > > > > > >> >> case > > > > > > > > > > > > >> >> > >> the rationale for the decision was that so > > > many > > > > of > > > > > > the > > > > > > > > > > > requests > > > > > > > > > > > > >> were > > > > > > > > > > > > >> >> > >> per-partition or per-topic or whatever and > > > hence > > > > > > fail > > > > > > > or > > > > > > > > > > > > succeed at > > > > > > > > > > > > >> >> that > > > > > > > > > > > > >> >> > >> level and this makes it hard to know what > > the > > > > > right > > > > > > > > > > top-level > > > > > > > > > > > > error > > > > > > > > > > > > >> >> code > > > > > > > > > > > > >> >> > >> is > > > > > > > > > > > > >> >> > >> and hard for the client to figure out what > > to > > > do > > > > > > with > > > > > > > > the > > > > > > > > > > top > > > > > > > > > > > > level > > > > > > > > > > > > >> >> > error > > > > > > > > > > > > >> >> > >> if some of the partitions succeed but > there > > > is a > > > > > > > > top-level > > > > > > > > > > > > error). > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> I think actually this new API actually > > gives a > > > > way > > > > > > to > > > > > > > > > handle > > > > > > > > > > > > this > > > > > > > > > > > > >> >> > >> gracefully on the client side by just > having > > > > > clients > > > > > > > > that > > > > > > > > > > want > > > > > > > > > > > > to > > > > > > > > > > > > >> be > > > > > > > > > > > > >> >> > >> graceful check for support for their > > version. > > > > > > Clients > > > > > > > > that > > > > > > > > > > do > > > > > > > > > > > > that > > > > > > > > > > > > >> >> will > > > > > > > > > > > > >> >> > >> have a graceful message. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> At some point if we're ever reworking the > > > > headers > > > > > we > > > > > > > > > should > > > > > > > > > > > > really > > > > > > > > > > > > >> >> > >> consider > > > > > > > > > > > > >> >> > >> (a) versioning them and (b) adding a > > top-level > > > > > error > > > > > > > > code > > > > > > > > > in > > > > > > > > > > > the > > > > > > > > > > > > >> >> > response. > > > > > > > > > > > > >> >> > >> But given this would be a big breaking > > change > > > > and > > > > > > this > > > > > > > > is > > > > > > > > > > > really > > > > > > > > > > > > >> just > > > > > > > > > > > > >> >> to > > > > > > > > > > > > >> >> > >> give a nicer error message seems like it > > > > probably > > > > > > > isn't > > > > > > > > > > worth > > > > > > > > > > > > it to > > > > > > > > > > > > >> >> try > > > > > > > > > > > > >> >> > to > > > > > > > > > > > > >> >> > >> do something now. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> -Jay > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie > > Qin > > > > > > > > > > > > >> >> <j...@linkedin.com.invalid > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > >> wrote: > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > I am thinking instead of returning an > > empty > > > > > > > response, > > > > > > > > it > > > > > > > > > > > > would be > > > > > > > > > > > > >> >> > >> better to > > > > > > > > > > > > >> >> > >> > return an explicit > > > UnsupportedVersionException > > > > > > code. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > Today KafkaApis handles the error in the > > > > > following > > > > > > > > way: > > > > > > > > > > > > >> >> > >> > 1. For requests/responses using old > Scala > > > > > classes, > > > > > > > > > > KafkaApis > > > > > > > > > > > > uses > > > > > > > > > > > > >> >> > >> > RequestOrResponse.handleError() to > return > > an > > > > > error > > > > > > > > > > response. > > > > > > > > > > > > >> >> > >> > 2. For requests/response using Java > > classes > > > > > (only > > > > > > > > > > > > >> JoinGroupRequest > > > > > > > > > > > > >> >> and > > > > > > > > > > > > >> >> > >> > Heartbeat now), KafkaApis calls > > > > > > > > > > > > >> AbstractRequest.getErrorResponse() > > > > > > > > > > > > >> >> to > > > > > > > > > > > > >> >> > >> > return an error response. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > In KAFKA-2512, I am returning an > > > > > > > > > > UnsupportedVersionException > > > > > > > > > > > > for > > > > > > > > > > > > >> >> case > > > > > > > > > > > > >> >> > >> [1] > > > > > > > > > > > > >> >> > >> > when see an unsupported version. This > will > > > put > > > > > the > > > > > > > > error > > > > > > > > > > > code > > > > > > > > > > > > per > > > > > > > > > > > > >> >> > topic > > > > > > > > > > > > >> >> > >> or > > > > > > > > > > > > >> >> > >> > partition for most of the requests, but > > > might > > > > > not > > > > > > > work > > > > > > > > > all > > > > > > > > > > > the > > > > > > > > > > > > >> time. > > > > > > > > > > > > >> >> > >> e.g. > > > > > > > > > > > > >> >> > >> > TopicMetadataRequest with an empty topic > > > set. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > Case [2] does not quite work for > > unsupported > > > > > > > version, > > > > > > > > > > > because > > > > > > > > > > > > we > > > > > > > > > > > > >> >> will > > > > > > > > > > > > >> >> > >> > thrown an uncaught exception when > version > > is > > > > not > > > > > > > > > > recognized > > > > > > > > > > > > (BTW > > > > > > > > > > > > >> >> this > > > > > > > > > > > > >> >> > >> is a > > > > > > > > > > > > >> >> > >> > bug). Part of the reason is that for > some > > > > > response > > > > > > > > > types, > > > > > > > > > > > > error > > > > > > > > > > > > >> code > > > > > > > > > > > > >> >> > is > > > > > > > > > > > > >> >> > >> not > > > > > > > > > > > > >> >> > >> > part of the response level field. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > Maybe it worth checking how each > response > > is > > > > > > dealing > > > > > > > > > with > > > > > > > > > > > > error > > > > > > > > > > > > >> code > > > > > > > > > > > > >> >> > >> today. > > > > > > > > > > > > >> >> > >> > A scan of the response formats gives the > > > > > following > > > > > > > > > result: > > > > > > > > > > > > >> >> > >> > 1. TopicMetadataResponse - per topic > error > > > > code, > > > > > > > does > > > > > > > > > not > > > > > > > > > > > work > > > > > > > > > > > > >> when > > > > > > > > > > > > >> >> > the > > > > > > > > > > > > >> >> > >> > topic set is empty in the request. > > > > > > > > > > > > >> >> > >> > 2. ProduceResonse - per partition error > > > code. > > > > > > > > > > > > >> >> > >> > 3. OffsetCommitResponse - per partition. > > > > > > > > > > > > >> >> > >> > 4. OffsetFetchResponse - per partition. > > > > > > > > > > > > >> >> > >> > 5. OffsetResponse - per partition. > > > > > > > > > > > > >> >> > >> > 6. FetchResponse - per partition > > > > > > > > > > > > >> >> > >> > 7. ConsumerMetadataResponse - response > > level > > > > > > > > > > > > >> >> > >> > 8. ControlledShutdownResponse - response > > > level > > > > > > > > > > > > >> >> > >> > 9. JoinGroupResponse - response level > > > > > > > > > > > > >> >> > >> > 10. HearbeatResponse - response level > > > > > > > > > > > > >> >> > >> > 11. LeaderAndIsrResponse - response > level > > > > > > > > > > > > >> >> > >> > 12. StopReplicaResponse - response level > > > > > > > > > > > > >> >> > >> > 13. UpdateMetadataResponse - response > > level > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > So from the list above it looks for each > > > > > response > > > > > > we > > > > > > > > are > > > > > > > > > > > > actually > > > > > > > > > > > > >> >> able > > > > > > > > > > > > >> >> > >> to > > > > > > > > > > > > >> >> > >> > return an error code, as long as we make > > > sure > > > > > the > > > > > > > > topic > > > > > > > > > or > > > > > > > > > > > > >> partition > > > > > > > > > > > > >> >> > >> won't > > > > > > > > > > > > >> >> > >> > be empty when the error code is at topic > > or > > > > > > > partition > > > > > > > > > > level. > > > > > > > > > > > > >> Luckily > > > > > > > > > > > > >> >> > in > > > > > > > > > > > > >> >> > >> the > > > > > > > > > > > > >> >> > >> > above list we only need to worry about > > > > > > > > > > > TopicMetadataResponse. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > Maybe error handling is out of the scope > > of > > > > this > > > > > > > KIP, > > > > > > > > > but > > > > > > > > > > I > > > > > > > > > > > > >> prefer > > > > > > > > > > > > >> >> we > > > > > > > > > > > > >> >> > >> think > > > > > > > > > > > > >> >> > >> > through how to deal with error code for > > the > > > > > > > requests, > > > > > > > > > > > because > > > > > > > > > > > > >> there > > > > > > > > > > > > >> >> > are > > > > > > > > > > > > >> >> > >> > more request types to be added in > > KAFKA-2464 > > > > and > > > > > > > > future > > > > > > > > > > > > patches. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > Thanks, > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > Jiangjie (Becket) Qin > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > On Mon, Oct 12, 2015 at 6:04 PM, Jay > > Kreps < > > > > > > > > > > > j...@confluent.io> > > > > > > > > > > > > >> >> wrote: > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > > Two quick pieces of feedback: > > > > > > > > > > > > >> >> > >> > > 1. The use of a version of -1 as > magical > > > > entry > > > > > > > > > dividing > > > > > > > > > > > > between > > > > > > > > > > > > >> >> > >> > deprecated > > > > > > > > > > > > >> >> > >> > > versions is a bit hacky. What about > > > instead > > > > > > having > > > > > > > > an > > > > > > > > > > > array > > > > > > > > > > > > of > > > > > > > > > > > > >> >> > >> supported > > > > > > > > > > > > >> >> > >> > > versions and a separate array of > > > deprecated > > > > > > > > versions. > > > > > > > > > > The > > > > > > > > > > > > >> >> deprecated > > > > > > > > > > > > >> >> > >> > > versions would always be a subset of > the > > > > > > supported > > > > > > > > > > > versions. > > > > > > > > > > > > >> Or, > > > > > > > > > > > > >> >> > >> > > alternately, since deprecation has no > > > > > functional > > > > > > > > > impact > > > > > > > > > > > and > > > > > > > > > > > > is > > > > > > > > > > > > >> >> just > > > > > > > > > > > > >> >> > a > > > > > > > > > > > > >> >> > >> > > message to developers, we could just > > leave > > > > it > > > > > > out > > > > > > > of > > > > > > > > > the > > > > > > > > > > > > >> protocol > > > > > > > > > > > > >> >> > and > > > > > > > > > > > > >> >> > >> > just > > > > > > > > > > > > >> >> > >> > > have it in release notes etc. > > > > > > > > > > > > >> >> > >> > > 2. I think including the api name may > > > cause > > > > > some > > > > > > > > > > problems. > > > > > > > > > > > > >> >> Currently > > > > > > > > > > > > >> >> > >> the > > > > > > > > > > > > >> >> > >> > > api key is the primary key that we > keep > > > > > > consistent > > > > > > > > but > > > > > > > > > > we > > > > > > > > > > > > have > > > > > > > > > > > > >> >> > >> actually > > > > > > > > > > > > >> >> > >> > > evolved the english description of the > > > apis > > > > as > > > > > > > they > > > > > > > > > have > > > > > > > > > > > > >> changed. > > > > > > > > > > > > >> >> > The > > > > > > > > > > > > >> >> > >> > only > > > > > > > > > > > > >> >> > >> > > use I can think of for the name would > be > > > if > > > > > > people > > > > > > > > > used > > > > > > > > > > > the > > > > > > > > > > > > >> >> logical > > > > > > > > > > > > >> >> > >> name > > > > > > > > > > > > >> >> > >> > > and tried to resolve the api key, but > > that > > > > > would > > > > > > > be > > > > > > > > > > wrong. > > > > > > > > > > > > Not > > > > > > > > > > > > >> >> sure > > > > > > > > > > > > >> >> > >> if we > > > > > > > > > > > > >> >> > >> > > actually need the english name, if > there > > > is > > > > a > > > > > > use > > > > > > > > > case I > > > > > > > > > > > > guess > > > > > > > > > > > > >> >> we'll > > > > > > > > > > > > >> >> > >> just > > > > > > > > > > > > >> >> > >> > > have to be very clear that the name is > > > just > > > > > > > > > > documentation > > > > > > > > > > > > and > > > > > > > > > > > > >> can > > > > > > > > > > > > >> >> > >> change > > > > > > > > > > > > >> >> > >> > > any time. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > -Jay > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > On Fri, Sep 25, 2015 at 2:53 PM, > Magnus > > > > > > Edenhill < > > > > > > > > > > > > >> >> > mag...@edenhill.se> > > > > > > > > > > > > >> >> > >> > > wrote: > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > Good evening, > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > > >> >> > >> > > > KIP-35 was created to address > current > > > and > > > > > > future > > > > > > > > > > > > >> broker-client > > > > > > > > > > > > >> >> > >> > > > compatibility. > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > > >> >> > >> > > > Summary: > > > > > > > > > > > > >> >> > >> > > > * allow clients to retrieve the > > > broker's > > > > > > > protocol > > > > > > > > > > > version > > > > > > > > > > > > >> >> > >> > > > * make broker handle unknown > protocol > > > > > > requests > > > > > > > > > > > gracefully > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > > >> >> > >> > > > Feedback and comments welcome! > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > > >> >> > >> > > > Regards, > > > > > > > > > > > > >> >> > >> > > > Magnus > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > -- > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > Regards, > > > > > > > > > > > > >> >> > > Ashish > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > -- > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > Regards, > > > > > > > > > > > > >> >> > Ashish > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > -- > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > Regards, > > > > > > > > > > > > >> > Ashish > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > Ashish > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > Ashish > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Regards, > > > > Ashish > > > > > > > > > > -- Regards, Ashish