Yeah here is my summary of my take: 1. Negotiating a per-connection protocol actually does add a lot of complexity to clients (many more failure states to get right).
2. Having the client configure the protocol version manually is doable now but probably a worse state. I suspect this will lead to more not less confusion. 3. I don't think the current state is actually that bad. Integrators pick a conservative version and build against that. There is a tradeoff between getting the new features and being compatible with old Kafka versions. But a large part of this tradeoff is essential since new features aren't going to magically appear on old servers, so even if you upgrade your client you likely aren't going to get the new stuff (since we will end up dynamically turning it off). Having client features that are there but don't work because you're on an old cluster may actually be a worse experience if not handled very carefully.. 4. The problems Dana brought up are totally orthogonal to the problem of having per-api versions or overall versions. The problem was that we changed behavior subtly without changing the version. This will be an issue regardless of whether the version is global or not. 5. Using the broker release as the version is strictly worse than using a global protocol version (0, 1, 2, ...) that increments any time any api changes but doesn't increment just because non-protocol code is changed. The problem with using the broker release version is we want to be able to keep Kafka releasable from any commit which means there isn't as clear a sequencing of releases as you would think. 6. We need to consider the case of mixed version clusters during the time period when you are upgrading Kafka. So overall I think this is not a critical thing to do right now, but if we are going to do it we should do it in a way that actually improves things. Here would be one proposal for that: a. Add a global protocol version that increments with any api version update. Move the documentation so that the docs are by version. This is basically just a short-hand for a complete set of supported api versions. b. Include a field in the metadata response for each broker that adds the protocol version. c. To maintain the protocol version this information will have to get propagated with the rest of the broker metadata like host, port, id, etc. The instructions to clients would be: - By default you build against a single conservative Kafka protocol version and we carry that support forward, as today - If you want to get fancy you can use the protocol version field in the metadata request to more dynamically chose what features are available and select api versions appropriately. This is purely optional. -Jay On Thu, Mar 3, 2016 at 9:38 PM, Jason Gustafson <ja...@confluent.io> wrote: > I talked with Jay about this KIP briefly this morning, so let me try to > summarize the discussion (I'm sure he'll jump in if I get anything wrong). > Apologies in advance for the length. > > I think we both share some skepticism that a request with all the supported > versions of all the request APIs is going to be a useful primitive to try > and build client compatibility around. In practice I think people would end > up checking for particular request versions in order to determine if the > broker is 0.8 or 0.9 or whatever, and then change behavior accordingly. I'm > wondering if there's a reasonable way to handle the version responses that > doesn't amount to that. Maybe you could try to capture feature > compatibility by checking the versions for a subset of request types? For > example, to ensure that you can use the new consumer API, you check that > the group coordinator request is present, the offset commit request version > is greater than 2, the offset fetch request is greater than 1, and the join > group request is present. And to ensure compatibility with KIP-32, maybe > you only need to check the appropriate versions of the fetch and produce > requests. That sounds kind of complicated to keep track of and you probably > end up trying to handle combinations which aren't even possible in > practice. > > The alternative is to use a single API version. It could be the Kafka > release version, but then you need to figure out how to handle users who > are running off of trunk since multiple API versions will typically change > between releases. Perhaps it makes more sense to keep a separate API > version number which is incremented every time any one of the API versions > increases? This also decouples the protocol from the Kafka distribution. > > As far as whether there should be a separate request or not, I get Becket's > point that you would only need to do the version check once when a > connection is established, but another round trip still complicates the > picture quite a bit. Before you just need to send a metadata request to > bootstrap yourself to the cluster, but now you need to do version > negotiation before you can even do that, and then you need to try adapt to > the versions reported. Jay brought up the point that you probably wouldn't > design a protocol from scratch to work this way. Using the metadata request > would be better if it's possible, but you need a way to handle the fact > that a broker's version might be stale by the time you connect to it. And > even then you're going to have to deal internally with the complexity > involved in trying to upgrade/downgrade dynamically, which sounds to me > like it would have a ton of edge cases. > > Taking a bit of a step back, any solution is probably going to be painful > since the Kafka protocol was not designed for this use case. Currently what > that means for clients that /want/ to support compatibility across broker > versions is that they need to have the user tell them the broker version > through configuration (e.g. librdkafka has a "protocol.version" field for > this purpose). The only real problem with this in my mind is that we don't > have a graceful way to detect request incompatibility, which is why there > are so many questions on the user list which basically amount to the client > hanging because the broker refuses to respond to a request it doesn't > understand. If you solve this problem, then depending on configuration > seems totally reasonable and we can skip trying to implement request > version negotiation. Magnus's solution in this KIP may seem a little hacky, > but it also seems like the only way to do it without changing the header. > > The Spark problem mentioned above is interesting and I agree that it sucks > for frameworks that need to ship the kafka client library since they have > to figure out how to bundle multiple versions. Ultimately if we want to > solve this problem, then it sounds like we need to commit to maintaining > compatibility with older versions of Kafka in the client going forward. > That's a lot bigger decision and it matters less whether the broker version > is found through configuration, topic metadata, or a new request type. > > -Jason > > > On Thu, Mar 3, 2016 at 3:59 PM, Becket Qin <becket....@gmail.com> wrote: > > > Hi Ashish, > > > > In approach (1), the clients will still be able to talked to multiple > > versions of Kafka brokers as long as the clients version is not higher > than > > the broker version, right? > > > > From Spark's point of view, it seems the difference is whether Spark can > > independently update their Kafka clients dependency or not. More > > specifically, consider the following three scenarios: > > A. Spark has some new features that do not rely on clients or brokers in > a > > new Kafka release. > > B. Spark has some new features that only rely on the clients in a new > Kafka > > release, but not rely on the brokers in a new Kafka release. e.g. New > > client provides a listTopic() method. > > C. Spark has some new features that rely on both the clients and brokers > in > > a new Kafka release. e.g timestamp field. > > > > For A, Spark does not need to update the Kafka dependency because there > is > > no need and the old clients can talk to both new and old Kafka brokers. > > For C, Spark has to wait for broker upgrade anyways. > > So in the above two scenarios, there is not much difference between > > approach (1) and (2). > > > > B is a tricky scenario. Because it is possible that we introduce both > > listTopic() and the timestamp field in the same Kafka release, and we > don't > > know if Spark needs both or only uses listTopic(). > > This indicates the client should work fine if a method is supported and > > should throw exception when a method is not supported. I think we can do > > the following: > > 0. Clients always use its highest request version. The clients keeps a > > static final map recording the minimum required ApiVersion for each > > request. > > 1. When connect to a broker, the clients always send an ApiVersionRequest > > to the broker. > > 2. The broker replies with the its highest supported ApiVersion. > > 3. Before sending a request, the clients checks the minimum required > > ApiVersion for that request. If the broker returned ApiVersion is higher > > than this minimum required ApiVersion, then we can proceed. Otherwise we > > throw something like NotSupportedOperationException. > > > > With this approach, scenario B will also work unless Spark calls some > > function that is not supported by the Kafka broker, which makes it become > > scenario C. > > > > Thoughts? > > > > Thanks, > > > > Jiangjie (Becket) Qin > > > > On Thu, Mar 3, 2016 at 11:24 AM, Ashish Singh <asi...@cloudera.com> > wrote: > > > > > On Wed, Mar 2, 2016 at 8:29 PM, Becket Qin <becket....@gmail.com> > wrote: > > > > > > > Hi Jason, > > > > > > > > I was thinking that every time when we connect to a broker, we first > > send > > > > the version check request. (The version check request itself should > be > > > very > > > > simple and never changes across all server releases.) This does add > an > > > > additional round trip, but given reconnect is rare, it is probably > > fine. > > > On > > > > the client side, the client will always send request using the lowest > > > > supported version across all brokers. That means if a Kafka cluster > is > > > > downgrading, we will use the downgraded protocol as soon as the > client > > > > connected to an older broker. > > > > > > > This sounds interesting and very similar to current suggestion. > However, > > > just to make sure I am getting it right, you are suggesting send a > > separate > > > request only for release version? > > > > > > > > > > > @Ashish, > > > > Can you help me understand the pain points from other open source > > > projects > > > > that you mentioned a little more? There are two different levels of > > > > requirements: > > > > 1. User wants to know if the client is compatible with the broker or > > not. > > > > 2. User wants the client and the broker to negotiate the protocol on > > > their > > > > own. > > > > > > > Not sure which category it falls in, but below is the excerpt from > Mark, > > a > > > spark dev, who has been trying to upgrade spark kafka integration to > use > > > 0.9 clients. > > > > > > Based on what I understand, users of Kafka need to upgrade their > brokers > > to > > > Kafka 0.9.x first, before they upgrade their clients to Kafka 0.9.x. > > > > > > > However, that presents a problem to other projects that integrate > with > > > Kafka (Spark, Flume, Storm, etc.). From here on, I will speak for > Spark + > > > Kafka, since that's the one I am most familiar with. > > > In the light of compatibility (or the lack thereof) between 0.8.x and > > > 0.9.x, Spark is faced with a problem of what version(s) of Kafka to be > > > compatible with, and has 2 options (discussed in this PR > > > <https://github.com/apache/spark/pull/11143>): > > > 1. We either upgrade to Kafka 0.9, dropping support for 0.8. Storm and > > > Flume are already on this path. > > > 2. We introduce complexity in our code to support both 0.8 and 0.9 for > > the > > > entire duration of our next major release (Apache Spark 2.x). > > > I'd love to hear your thoughts on which option, you recommend. > > > Long term, I'd really appreciate if Kafka could do something that > doesn't > > > make Spark having to support two, or even more versions of Kafka. And, > if > > > there is something that I, personally, and Spark project can do in your > > > next release candidate phase to make things easier, please do let us > > know. > > > > > > This issue has made other projects worry about how they are going to > keep > > > up with Kafka releases. Last I heard, take this with a pinch of salt, > > Spark > > > folks are discussing about using Maven profiles to build against > multiple > > > Kafka versions at compile time, etc. Also, there are clients who are > > > relying on class-loading tricks with custom implementation of OSGi to > > solve > > > such issues. Don't quote me on the stuff I just mentioned, as this is > > what > > > I have heard during casual discussions. The point I am trying to make > is > > > that Kafka clients are worried about being able to support multiple > Kafka > > > broker versions. I am sure we all agree on that. > > > > > > I think the second requirement makes more sense from a client > > perspective. > > > First req will just tell them that there is a problem, but no way to > work > > > around it. > > > > > > > > > > > Currently in Kafka the principle we are following is to let clients > > stick > > > > to a certain version and server will adapt to the clients > accordingly. > > > > If this KIP doesn't want to break this rule, it seems we should > simply > > > let > > > > the clients send the ApiVersion it is using to the brokers and the > > > brokers > > > > will decide whether to accept or reject the clients. This means user > > have > > > > to upgrade broker before they upgrade clients. This satisfies (1) so > > > that a > > > > newer client will know it does not compatible with an older server > > > > immediately. > > > > If this KIP will change that to let the newer clients adapt to the > > older > > > > brokers, compatibility wise it is a good thing to have. With this > now > > > > users are able to upgrade clients before they upgrade Kafka brokers. > > This > > > > means user can upgrade clients even before upgrade servers. This > > > satisfies > > > > (2) as the newer clients can also talk to the older servers. > > > > > > > More importantly, this will allow a client to talk to multiple versions > > of > > > Kafka. > > > > > > > > > > > > If we decide to go with (2). The benefit is that a newer client won't > > > break > > > > when talking to an older broker. But functionality wise, it might be > > the > > > > same as an older clients. > > > > In the downgrading case, we probably still have to notify all the > > users. > > > > For example, if application is sending messages with timestamp and > the > > > > broker got downgraded to an older version that does not support > > > timestamp. > > > > The clients will suddenly start to throw away timestamps. This might > > > affect > > > > the application logic. In this case even if we have clients > > automatically > > > > adapted to a lower version broker, the applications might still > break. > > > > Hence we still need to notify the users about the case when the > clients > > > is > > > > newer than the brokers. This is the same for both (1) and (2). > > > > Supporting (2) will introduce more complication on the client side. > And > > > we > > > > may also have to communicate with users about what function is > > supported > > > in > > > > the new clients and what is not supported after the protocol > > negotiation > > > > finishes. > > > > > > > Totally agreed, however only if clients want to support multiple broker > > > versions. If they want to, then I am sure they are willing to add some > > > logic on their end. > > > > > > > > > > > Thanks, > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > > > > > On Wed, Mar 2, 2016 at 5:58 PM, Dana Powers <dana.pow...@gmail.com> > > > wrote: > > > > > > > > > In kafka-python we've been doing something like: > > > > > > > > > > if version >= (0, 9): > > > > > Do cool new stuff > > > > > elif version >= (0, 8, 2): > > > > > Do some older stuff > > > > > .... > > > > > else: > > > > > raise UnsupportedVersionError > > > > > > > > > > This will break if / when the new 0.9 apis are completely removed > > from > > > > some > > > > > future release, but should handle intermediate broker upgrades. > > Because > > > > we > > > > > can't add support for future apis a priori, I think the best we > could > > > do > > > > > here is throw an error that request protocol version X is not > > > supported. > > > > > For now that comes through as a broken socket connection, so there > is > > > an > > > > > error - just not a super helpful one. > > > > > > > > > > For that reason I'm also in favor of a generic error response when > a > > > > > protocol req is not recognized. > > > > > > > > > > -Dana > > > > > On Mar 2, 2016 5:38 PM, "Jay Kreps" <j...@confluent.io> wrote: > > > > > > > > > > > But won't it be the case that what clients end up doing would be > > > > > something > > > > > > like > > > > > > if(version != 0.8.1) > > > > > > throw new UnsupportedVersionException() > > > > > > which then means the client is broken as soon as we release a new > > > > server > > > > > > version even though the protocol didn't change. I'm actually not > > sure > > > > how > > > > > > you could use that information in a forward compatible way since > > you > > > > > can't > > > > > > know a priori if you will work with the next release until you > know > > > if > > > > > the > > > > > > protocol changed. > > > > > > > > > > > > -Jay > > > > > > > > > > > > On Wed, Mar 2, 2016 at 5:28 PM, Jason Gustafson < > > ja...@confluent.io> > > > > > > wrote: > > > > > > > > > > > > > Hey Jay, > > > > > > > > > > > > > > Yeah, I wasn't suggesting that we eliminate request API > versions. > > > > > They're > > > > > > > definitely needed on the broker to support compatibility. I was > > > just > > > > > > saying > > > > > > > that if a client wants to support multiple broker versions > (e.g. > > > 0.8 > > > > > and > > > > > > > 0.9), then it makes more sense to me to make the kafka release > > > > version > > > > > > > available in order to determine which version of the request > API > > > > should > > > > > > be > > > > > > > used rather than adding a new request type which exposes all of > > the > > > > > > > different supported versions for all of the request types. > > Request > > > > API > > > > > > > versions all change in lockstep with Kafka releases anyway. > > > > > > > > > > > > > > -Jason > > > > > > > > > > > > > > On Wed, Mar 2, 2016 at 5:15 PM, Becket Qin < > becket....@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > I think using Kafka release version makes sense. More > > > particularly, > > > > > we > > > > > > > can > > > > > > > > use the ApiVersion and this will cover all the interval > version > > > as > > > > > > well. > > > > > > > In > > > > > > > > KAFKA-3025, we added the ApiVersion to message format version > > > > > mapping, > > > > > > We > > > > > > > > can add the ApiKey to version mapping to ApiVersion as well. > We > > > can > > > > > > move > > > > > > > > ApiVersion class to o.a.k.c package and use it for both > server > > > and > > > > > > > clients. > > > > > > > > > > > > > > > > @Jason, if we cache the release info in metadata and not > > > > re-validate > > > > > > the > > > > > > > > release on reconnect, would it still work if we do a rolling > > > > > downgrade? > > > > > > > > > > > > > > > > On Wed, Mar 2, 2016 at 3:16 PM, Jason Gustafson < > > > > ja...@confluent.io> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > I think Dana's suggestion to include the Kafka release > > version > > > > > makes > > > > > > a > > > > > > > > lot > > > > > > > > > of sense. I'm actually wondering why you would need the > > > > individual > > > > > > API > > > > > > > > > versions if you have that? It sounds like keeping track of > > all > > > > the > > > > > > api > > > > > > > > > version information would add a lot of complexity to > clients > > > > since > > > > > > > > they'll > > > > > > > > > have to try to handle different version permutations which > > are > > > > not > > > > > > > > actually > > > > > > > > > possible in practice. Wouldn't it be simpler to know that > > > you're > > > > > > > talking > > > > > > > > to > > > > > > > > > an 0.9 broker than that you're talking to a broker which > > > supports > > > > > > > > version 2 > > > > > > > > > of the group coordinator request, version 1 of fetch > request, > > > > etc? > > > > > > > Also, > > > > > > > > > the release version could be included in the broker > > information > > > > in > > > > > > the > > > > > > > > > topic metadata request which would save the need for the > > > > additional > > > > > > > round > > > > > > > > > trip on every reconnect. > > > > > > > > > > > > > > > > > > -Jason > > > > > > > > > > > > > > > > > > On Tue, Mar 1, 2016 at 7:59 PM, Ashish Singh < > > > > asi...@cloudera.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > On Tue, Mar 1, 2016 at 6:30 PM, Gwen Shapira < > > > > g...@confluent.io> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > One more thing, the KIP actually had 3 parts: > > > > > > > > > > > 1. The version protocol > > > > > > > > > > > 2. New response on messages of wrong API key or wrong > > > version > > > > > > > > > > > 3. Protocol documentation > > > > > > > > > > > > > > > > > > > > > There is a WIP patch for adding protocol docs, > > > > > > > > > > https://github.com/apache/kafka/pull/970 . By protocol > > > > > > > documentation, > > > > > > > > > you > > > > > > > > > > mean updating this, right? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I understand that you are offering to only implement > part > > > 1? > > > > > > > > > > > But the KIP discussion and vote should still cover all > > > three > > > > > > parts, > > > > > > > > > > > they will just be implemented in separate JIRA? > > > > > > > > > > > > > > > > > > > > > The patch for KAFKA-3307, > > > > > https://github.com/apache/kafka/pull/986 > > > > > > , > > > > > > > > > covers > > > > > > > > > > 1 and 2. KAFKA-3309 tracks documentation part. Yes, we > > should > > > > > > include > > > > > > > > all > > > > > > > > > > the three points you mentioned while discussing or voting > > for > > > > > > KIP-35. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 1, 2016 at 5:25 PM, Ashish Singh < > > > > > > asi...@cloudera.com> > > > > > > > > > > wrote: > > > > > > > > > > > > On Tue, Mar 1, 2016 at 5:11 PM, Gwen Shapira < > > > > > > g...@confluent.io> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > >> I don't see a use for the name - clients should be > > able > > > to > > > > > > > > translate > > > > > > > > > > > >> ApiKey to name for any API they support, and I'm not > > > sure > > > > > why > > > > > > > > would > > > > > > > > > a > > > > > > > > > > > >> client need to log anything about APIs it does not > > > > support. > > > > > > Am I > > > > > > > > > > > >> missing something? > > > > > > > > > > > >> > > > > > > > > > > > > Yea, it is a fair assumption that client would know > > about > > > > > APIs > > > > > > it > > > > > > > > > > > supports. > > > > > > > > > > > > It could have been helpful for client users to see > new > > > APIs > > > > > > > though, > > > > > > > > > > > however > > > > > > > > > > > > users can always refer to protocol doc of new version > > to > > > > find > > > > > > > > > > > corresponding > > > > > > > > > > > > names of the new APIs. > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > >> On a related note, Magnus is currently on vacation, > > but > > > he > > > > > > > should > > > > > > > > be > > > > > > > > > > > >> back at the end of next week. I'd like to hold off > on > > > the > > > > > vote > > > > > > > > until > > > > > > > > > > > >> he gets back since his experience in implementing > > > clients > > > > > and > > > > > > > his > > > > > > > > > > > >> opinions will be very valuable for this discussion. > > > > > > > > > > > >> > > > > > > > > > > > > That is great. It will be valuable to have his > > feedback. > > > I > > > > > will > > > > > > > > hold > > > > > > > > > > off > > > > > > > > > > > on > > > > > > > > > > > > removing "api_name" and "api_deprecated_versions" or > > > adding > > > > > > > release > > > > > > > > > > > version. > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > >> Gwen > > > > > > > > > > > >> > > > > > > > > > > > >> On Tue, Mar 1, 2016 at 4:24 PM, Ashish Singh < > > > > > > > asi...@cloudera.com > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > >> > Works with me. I will update PR to remove this. > > > > > > > > > > > >> > > > > > > > > > > > > >> > Also, "api_name" have been pointed out as a > concern. > > > > > > However, > > > > > > > it > > > > > > > > > can > > > > > > > > > > > be > > > > > > > > > > > >> > handy for logging and similar purposes. Any take > on > > > > that? > > > > > > > > > > > >> > > > > > > > > > > > > >> > On Tue, Mar 1, 2016 at 3:46 PM, Gwen Shapira < > > > > > > > g...@confluent.io > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > >> > > > > > > > > > > > > >> >> Jay also mentioned: > > > > > > > > > > > >> >> "Or, alternately, since deprecation has no > > functional > > > > > > impact > > > > > > > > and > > > > > > > > > is > > > > > > > > > > > >> >> just a message > > > > > > > > > > > >> >> to developers, we could just leave it out of the > > > > protocol > > > > > > and > > > > > > > > > just > > > > > > > > > > > have > > > > > > > > > > > >> it > > > > > > > > > > > >> >> in release notes etc." > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> I'm in favor of leaving it out of the protocol. I > > > can't > > > > > > > really > > > > > > > > > see > > > > > > > > > > a > > > > > > > > > > > >> >> use-case. > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> Gwen > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> On Mon, Feb 29, 2016 at 3:55 PM, Ashish Singh < > > > > > > > > > asi...@cloudera.com > > > > > > > > > > > > > > > > > > > > > > >> wrote: > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > I hope it is OK for me to make some progress > > here. > > > I > > > > > have > > > > > > > > made > > > > > > > > > > the > > > > > > > > > > > >> >> > following changes. > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > 1. Updated KIP-35, to adopt Jay's suggestion on > > > > > > maintaining > > > > > > > > > > > separate > > > > > > > > > > > >> list > > > > > > > > > > > >> >> > of deprecated versions, instead of using a > > version > > > of > > > > > -1. > > > > > > > > > > > >> >> > 2. Added information on required permissions, > > > > Describe > > > > > > > action > > > > > > > > > on > > > > > > > > > > > >> Cluster > > > > > > > > > > > >> >> > resource, to be able to retrieve protocol > > versions > > > > > from a > > > > > > > > auth > > > > > > > > > > > enabled > > > > > > > > > > > >> >> > Kafka cluster. > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > Created > > > > > https://issues.apache.org/jira/browse/KAFKA-3304 > > > > > > . > > > > > > > > > > Primary > > > > > > > > > > > >> patch > > > > > > > > > > > >> >> is > > > > > > > > > > > >> >> > available to review, > > > > > > > > https://github.com/apache/kafka/pull/986 > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > On Thu, Feb 25, 2016 at 1:27 PM, Ashish Singh < > > > > > > > > > > asi...@cloudera.com > > > > > > > > > > > > > > > > > > > > > > > >> >> wrote: > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > Kafka clients in Hadoop ecosystem, Flume, > > Spark, > > > > etc, > > > > > > > have > > > > > > > > > > found > > > > > > > > > > > it > > > > > > > > > > > >> >> > really > > > > > > > > > > > >> >> > > difficult to cope up with Kafka releases as > > they > > > > want > > > > > > to > > > > > > > > > > support > > > > > > > > > > > >> users > > > > > > > > > > > >> >> on > > > > > > > > > > > >> >> > > different Kafka versions. Capability to > > retrieve > > > > > > protocol > > > > > > > > > > version > > > > > > > > > > > >> will > > > > > > > > > > > >> >> > go a > > > > > > > > > > > >> >> > > long way to ease out those pain points. I > will > > be > > > > > happy > > > > > > > to > > > > > > > > > help > > > > > > > > > > > out > > > > > > > > > > > >> >> with > > > > > > > > > > > >> >> > > the work on this KIP. @Magnus, thanks for > > driving > > > > > this, > > > > > > > is > > > > > > > > it > > > > > > > > > > OK > > > > > > > > > > > if > > > > > > > > > > > >> I > > > > > > > > > > > >> >> > carry > > > > > > > > > > > >> >> > > forward the work from here. It will be ideal > to > > > > have > > > > > > this > > > > > > > > in > > > > > > > > > > > >> 0.10.0.0. > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > On Mon, Oct 12, 2015 at 9:29 PM, Jay Kreps < > > > > > > > > j...@confluent.io > > > > > > > > > > > > > > > > > > > > > >> wrote: > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > >> I wonder if we need to solve the error > > problem? > > > I > > > > > > think > > > > > > > > this > > > > > > > > > > KIP > > > > > > > > > > > >> >> gives a > > > > > > > > > > > >> >> > >> descent work around. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > >> Probably we should have included an error in > > the > > > > > > > response > > > > > > > > > > > header, > > > > > > > > > > > >> but > > > > > > > > > > > >> >> we > > > > > > > > > > > >> >> > >> debated it at the time decided not to and > now > > it > > > > is > > > > > > > pretty > > > > > > > > > > hard > > > > > > > > > > > to > > > > > > > > > > > >> add > > > > > > > > > > > >> >> > >> because the headers aren't versioned (d'oh). > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > >> It seems like any other solution is going to > > be > > > > kind > > > > > > of > > > > > > > a > > > > > > > > > > hack, > > > > > > > > > > > >> right? > > > > > > > > > > > >> >> > >> Sending malformed responses back seems like > > not > > > a > > > > > > clean > > > > > > > > > > > solution... > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > >> (Not sure if I was pro- having a top-level > > error > > > > or > > > > > > not, > > > > > > > > but > > > > > > > > > > in > > > > > > > > > > > any > > > > > > > > > > > >> >> case > > > > > > > > > > > >> >> > >> the rationale for the decision was that so > > many > > > of > > > > > the > > > > > > > > > > requests > > > > > > > > > > > >> were > > > > > > > > > > > >> >> > >> per-partition or per-topic or whatever and > > hence > > > > > fail > > > > > > or > > > > > > > > > > > succeed at > > > > > > > > > > > >> >> that > > > > > > > > > > > >> >> > >> level and this makes it hard to know what > the > > > > right > > > > > > > > > top-level > > > > > > > > > > > error > > > > > > > > > > > >> >> code > > > > > > > > > > > >> >> > >> is > > > > > > > > > > > >> >> > >> and hard for the client to figure out what > to > > do > > > > > with > > > > > > > the > > > > > > > > > top > > > > > > > > > > > level > > > > > > > > > > > >> >> > error > > > > > > > > > > > >> >> > >> if some of the partitions succeed but there > > is a > > > > > > > top-level > > > > > > > > > > > error). > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > >> I think actually this new API actually > gives a > > > way > > > > > to > > > > > > > > handle > > > > > > > > > > > this > > > > > > > > > > > >> >> > >> gracefully on the client side by just having > > > > clients > > > > > > > that > > > > > > > > > want > > > > > > > > > > > to > > > > > > > > > > > >> be > > > > > > > > > > > >> >> > >> graceful check for support for their > version. > > > > > Clients > > > > > > > that > > > > > > > > > do > > > > > > > > > > > that > > > > > > > > > > > >> >> will > > > > > > > > > > > >> >> > >> have a graceful message. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > >> At some point if we're ever reworking the > > > headers > > > > we > > > > > > > > should > > > > > > > > > > > really > > > > > > > > > > > >> >> > >> consider > > > > > > > > > > > >> >> > >> (a) versioning them and (b) adding a > top-level > > > > error > > > > > > > code > > > > > > > > in > > > > > > > > > > the > > > > > > > > > > > >> >> > response. > > > > > > > > > > > >> >> > >> But given this would be a big breaking > change > > > and > > > > > this > > > > > > > is > > > > > > > > > > really > > > > > > > > > > > >> just > > > > > > > > > > > >> >> to > > > > > > > > > > > >> >> > >> give a nicer error message seems like it > > > probably > > > > > > isn't > > > > > > > > > worth > > > > > > > > > > > it to > > > > > > > > > > > >> >> try > > > > > > > > > > > >> >> > to > > > > > > > > > > > >> >> > >> do something now. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > >> -Jay > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > >> On Mon, Oct 12, 2015 at 8:11 PM, Jiangjie > Qin > > > > > > > > > > > >> >> <j...@linkedin.com.invalid > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > >> wrote: > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > >> > I am thinking instead of returning an > empty > > > > > > response, > > > > > > > it > > > > > > > > > > > would be > > > > > > > > > > > >> >> > >> better to > > > > > > > > > > > >> >> > >> > return an explicit > > UnsupportedVersionException > > > > > code. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > Today KafkaApis handles the error in the > > > > following > > > > > > > way: > > > > > > > > > > > >> >> > >> > 1. For requests/responses using old Scala > > > > classes, > > > > > > > > > KafkaApis > > > > > > > > > > > uses > > > > > > > > > > > >> >> > >> > RequestOrResponse.handleError() to return > an > > > > error > > > > > > > > > response. > > > > > > > > > > > >> >> > >> > 2. For requests/response using Java > classes > > > > (only > > > > > > > > > > > >> JoinGroupRequest > > > > > > > > > > > >> >> and > > > > > > > > > > > >> >> > >> > Heartbeat now), KafkaApis calls > > > > > > > > > > > >> AbstractRequest.getErrorResponse() > > > > > > > > > > > >> >> to > > > > > > > > > > > >> >> > >> > return an error response. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > In KAFKA-2512, I am returning an > > > > > > > > > UnsupportedVersionException > > > > > > > > > > > for > > > > > > > > > > > >> >> case > > > > > > > > > > > >> >> > >> [1] > > > > > > > > > > > >> >> > >> > when see an unsupported version. This will > > put > > > > the > > > > > > > error > > > > > > > > > > code > > > > > > > > > > > per > > > > > > > > > > > >> >> > topic > > > > > > > > > > > >> >> > >> or > > > > > > > > > > > >> >> > >> > partition for most of the requests, but > > might > > > > not > > > > > > work > > > > > > > > all > > > > > > > > > > the > > > > > > > > > > > >> time. > > > > > > > > > > > >> >> > >> e.g. > > > > > > > > > > > >> >> > >> > TopicMetadataRequest with an empty topic > > set. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > Case [2] does not quite work for > unsupported > > > > > > version, > > > > > > > > > > because > > > > > > > > > > > we > > > > > > > > > > > >> >> will > > > > > > > > > > > >> >> > >> > thrown an uncaught exception when version > is > > > not > > > > > > > > > recognized > > > > > > > > > > > (BTW > > > > > > > > > > > >> >> this > > > > > > > > > > > >> >> > >> is a > > > > > > > > > > > >> >> > >> > bug). Part of the reason is that for some > > > > response > > > > > > > > types, > > > > > > > > > > > error > > > > > > > > > > > >> code > > > > > > > > > > > >> >> > is > > > > > > > > > > > >> >> > >> not > > > > > > > > > > > >> >> > >> > part of the response level field. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > Maybe it worth checking how each response > is > > > > > dealing > > > > > > > > with > > > > > > > > > > > error > > > > > > > > > > > >> code > > > > > > > > > > > >> >> > >> today. > > > > > > > > > > > >> >> > >> > A scan of the response formats gives the > > > > following > > > > > > > > result: > > > > > > > > > > > >> >> > >> > 1. TopicMetadataResponse - per topic error > > > code, > > > > > > does > > > > > > > > not > > > > > > > > > > work > > > > > > > > > > > >> when > > > > > > > > > > > >> >> > the > > > > > > > > > > > >> >> > >> > topic set is empty in the request. > > > > > > > > > > > >> >> > >> > 2. ProduceResonse - per partition error > > code. > > > > > > > > > > > >> >> > >> > 3. OffsetCommitResponse - per partition. > > > > > > > > > > > >> >> > >> > 4. OffsetFetchResponse - per partition. > > > > > > > > > > > >> >> > >> > 5. OffsetResponse - per partition. > > > > > > > > > > > >> >> > >> > 6. FetchResponse - per partition > > > > > > > > > > > >> >> > >> > 7. ConsumerMetadataResponse - response > level > > > > > > > > > > > >> >> > >> > 8. ControlledShutdownResponse - response > > level > > > > > > > > > > > >> >> > >> > 9. JoinGroupResponse - response level > > > > > > > > > > > >> >> > >> > 10. HearbeatResponse - response level > > > > > > > > > > > >> >> > >> > 11. LeaderAndIsrResponse - response level > > > > > > > > > > > >> >> > >> > 12. StopReplicaResponse - response level > > > > > > > > > > > >> >> > >> > 13. UpdateMetadataResponse - response > level > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > So from the list above it looks for each > > > > response > > > > > we > > > > > > > are > > > > > > > > > > > actually > > > > > > > > > > > >> >> able > > > > > > > > > > > >> >> > >> to > > > > > > > > > > > >> >> > >> > return an error code, as long as we make > > sure > > > > the > > > > > > > topic > > > > > > > > or > > > > > > > > > > > >> partition > > > > > > > > > > > >> >> > >> won't > > > > > > > > > > > >> >> > >> > be empty when the error code is at topic > or > > > > > > partition > > > > > > > > > level. > > > > > > > > > > > >> Luckily > > > > > > > > > > > >> >> > in > > > > > > > > > > > >> >> > >> the > > > > > > > > > > > >> >> > >> > above list we only need to worry about > > > > > > > > > > TopicMetadataResponse. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > Maybe error handling is out of the scope > of > > > this > > > > > > KIP, > > > > > > > > but > > > > > > > > > I > > > > > > > > > > > >> prefer > > > > > > > > > > > >> >> we > > > > > > > > > > > >> >> > >> think > > > > > > > > > > > >> >> > >> > through how to deal with error code for > the > > > > > > requests, > > > > > > > > > > because > > > > > > > > > > > >> there > > > > > > > > > > > >> >> > are > > > > > > > > > > > >> >> > >> > more request types to be added in > KAFKA-2464 > > > and > > > > > > > future > > > > > > > > > > > patches. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > Thanks, > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > Jiangjie (Becket) Qin > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > On Mon, Oct 12, 2015 at 6:04 PM, Jay > Kreps < > > > > > > > > > > j...@confluent.io> > > > > > > > > > > > >> >> wrote: > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > > Two quick pieces of feedback: > > > > > > > > > > > >> >> > >> > > 1. The use of a version of -1 as magical > > > entry > > > > > > > > dividing > > > > > > > > > > > between > > > > > > > > > > > >> >> > >> > deprecated > > > > > > > > > > > >> >> > >> > > versions is a bit hacky. What about > > instead > > > > > having > > > > > > > an > > > > > > > > > > array > > > > > > > > > > > of > > > > > > > > > > > >> >> > >> supported > > > > > > > > > > > >> >> > >> > > versions and a separate array of > > deprecated > > > > > > > versions. > > > > > > > > > The > > > > > > > > > > > >> >> deprecated > > > > > > > > > > > >> >> > >> > > versions would always be a subset of the > > > > > supported > > > > > > > > > > versions. > > > > > > > > > > > >> Or, > > > > > > > > > > > >> >> > >> > > alternately, since deprecation has no > > > > functional > > > > > > > > impact > > > > > > > > > > and > > > > > > > > > > > is > > > > > > > > > > > >> >> just > > > > > > > > > > > >> >> > a > > > > > > > > > > > >> >> > >> > > message to developers, we could just > leave > > > it > > > > > out > > > > > > of > > > > > > > > the > > > > > > > > > > > >> protocol > > > > > > > > > > > >> >> > and > > > > > > > > > > > >> >> > >> > just > > > > > > > > > > > >> >> > >> > > have it in release notes etc. > > > > > > > > > > > >> >> > >> > > 2. I think including the api name may > > cause > > > > some > > > > > > > > > problems. > > > > > > > > > > > >> >> Currently > > > > > > > > > > > >> >> > >> the > > > > > > > > > > > >> >> > >> > > api key is the primary key that we keep > > > > > consistent > > > > > > > but > > > > > > > > > we > > > > > > > > > > > have > > > > > > > > > > > >> >> > >> actually > > > > > > > > > > > >> >> > >> > > evolved the english description of the > > apis > > > as > > > > > > they > > > > > > > > have > > > > > > > > > > > >> changed. > > > > > > > > > > > >> >> > The > > > > > > > > > > > >> >> > >> > only > > > > > > > > > > > >> >> > >> > > use I can think of for the name would be > > if > > > > > people > > > > > > > > used > > > > > > > > > > the > > > > > > > > > > > >> >> logical > > > > > > > > > > > >> >> > >> name > > > > > > > > > > > >> >> > >> > > and tried to resolve the api key, but > that > > > > would > > > > > > be > > > > > > > > > wrong. > > > > > > > > > > > Not > > > > > > > > > > > >> >> sure > > > > > > > > > > > >> >> > >> if we > > > > > > > > > > > >> >> > >> > > actually need the english name, if there > > is > > > a > > > > > use > > > > > > > > case I > > > > > > > > > > > guess > > > > > > > > > > > >> >> we'll > > > > > > > > > > > >> >> > >> just > > > > > > > > > > > >> >> > >> > > have to be very clear that the name is > > just > > > > > > > > > documentation > > > > > > > > > > > and > > > > > > > > > > > >> can > > > > > > > > > > > >> >> > >> change > > > > > > > > > > > >> >> > >> > > any time. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > > -Jay > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > > On Fri, Sep 25, 2015 at 2:53 PM, Magnus > > > > > Edenhill < > > > > > > > > > > > >> >> > mag...@edenhill.se> > > > > > > > > > > > >> >> > >> > > wrote: > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > > > Good evening, > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > KIP-35 was created to address current > > and > > > > > future > > > > > > > > > > > >> broker-client > > > > > > > > > > > >> >> > >> > > > compatibility. > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > Summary: > > > > > > > > > > > >> >> > >> > > > * allow clients to retrieve the > > broker's > > > > > > protocol > > > > > > > > > > version > > > > > > > > > > > >> >> > >> > > > * make broker handle unknown protocol > > > > > requests > > > > > > > > > > gracefully > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > Feedback and comments welcome! > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > Regards, > > > > > > > > > > > >> >> > >> > > > Magnus > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > > >> >> > >> > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > -- > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > Regards, > > > > > > > > > > > >> >> > > Ashish > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > -- > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > Regards, > > > > > > > > > > > >> >> > Ashish > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > -- > > > > > > > > > > > >> > > > > > > > > > > > > >> > Regards, > > > > > > > > > > > >> > Ashish > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > Ashish > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > Ashish > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Regards, > > > Ashish > > > > > >