We have the new warning in 0.8.2. I updated KIP-1 with the new plan: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1+-+Remove+support+of+request.required.acks
I'm waiting a day for additional discussion and if there are no replies, I'll send the [VOTE] email. Gwen On Mon, Jan 19, 2015 at 10:00 AM, Gwen Shapira <gshap...@cloudera.com> wrote: > Sounds good to me. > I'll open a new JIRA for 0.8.2 with just an extra log warning, to > avoid making KAFKA-1697 any more confusing. > > On Mon, Jan 19, 2015 at 9:46 AM, Joe Stein <joe.st...@stealth.ly> wrote: >> << For 2, how about we make a change to log a warning for ack > 1 in 0.8.2 >> and then drop the ack > 1 support in trunk (w/o bumping up the protocol >> version)? >> >> +1 >> >> >> On Mon, Jan 19, 2015 at 12:35 PM, Jun Rao <j...@confluent.io> wrote: >>> >>> For 2, how about we make a change to log a warning for ack > 1 in 0.8.2 >>> and then drop the ack > 1 support in trunk (w/o bumping up the protocol >>> version)? Thanks, >>> >>> Jun >>> >>> On Sun, Jan 18, 2015 at 8:24 PM, Gwen Shapira <gshap...@cloudera.com> >>> wrote: >>>> >>>> Overall, agree on point #1, less sure on point #2. >>>> >>>> 1. Some protocols never ever add new errors, while others add errors >>>> without bumping versions. HTTP is a good example of the second type. >>>> HTTP-451 was added fairly recently, there are some errors specific to >>>> NGINX, etc. No one cares. I think we should properly document in the >>>> wire-protocol doc that new errors can be added, and I think we should >>>> strongly suggest (and implement ourselves) that unknown error codes >>>> should be shown to users (or at least logged), so they can be googled >>>> and understood through our documentation. >>>> In addition, hierarchy of error codes, so clients will know if an >>>> error is retry-able just by looking at the code could be nice. Same >>>> for adding an error string to the protocol. These are future >>>> enhancements that should be discussed separately. >>>> >>>> 2. I think we want to allow admins to upgrade their Kafka brokers >>>> without having to chase down clients in their organization and without >>>> getting blamed if clients break. I think it makes sense to have one >>>> version that will support existing behavior, but log warnings, so >>>> admins will know about misbehaving clients and can track them down >>>> before an upgrade that breaks them (or before the broken config causes >>>> them to lose data!). Hopefully this is indeed a very rare behavior and >>>> we are taking extra precaution for nothing, but I have customers where >>>> one traumatic upgrade means they will never upgrade a Kafka again, so >>>> I'm being conservative. >>>> >>>> Gwen >>>> >>>> >>>> On Sun, Jan 18, 2015 at 3:50 PM, Jun Rao <j...@confluent.io> wrote: >>>> > Overall, I agree with Jay on both points. >>>> > >>>> > 1. I think it's reasonable to add new error codes w/o bumping up the >>>> > protocol version. In most cases, by adding new error codes, we are just >>>> > refining the categorization of those unknown errors. So, a client >>>> > shouldn't >>>> > behave worse than before as long as unknown errors have been properly >>>> > handled. >>>> > >>>> > 2. I think it's reasonable to just document that 0.8.2 will be the last >>>> > release that will support ack > 1 and remove the support completely in >>>> > trunk >>>> > w/o bumping up the protocol. This is because (a) we never included ack >>>> > > 1 >>>> > explicitly in the documentation and so the usage should be limited; (2) >>>> > ack >>>> >> 1 doesn't provide the guarantee that people really want and so it >>>> > shouldn't really be used. >>>> > >>>> > Thanks, >>>> > >>>> > Jun >>>> > >>>> > >>>> > On Sun, Jan 18, 2015 at 11:03 AM, Jay Kreps <jay.kr...@gmail.com> >>>> > wrote: >>>> >> >>>> >> Hey guys, >>>> >> >>>> >> I really think we are discussing two things here: >>>> >> >>>> >> How should we generally handle changes to the set of errors? Should >>>> >> introducing new errors be considered a protocol change or should we >>>> >> reserve >>>> >> the right to introduce new error codes? >>>> >> Given that this particular change is possibly incompatible, how should >>>> >> we >>>> >> handle it? >>>> >> >>>> >> I think it would be good for people who are responding here to be >>>> >> specific >>>> >> about which they are addressing. >>>> >> >>>> >> Here is what I think: >>>> >> >>>> >> 1. Errors should be extensible within a protocol version. >>>> >> >>>> >> We should change the protocol documentation to list the errors that >>>> >> can be >>>> >> given back from each api, their meaning, and how to handle them, BUT >>>> >> we >>>> >> should explicitly state that the set of errors are open ended. That is >>>> >> we >>>> >> should reserve the right to introduce new errors and explicitly state >>>> >> that >>>> >> clients need a blanket "unknown error" handling mechanism. The error >>>> >> can >>>> >> link to the protocol definition (something like "Unknown error 42, see >>>> >> protocol definition at http://link"). We could make this work really >>>> >> well by >>>> >> instructing all the clients to report the error in a very googlable >>>> >> way as >>>> >> Oracle does with their error format (e.g. "ORA-32") so that if you >>>> >> ever get >>>> >> the raw error google will take you to the definition. >>>> >> >>>> >> I agree that a more rigid definition seems like "right thing", but >>>> >> having >>>> >> just implemented two clients and spent a bunch of time on the server >>>> >> side, I >>>> >> think, it will work out poorly in practice. Here is why: >>>> >> >>>> >> I think we will make a lot of mistakes in nailing down the set of >>>> >> error >>>> >> codes up front and we will end up going through 3-4 churns of the >>>> >> protocol >>>> >> definition just realizing the set of errors that can be thrown. I >>>> >> think this >>>> >> churn will actually make life worse for clients that now have to >>>> >> figure out >>>> >> 7 identical versions of the protocol and will be a mess in terms of >>>> >> testing >>>> >> on the server side. I actually know this to be true because while >>>> >> implementing the clients I tried to guess the errors that could be >>>> >> thrown, >>>> >> then checked my guess by close code inspection. It turned out that I >>>> >> always >>>> >> missed things in my belief about errors, but more importantly even >>>> >> after >>>> >> close code inspection I found tons of other errors in my stress >>>> >> testing. >>>> >> In practice error handling always involves calling out one or two >>>> >> meaningful failures that have special recovery and then a blanket case >>>> >> that >>>> >> just handles everything else. It's true that some clients may not have >>>> >> done >>>> >> this well, but I think it is for the best if they fix that. >>>> >> Reserving the right to add errors doesn't mean we will do it without >>>> >> care. >>>> >> We will think through each change and decide whether giving a little >>>> >> more >>>> >> precision in the error is worth the overhead and churn of a protocol >>>> >> version >>>> >> bump. >>>> >> >>>> >> 2. In this case in particular we should not introduce a new protocol >>>> >> version >>>> >> >>>> >> In this particular case we are saying that acks > 1 doesn't make sense >>>> >> and >>>> >> we want to give an error to people specifying this so that they change >>>> >> their >>>> >> configuration. This is a configuration that few people use and we want >>>> >> to >>>> >> just make it an error. The bad behavior will just be that the error >>>> >> will not >>>> >> be as good as it could be. I think that is a better tradeoff than >>>> >> introducing a separate protocol version (this may be true of the java >>>> >> clients too). >>>> >> >>>> >> We will have lots of cases like this in the future and we aren't going >>>> >> to >>>> >> want to churn the protocol for each of them. For example we previously >>>> >> had >>>> >> to get more precise about which characters were legal and which >>>> >> illegal in >>>> >> topic names. >>>> >> >>>> >> -Jay >>>> >> >>>> >> On Fri, Jan 16, 2015 at 11:55 AM, Gwen Shapira <gshap...@cloudera.com> >>>> >> wrote: >>>> >>> >>>> >>> I updated the KIP: Using acks > 1 in version 0 will log a WARN >>>> >>> message >>>> >>> in the broker about client using deprecated behavior (suggested by >>>> >>> Joe >>>> >>> in the JIRA, and I think it makes sense). >>>> >>> >>>> >>> Gwen >>>> >>> >>>> >>> On Fri, Jan 16, 2015 at 10:40 AM, Gwen Shapira >>>> >>> <gshap...@cloudera.com> >>>> >>> wrote: >>>> >>> > How about we continue the discussion on this thread, so we won't >>>> >>> > lose >>>> >>> > the context of this discussion, and put it up for VOTE when this >>>> >>> > has >>>> >>> > been finalized? >>>> >>> > >>>> >>> > On Fri, Jan 16, 2015 at 10:22 AM, Neha Narkhede <n...@confluent.io> >>>> >>> > wrote: >>>> >>> >> Gwen, >>>> >>> >> >>>> >>> >> KIP write-up looks good. According to the rest of the KIP process >>>> >>> >> proposal, >>>> >>> >> would you like to start a DISCUSS/VOTE thread for it? >>>> >>> >> >>>> >>> >> Thanks, >>>> >>> >> Neha >>>> >>> >> >>>> >>> >> On Fri, Jan 16, 2015 at 9:37 AM, Ewen Cheslack-Postava >>>> >>> >> <e...@confluent.io> >>>> >>> >> wrote: >>>> >>> >> >>>> >>> >>> Gwen -- KIP write up looks good. Deprecation schedule probably >>>> >>> >>> needs >>>> >>> >>> to be >>>> >>> >>> more specific, but I think that discussion probably needs to >>>> >>> >>> happen >>>> >>> >>> after a >>>> >>> >>> solution is agreed upon. >>>> >>> >>> >>>> >>> >>> Jay -- I think "older clients will get a bad error message >>>> >>> >>> instead of >>>> >>> >>> a >>>> >>> >>> good one" isn't what would be happening with this change. >>>> >>> >>> Previously >>>> >>> >>> they >>>> >>> >>> wouldn't have received an error and they would have been able to >>>> >>> >>> produce >>>> >>> >>> messages. After the change they'll just receive this new error >>>> >>> >>> message >>>> >>> >>> which their clients can't possibly handle gracefully since it >>>> >>> >>> didn't >>>> >>> >>> exist >>>> >>> >>> when the client was written. Whether the acks > 1 setting was >>>> >>> >>> actually >>>> >>> >>> accomplishing what they thought doesn't matter. Someone could >>>> >>> >>> have >>>> >>> >>> reasonably read the docs on 0.8.1.1, thought acks = 2 is an ok >>>> >>> >>> setting for >>>> >>> >>> their applications, set it as a default across a bunch of apps, >>>> >>> >>> then >>>> >>> >>> follow >>>> >>> >>> the recommended upgrade path of updating brokers to 0.8.2 and all >>>> >>> >>> their >>>> >>> >>> apps will start failing on produce requests. >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> On Thu, Jan 15, 2015 at 8:27 PM, Jay Kreps <jay.kr...@gmail.com> >>>> >>> >>> wrote: >>>> >>> >>> >>>> >>> >>> > This is a good case to discuss. >>>> >>> >>> > >>>> >>> >>> > Let's figure the general case of how we want to handle errors >>>> >>> >>> > and >>>> >>> >>> > get >>>> >>> >>> that >>>> >>> >>> > documented in the protocol. The problem right now is that we >>>> >>> >>> > give >>>> >>> >>> > no >>>> >>> >>> > guidance on this. I actually thought Gwen's suggestion made >>>> >>> >>> > sense >>>> >>> >>> > on the >>>> >>> >>> > guidance we should have given which is that we will enumerate a >>>> >>> >>> > set >>>> >>> >>> > of >>>> >>> >>> > errors and their meaning for each API but it is possible that >>>> >>> >>> > other >>>> >>> >>> errors >>>> >>> >>> > will occur and they should be handled (maybe poorly) in the >>>> >>> >>> > same >>>> >>> >>> > way >>>> >>> >>> > UNKNOWN_ERROR is handled which is our normal escape hatch for >>>> >>> >>> > things like >>>> >>> >>> > OOMException. >>>> >>> >>> > >>>> >>> >>> > I really do think we shouldn't be dogmatic here: In considering >>>> >>> >>> > a >>>> >>> >>> > change >>>> >>> >>> > to errors we should consider the potential ill-effect vs the >>>> >>> >>> > complexity >>>> >>> >>> of >>>> >>> >>> > yet another protocol version. >>>> >>> >>> > >>>> >>> >>> > In this case I actually am not sure we need to bump the >>>> >>> >>> > protocol >>>> >>> >>> > because >>>> >>> >>> > the whole point of the change was to make a setting we think >>>> >>> >>> > doesn't make >>>> >>> >>> > sense break, right? Well this will break it. It seems like the >>>> >>> >>> > only >>>> >>> >>> > downside is that older clients will get a bad error message >>>> >>> >>> > instead >>>> >>> >>> > of a >>>> >>> >>> > good one. But it isn't like we will have rendered a client >>>> >>> >>> > unusable, it >>>> >>> >>> is >>>> >>> >>> > just that they will need to change their config. >>>> >>> >>> > >>>> >>> >>> > -Jay >>>> >>> >>> > >>>> >>> >>> > On Thu, Jan 15, 2015 at 6:14 PM, Gwen Shapira >>>> >>> >>> > <gshap...@cloudera.com> >>>> >>> >>> > wrote: >>>> >>> >>> > >>>> >>> >>> >> I created a KIP for this suggestion: >>>> >>> >>> >> >>>> >>> >>> >> >>>> >>> >>> >> >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1+-+Remove+support+of+request.required.acks >>>> >>> >>> >> >>>> >>> >>> >> Basically documenting what was already discussed here. >>>> >>> >>> >> Comments >>>> >>> >>> >> will >>>> >>> >>> >> be awesome! >>>> >>> >>> >> >>>> >>> >>> >> Gwen >>>> >>> >>> >> >>>> >>> >>> >> On Thu, Jan 15, 2015 at 5:19 PM, Gwen Shapira >>>> >>> >>> >> <gshap...@cloudera.com> >>>> >>> >>> >> wrote: >>>> >>> >>> >> > The errors are part of the KIP process now, so I think the >>>> >>> >>> >> > clients are >>>> >>> >>> >> safe :) >>>> >>> >>> >> > >>>> >>> >>> >> > >>>> >>> >>> >> >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals >>>> >>> >>> >> > >>>> >>> >>> >> > On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin >>>> >>> >>> >> > <steve.mo...@gmail.com> >>>> >>> >>> >> wrote: >>>> >>> >>> >> >> Agree errors should be part of the protocol >>>> >>> >>> >> >> >>>> >>> >>> >> >>> On Jan 15, 2015, at 17:59, Gwen Shapira >>>> >>> >>> >> >>> <gshap...@cloudera.com> >>>> >>> >>> >> wrote: >>>> >>> >>> >> >>> >>>> >>> >>> >> >>> Hi, >>>> >>> >>> >> >>> >>>> >>> >>> >> >>> I got convinced by Joe and Dana that errors are indeed >>>> >>> >>> >> >>> part of >>>> >>> >>> >> >>> the >>>> >>> >>> >> >>> protocol and can't be randomly added. >>>> >>> >>> >> >>> >>>> >>> >>> >> >>> So, it looks like we need to bump version of >>>> >>> >>> >> >>> ProduceRequest in >>>> >>> >>> >> >>> the >>>> >>> >>> >> >>> following way: >>>> >>> >>> >> >>> Version 0 -> accept acks >1. I think we should keep the >>>> >>> >>> >> >>> existing >>>> >>> >>> >> >>> behavior too (i.e. not replace it with -1) to avoid >>>> >>> >>> >> >>> surprising >>>> >>> >>> >> >>> clients, but I'm willing to hear other opinions. >>>> >>> >>> >> >>> Version 1 -> do not accept acks >1 and return an error. >>>> >>> >>> >> >>> Are we ok with the error I added in KAFKA-1697? We can use >>>> >>> >>> >> >>> something >>>> >>> >>> >> >>> less specific like InvalidRequestParameter. This error can >>>> >>> >>> >> >>> be >>>> >>> >>> >> >>> reused >>>> >>> >>> >> >>> in the future and reduce the need to add errors, but will >>>> >>> >>> >> >>> also >>>> >>> >>> >> >>> be >>>> >>> >>> less >>>> >>> >>> >> >>> clear to the client and its users. Maybe even add the >>>> >>> >>> >> >>> error >>>> >>> >>> >> >>> message >>>> >>> >>> >> >>> string to the protocol in addition to the error code? >>>> >>> >>> >> >>> (since >>>> >>> >>> >> >>> we are >>>> >>> >>> >> >>> bumping versions....) >>>> >>> >>> >> >>> >>>> >>> >>> >> >>> I think maintaining the old version throughout 0.8.X makes >>>> >>> >>> >> >>> sense. >>>> >>> >>> IMO >>>> >>> >>> >> >>> dropping it for 0.9 is feasible, but I'll let client >>>> >>> >>> >> >>> owners >>>> >>> >>> >> >>> help >>>> >>> >>> make >>>> >>> >>> >> >>> that call. >>>> >>> >>> >> >>> >>>> >>> >>> >> >>> Am I missing anything? Should I start a KIP? It seems like >>>> >>> >>> >> >>> a >>>> >>> >>> KIP-type >>>> >>> >>> >> >>> discussion :) >>>> >>> >>> >> >>> >>>> >>> >>> >> >>> Gwen >>>> >>> >>> >> >>> >>>> >>> >>> >> >>> >>>> >>> >>> >> >>> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava >>>> >>> >>> >> >>> <e...@confluent.io> wrote: >>>> >>> >>> >> >>>> Gwen, >>>> >>> >>> >> >>>> >>>> >>> >>> >> >>>> I think the only option that wouldn't require a protocol >>>> >>> >>> >> >>>> version >>>> >>> >>> >> change is >>>> >>> >>> >> >>>> the one where acks > 1 is converted to acks = -1 since >>>> >>> >>> >> >>>> it's >>>> >>> >>> >> >>>> the >>>> >>> >>> only >>>> >>> >>> >> one >>>> >>> >>> >> >>>> that doesn't potentially break older clients. The >>>> >>> >>> >> >>>> protocol >>>> >>> >>> >> >>>> guide >>>> >>> >>> >> says that >>>> >>> >>> >> >>>> the expected upgrade path is servers first, then clients, >>>> >>> >>> >> >>>> so >>>> >>> >>> >> >>>> old >>>> >>> >>> >> clients, >>>> >>> >>> >> >>>> including non-java clients, that may be using acks > 1 >>>> >>> >>> >> >>>> should >>>> >>> >>> >> >>>> be >>>> >>> >>> >> able to >>>> >>> >>> >> >>>> work with a new broker version. >>>> >>> >>> >> >>>> >>>> >>> >>> >> >>>> It's more work, but I think dealing with the protocol >>>> >>> >>> >> >>>> change >>>> >>> >>> >> >>>> is the >>>> >>> >>> >> right >>>> >>> >>> >> >>>> thing to do since it eventually gets us to the behavior I >>>> >>> >>> >> >>>> think is >>>> >>> >>> >> better -- >>>> >>> >>> >> >>>> the broker should reject requests with invalid values. I >>>> >>> >>> >> >>>> think Joe >>>> >>> >>> >> and I >>>> >>> >>> >> >>>> were basically in agreement. In my mind the major piece >>>> >>> >>> >> >>>> missing >>>> >>> >>> from >>>> >>> >>> >> his >>>> >>> >>> >> >>>> description is how long we're going to maintain his "case >>>> >>> >>> >> >>>> 0" >>>> >>> >>> >> behavior. It's >>>> >>> >>> >> >>>> impractical to maintain old versions forever, but it >>>> >>> >>> >> >>>> sounds >>>> >>> >>> >> >>>> like >>>> >>> >>> >> there >>>> >>> >>> >> >>>> hasn't been a decision on how long to maintain them. >>>> >>> >>> >> >>>> Maybe >>>> >>> >>> >> >>>> that's >>>> >>> >>> >> another >>>> >>> >>> >> >>>> item to add to KIPs -- protocol versions and behavior >>>> >>> >>> >> >>>> need to >>>> >>> >>> >> >>>> be >>>> >>> >>> >> listed as >>>> >>> >>> >> >>>> deprecated and the earliest version in which they'll be >>>> >>> >>> >> >>>> removed >>>> >>> >>> >> should be >>>> >>> >>> >> >>>> specified so users can understand which versions are >>>> >>> >>> >> >>>> guaranteed to >>>> >>> >>> be >>>> >>> >>> >> >>>> compatible, even if they're using (well-written) non-java >>>> >>> >>> >> >>>> clients. >>>> >>> >>> >> >>>> >>>> >>> >>> >> >>>> -Ewen >>>> >>> >>> >> >>>> >>>> >>> >>> >> >>>> >>>> >>> >>> >> >>>>> On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers < >>>> >>> >>> >> dana.pow...@gmail.com> wrote: >>>> >>> >>> >> >>>>> >>>> >>> >>> >> >>>>>> clients don't break on unknown errors >>>> >>> >>> >> >>>>> >>>> >>> >>> >> >>>>> maybe true for the official java clients, but I dont >>>> >>> >>> >> >>>>> think >>>> >>> >>> >> >>>>> the >>>> >>> >>> >> assumption >>>> >>> >>> >> >>>>> holds true for community-maintained clients and users of >>>> >>> >>> >> >>>>> those >>>> >>> >>> >> clients. >>>> >>> >>> >> >>>>> kafka-python generally follows the fail-fast philosophy >>>> >>> >>> >> >>>>> and >>>> >>> >>> >> >>>>> raises >>>> >>> >>> >> an >>>> >>> >>> >> >>>>> exception on any unrecognized error code in any server >>>> >>> >>> >> >>>>> response. >>>> >>> >>> >> in this >>>> >>> >>> >> >>>>> case, kafka-python allows users to set their own >>>> >>> >>> >> >>>>> required-acks >>>> >>> >>> >> policy when >>>> >>> >>> >> >>>>> creating a producer instance. It is possible that users >>>> >>> >>> >> >>>>> of >>>> >>> >>> >> kafka-python >>>> >>> >>> >> >>>>> have deployed producer code that uses ack>1 -- perhaps >>>> >>> >>> >> >>>>> in >>>> >>> >>> production >>>> >>> >>> >> >>>>> environments -- and for those users the new error code >>>> >>> >>> >> >>>>> will >>>> >>> >>> >> >>>>> crash >>>> >>> >>> >> their >>>> >>> >>> >> >>>>> producer code. I would not be surprised if the same >>>> >>> >>> >> >>>>> were >>>> >>> >>> >> >>>>> true of >>>> >>> >>> >> other >>>> >>> >>> >> >>>>> community clients. >>>> >>> >>> >> >>>>> >>>> >>> >>> >> >>>>> *one reason for the fail-fast approach is that there >>>> >>> >>> >> >>>>> isn't >>>> >>> >>> >> >>>>> great >>>> >>> >>> >> >>>>> documentation on what errors to expect for each request >>>> >>> >>> >> >>>>> / >>>> >>> >>> >> >>>>> response >>>> >>> >>> >> -- so >>>> >>> >>> >> >>>>> we >>>> >>> >>> >> >>>>> use failures to alert that some error case is not >>>> >>> >>> >> >>>>> handled >>>> >>> >>> >> properly. and >>>> >>> >>> >> >>>>> because of that, introducing new error cases without >>>> >>> >>> >> >>>>> bumping >>>> >>> >>> >> >>>>> the >>>> >>> >>> api >>>> >>> >>> >> >>>>> version is likely to cause those errors to get >>>> >>> >>> >> >>>>> raised/thrown >>>> >>> >>> >> >>>>> all >>>> >>> >>> >> the way >>>> >>> >>> >> >>>>> back up to the user. of course we (client maintainers) >>>> >>> >>> >> >>>>> can >>>> >>> >>> >> >>>>> fix >>>> >>> >>> the >>>> >>> >>> >> issues >>>> >>> >>> >> >>>>> in the client libraries and suggest users upgrade, but >>>> >>> >>> >> >>>>> it's >>>> >>> >>> >> >>>>> not >>>> >>> >>> the >>>> >>> >>> >> ideal >>>> >>> >>> >> >>>>> situation. >>>> >>> >>> >> >>>>> >>>> >>> >>> >> >>>>> >>>> >>> >>> >> >>>>> long-winded way of saying: I agree w/ Joe. >>>> >>> >>> >> >>>>> >>>> >>> >>> >> >>>>> -Dana >>>> >>> >>> >> >>>>> >>>> >>> >>> >> >>>>> >>>> >>> >>> >> >>>>> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira < >>>> >>> >>> >> gshap...@cloudera.com> >>>> >>> >>> >> >>>>> wrote: >>>> >>> >>> >> >>>>> >>>> >>> >>> >> >>>>>> Is the protocol bump caused by the behavior change or >>>> >>> >>> >> >>>>>> the >>>> >>> >>> >> >>>>>> new >>>> >>> >>> error >>>> >>> >>> >> >>>>>> code? >>>> >>> >>> >> >>>>>> >>>> >>> >>> >> >>>>>> 1) IMO, error_codes are data, and clients can expect to >>>> >>> >>> >> >>>>>> receive >>>> >>> >>> >> errors >>>> >>> >>> >> >>>>>> that they don't understand (i.e. unknown errors). >>>> >>> >>> >> >>>>>> AFAIK, >>>> >>> >>> >> >>>>>> clients >>>> >>> >>> >> don't >>>> >>> >>> >> >>>>>> break on unknown errors, they are simple more >>>> >>> >>> >> >>>>>> challenging >>>> >>> >>> >> >>>>>> to >>>> >>> >>> >> debug. If >>>> >>> >>> >> >>>>>> we document the new behavior, then its definitely >>>> >>> >>> >> >>>>>> debuggable and >>>> >>> >>> >> >>>>>> fixable. >>>> >>> >>> >> >>>>>> >>>> >>> >>> >> >>>>>> 2) The behavior change is basically a deprecation - >>>> >>> >>> >> >>>>>> i.e. >>>> >>> >>> >> >>>>>> acks > 1 >>>> >>> >>> >> were >>>> >>> >>> >> >>>>>> never documented, and are not supported by Kafka >>>> >>> >>> >> >>>>>> clients >>>> >>> >>> >> >>>>>> starting >>>> >>> >>> >> with >>>> >>> >>> >> >>>>>> version 0.8.2. I'm not sure this requires a protocol >>>> >>> >>> >> >>>>>> bump >>>> >>> >>> >> >>>>>> either, >>>> >>> >>> >> >>>>>> although its a better case than new error codes. >>>> >>> >>> >> >>>>>> >>>> >>> >>> >> >>>>>> Thanks, >>>> >>> >>> >> >>>>>> Gwen >>>> >>> >>> >> >>>>>> >>>> >>> >>> >> >>>>>> On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein < >>>> >>> >>> joe.st...@stealth.ly> >>>> >>> >>> >> >>>>>> wrote: >>>> >>> >>> >> >>>>>>> Looping in the mailing list that the client developers >>>> >>> >>> >> >>>>>>> live on >>>> >>> >>> >> because >>>> >>> >>> >> >>>>>> they >>>> >>> >>> >> >>>>>>> are all not on dev (though they should be if they want >>>> >>> >>> >> >>>>>>> to >>>> >>> >>> >> >>>>>>> be >>>> >>> >>> >> helping >>>> >>> >>> >> >>>>>>> to >>>> >>> >>> >> >>>>>>> build the best client libraries they can). >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> I whole hardily believe that we need to not break >>>> >>> >>> >> >>>>>>> existing >>>> >>> >>> >> >>>>>>> functionality >>>> >>> >>> >> >>>>>> of >>>> >>> >>> >> >>>>>>> the client protocol, ever. >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> There are many reasons for this and we have other >>>> >>> >>> >> >>>>>>> threads >>>> >>> >>> >> >>>>>>> on the >>>> >>> >>> >> >>>>>>> mailing >>>> >>> >>> >> >>>>>>> list where we are discussing that topic (no pun >>>> >>> >>> >> >>>>>>> intended) >>>> >>> >>> >> >>>>>>> that I >>>> >>> >>> >> don't >>>> >>> >>> >> >>>>>> want >>>> >>> >>> >> >>>>>>> to re-hash here. >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> If we change wire protocol functionality OR the binary >>>> >>> >>> >> >>>>>>> format >>>> >>> >>> >> (either) >>>> >>> >>> >> >>>>>>> we >>>> >>> >>> >> >>>>>>> must bump version AND treat version as a feature flag >>>> >>> >>> >> >>>>>>> with >>>> >>> >>> >> backward >>>> >>> >>> >> >>>>>>> compatibility support until it is deprecated for some >>>> >>> >>> >> >>>>>>> time >>>> >>> >>> >> >>>>>>> for >>>> >>> >>> >> folks >>>> >>> >>> >> >>>>>>> to >>>> >>> >>> >> >>>>>> deal >>>> >>> >>> >> >>>>>>> with it. >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> match version = { >>>> >>> >>> >> >>>>>>> case 0: keepDoingWhatWeWereDoing() >>>> >>> >>> >> >>>>>>> case 1: doNewStuff() >>>> >>> >>> >> >>>>>>> case 2: doEvenMoreNewStuff() >>>> >>> >>> >> >>>>>>> } >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> has to be a practice we adopt imho ... I know feature >>>> >>> >>> >> >>>>>>> flags can >>>> >>> >>> be >>>> >>> >>> >> >>>>>> construed >>>> >>> >>> >> >>>>>>> as "messy code" but I am eager to hear another >>>> >>> >>> >> >>>>>>> (better? >>>> >>> >>> >> different?) >>>> >>> >>> >> >>>>>> solution >>>> >>> >>> >> >>>>>>> to this. >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> If we don't do a feature flag like this specifically >>>> >>> >>> >> >>>>>>> with >>>> >>> >>> >> >>>>>>> this >>>> >>> >>> >> change >>>> >>> >>> >> >>>>>> then >>>> >>> >>> >> >>>>>>> what happens is that someone upgrades their brokers >>>> >>> >>> >> >>>>>>> with a >>>> >>> >>> rolling >>>> >>> >>> >> >>>>>> restart >>>> >>> >>> >> >>>>>>> in 0.8.3 and every single one of their producer >>>> >>> >>> >> >>>>>>> requests >>>> >>> >>> >> >>>>>>> start >>>> >>> >>> to >>>> >>> >>> >> fail >>>> >>> >>> >> >>>>>> and >>>> >>> >>> >> >>>>>>> they have a major production outage. eeeek!!!! >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> I do 100% agree that > 1 makes no sense and we >>>> >>> >>> >> >>>>>>> *REALLY* >>>> >>> >>> >> >>>>>>> need >>>> >>> >>> >> people to >>>> >>> >>> >> >>>>>> start >>>> >>> >>> >> >>>>>>> using 0,1,-1 but we need to-do that in a way that is >>>> >>> >>> >> >>>>>>> going >>>> >>> >>> >> >>>>>>> to >>>> >>> >>> >> work for >>>> >>> >>> >> >>>>>>> everyone. >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> Old producers and consumers must keep working with new >>>> >>> >>> >> >>>>>>> brokers >>>> >>> >>> >> and if >>>> >>> >>> >> >>>>>>> we >>>> >>> >>> >> >>>>>> are >>>> >>> >>> >> >>>>>>> not going to support that then I am unclear what the >>>> >>> >>> >> >>>>>>> use >>>> >>> >>> >> >>>>>>> of >>>> >>> >>> >> "version" >>>> >>> >>> >> >>>>>>> is >>>> >>> >>> >> >>>>>>> based on our original intentions of having it because >>>> >>> >>> >> >>>>>>> of >>>> >>> >>> >> >>>>>>> the >>>> >>> >>> >> >>>>>>> 0.7=>-0.8. >>>> >>> >>> >> >>>>>> We >>>> >>> >>> >> >>>>>>> said no more breaking changes when we did that. >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> - Joe Stein >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> On Thu, Jan 15, 2015 at 12:38 PM, Ewen >>>> >>> >>> >> >>>>>>> Cheslack-Postava < >>>> >>> >>> >> >>>>>> e...@confluent.io> >>>> >>> >>> >> >>>>>>> wrote: >>>> >>> >>> >> >>>>>>>> >>>> >>> >>> >> >>>>>>>> Right, so this looks like it could create an issue >>>> >>> >>> >> >>>>>>>> similar to >>>> >>> >>> >> what's >>>> >>> >>> >> >>>>>>>> currently being discussed in >>>> >>> >>> >> >>>>>>>> https://issues.apache.org/jira/browse/KAFKA-1649 >>>> >>> >>> >> >>>>>>>> where >>>> >>> >>> >> >>>>>>>> users >>>> >>> >>> >> now get >>>> >>> >>> >> >>>>>>>> errors >>>> >>> >>> >> >>>>>>>> under conditions when they previously wouldn't. Old >>>> >>> >>> >> >>>>>>>> clients >>>> >>> >>> won't >>>> >>> >>> >> >>>>>>>> even >>>> >>> >>> >> >>>>>>>> know >>>> >>> >>> >> >>>>>>>> about the error code, so besides failing they won't >>>> >>> >>> >> >>>>>>>> even >>>> >>> >>> >> >>>>>>>> be >>>> >>> >>> able >>>> >>> >>> >> to >>>> >>> >>> >> >>>>>>>> log >>>> >>> >>> >> >>>>>>>> any >>>> >>> >>> >> >>>>>>>> meaningful error messages. >>>> >>> >>> >> >>>>>>>> >>>> >>> >>> >> >>>>>>>> I think there are two options for compatibility: >>>> >>> >>> >> >>>>>>>> >>>> >>> >>> >> >>>>>>>> 1. An alternative change is to remove the ack > 1 >>>> >>> >>> >> >>>>>>>> code, >>>> >>> >>> >> >>>>>>>> but >>>> >>> >>> >> silently >>>> >>> >>> >> >>>>>>>> "upgrade" requests with acks > 1 to acks = -1. This >>>> >>> >>> >> >>>>>>>> isn't >>>> >>> >>> >> >>>>>>>> the >>>> >>> >>> >> same as >>>> >>> >>> >> >>>>>>>> other >>>> >>> >>> >> >>>>>>>> changes to behavior since the interaction between the >>>> >>> >>> >> >>>>>>>> client >>>> >>> >>> and >>>> >>> >>> >> >>>>>>>> server >>>> >>> >>> >> >>>>>>>> remains the same, no error codes change, etc. The >>>> >>> >>> >> >>>>>>>> client >>>> >>> >>> >> >>>>>>>> might >>>> >>> >>> >> just >>>> >>> >>> >> >>>>>>>> see >>>> >>> >>> >> >>>>>>>> some increased latency since the message might need >>>> >>> >>> >> >>>>>>>> to be >>>> >>> >>> >> replicated >>>> >>> >>> >> >>>>>>>> to >>>> >>> >>> >> >>>>>>>> more brokers than they requested. >>>> >>> >>> >> >>>>>>>> 2. Split this into two patches, one that bumps the >>>> >>> >>> >> >>>>>>>> protocol >>>> >>> >>> >> version >>>> >>> >>> >> >>>>>>>> on >>>> >>> >>> >> >>>>>>>> that >>>> >>> >>> >> >>>>>>>> message to include the new error code but maintains >>>> >>> >>> >> >>>>>>>> both >>>> >>> >>> >> >>>>>>>> old >>>> >>> >>> (now >>>> >>> >>> >> >>>>>>>> deprecated) and new behavior, then a second that >>>> >>> >>> >> >>>>>>>> would be >>>> >>> >>> >> applied in >>>> >>> >>> >> >>>>>>>> a >>>> >>> >>> >> >>>>>>>> later release that removes the old protocol + code >>>> >>> >>> >> >>>>>>>> for >>>> >>> >>> >> >>>>>>>> handling >>>> >>> >>> >> acks >>>> >>> >>> >> >>>>>> 1. >>>> >>> >>> >> >>>>>>>> >>>> >>> >>> >> >>>>>>>> 2 is probably the right thing to do. If we specify >>>> >>> >>> >> >>>>>>>> the >>>> >>> >>> >> >>>>>>>> release >>>> >>> >>> >> when >>>> >>> >>> >> >>>>>> we'll >>>> >>> >>> >> >>>>>>>> remove the deprecated protocol at the time of >>>> >>> >>> >> >>>>>>>> deprecation >>>> >>> >>> >> >>>>>>>> it >>>> >>> >>> >> makes >>>> >>> >>> >> >>>>>> things >>>> >>> >>> >> >>>>>>>> a >>>> >>> >>> >> >>>>>>>> lot easier for people writing non-java clients and >>>> >>> >>> >> >>>>>>>> could >>>> >>> >>> >> >>>>>>>> give >>>> >>> >>> >> users >>>> >>> >>> >> >>>>>> better >>>> >>> >>> >> >>>>>>>> predictability (e.g. if clients are at most 1 major >>>> >>> >>> >> >>>>>>>> release >>>> >>> >>> >> behind >>>> >>> >>> >> >>>>>>>> brokers, >>>> >>> >>> >> >>>>>>>> they'll remain compatible but possibly use deprecated >>>> >>> >>> features). >>>> >>> >>> >> >>>>>>>> >>>> >>> >>> >> >>>>>>>> >>>> >>> >>> >> >>>>>>>> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira < >>>> >>> >>> >> gshap...@cloudera.com> >>>> >>> >>> >> >>>>>>>> wrote: >>>> >>> >>> >> >>>>>>>> >>>> >>> >>> >> >>>>>>>>> Hi Kafka Devs, >>>> >>> >>> >> >>>>>>>>> >>>> >>> >>> >> >>>>>>>>> We are working on KAFKA-1697 - remove code related >>>> >>> >>> >> >>>>>>>>> to >>>> >>> >>> >> >>>>>>>>> ack>1 on >>>> >>> >>> >> the >>>> >>> >>> >> >>>>>>>>> broker. Per Neha's suggestion, I'd like to give >>>> >>> >>> >> >>>>>>>>> everyone >>>> >>> >>> >> >>>>>>>>> a >>>> >>> >>> >> heads up >>>> >>> >>> >> >>>>>>>>> on >>>> >>> >>> >> >>>>>>>>> what these changes mean. >>>> >>> >>> >> >>>>>>>>> >>>> >>> >>> >> >>>>>>>>> Once this patch is included, any produce requests >>>> >>> >>> >> >>>>>>>>> that >>>> >>> >>> >> >>>>>>>>> include >>>> >>> >>> >> >>>>>>>>> request.required.acks > 1 will result in an >>>> >>> >>> >> >>>>>>>>> exception. >>>> >>> >>> >> >>>>>>>>> This >>>> >>> >>> >> will be >>>> >>> >>> >> >>>>>>>>> InvalidRequiredAcks in new versions (0.8.3 and up, I >>>> >>> >>> >> >>>>>>>>> assume) >>>> >>> >>> and >>>> >>> >>> >> >>>>>>>>> UnknownException in existing versions (sorry, but I >>>> >>> >>> >> >>>>>>>>> can't add >>>> >>> >>> >> error >>>> >>> >>> >> >>>>>>>>> codes retroactively). >>>> >>> >>> >> >>>>>>>>> >>>> >>> >>> >> >>>>>>>>> This behavior is already enforced by 0.8.2 producers >>>> >>> >>> >> >>>>>>>>> (sync and >>>> >>> >>> >> >>>>>>>>> new), >>>> >>> >>> >> >>>>>>>>> but we expect impact on users with older producers >>>> >>> >>> >> >>>>>>>>> that >>>> >>> >>> >> >>>>>>>>> relied >>>> >>> >>> >> on >>>> >>> >>> >> >>>>>>>>> acks >>>> >>> >>> >> >>>>>>>>>> 1 and external clients (i.e python, go, etc). >>>> >>> >>> >> >>>>>>>>> >>>> >>> >>> >> >>>>>>>>> Users who relied on acks > 1 are expected to switch >>>> >>> >>> >> >>>>>>>>> to >>>> >>> >>> >> >>>>>>>>> using >>>> >>> >>> >> acks = >>>> >>> >>> >> >>>>>>>>> -1 >>>> >>> >>> >> >>>>>>>>> and a min.isr parameter than matches their user >>>> >>> >>> >> >>>>>>>>> case. >>>> >>> >>> >> >>>>>>>>> >>>> >>> >>> >> >>>>>>>>> This change was discussed in the past in the context >>>> >>> >>> >> >>>>>>>>> of >>>> >>> >>> >> KAFKA-1555 >>>> >>> >>> >> >>>>>>>>> (min.isr), but let us know if you have any questions >>>> >>> >>> >> >>>>>>>>> or >>>> >>> >>> concerns >>>> >>> >>> >> >>>>>>>>> regarding this change. >>>> >>> >>> >> >>>>>>>>> >>>> >>> >>> >> >>>>>>>>> Gwen >>>> >>> >>> >> >>>>>>>> >>>> >>> >>> >> >>>>>>>> >>>> >>> >>> >> >>>>>>>> >>>> >>> >>> >> >>>>>>>> -- >>>> >>> >>> >> >>>>>>>> Thanks, >>>> >>> >>> >> >>>>>>>> Ewen >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> -- >>>> >>> >>> >> >>>>>>> You received this message because you are subscribed >>>> >>> >>> >> >>>>>>> to >>>> >>> >>> >> >>>>>>> the >>>> >>> >>> Google >>>> >>> >>> >> >>>>>>> Groups >>>> >>> >>> >> >>>>>>> "kafka-clients" group. >>>> >>> >>> >> >>>>>>> To unsubscribe from this group and stop receiving >>>> >>> >>> >> >>>>>>> emails >>>> >>> >>> >> >>>>>>> from >>>> >>> >>> it, >>>> >>> >>> >> send >>>> >>> >>> >> >>>>>>> an >>>> >>> >>> >> >>>>>>> email to kafka-clients+unsubscr...@googlegroups.com. >>>> >>> >>> >> >>>>>>> To post to this group, send email to >>>> >>> >>> >> kafka-clie...@googlegroups.com. >>>> >>> >>> >> >>>>>>> Visit this group at >>>> >>> >>> http://groups.google.com/group/kafka-clients. >>>> >>> >>> >> >>>>>>> To view this discussion on the web visit >>>> >>> >>> >> >>>>>> >>>> >>> >>> >> >>>>>> >>>> >>> >>> >> >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> https://groups.google.com/d/msgid/kafka-clients/CAA7ooCBtH2JjyQsArdx_%3DV25B4O1QJk0YvOu9U6kYt9sB4aqng%40mail.gmail.com >>>> >>> >>> >> >>>>>> . >>>> >>> >>> >> >>>>>>> >>>> >>> >>> >> >>>>>>> For more options, visit >>>> >>> >>> >> >>>>>>> https://groups.google.com/d/optout. >>>> >>> >>> >> >>>>>> >>>> >>> >>> >> >>>>>> -- >>>> >>> >>> >> >>>>>> You received this message because you are subscribed to >>>> >>> >>> >> >>>>>> the >>>> >>> >>> Google >>>> >>> >>> >> >>>>>> Groups >>>> >>> >>> >> >>>>>> "kafka-clients" group. >>>> >>> >>> >> >>>>>> To unsubscribe from this group and stop receiving >>>> >>> >>> >> >>>>>> emails >>>> >>> >>> >> >>>>>> from it, >>>> >>> >>> >> send >>>> >>> >>> >> >>>>>> an >>>> >>> >>> >> >>>>>> email to kafka-clients+unsubscr...@googlegroups.com. >>>> >>> >>> >> >>>>>> To post to this group, send email to >>>> >>> >>> >> kafka-clie...@googlegroups.com. >>>> >>> >>> >> >>>>>> Visit this group at >>>> >>> >>> >> >>>>>> http://groups.google.com/group/kafka-clients >>>> >>> >>> . >>>> >>> >>> >> >>>>>> To view this discussion on the web visit >>>> >>> >>> >> >>>>>> >>>> >>> >>> >> >>>>>> >>>> >>> >>> >> >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> https://groups.google.com/d/msgid/kafka-clients/CAHBV8WeUebxi%2B%2BSbjz8E9Yf4u4hkcPJ80Xsj0XTKcTac%3D%2B613A%40mail.gmail.com >>>> >>> >>> >> >>>>>> . >>>> >>> >>> >> >>>>>> For more options, visit >>>> >>> >>> >> >>>>>> https://groups.google.com/d/optout. >>>> >>> >>> >> >>>> >>>> >>> >>> >> >>>> >>>> >>> >>> >> >>>> >>>> >>> >>> >> >>>> >>>> >>> >>> >> >>>> -- >>>> >>> >>> >> >>>> Thanks, >>>> >>> >>> >> >>>> Ewen >>>> >>> >>> >> >>>> >>> >>> > >>>> >>> >>> > >>>> >>> >>> >>>> >>> >>> >>>> >>> >>> -- >>>> >>> >>> Thanks, >>>> >>> >>> Ewen >>>> >>> >>> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> -- >>>> >>> >> Thanks, >>>> >>> >> Neha >>>> >> >>>> >> >>>> >> -- >>>> >> You received this message because you are subscribed to the Google >>>> >> Groups >>>> >> "kafka-clients" group. >>>> >> To unsubscribe from this group and stop receiving emails from it, send >>>> >> an >>>> >> email to kafka-clients+unsubscr...@googlegroups.com. >>>> >> To post to this group, send email to kafka-clie...@googlegroups.com. >>>> >> Visit this group at http://groups.google.com/group/kafka-clients. >>>> >> To view this discussion on the web visit >>>> >> >>>> >> https://groups.google.com/d/msgid/kafka-clients/CAOeJiJh17CYq%3D-qgPu9rnArsPW%3D7RL9AAW_h%3DrrXx0%2BKhhKgNQ%40mail.gmail.com. >>>> >> >>>> >> For more options, visit https://groups.google.com/d/optout. >>>> > >>>> > >>> >>> >>