Gwen,

KIP write-up looks good. According to the rest of the KIP process proposal,
would you like to start a DISCUSS/VOTE thread for it?

Thanks,
Neha

On Fri, Jan 16, 2015 at 9:37 AM, Ewen Cheslack-Postava <e...@confluent.io>
wrote:

> Gwen -- KIP write up looks good. Deprecation schedule probably needs to be
> more specific, but I think that discussion probably needs to happen after a
> solution is agreed upon.
>
> Jay -- I think "older clients will get a bad error message instead of a
> good one" isn't what would be happening with this change. Previously they
> wouldn't have received an error and they would have been able to produce
> messages. After the change they'll just receive this new error message
> which their clients can't possibly handle gracefully since it didn't exist
> when the client was written. Whether the acks > 1 setting was actually
> accomplishing what they thought doesn't matter. Someone could have
> reasonably read the docs on 0.8.1.1, thought acks = 2 is an ok setting for
> their applications, set it as a default across a bunch of apps, then follow
> the recommended upgrade path of updating brokers to 0.8.2 and all their
> apps will start failing on produce requests.
>
>
> On Thu, Jan 15, 2015 at 8:27 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>
> > This is a good case to discuss.
> >
> > Let's figure the general case of how we want to handle errors and get
> that
> > documented in the protocol. The problem right now is that we give no
> > guidance on this. I actually thought Gwen's suggestion made sense on the
> > guidance we should have given which is that we will enumerate a set of
> > errors and their meaning for each API but it is possible that other
> errors
> > will occur and they should be handled (maybe poorly) in the same way
> > UNKNOWN_ERROR is handled which is our normal escape hatch for things like
> > OOMException.
> >
> > I really do think we shouldn't be dogmatic here: In considering a change
> > to errors we should consider the potential ill-effect vs the complexity
> of
> > yet another protocol version.
> >
> > In this case I actually am not sure we need to bump the protocol because
> > the whole point of the change was to make a setting we think doesn't make
> > sense break, right? Well this will break it. It seems like the only
> > downside is that older clients will get a bad error message instead of a
> > good one. But it isn't like we will have rendered a client unusable, it
> is
> > just that they will need to change their config.
> >
> > -Jay
> >
> > On Thu, Jan 15, 2015 at 6:14 PM, Gwen Shapira <gshap...@cloudera.com>
> > wrote:
> >
> >> I created a KIP for this suggestion:
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1+-+Remove+support+of+request.required.acks
> >>
> >> Basically documenting what was already discussed here.  Comments will
> >> be awesome!
> >>
> >> Gwen
> >>
> >> On Thu, Jan 15, 2015 at 5:19 PM, Gwen Shapira <gshap...@cloudera.com>
> >> wrote:
> >> > The errors are part of the KIP process now, so I think the clients are
> >> safe :)
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
> >> >
> >> > On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin <steve.mo...@gmail.com>
> >> wrote:
> >> >> Agree errors should be part of the protocol
> >> >>
> >> >>> On Jan 15, 2015, at 17:59, Gwen Shapira <gshap...@cloudera.com>
> >> wrote:
> >> >>>
> >> >>> Hi,
> >> >>>
> >> >>> I got convinced by Joe and Dana that errors are indeed part of the
> >> >>> protocol and can't be randomly added.
> >> >>>
> >> >>> So, it looks like we need to bump version of ProduceRequest in the
> >> >>> following way:
> >> >>> Version 0 -> accept acks >1. I think we should keep the existing
> >> >>> behavior too (i.e. not replace it with -1) to avoid surprising
> >> >>> clients, but I'm willing to hear other opinions.
> >> >>> Version 1 -> do not accept acks >1 and return an error.
> >> >>> Are we ok with the error I added in KAFKA-1697? We can use something
> >> >>> less specific like InvalidRequestParameter. This error can be reused
> >> >>> in the future and reduce the need to add errors, but will also be
> less
> >> >>> clear to the client and its users. Maybe even add the error message
> >> >>> string to the protocol in addition to the error code? (since we are
> >> >>> bumping versions....)
> >> >>>
> >> >>> I think maintaining the old version throughout 0.8.X makes sense.
> IMO
> >> >>> dropping it for 0.9 is feasible, but I'll let client owners help
> make
> >> >>> that call.
> >> >>>
> >> >>> Am I missing anything? Should I start a KIP? It seems like a
> KIP-type
> >> >>> discussion :)
> >> >>>
> >> >>> Gwen
> >> >>>
> >> >>>
> >> >>> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava
> >> >>> <e...@confluent.io> wrote:
> >> >>>> Gwen,
> >> >>>>
> >> >>>> I think the only option that wouldn't require a protocol version
> >> change is
> >> >>>> the one where acks > 1 is converted to acks = -1 since it's the
> only
> >> one
> >> >>>> that doesn't potentially break older clients. The protocol guide
> >> says that
> >> >>>> the expected upgrade path is servers first, then clients, so old
> >> clients,
> >> >>>> including non-java clients, that may be using acks > 1 should be
> >> able to
> >> >>>> work with a new broker version.
> >> >>>>
> >> >>>> It's more work, but I think dealing with the protocol change is the
> >> right
> >> >>>> thing to do since it eventually gets us to the behavior I think is
> >> better --
> >> >>>> the broker should reject requests with invalid values. I think Joe
> >> and I
> >> >>>> were basically in agreement. In my mind the major piece missing
> from
> >> his
> >> >>>> description is how long we're going to maintain his "case 0"
> >> behavior. It's
> >> >>>> impractical to maintain old versions forever, but it sounds like
> >> there
> >> >>>> hasn't been a decision on how long to maintain them. Maybe that's
> >> another
> >> >>>> item to add to KIPs -- protocol versions and behavior need to be
> >> listed as
> >> >>>> deprecated and the earliest version in which they'll be removed
> >> should be
> >> >>>> specified so users can understand which versions are guaranteed to
> be
> >> >>>> compatible, even if they're using (well-written) non-java clients.
> >> >>>>
> >> >>>> -Ewen
> >> >>>>
> >> >>>>
> >> >>>>> On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers <
> >> dana.pow...@gmail.com> wrote:
> >> >>>>>
> >> >>>>>> clients don't break on unknown errors
> >> >>>>>
> >> >>>>> maybe true for the official java clients, but I dont think the
> >> assumption
> >> >>>>> holds true for community-maintained clients and users of those
> >> clients.
> >> >>>>> kafka-python generally follows the fail-fast philosophy and raises
> >> an
> >> >>>>> exception on any unrecognized error code in any server response.
> >> in this
> >> >>>>> case, kafka-python allows users to set their own required-acks
> >> policy when
> >> >>>>> creating a producer instance.  It is possible that users of
> >> kafka-python
> >> >>>>> have deployed producer code that uses ack>1 -- perhaps in
> production
> >> >>>>> environments -- and for those users the new error code will crash
> >> their
> >> >>>>> producer code.  I would not be surprised if the same were true of
> >> other
> >> >>>>> community clients.
> >> >>>>>
> >> >>>>> *one reason for the fail-fast approach is that there isn't great
> >> >>>>> documentation on what errors to expect for each request / response
> >> -- so
> >> >>>>> we
> >> >>>>> use failures to alert that some error case is not handled
> >> properly.  and
> >> >>>>> because of that, introducing new error cases without bumping the
> api
> >> >>>>> version is likely to cause those errors to get raised/thrown all
> >> the way
> >> >>>>> back up to the user.  of course we (client maintainers) can fix
> the
> >> issues
> >> >>>>> in the client libraries and suggest users upgrade, but it's not
> the
> >> ideal
> >> >>>>> situation.
> >> >>>>>
> >> >>>>>
> >> >>>>> long-winded way of saying: I agree w/ Joe.
> >> >>>>>
> >> >>>>> -Dana
> >> >>>>>
> >> >>>>>
> >> >>>>> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira <
> >> gshap...@cloudera.com>
> >> >>>>> wrote:
> >> >>>>>
> >> >>>>>> Is the protocol bump caused by the behavior change or the new
> error
> >> >>>>>> code?
> >> >>>>>>
> >> >>>>>> 1) IMO, error_codes are data, and clients can expect to receive
> >> errors
> >> >>>>>> that they don't understand (i.e. unknown errors). AFAIK, clients
> >> don't
> >> >>>>>> break on unknown errors, they are simple more challenging to
> >> debug. If
> >> >>>>>> we document the new behavior, then its definitely debuggable and
> >> >>>>>> fixable.
> >> >>>>>>
> >> >>>>>> 2) The behavior change is basically a deprecation - i.e. acks > 1
> >> were
> >> >>>>>> never documented, and are not supported by Kafka clients starting
> >> with
> >> >>>>>> version 0.8.2. I'm not sure this requires a protocol bump either,
> >> >>>>>> although its a better case than new error codes.
> >> >>>>>>
> >> >>>>>> Thanks,
> >> >>>>>> Gwen
> >> >>>>>>
> >> >>>>>> On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein <
> joe.st...@stealth.ly>
> >> >>>>>> wrote:
> >> >>>>>>> Looping in the mailing list that the client developers live on
> >> because
> >> >>>>>> they
> >> >>>>>>> are all not on dev (though they should be if they want to be
> >> helping
> >> >>>>>>> to
> >> >>>>>>> build the best client libraries they can).
> >> >>>>>>>
> >> >>>>>>> I whole hardily believe that we need to not break existing
> >> >>>>>>> functionality
> >> >>>>>> of
> >> >>>>>>> the client protocol, ever.
> >> >>>>>>>
> >> >>>>>>> There are many reasons for this and we have other threads on the
> >> >>>>>>> mailing
> >> >>>>>>> list where we are discussing that topic (no pun intended) that I
> >> don't
> >> >>>>>> want
> >> >>>>>>> to re-hash here.
> >> >>>>>>>
> >> >>>>>>> If we change wire protocol functionality OR the binary format
> >> (either)
> >> >>>>>>> we
> >> >>>>>>> must bump version AND treat version as a feature flag with
> >> backward
> >> >>>>>>> compatibility support until it is deprecated for some time for
> >> folks
> >> >>>>>>> to
> >> >>>>>> deal
> >> >>>>>>> with it.
> >> >>>>>>>
> >> >>>>>>> match version = {
> >> >>>>>>> case 0: keepDoingWhatWeWereDoing()
> >> >>>>>>> case 1: doNewStuff()
> >> >>>>>>> case 2: doEvenMoreNewStuff()
> >> >>>>>>> }
> >> >>>>>>>
> >> >>>>>>> has to be a practice we adopt imho ... I know feature flags can
> be
> >> >>>>>> construed
> >> >>>>>>> as "messy code" but I am eager to hear another (better?
> >> different?)
> >> >>>>>> solution
> >> >>>>>>> to this.
> >> >>>>>>>
> >> >>>>>>> If we don't do a feature flag like this specifically with this
> >> change
> >> >>>>>> then
> >> >>>>>>> what happens is that someone upgrades their brokers with a
> rolling
> >> >>>>>> restart
> >> >>>>>>> in 0.8.3 and every single one of their producer requests start
> to
> >> fail
> >> >>>>>> and
> >> >>>>>>> they have a major production outage. eeeek!!!!
> >> >>>>>>>
> >> >>>>>>> I do 100% agree that > 1 makes no sense and we *REALLY* need
> >> people to
> >> >>>>>> start
> >> >>>>>>> using 0,1,-1 but we need to-do that in a way that is going to
> >> work for
> >> >>>>>>> everyone.
> >> >>>>>>>
> >> >>>>>>> Old producers and consumers must keep working with new brokers
> >> and if
> >> >>>>>>> we
> >> >>>>>> are
> >> >>>>>>> not going to support that then I am unclear what the use of
> >> "version"
> >> >>>>>>> is
> >> >>>>>>> based on our original intentions of having it because of the
> >> >>>>>>> 0.7=>-0.8.
> >> >>>>>> We
> >> >>>>>>> said no more breaking changes when we did that.
> >> >>>>>>>
> >> >>>>>>> - Joe Stein
> >> >>>>>>>
> >> >>>>>>> On Thu, Jan 15, 2015 at 12:38 PM, Ewen Cheslack-Postava <
> >> >>>>>> e...@confluent.io>
> >> >>>>>>> wrote:
> >> >>>>>>>>
> >> >>>>>>>> Right, so this looks like it could create an issue similar to
> >> what's
> >> >>>>>>>> currently being discussed in
> >> >>>>>>>> https://issues.apache.org/jira/browse/KAFKA-1649 where users
> >> now get
> >> >>>>>>>> errors
> >> >>>>>>>> under conditions when they previously wouldn't. Old clients
> won't
> >> >>>>>>>> even
> >> >>>>>>>> know
> >> >>>>>>>> about the error code, so besides failing they won't even be
> able
> >> to
> >> >>>>>>>> log
> >> >>>>>>>> any
> >> >>>>>>>> meaningful error messages.
> >> >>>>>>>>
> >> >>>>>>>> I think there are two options for compatibility:
> >> >>>>>>>>
> >> >>>>>>>> 1. An alternative change is to remove the ack > 1 code, but
> >> silently
> >> >>>>>>>> "upgrade" requests with acks > 1 to acks = -1. This isn't the
> >> same as
> >> >>>>>>>> other
> >> >>>>>>>> changes to behavior since the interaction between the client
> and
> >> >>>>>>>> server
> >> >>>>>>>> remains the same, no error codes change, etc. The client might
> >> just
> >> >>>>>>>> see
> >> >>>>>>>> some increased latency since the message might need to be
> >> replicated
> >> >>>>>>>> to
> >> >>>>>>>> more brokers than they requested.
> >> >>>>>>>> 2. Split this into two patches, one that bumps the protocol
> >> version
> >> >>>>>>>> on
> >> >>>>>>>> that
> >> >>>>>>>> message to include the new error code but maintains both old
> (now
> >> >>>>>>>> deprecated) and new behavior, then a second that would be
> >> applied in
> >> >>>>>>>> a
> >> >>>>>>>> later release that removes the old protocol + code for handling
> >> acks
> >> >>>>>> 1.
> >> >>>>>>>>
> >> >>>>>>>> 2 is probably the right thing to do. If we specify the release
> >> when
> >> >>>>>> we'll
> >> >>>>>>>> remove the deprecated protocol at the time of deprecation it
> >> makes
> >> >>>>>> things
> >> >>>>>>>> a
> >> >>>>>>>> lot easier for people writing non-java clients and could give
> >> users
> >> >>>>>> better
> >> >>>>>>>> predictability (e.g. if clients are at most 1 major release
> >> behind
> >> >>>>>>>> brokers,
> >> >>>>>>>> they'll remain compatible but possibly use deprecated
> features).
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira <
> >> gshap...@cloudera.com>
> >> >>>>>>>> wrote:
> >> >>>>>>>>
> >> >>>>>>>>> Hi Kafka Devs,
> >> >>>>>>>>>
> >> >>>>>>>>> We are working on KAFKA-1697 - remove code related to ack>1 on
> >> the
> >> >>>>>>>>> broker. Per Neha's suggestion, I'd like to give everyone a
> >> heads up
> >> >>>>>>>>> on
> >> >>>>>>>>> what these changes mean.
> >> >>>>>>>>>
> >> >>>>>>>>> Once this patch is included, any produce requests that include
> >> >>>>>>>>> request.required.acks > 1 will result in an exception. This
> >> will be
> >> >>>>>>>>> InvalidRequiredAcks in new versions (0.8.3 and up, I assume)
> and
> >> >>>>>>>>> UnknownException in existing versions (sorry, but I can't add
> >> error
> >> >>>>>>>>> codes retroactively).
> >> >>>>>>>>>
> >> >>>>>>>>> This behavior is already enforced by 0.8.2 producers (sync and
> >> >>>>>>>>> new),
> >> >>>>>>>>> but we expect impact on users with older producers that relied
> >> on
> >> >>>>>>>>> acks
> >> >>>>>>>>>> 1 and external clients (i.e python, go, etc).
> >> >>>>>>>>>
> >> >>>>>>>>> Users who relied on acks > 1 are expected to switch to using
> >> acks =
> >> >>>>>>>>> -1
> >> >>>>>>>>> and a min.isr parameter than matches their user case.
> >> >>>>>>>>>
> >> >>>>>>>>> This change was discussed in the past in the context of
> >> KAFKA-1555
> >> >>>>>>>>> (min.isr), but let us know if you have any questions or
> concerns
> >> >>>>>>>>> regarding this change.
> >> >>>>>>>>>
> >> >>>>>>>>> Gwen
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> --
> >> >>>>>>>> Thanks,
> >> >>>>>>>> Ewen
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> --
> >> >>>>>>> You received this message because you are subscribed to the
> Google
> >> >>>>>>> Groups
> >> >>>>>>> "kafka-clients" group.
> >> >>>>>>> To unsubscribe from this group and stop receiving emails from
> it,
> >> send
> >> >>>>>>> an
> >> >>>>>>> email to kafka-clients+unsubscr...@googlegroups.com.
> >> >>>>>>> To post to this group, send email to
> >> kafka-clie...@googlegroups.com.
> >> >>>>>>> Visit this group at
> http://groups.google.com/group/kafka-clients.
> >> >>>>>>> To view this discussion on the web visit
> >> >>>>>>
> >> >>>>>>
> >>
> https://groups.google.com/d/msgid/kafka-clients/CAA7ooCBtH2JjyQsArdx_%3DV25B4O1QJk0YvOu9U6kYt9sB4aqng%40mail.gmail.com
> >> >>>>>> .
> >> >>>>>>>
> >> >>>>>>> For more options, visit https://groups.google.com/d/optout.
> >> >>>>>>
> >> >>>>>> --
> >> >>>>>> You received this message because you are subscribed to the
> Google
> >> >>>>>> Groups
> >> >>>>>> "kafka-clients" group.
> >> >>>>>> To unsubscribe from this group and stop receiving emails from it,
> >> send
> >> >>>>>> an
> >> >>>>>> email to kafka-clients+unsubscr...@googlegroups.com.
> >> >>>>>> To post to this group, send email to
> >> kafka-clie...@googlegroups.com.
> >> >>>>>> Visit this group at http://groups.google.com/group/kafka-clients
> .
> >> >>>>>> To view this discussion on the web visit
> >> >>>>>>
> >> >>>>>>
> >>
> https://groups.google.com/d/msgid/kafka-clients/CAHBV8WeUebxi%2B%2BSbjz8E9Yf4u4hkcPJ80Xsj0XTKcTac%3D%2B613A%40mail.gmail.com
> >> >>>>>> .
> >> >>>>>> For more options, visit https://groups.google.com/d/optout.
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>> Thanks,
> >> >>>> Ewen
> >>
> >
> >
>
>
> --
> Thanks,
> Ewen
>



-- 
Thanks,
Neha

Reply via email to