On Fri, Sep 25, 2020, at 10:49, Boyang Chen wrote:
> Hey Jun,
> 
> On Fri, Sep 25, 2020 at 10:19 AM Jun Rao <j...@confluent.io> wrote:
> 
> > Hi, Boyang,
> >
> > Does EnvelopeRequest avoid the need for IBP? How do we know if the
> > controller supports EnvelopeRequest or not?
> >
> > Unfortunately, the EnvelopeRequest is solving the inter-broker
> > communication problem only. Admin clients still need to learn the proper
> > ApiVersion from the broker, which means we need to bump IBP to limit the
> > version range.
> 

Right-- the purpose of EnvelopeRequest is to avoid downconversion / 
upconversion on the forwarding broker.  It unfortunately doesn't avoid the need 
to tie ApiVersionsResponse to IBP.

> > > On Thu, Sep 24, 2020 at 4:53 PM Jun Rao <j...@confluent.io> wrote:
> > >
> > > > Hi, Jason,
> > > >
> > > > Yes, the most important thing is to be able to avoid two rolling
> > > > restarts
> > > > in the future. If we have a path to achieve that down the road, the
> > > > changes here are fine.
> > > >

Yeah.  I think it would be good to make IBP a feature flag, as long as it could 
be changed without doing a second rolling restart.  We actually don't want to 
have too many feature flags, since it blows up the test matrix.

best,
Colin

> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Thu, Sep 24, 2020 at 3:20 PM Jason Gustafson <ja...@confluent.io>
> > > > wrote:
> > > >
> > > > > > One of the goals of KIP-584 (feature versioning) is that we can get
> > > rid
> > > > > of
> > > > > IBP in the future. So does this change prevent us from removing IBP
> > in
> > > > the
> > > > > future?
> > > > >
> > > > > That is a good question. I think the problem here is that request
> > > > > forwarding puts an expectation on api version support which covers
> > more
> > > > > than one broker. This is why the normal ApiVersions behavior doesn't
> > > > work.
> > > > > I thought about this a bit and haven't come up with a good
> > alternative.
> > > > One
> > > > > thought I've been considering is letting the controller in the
> > > > post-kip-500
> > > > > world set the maximum range of api support for the cluster. However,
> > > even
> > > > > then we would need some way to tell when the controller quorum itself
> > > is
> > > > > ready to enable support for a new api version. My feeling is that we
> > > will
> > > > > probably always need something like the IBP to control when it is
> > safe
> > > to
> > > > > expose versions of APIs which have a cross-broker dependence.
> > However,
> > > > > KIP-584 would still allow us to manage the IBP at the level of a
> > > feature
> > > > so
> > > > > that we don't need two rolling restarts anymore.
> > > > >
> > > > > Best,
> > > > > Jason
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Sep 24, 2020 at 1:59 PM Jun Rao <j...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Boyang,
> > > > > >
> > > > > > One of the goals of KIP-584 (feature versioning) is that we can get
> > > rid
> > > > > of
> > > > > > IBP in the future. So does this change prevent us from removing IBP
> > > in
> > > > > the
> > > > > > future?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Thu, Sep 24, 2020 at 12:46 PM Jason Gustafson <
> > ja...@confluent.io
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hey Boyang,
> > > > > > >
> > > > > > > Thanks for the update. This seems like the best thing we can do.
> > > The
> > > > > > > alternative would be to always ensure that the forwarded APIs are
> > > > safe
> > > > > > for
> > > > > > > conversion between versions, but that would restrict the
> > > flexibility
> > > > > that
> > > > > > > the versioning is providing. It would also be a large effort to
> > > avoid
> > > > > > > introducing regressions through conversion. Sadly this broadens
> > the
> > > > > scope
> > > > > > > of the IBP, but in fact forwarded APIs are inter-broker APIs.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Jason
> > > > > > >
> > > > > > > On Thu, Sep 24, 2020 at 9:23 AM Boyang Chen <
> > > > > reluctanthero...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hey there,
> > > > > > > >
> > > > > > > > we spotted a necessary case to handle the redirect request
> > > > > versioning,
> > > > > > > and
> > > > > > > > proposed the following changes:
> > > > > > > >
> > > > > > > > 1. For redirection RPCs (AlterConfig, Acl, Token etc), the
> > > > > > corresponding
> > > > > > > > allowed versions in the ApiVersionResponse will be affected by
> > > the
> > > > > > entire
> > > > > > > > cluster's versioning, not just the receiving broker, since we
> > > need
> > > > to
> > > > > > > > ensure the chosen version get properly handled by all parties.
> > > Thus
> > > > > > from
> > > > > > > > now on, RPC with redirection will be treated as inter-broker
> > RPC,
> > > > and
> > > > > > any
> > > > > > > > version bump for these RPCs has to go through IBP bump as well.
> > > > > > > > ApiVersionResponse will take IBP into considerations for the
> > > > > > redirection
> > > > > > > > RPCs allowable versions.
> > > > > > > >
> > > > > > > > 2. We would do the best effort to maintain the same request
> > > version
> > > > > for
> > > > > > > > the entire admin client -> receiving broker -> controller
> > broker
> > > > > path,
> > > > > > > but
> > > > > > > > for old RPC versions, they may not have flexible fields
> > > introduced
> > > > > yet.
> > > > > > > > Thus, we would have to upgrade the RPC to the minimum version
> > > which
> > > > > > > > supports flexible fields
> > > > > > > > and add another tagged field in the header called
> > > > > > > `OriginalRequestVersion`
> > > > > > > > to help the controller broker correctly deserialize the request
> > > > with
> > > > > > the
> > > > > > > > original admin client sent out version. We would not downgrade
> > > the
> > > > > > > original
> > > > > > > > request in any circumstance, since the flexible field support
> > is
> > > > > > required
> > > > > > > > to be open-ended on the high side.
> > > > > > > >
> > > > > > > > Let me know if you have any questions.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Boyang
> > > > > > > >
> > > > > > > > On Thu, Aug 6, 2020 at 6:11 PM Boyang Chen <
> > > > > reluctanthero...@gmail.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey there,
> > > > > > > > >
> > > > > > > > > we are going to introduce a minor change to bump the version
> > of
> > > > > > several
> > > > > > > > > RPCs which are currently not supporting flexible versions. It
> > > is
> > > > > > > > necessary
> > > > > > > > > because they need to be able to construct request header with
> > > > > initial
> > > > > > > > > principal name and client id as optional fields for
> > > redirection.
> > > > > The
> > > > > > > are
> > > > > > > > > only two of them:
> > > > > > > > >
> > > > > > > > > 1. AlterConfig
> > > > > > > > > 2. AlterClientQuotas
> > > > > > > > >
> > > > > > > > > Let me know if you have any questions.
> > > > > > > > >
> > > > > > > > > Boyang
> > > > > > > > >
> > > > > > > > > On Fri, Jul 31, 2020 at 11:42 AM Boyang Chen <
> > > > > > > reluctanthero...@gmail.com
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> Hey David,
> > > > > > > > >>
> > > > > > > > >> After discussing with Colin offline, I would like to correct
> > > one
> > > > > > case
> > > > > > > in
> > > > > > > > >> the described workflow, where the CLUSTER_ACTION
> > authorization
> > > > > would
> > > > > > > > not be
> > > > > > > > >> based on the initial principal field check, because it is
> > not
> > > a
> > > > > > > secured
> > > > > > > > >> condition which anyone could forge. The revised workflow
> > shall
> > > > be:
> > > > > > > > >>
> > > > > > > > >> Step 1. Filter out resources that are authorized
> > > > > > > > >>          1.1 Use traditional principals to verify first. If
> > > > > > > authorized,
> > > > > > > > >> continue
> > > > > > > > >>          1.2 If not authorized, check whether the request is
> > > > from
> > > > > > the
> > > > > > > > >> control plane. Note that this is a best-effort to verify
> > > whether
> > > > > the
> > > > > > > > >> request is internal.
> > > > > > > > >>          1.3 If the request is not from the control plane,
> > > > return
> > > > > > > > >> authorization failure
> > > > > > > > >>          1.4 If the request is from the control plane, use
> > > > > > > > CLUSTER_ACTION
> > > > > > > > >> to verify and determine the result
> > > > > > > > >>
> > > > > > > > >> Step 2. Check the request context to see if this is a
> > > forwarding
> > > > > > > > request,
> > > > > > > > >> by checking whether it is from control plane and uses extra
> > > > header
> > > > > > > > fields
> > > > > > > > >>         2.1 if the resource is authorized, and if this is
> > the
> > > > > active
> > > > > > > > >> controller, process it
> > > > > > > > >>         2.2 if the resource is authorized but this is not
> > the
> > > > > active
> > > > > > > > >> controller, return NOT_CONTROLLER to the sender (forwarding
> > > > > broker)
> > > > > > > for
> > > > > > > > >> retry
> > > > > > > > >>         2.3 if the resource is not authorized, return
> > > > > > > > >> CLUSTER_AUTHORIZATION_FAILURE to propagate back to the
> > > original
> > > > > > client
> > > > > > > > >> through forwarding broker
> > > > > > > > >> Step 3. If the request is not a forwarding request
> > > > > > > > >>         3.1 If the resource is authorized, and this is the
> > > > active
> > > > > > > > >> controller, process it
> > > > > > > > >>         3.2 If the resource is authorized, but this is not
> > > > active
> > > > > > > > >> controller, put the resource into the preparation for a new
> > > > > > > AlterConfig
> > > > > > > > >> request for forwarding
> > > > > > > > >>         3.3 If the resource is not authorized, reply the
> > > > original
> > > > > > > client
> > > > > > > > >> AUTHORIZATION_FAILURE when the forwarding request is
> > returned
> > > > > > > > >>
> > > > > > > > >> On Thu, Jul 30, 2020 at 3:47 PM Boyang Chen <
> > > > > > > reluctanthero...@gmail.com
> > > > > > > > >
> > > > > > > > >> wrote:
> > > > > > > > >>
> > > > > > > > >>>
> > > > > > > > >>>
> > > > > > > > >>> On Thu, Jul 30, 2020 at 7:18 AM David Jacot <
> > > > dja...@confluent.io
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >>>
> > > > > > > > >>>> Hi Boyang,
> > > > > > > > >>>>
> > > > > > > > >>>> Thanks for your answers.
> > > > > > > > >>>>
> > > > > > > > >>>> > The point for using the listener name is more of a
> > > security
> > > > > > > purpose,
> > > > > > > > >>>> to
> > > > > > > > >>>> > detect any forged request to our best effort.
> > > > > > > > >>>> > For throttling I think we could just check the request
> > > > header
> > > > > > for
> > > > > > > > >>>> > *InitialClientId* existence, to distinguish whether to
> > > apply
> > > > > > > > >>>> > throttling strategy as forwarded request or direct
> > > request.
> > > > > > > > >>>>
> > > > > > > > >>>> Reading "security" and "best effort" in the same sentence
> > > > makes
> > > > > > me a
> > > > > > > > >>>> little nervous :).
> > > > > > > > >>>>
> > > > > > > > >>>> The identification issue is also valid for quota as we
> > don't
> > > > > want
> > > > > > > one
> > > > > > > > >>>> to be
> > > > > > > > >>>> able to bypass the quota by forging a request as well,
> > isn't
> > > > it?
> > > > > > > > >>>> Otherwise,
> > > > > > > > >>>> anyone could just set the InitialPrincipal to bypass it. I
> > > > think
> > > > > > > that
> > > > > > > > we
> > > > > > > > >>>> should
> > > > > > > > >>>> only use InitialPrincipal and/or InitialClientId when we
> > > know
> > > > > that
> > > > > > > > they
> > > > > > > > >>>> come
> > > > > > > > >>>> from another broker. Based on what I read in the KIP, it
> > > looks
> > > > > > like
> > > > > > > we
> > > > > > > > >>>> could
> > > > > > > > >>>> only use them when the principal has CLUSTER_ACTION
> > > privilege.
> > > > > Do
> > > > > > I
> > > > > > > > >>>> understand it correctly?
> > > > > > > > >>>>
> > > > > > > > >>>> There is no 100% safe way to distinguish between raw
> > request
> > > > and
> > > > > > > > >>> forward request if you take malicious request into
> > > > consideration,
> > > > > > > which
> > > > > > > > >>> could happen
> > > > > > > > >>> anyway if the attacker prepares those requests to pass the
> > > > checks
> > > > > > > for a
> > > > > > > > >>> pre-KIP-500 cluster.
> > > > > > > > >>> We could at most know whether it is sent to the control
> > > plane,
> > > > or
> > > > > > the
> > > > > > > > >>> data plane, and whether it has extra header fields such as
> > > > > initial
> > > > > > > > >>> principal and client id defined. For a non-malicious
> > request
> > > > > > > > >>> going into the control plane, it must be sent from a valid
> > > > > broker,
> > > > > > > > which
> > > > > > > > >>> is a prerequisite to check its
> > > > > > > > >>> CLUSTER_ACTION principal. Take AlterConfig for an example,
> > > the
> > > > > > > intended
> > > > > > > > >>> workflow for a *KIP-590 broker* would be:
> > > > > > > > >>>
> > > > > > > > >>> Step 1. Check the request context to see if this is a
> > > > forwarding
> > > > > > > > >>> request, by checking whether it is from control plane and
> > > uses
> > > > > > extra
> > > > > > > > header
> > > > > > > > >>> fields
> > > > > > > > >>>         1.1 if it is a forwarding request, use
> > CLUSTER_ACTION
> > > > to
> > > > > > > verify
> > > > > > > > >>> the given resource
> > > > > > > > >>>         1.2 if the resource is authorized, and if this is
> > the
> > > > > > active
> > > > > > > > >>> controller, process it
> > > > > > > > >>>         1.3 if the resource is authorized but this is not
> > the
> > > > > > active
> > > > > > > > >>> controller, return NOT_CONTROLLER to the sender (forwarding
> > > > > broker)
> > > > > > > for
> > > > > > > > >>> retry
> > > > > > > > >>>         1.4 if the resource is not authorized, return
> > > > > > > > >>> CLUSTER_AUTHORIZATION_FAILURE to propagate back to the
> > > original
> > > > > > > client
> > > > > > > > >>> through forwarding broker
> > > > > > > > >>> Step 2. If the request is not a forwarding request
> > > > > > > > >>>         2.1 Verify with normal principal as ALTER on the
> > > given
> > > > > > > > resources
> > > > > > > > >>>         2.2 If the resource is authorized, and this is the
> > > > active
> > > > > > > > >>> controller, process it
> > > > > > > > >>>         2.3 If the resource is authorized, but this is not
> > > > active
> > > > > > > > >>> controller, put the resource into the preparation for a new
> > > > > > > AlterConfig
> > > > > > > > >>> request for forwarding
> > > > > > > > >>>         2.4 If the resource is not authorized, reply the
> > > > original
> > > > > > > > client
> > > > > > > > >>> AUTHORIZATION_FAILURE when the forwarding request is
> > returned
> > > > > > > > >>>
> > > > > > > > >>> When the control plane and data plane are using the same
> > > > > listener,
> > > > > > we
> > > > > > > > >>> couldn't distinguish whether a request is forwarded or not
> > > for
> > > > > > sure,
> > > > > > > > so in
> > > > > > > > >>> terms of the forward request checking, we have to require
> > the
> > > > > extra
> > > > > > > > header
> > > > > > > > >>> fields to present. A stronger checking mechanism could only
> > > be
> > > > > > > enforced
> > > > > > > > >>> when we upgrade to post-KIP-500 with a separate controller
> > > > > network.
> > > > > > > > >>>
> > > > > > > > >>>
> > > > > > > > >>>> I have made another pass on the whole KIP, I have few
> > nits:
> > > > > > > > >>>>
> > > > > > > > >>>> - The sentence "Take AlterConfig as an example to
> > understand
> > > > the
> > > > > > > > changes
> > > > > > > > >>>> we are making." does not make much sense anymore in the
> > > > > beginning
> > > > > > of
> > > > > > > > the
> > > > > > > > >>>> "Proposed Changes" chapter.
> > > > > > > > >>>>
> > > > > > > > >>>> Sure, deleted.
> > > > > > > > >>>
> > > > > > > > >>>
> > > > > > > > >>>> - When you say "Existing RPCs which are sending directly
> > to
> > > > the
> > > > > > > > >>>> controller
> > > > > > > > >>>> will
> > > > > > > > >>>> rely on forwarding as well.". I suggest to explicitly
> > > mention
> > > > > how
> > > > > > > "old
> > > > > > > > >>>> admin clients"
> > > > > > > > >>>> will work here to complement the sentence. Something like:
> > > > They
> > > > > > will
> > > > > > > > >>>> get a
> > > > > > > > >>>> random
> > > > > > > > >>>> broker id as the controller id in the metadata response
> > and
> > > > > stick
> > > > > > to
> > > > > > > > it
> > > > > > > > >>>> as
> > > > > > > > >>>> you explained.
> > > > > > > > >>>>
> > > > > > > > >>>> Sg, changed.
> > > > > > > > >>>
> > > > > > > > >>>
> > > > > > > > >>>> - "The purpose of adding principal name is for the audit
> > > > > logging,
> > > > > > > and
> > > > > > > > >>>> the
> > > > > > > > >>>> client id is
> > > > > > > > >>>> being used to throttling according to KIP-599
> > requirement."
> > > > > > > Actually,
> > > > > > > > >>>> KIP-599 needs
> > > > > > > > >>>> both the principal and the clientId.
> > > > > > > > >>>>
> > > > > > > > >>>> Makes sense.
> > > > > > > > >>>
> > > > > > > > >>>
> > > > > > > > >>>> - In the "Routing Request Security" chapter. It is written
> > > > that
> > > > > > the
> > > > > > > > >>>> forwarding broker
> > > > > > > > >>>> will verify the request with its own authorizer and will
> > > just
> > > > > > > forward
> > > > > > > > >>>> it if
> > > > > > > > >>>> the request
> > > > > > > > >>>> looks good. When a request contains for instance multiple
> > > > > topics,
> > > > > > I
> > > > > > > > >>>> suppose
> > > > > > > > >>>> that
> > > > > > > > >>>> we will forward only the authorized ones and not the whole
> > > > > > original
> > > > > > > > >>>> request
> > > > > > > > >>>> as is.
> > > > > > > > >>>> We may want to reword the sentence to make this clear.
> > > > > > > > >>>>
> > > > > > > > >>>> That makes sense, I will put this more detailed version of
> > > > > routing
> > > > > > > > into
> > > > > > > > >>> the design.
> > > > > > > > >>>
> > > > > > > > >>>
> > > > > > > > >>>> - For the record, should we put the previous proposal in
> > the
> > > > > > > rejected
> > > > > > > > >>>> alternatives as
> > > > > > > > >>>> well?
> > > > > > > > >>>>
> > > > > > > > >>>> We do have big changes in this KIP, our current strategy
> > is
> > > to
> > > > > > rely
> > > > > > > on
> > > > > > > > >>> wiki revisions if people
> > > > > > > > >>> are interested to figure out our previous design strategy.
> > > > > Putting
> > > > > > > the
> > > > > > > > >>> full design in current KIP proposal
> > > > > > > > >>> would distract too much for readers as we discussed during
> > > the
> > > > > last
> > > > > > > > time
> > > > > > > > >>> of big refactoring.
> > > > > > > > >>>
> > > > > > > > >>> Best,
> > > > > > > > >>>> David
> > > > > > > > >>>>
> > > > > > > > >>>> On Thu, Jul 30, 2020 at 3:51 AM Boyang Chen <
> > > > > > > > reluctanthero...@gmail.com
> > > > > > > > >>>> >
> > > > > > > > >>>> wrote:
> > > > > > > > >>>>
> > > > > > > > >>>> > Thanks David for the feedback!
> > > > > > > > >>>> >
> > > > > > > > >>>> > On Wed, Jul 29, 2020 at 7:53 AM David Jacot <
> > > > > > dja...@confluent.io>
> > > > > > > > >>>> wrote:
> > > > > > > > >>>> >
> > > > > > > > >>>> > > Hi, Colin, Boyang,
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > Colin, thanks for the clarification. Somehow, I
> > thought
> > > > that
> > > > > > > even
> > > > > > > > >>>> if the
> > > > > > > > >>>> > > controller is ran independently, it
> > > > > > > > >>>> > > would still run the listeners of the broker and thus
> > > would
> > > > > be
> > > > > > > > >>>> accessible
> > > > > > > > >>>> > by
> > > > > > > > >>>> > > redirecting on the loopback
> > > > > > > > >>>> > > interface. My mistake.
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > Boyang, I have few questions/comments regarding the
> > > > updated
> > > > > > KIP:
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > 1. I think that it would be great if we could clarify
> > > how
> > > > > old
> > > > > > > > admin
> > > > > > > > >>>> > clients
> > > > > > > > >>>> > > which are directly talking to the
> > > > > > > > >>>> > > controller will work with this KIP. I read between the
> > > > lines
> > > > > > > that,
> > > > > > > > >>>> as we
> > > > > > > > >>>> > > propose to provide a random
> > > > > > > > >>>> > > broker Id as the controller Id in the metadata
> > response,
> > > > > they
> > > > > > > will
> > > > > > > > >>>> use a
> > > > > > > > >>>> > > single node as a proxy. Is that
> > > > > > > > >>>> > > correct? This deserves to be called out more
> > explicitly
> > > in
> > > > > the
> > > > > > > > >>>> design
> > > > > > > > >>>> > > section instead of being hidden
> > > > > > > > >>>> > > in the protocol bump of the metadata RPC.
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > Makes sense, I stress this point in the compatibility
> > > > > section.
> > > > > > > > >>>> >
> > > > > > > > >>>> >
> > > > > > > > >>>> > > 1.1 If I understand correctly, could we assume that
> > old
> > > > > admin
> > > > > > > > >>>> clients
> > > > > > > > >>>> > will
> > > > > > > > >>>> > > stick to the same "fake controller"
> > > > > > > > >>>> > > until they refresh their metadata? Refreshing the
> > > metadata
> > > > > > > usually
> > > > > > > > >>>> > happens
> > > > > > > > >>>> > > when NOT_CONTROLLER
> > > > > > > > >>>> > > is received but this won't happen anymore so they
> > should
> > > > > > change
> > > > > > > > >>>> > > infrequently.
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > That is correct, old admin clients would not try to
> > > > refresh
> > > > > > > their
> > > > > > > > >>>> > metadata
> > > > > > > > >>>> > due to NOT_CONTROLLER,
> > > > > > > > >>>> > which is impossible to happen with the new broker
> > cluster.
> > > > > > > > >>>> >
> > > > > > > > >>>> >
> > > > > > > > >>>> > > 2. For the new admin client, I suppose that we plan on
> > > > using
> > > > > > > > >>>> > > LeastLoadedNodeProvider for the
> > > > > > > > >>>> > > requests that are using ControllerNodeProvider. We
> > could
> > > > > > perhaps
> > > > > > > > >>>> mention
> > > > > > > > >>>> > > it.
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > Sure, added.
> > > > > > > > >>>> >
> > > > > > > > >>>> >
> > > > > > > > >>>> > > 3. Pre KIP-500, will we have a way to distinguish if a
> > > > > request
> > > > > > > > that
> > > > > > > > >>>> is
> > > > > > > > >>>> > > received by the controller is
> > > > > > > > >>>> > > coming directly from a client or from a broker? You
> > > > mention
> > > > > > that
> > > > > > > > the
> > > > > > > > >>>> > > listener can be used to do
> > > > > > > > >>>> > > this but as you pointed out, it is not mandatory. Do
> > we
> > > > have
> > > > > > > > another
> > > > > > > > >>>> > > reliable method? I am asking
> > > > > > > > >>>> > > in the context of KIP-599 with the current controller,
> > > we
> > > > > may
> > > > > > > need
> > > > > > > > >>>> to
> > > > > > > > >>>> > > throttle differently if the
> > > > > > > > >>>> > > request comes from a client or from a broker.
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > The point for using the listener name is more of a
> > > > security
> > > > > > > > >>>> purpose, to
> > > > > > > > >>>> > detect any forged request to our best effort.
> > > > > > > > >>>> > For throttling I think we could just check the request
> > > > header
> > > > > > for
> > > > > > > > >>>> > *InitialClientId* existence, to distinguish whether to
> > > apply
> > > > > > > > >>>> > throttling strategy as forwarded request or direct
> > > request.
> > > > > > > > >>>> >
> > > > > > > > >>>> >
> > > > > > > > >>>> > > 4. Could we add `InitialClientId` as well? This will
> > be
> > > > > > required
> > > > > > > > >>>> for the
> > > > > > > > >>>> > > quota as we can apply them
> > > > > > > > >>>> > > by principal and/or clientId.
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > Sounds good, added.
> > > > > > > > >>>> >
> > > > > > > > >>>> >
> > > > > > > > >>>> > > 5. A small remark regarding the structure of the KIP.
> > It
> > > > is
> > > > > a
> > > > > > > bit
> > > > > > > > >>>> weird
> > > > > > > > >>>> > > that requests that do not go
> > > > > > > > >>>> > > to the controller are mentioned in the Proposed Design
> > > > > section
> > > > > > > and
> > > > > > > > >>>> the
> > > > > > > > >>>> > > requests that go to the
> > > > > > > > >>>> > > controller are mentioned in the Public Interfaces.
> > When
> > > > one
> > > > > > read
> > > > > > > > the
> > > > > > > > >>>> > > Proposed Design, it does not
> > > > > > > > >>>> > > get a full picture of the whole new routing proposal
> > for
> > > > old
> > > > > > and
> > > > > > > > new
> > > > > > > > >>>> > > clients. It would be great if we
> > > > > > > > >>>> > > could have a full overview in that section.
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > Good point, I will move the pieces around.
> > > > > > > > >>>> >
> > > > > > > > >>>> >
> > > > > > > > >>>> > > Overall the change makes sense to me. I will work on
> > > > > drafting
> > > > > > an
> > > > > > > > >>>> addendum
> > > > > > > > >>>> > > to KIP-599 to
> > > > > > > > >>>> > > alter the design to cope with these changes. At a
> > first
> > > > > > glance,
> > > > > > > > that
> > > > > > > > >>>> > seems
> > > > > > > > >>>> > > doable if 1.1, 3
> > > > > > > > >>>> > > and 4 are OK.
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > Thank you for the help!
> > > > > > > > >>>> >
> > > > > > > > >>>> >
> > > > > > > > >>>> > > Thanks,
> > > > > > > > >>>> > > David
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > On Wed, Jul 29, 2020 at 5:29 AM Boyang Chen <
> > > > > > > > >>>> reluctanthero...@gmail.com>
> > > > > > > > >>>> > > wrote:
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > > Thanks for the feedback Colin!
> > > > > > > > >>>> > > >
> > > > > > > > >>>> > > > On Tue, Jul 28, 2020 at 2:11 PM Colin McCabe <
> > > > > > > > cmcc...@apache.org>
> > > > > > > > >>>> > wrote:
> > > > > > > > >>>> > > >
> > > > > > > > >>>> > > > > Hi Boyang,
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > Thanks for updating this.  A few comments below:
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > In the "Routing Request Security" section, there
> > is
> > > a
> > > > > > > > reference
> > > > > > > > >>>> to
> > > > > > > > >>>> > > "older
> > > > > > > > >>>> > > > > requests that need redirection."  But after these
> > > new
> > > > > > > > >>>> revisions, both
> > > > > > > > >>>> > > new
> > > > > > > > >>>> > > > > and old requests need redirection.  So we should
> > > > > rephrase
> > > > > > > > this.
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > > In addition, to avoid exposing this forwarding
> > > power
> > > > > to
> > > > > > > the
> > > > > > > > >>>> admin
> > > > > > > > >>>> > > > > clients,
> > > > > > > > >>>> > > > > > the routing request shall be forwarded towards
> > the
> > > > > > > > controller
> > > > > > > > >>>> > broker
> > > > > > > > >>>> > > > > internal
> > > > > > > > >>>> > > > > > endpoint which should be only visible to other
> > > > brokers
> > > > > > > > inside
> > > > > > > > >>>> the
> > > > > > > > >>>> > > > > cluster
> > > > > > > > >>>> > > > > > in the KIP-500 controller. Any admin
> > configuration
> > > > > > request
> > > > > > > > >>>> with
> > > > > > > > >>>> > > broker
> > > > > > > > >>>> > > > > > principal should not be going through the public
> > > > > > endpoint
> > > > > > > > and
> > > > > > > > >>>> will
> > > > > > > > >>>> > be
> > > > > > > > >>>> > > > > > rejected for security purpose.
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > We should also describe how this will work in the
> > > > > > > pre-KIP-500
> > > > > > > > >>>> case.
> > > > > > > > >>>> > In
> > > > > > > > >>>> > > > > that case, CLUSTER_ACTION gets the extra
> > permissions
> > > > > > > described
> > > > > > > > >>>> here
> > > > > > > > >>>> > > only
> > > > > > > > >>>> > > > > when the message comes in on the inter-broker
> > > > listener.
> > > > > > We
> > > > > > > > >>>> should
> > > > > > > > >>>> > > state
> > > > > > > > >>>> > > > > that here.
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > (I can see that you have this information later on
> > > in
> > > > > the
> > > > > > > > >>>> "Security
> > > > > > > > >>>> > > > Access
> > > > > > > > >>>> > > > > Changes" section, but it would be good to have it
> > > here
> > > > > as
> > > > > > > > well,
> > > > > > > > >>>> to
> > > > > > > > >>>> > > avoid
> > > > > > > > >>>> > > > > confusion.)
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > > To be more strict of protecting controller
> > > > > information,
> > > > > > > the
> > > > > > > > >>>> > > > > "ControllerId"
> > > > > > > > >>>> > > > > > field in new MetadataResponse shall be set to -1
> > > > when
> > > > > > the
> > > > > > > > >>>> original
> > > > > > > > >>>> > > > > request
> > > > > > > > >>>> > > > > > comes from a non-broker client and it is already
> > > on
> > > > > v10.
> > > > > > > We
> > > > > > > > >>>> shall
> > > > > > > > >>>> > use
> > > > > > > > >>>> > > > the
> > > > > > > > >>>> > > > > > request listener name to distinguish whether a
> > > given
> > > > > > > request
> > > > > > > > >>>> is
> > > > > > > > >>>> > > > > inter-broker,
> > > > > > > > >>>> > > > > > or from the client.
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > I'm not sure why we would need to distinguish
> > > between
> > > > > > broker
> > > > > > > > >>>> clients
> > > > > > > > >>>> > > and
> > > > > > > > >>>> > > > > non-broker clients.  Brokers don't generally send
> > > > > > > > >>>> MetadataRequests to
> > > > > > > > >>>> > > > other
> > > > > > > > >>>> > > > > brokers, do they?  Brokers learn about metadata
> > from
> > > > > > > > >>>> > > > UpdateMetadataRequest
> > > > > > > > >>>> > > > > and LeaderAndIsrRequest, not by sending
> > > > MetadataRequests
> > > > > > to
> > > > > > > > >>>> other
> > > > > > > > >>>> > > > brokers.
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > We do have one use case where the MetadataRequest
> > > gets
> > > > > > sent
> > > > > > > > >>>> between
> > > > > > > > >>>> > the
> > > > > > > > >>>> > > > brokers, which is the InterBrokerSendThread.
> > Currently
> > > > we
> > > > > > > don't
> > > > > > > > >>>> rely on
> > > > > > > > >>>> > > it
> > > > > > > > >>>> > > > to get the controller id, so I guess your suggestion
> > > > > should
> > > > > > be
> > > > > > > > >>>> good to
> > > > > > > > >>>> > > > enforce. We could use some meta comment on the
> > > > > NetworkClient
> > > > > > > > that
> > > > > > > > >>>> it
> > > > > > > > >>>> > > should
> > > > > > > > >>>> > > > not be used to get the controller location.
> > > > > > > > >>>> > > >
> > > > > > > > >>>> > > > Probably what we want here is: v0-v9: return a
> > random
> > > > > broker
> > > > > > > in
> > > > > > > > >>>> the
> > > > > > > > >>>> > > cluster
> > > > > > > > >>>> > > > > as the controller ID.  v10: no controllerID
> > present
> > > in
> > > > > the
> > > > > > > > >>>> > > > > MetadataResponse.  We should also deprecate the
> > > > > > adminClient
> > > > > > > > >>>> method
> > > > > > > > >>>> > > which
> > > > > > > > >>>> > > > > gets the controllerId.
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > > BROKER_AUTHORIZATION_FAILURE(92, "Authorization
> > > > failed
> > > > > > for
> > > > > > > > the
> > > > > > > > >>>> > > > > > request during forwarding, this indicates an
> > > > internal
> > > > > > > error
> > > > > > > > >>>> on the
> > > > > > > > >>>> > > > broker
> > > > > > > > >>>> > > > > > cluster security setup.",
> > > > > > > > >>>> > BrokerAuthorizationFailureException::new);
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > Grammar nitpick: It would be good to have a period
> > > > > between
> > > > > > > > >>>> > "forwarding"
> > > > > > > > >>>> > > > > and "this" to avoid a run-on sentence :)
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > best,
> > > > > > > > >>>> > > > > Colin
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > On Mon, Jul 27, 2020, at 21:47, Boyang Chen wrote:
> > > > > > > > >>>> > > > > > Hey there,
> > > > > > > > >>>> > > > > >
> > > > > > > > >>>> > > > > > I'm re-opening this thread because after some
> > > > initial
> > > > > > > > >>>> > implementations
> > > > > > > > >>>> > > > > > started, we spotted some gaps in the approved
> > KIP
> > > as
> > > > > > well
> > > > > > > as
> > > > > > > > >>>> some
> > > > > > > > >>>> > > > > > inconsistencies with KIP-631 controller. There
> > > are a
> > > > > > > couple
> > > > > > > > of
> > > > > > > > >>>> > > > addendums
> > > > > > > > >>>> > > > > to
> > > > > > > > >>>> > > > > > the existing KIP, specifically:
> > > > > > > > >>>> > > > > >
> > > > > > > > >>>> > > > > > 1. As the controller is foreseen to be only
> > > > accessible
> > > > > > to
> > > > > > > > the
> > > > > > > > >>>> > > brokers,
> > > > > > > > >>>> > > > > the
> > > > > > > > >>>> > > > > > new admin client would not have direct access to
> > > the
> > > > > > > > >>>> controller. It
> > > > > > > > >>>> > > is
> > > > > > > > >>>> > > > > > guaranteed on the MetadataResponse level which
> > no
> > > > > longer
> > > > > > > > >>>> provides
> > > > > > > > >>>> > > > > > `ControllerId` to client side requests.
> > > > > > > > >>>> > > > > >
> > > > > > > > >>>> > > > > > 2. The broker would forward any direct ZK path
> > > > > mutation
> > > > > > > > >>>> requests,
> > > > > > > > >>>> > > > > including
> > > > > > > > >>>> > > > > > topic creation/deletion, reassignment, etc since
> > > we
> > > > > > > > deprecate
> > > > > > > > >>>> the
> > > > > > > > >>>> > > > direct
> > > > > > > > >>>> > > > > > controller access on the client side. No more
> > > > protocol
> > > > > > > > >>>> version bump
> > > > > > > > >>>> > > is
> > > > > > > > >>>> > > > > > necessary for the configuration requests.
> > > > > > > > >>>> > > > > >
> > > > > > > > >>>> > > > > > 3. To make sure forwarding requests pass the
> > > > > > > authorization,
> > > > > > > > >>>> broker
> > > > > > > > >>>> > > > > > principal CLUSTER_ACTION would be allowed to be
> > > used
> > > > > as
> > > > > > an
> > > > > > > > >>>> > > alternative
> > > > > > > > >>>> > > > > > authentication method for a variety of principal
> > > > > > > operations,
> > > > > > > > >>>> > > including
> > > > > > > > >>>> > > > > > ALTER, ALTER_CONFIG, DELETE, etc. It is because
> > > the
> > > > > > > > forwarding
> > > > > > > > >>>> > > request
> > > > > > > > >>>> > > > > > needs to use the proxy broker's own principal,
> > > which
> > > > > is
> > > > > > > > >>>> currently
> > > > > > > > >>>> > not
> > > > > > > > >>>> > > > > > supported to be used for many configuration
> > change
> > > > > > > > >>>> authentication
> > > > > > > > >>>> > > > listed
> > > > > > > > >>>> > > > > > above. The full list could be found in the KIP.
> > > > > > > > >>>> > > > > >
> > > > > > > > >>>> > > > > > 4. Add a new BROKER_AUTHORIZATION_FAILURE error
> > > code
> > > > > to
> > > > > > > > >>>> indicate
> > > > > > > > >>>> > any
> > > > > > > > >>>> > > > > > internal security configuration failure, when
> > the
> > > > > > > forwarded
> > > > > > > > >>>> request
> > > > > > > > >>>> > > > > failed
> > > > > > > > >>>> > > > > > authentication on the controller side.
> > > > > > > > >>>> > > > > >
> > > > > > > > >>>> > > > > > Let me know what you think. With such a major
> > > > > refinement
> > > > > > > of
> > > > > > > > >>>> the
> > > > > > > > >>>> > KIP,
> > > > > > > > >>>> > > > I'm
> > > > > > > > >>>> > > > > > open for re-vote after discussions converge.
> > > > > > > > >>>> > > > > >
> > > > > > > > >>>> > > > > > Boyang
> > > > > > > > >>>> > > > > >
> > > > > > > > >>>> > > > > > On Wed, Jul 1, 2020 at 2:17 PM Boyang Chen <
> > > > > > > > >>>> > > reluctanthero...@gmail.com
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > > > > wrote:
> > > > > > > > >>>> > > > > >
> > > > > > > > >>>> > > > > > > Hey folks,
> > > > > > > > >>>> > > > > > >
> > > > > > > > >>>> > > > > > > I have also synced on the KIP-578 which was
> > > doing
> > > > > the
> > > > > > > > >>>> partition
> > > > > > > > >>>> > > > limit,
> > > > > > > > >>>> > > > > to
> > > > > > > > >>>> > > > > > > make sure the partition limit error code would
> > > be
> > > > > > > properly
> > > > > > > > >>>> > > propagated
> > > > > > > > >>>> > > > > once
> > > > > > > > >>>> > > > > > > it is done on top of KIP-590. Let me know if
> > you
> > > > > have
> > > > > > > > >>>> further
> > > > > > > > >>>> > > > > questions or
> > > > > > > > >>>> > > > > > > concerns.
> > > > > > > > >>>> > > > > > >
> > > > > > > > >>>> > > > > > > Boyang
> > > > > > > > >>>> > > > > > >
> > > > > > > > >>>> > > > > > > On Tue, Jun 23, 2020 at 5:08 PM Boyang Chen <
> > > > > > > > >>>> > > > > reluctanthero...@gmail.com>
> > > > > > > > >>>> > > > > > > wrote:
> > > > > > > > >>>> > > > > > >
> > > > > > > > >>>> > > > > > >> Thanks for the clarification, Colin and
> > Ismael.
> > > > > > > > Personally
> > > > > > > > >>>> I
> > > > > > > > >>>> > also
> > > > > > > > >>>> > > > feel
> > > > > > > > >>>> > > > > > >> Option A is better to prioritize fixing the
> > > gap.
> > > > > Just
> > > > > > > to
> > > > > > > > be
> > > > > > > > >>>> > clear,
> > > > > > > > >>>> > > > the
> > > > > > > > >>>> > > > > > >> proposed solution would be:
> > > > > > > > >>>> > > > > > >>
> > > > > > > > >>>> > > > > > >> 1. Bump the Metadata RPC version to return
> > > > > > > > >>>> POLICY_VIOLATION. In
> > > > > > > > >>>> > > the
> > > > > > > > >>>> > > > > > >> application level, we should swap the error
> > > > message
> > > > > > > with
> > > > > > > > >>>> the
> > > > > > > > >>>> > > actual
> > > > > > > > >>>> > > > > failure
> > > > > > > > >>>> > > > > > >> reason such as "violation of topic creation
> > > > policy
> > > > > > when
> > > > > > > > >>>> > attempting
> > > > > > > > >>>> > > > to
> > > > > > > > >>>> > > > > auto
> > > > > > > > >>>> > > > > > >> create internal topic through
> > MetadataRequest."
> > > > > > > > >>>> > > > > > >>
> > > > > > > > >>>> > > > > > >> 2. For older Metadata RPC, return
> > > > > > AUTHORIZATION_FAILED
> > > > > > > to
> > > > > > > > >>>> fail
> > > > > > > > >>>> > > fast.
> > > > > > > > >>>> > > > > > >>
> > > > > > > > >>>> > > > > > >> Will address our other discussed points as
> > well
> > > > in
> > > > > > the
> > > > > > > > >>>> KIP, let
> > > > > > > > >>>> > me
> > > > > > > > >>>> > > > > know
> > > > > > > > >>>> > > > > > >> if you have further questions.
> > > > > > > > >>>> > > > > > >>
> > > > > > > > >>>> > > > > > >> Thanks,
> > > > > > > > >>>> > > > > > >> Boyang
> > > > > > > > >>>> > > > > > >>
> > > > > > > > >>>> > > > > > >> On Tue, Jun 23, 2020 at 10:41 AM Ismael Juma
> > <
> > > > > > > > >>>> ism...@juma.me.uk
> > > > > > > > >>>> > >
> > > > > > > > >>>> > > > > wrote:
> > > > > > > > >>>> > > > > > >>
> > > > > > > > >>>> > > > > > >>> Option A is basically what I was thinking.
> > But
> > > > > with
> > > > > > a
> > > > > > > > >>>> slight
> > > > > > > > >>>> > > > > adjustment:
> > > > > > > > >>>> > > > > > >>>
> > > > > > > > >>>> > > > > > >>> New versions of MetadataResponse return
> > > > > > > > POLICY_VIOLATION,
> > > > > > > > >>>> old
> > > > > > > > >>>> > > > > versions
> > > > > > > > >>>> > > > > > >>> return AUTHORIZATION_FAILED. The latter
> > works
> > > > > > > correctly
> > > > > > > > >>>> with
> > > > > > > > >>>> > old
> > > > > > > > >>>> > > > Java
> > > > > > > > >>>> > > > > > >>> clients (i.e. the client fails fast and
> > > > propagates
> > > > > > the
> > > > > > > > >>>> error),
> > > > > > > > >>>> > > I've
> > > > > > > > >>>> > > > > > >>> tested
> > > > > > > > >>>> > > > > > >>> it. Adjust new clients to treat
> > > POLICY_VIOLATION
> > > > > > like
> > > > > > > > >>>> > > > > > >>> AUTHORIZATION_FAILED,
> > > > > > > > >>>> > > > > > >>> but propagate the custom error message.
> > > > > > > > >>>> > > > > > >>>
> > > > > > > > >>>> > > > > > >>> Ismael
> > > > > > > > >>>> > > > > > >>>
> > > > > > > > >>>> > > > > > >>> On Mon, Jun 22, 2020 at 11:00 PM Colin
> > McCabe
> > > <
> > > > > > > > >>>> > > cmcc...@apache.org>
> > > > > > > > >>>> > > > > > >>> wrote:
> > > > > > > > >>>> > > > > > >>>
> > > > > > > > >>>> > > > > > >>> > > > > On Fri, Jun 19, 2020 at 3:18 PM
> > Ismael
> > > > > Juma
> > > > > > <
> > > > > > > > >>>> > > > > ism...@juma.me.uk>
> > > > > > > > >>>> > > > > > >>> > wrote:
> > > > > > > > >>>> > > > > > >>> > > > >
> > > > > > > > >>>> > > > > > >>> > > > > > Hi Colin,
> > > > > > > > >>>> > > > > > >>> > > > > >
> > > > > > > > >>>> > > > > > >>> > > > > > The KIP states in the
> > Compatibility
> > > > > > section
> > > > > > > > (not
> > > > > > > > >>>> > Future
> > > > > > > > >>>> > > > > work):
> > > > > > > > >>>> > > > > > >>> > > > > >
> > > > > > > > >>>> > > > > > >>> > > > > > "To support the proxy of requests,
> > > we
> > > > > need
> > > > > > > to
> > > > > > > > >>>> build a
> > > > > > > > >>>> > > > > channel
> > > > > > > > >>>> > > > > > >>> for
> > > > > > > > >>>> > > > > > >>> > > > > > brokers to talk directly to the
> > > > > > controller.
> > > > > > > > >>>> This part
> > > > > > > > >>>> > > of
> > > > > > > > >>>> > > > > the
> > > > > > > > >>>> > > > > > >>> design
> > > > > > > > >>>> > > > > > >>> > > > > > is internal change only and won’t
> > > > block
> > > > > > the
> > > > > > > > KIP
> > > > > > > > >>>> > > > progress."
> > > > > > > > >>>> > > > > > >>> > > > > >
> > > > > > > > >>>> > > > > > >>> > > > > > I am clarifying that this is not
> > > > > internal
> > > > > > > only
> > > > > > > > >>>> due to
> > > > > > > > >>>> > > the
> > > > > > > > >>>> > > > > > >>> config.
> > > > > > > > >>>> > > > > > >>> > If we
> > > > > > > > >>>> > > > > > >>> > > > > > say that this KIP depends on
> > another
> > > > KIP
> > > > > > > > before
> > > > > > > > >>>> we
> > > > > > > > >>>> > can
> > > > > > > > >>>> > > > > merge
> > > > > > > > >>>> > > > > > >>> > > > > > it, that's fine although it feels
> > a
> > > > bit
> > > > > > > > >>>> unnecessary.
> > > > > > > > >>>> > > > > > >>> > > > > >
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>> > Hi Ismael,
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>> > I didn't realize there was still a
> > reference
> > > > to
> > > > > > the
> > > > > > > > >>>> separate
> > > > > > > > >>>> > > > > controller
> > > > > > > > >>>> > > > > > >>> > channel in the "Compatibility,
> > Deprecation,
> > > > and
> > > > > > > > >>>> Migration
> > > > > > > > >>>> > Plan"
> > > > > > > > >>>> > > > > > >>> section.  I
> > > > > > > > >>>> > > > > > >>> > agree that it doesn't really belong there.
> > > > > Given
> > > > > > > that
> > > > > > > > >>>> this
> > > > > > > > >>>> > is
> > > > > > > > >>>> > > > > creating
> > > > > > > > >>>> > > > > > >>> > confusion, I would suggest that we just
> > drop
> > > > > this
> > > > > > > from
> > > > > > > > >>>> the
> > > > > > > > >>>> > KIP
> > > > > > > > >>>> > > > > > >>> entirely.
> > > > > > > > >>>> > > > > > >>> > It really is orthogonal to what this KIP
> > is
> > > > > > about--
> > > > > > > we
> > > > > > > > >>>> don't
> > > > > > > > >>>> > > > need a
> > > > > > > > >>>> > > > > > >>> > separate channel to implement redirection.
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>> > Boyang wrote:
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>> > >
> > > > > > > > >>>> > > > > > >>> > > We are only opening the doors for
> > specific
> > > > > > > internal
> > > > > > > > >>>> topics
> > > > > > > > >>>> > > > > (offsets,
> > > > > > > > >>>> > > > > > >>> txn
> > > > > > > > >>>> > > > > > >>> > > log), which I assume the client should
> > > have
> > > > no
> > > > > > > > >>>> possibility
> > > > > > > > >>>> > to
> > > > > > > > >>>> > > > > mutate
> > > > > > > > >>>> > > > > > >>> the
> > > > > > > > >>>> > > > > > >>> > > topic policy?
> > > > > > > > >>>> > > > > > >>> > >
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>> > Hi Boyang,
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>> > I think you and Ismael are talking about
> > > > > different
> > > > > > > > >>>> scenarios.
> > > > > > > > >>>> > > > You
> > > > > > > > >>>> > > > > are
> > > > > > > > >>>> > > > > > >>> > describing the scenario where the broker
> > is
> > > > > > > > >>>> auto-creating the
> > > > > > > > >>>> > > > > > >>> transaction
> > > > > > > > >>>> > > > > > >>> > log topic or consumer offset topic.  This
> > > > > scenario
> > > > > > > > >>>> indeed
> > > > > > > > >>>> > > should
> > > > > > > > >>>> > > > > not
> > > > > > > > >>>> > > > > > >>> happen
> > > > > > > > >>>> > > > > > >>> > in a properly-configured cluster.
> > However,
> > > > > Ismael
> > > > > > > is
> > > > > > > > >>>> > > describing
> > > > > > > > >>>> > > > a
> > > > > > > > >>>> > > > > > >>> scenario
> > > > > > > > >>>> > > > > > >>> > where the client is auto-creating some
> > > > arbitrary
> > > > > > > > >>>> non-internal
> > > > > > > > >>>> > > > topic
> > > > > > > > >>>> > > > > > >>> just by
> > > > > > > > >>>> > > > > > >>> > sending a metadata request.
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>> > As far as I can see, there are two
> > solutions
> > > > > here:
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>> > A. Close the hole in CreateTopicsPolicy
> > > > > > immediately.
> > > > > > > > >>>> In new
> > > > > > > > >>>> > > > > versions,
> > > > > > > > >>>> > > > > > >>> > allow MetadataResponse to return
> > > > > > > AUTHORIZATION_FAILED
> > > > > > > > >>>> if we
> > > > > > > > >>>> > > tried
> > > > > > > > >>>> > > > > to
> > > > > > > > >>>> > > > > > >>> > auto-create a topic and failed.  Find some
> > > > other
> > > > > > > error
> > > > > > > > >>>> code
> > > > > > > > >>>> > to
> > > > > > > > >>>> > > > > return
> > > > > > > > >>>> > > > > > >>> for
> > > > > > > > >>>> > > > > > >>> > existing versions.
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>> > B. Keep the hole in CreateTopicsPolicy and
> > > add
> > > > > > some
> > > > > > > > >>>> > > configuration
> > > > > > > > >>>> > > > > to
> > > > > > > > >>>> > > > > > >>> allow
> > > > > > > > >>>> > > > > > >>> > admins to gradually migrate to closing it.
> > > In
> > > > > > > > >>>> practice, this
> > > > > > > > >>>> > > > > probably
> > > > > > > > >>>> > > > > > >>> > means a configuration toggle that enables
> > > > direct
> > > > > > ZK
> > > > > > > > >>>> access,
> > > > > > > > >>>> > > that
> > > > > > > > >>>> > > > > > >>> starts off
> > > > > > > > >>>> > > > > > >>> > as enabled.  Then we can eventually
> > default
> > > it
> > > > > to
> > > > > > > > false
> > > > > > > > >>>> and
> > > > > > > > >>>> > > then
> > > > > > > > >>>> > > > > > >>> remove it
> > > > > > > > >>>> > > > > > >>> > entirely over time.
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>> > best,
> > > > > > > > >>>> > > > > > >>> > Colin
> > > > > > > > >>>> > > > > > >>> >
> > > > > > > > >>>> > > > > > >>>
> > > > > > > > >>>> > > > > > >>
> > > > > > > > >>>> > > > > >
> > > > > > > > >>>> > > > >
> > > > > > > > >>>> > > >
> > > > > > > > >>>> > >
> > > > > > > > >>>> >
> > > > > > > > >>>>
> > > > > > > > >>>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to