On Fri, Sep 25, 2020, at 10:49, Boyang Chen wrote: > Hey Jun, > > On Fri, Sep 25, 2020 at 10:19 AM Jun Rao <j...@confluent.io> wrote: > > > Hi, Boyang, > > > > Does EnvelopeRequest avoid the need for IBP? How do we know if the > > controller supports EnvelopeRequest or not? > > > > Unfortunately, the EnvelopeRequest is solving the inter-broker > > communication problem only. Admin clients still need to learn the proper > > ApiVersion from the broker, which means we need to bump IBP to limit the > > version range. >
Right-- the purpose of EnvelopeRequest is to avoid downconversion / upconversion on the forwarding broker. It unfortunately doesn't avoid the need to tie ApiVersionsResponse to IBP. > > > On Thu, Sep 24, 2020 at 4:53 PM Jun Rao <j...@confluent.io> wrote: > > > > > > > Hi, Jason, > > > > > > > > Yes, the most important thing is to be able to avoid two rolling > > > > restarts > > > > in the future. If we have a path to achieve that down the road, the > > > > changes here are fine. > > > > Yeah. I think it would be good to make IBP a feature flag, as long as it could be changed without doing a second rolling restart. We actually don't want to have too many feature flags, since it blows up the test matrix. best, Colin > > > > Thanks, > > > > > > > > Jun > > > > > > > > On Thu, Sep 24, 2020 at 3:20 PM Jason Gustafson <ja...@confluent.io> > > > > wrote: > > > > > > > > > > One of the goals of KIP-584 (feature versioning) is that we can get > > > rid > > > > > of > > > > > IBP in the future. So does this change prevent us from removing IBP > > in > > > > the > > > > > future? > > > > > > > > > > That is a good question. I think the problem here is that request > > > > > forwarding puts an expectation on api version support which covers > > more > > > > > than one broker. This is why the normal ApiVersions behavior doesn't > > > > work. > > > > > I thought about this a bit and haven't come up with a good > > alternative. > > > > One > > > > > thought I've been considering is letting the controller in the > > > > post-kip-500 > > > > > world set the maximum range of api support for the cluster. However, > > > even > > > > > then we would need some way to tell when the controller quorum itself > > > is > > > > > ready to enable support for a new api version. My feeling is that we > > > will > > > > > probably always need something like the IBP to control when it is > > safe > > > to > > > > > expose versions of APIs which have a cross-broker dependence. > > However, > > > > > KIP-584 would still allow us to manage the IBP at the level of a > > > feature > > > > so > > > > > that we don't need two rolling restarts anymore. > > > > > > > > > > Best, > > > > > Jason > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Sep 24, 2020 at 1:59 PM Jun Rao <j...@confluent.io> wrote: > > > > > > > > > > > Hi, Boyang, > > > > > > > > > > > > One of the goals of KIP-584 (feature versioning) is that we can get > > > rid > > > > > of > > > > > > IBP in the future. So does this change prevent us from removing IBP > > > in > > > > > the > > > > > > future? > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jun > > > > > > > > > > > > On Thu, Sep 24, 2020 at 12:46 PM Jason Gustafson < > > ja...@confluent.io > > > > > > > > > > wrote: > > > > > > > > > > > > > Hey Boyang, > > > > > > > > > > > > > > Thanks for the update. This seems like the best thing we can do. > > > The > > > > > > > alternative would be to always ensure that the forwarded APIs are > > > > safe > > > > > > for > > > > > > > conversion between versions, but that would restrict the > > > flexibility > > > > > that > > > > > > > the versioning is providing. It would also be a large effort to > > > avoid > > > > > > > introducing regressions through conversion. Sadly this broadens > > the > > > > > scope > > > > > > > of the IBP, but in fact forwarded APIs are inter-broker APIs. > > > > > > > > > > > > > > Thanks, > > > > > > > Jason > > > > > > > > > > > > > > On Thu, Sep 24, 2020 at 9:23 AM Boyang Chen < > > > > > reluctanthero...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hey there, > > > > > > > > > > > > > > > > we spotted a necessary case to handle the redirect request > > > > > versioning, > > > > > > > and > > > > > > > > proposed the following changes: > > > > > > > > > > > > > > > > 1. For redirection RPCs (AlterConfig, Acl, Token etc), the > > > > > > corresponding > > > > > > > > allowed versions in the ApiVersionResponse will be affected by > > > the > > > > > > entire > > > > > > > > cluster's versioning, not just the receiving broker, since we > > > need > > > > to > > > > > > > > ensure the chosen version get properly handled by all parties. > > > Thus > > > > > > from > > > > > > > > now on, RPC with redirection will be treated as inter-broker > > RPC, > > > > and > > > > > > any > > > > > > > > version bump for these RPCs has to go through IBP bump as well. > > > > > > > > ApiVersionResponse will take IBP into considerations for the > > > > > > redirection > > > > > > > > RPCs allowable versions. > > > > > > > > > > > > > > > > 2. We would do the best effort to maintain the same request > > > version > > > > > for > > > > > > > > the entire admin client -> receiving broker -> controller > > broker > > > > > path, > > > > > > > but > > > > > > > > for old RPC versions, they may not have flexible fields > > > introduced > > > > > yet. > > > > > > > > Thus, we would have to upgrade the RPC to the minimum version > > > which > > > > > > > > supports flexible fields > > > > > > > > and add another tagged field in the header called > > > > > > > `OriginalRequestVersion` > > > > > > > > to help the controller broker correctly deserialize the request > > > > with > > > > > > the > > > > > > > > original admin client sent out version. We would not downgrade > > > the > > > > > > > original > > > > > > > > request in any circumstance, since the flexible field support > > is > > > > > > required > > > > > > > > to be open-ended on the high side. > > > > > > > > > > > > > > > > Let me know if you have any questions. > > > > > > > > > > > > > > > > Best, > > > > > > > > Boyang > > > > > > > > > > > > > > > > On Thu, Aug 6, 2020 at 6:11 PM Boyang Chen < > > > > > reluctanthero...@gmail.com > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hey there, > > > > > > > > > > > > > > > > > > we are going to introduce a minor change to bump the version > > of > > > > > > several > > > > > > > > > RPCs which are currently not supporting flexible versions. It > > > is > > > > > > > > necessary > > > > > > > > > because they need to be able to construct request header with > > > > > initial > > > > > > > > > principal name and client id as optional fields for > > > redirection. > > > > > The > > > > > > > are > > > > > > > > > only two of them: > > > > > > > > > > > > > > > > > > 1. AlterConfig > > > > > > > > > 2. AlterClientQuotas > > > > > > > > > > > > > > > > > > Let me know if you have any questions. > > > > > > > > > > > > > > > > > > Boyang > > > > > > > > > > > > > > > > > > On Fri, Jul 31, 2020 at 11:42 AM Boyang Chen < > > > > > > > reluctanthero...@gmail.com > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> Hey David, > > > > > > > > >> > > > > > > > > >> After discussing with Colin offline, I would like to correct > > > one > > > > > > case > > > > > > > in > > > > > > > > >> the described workflow, where the CLUSTER_ACTION > > authorization > > > > > would > > > > > > > > not be > > > > > > > > >> based on the initial principal field check, because it is > > not > > > a > > > > > > > secured > > > > > > > > >> condition which anyone could forge. The revised workflow > > shall > > > > be: > > > > > > > > >> > > > > > > > > >> Step 1. Filter out resources that are authorized > > > > > > > > >> 1.1 Use traditional principals to verify first. If > > > > > > > authorized, > > > > > > > > >> continue > > > > > > > > >> 1.2 If not authorized, check whether the request is > > > > from > > > > > > the > > > > > > > > >> control plane. Note that this is a best-effort to verify > > > whether > > > > > the > > > > > > > > >> request is internal. > > > > > > > > >> 1.3 If the request is not from the control plane, > > > > return > > > > > > > > >> authorization failure > > > > > > > > >> 1.4 If the request is from the control plane, use > > > > > > > > CLUSTER_ACTION > > > > > > > > >> to verify and determine the result > > > > > > > > >> > > > > > > > > >> Step 2. Check the request context to see if this is a > > > forwarding > > > > > > > > request, > > > > > > > > >> by checking whether it is from control plane and uses extra > > > > header > > > > > > > > fields > > > > > > > > >> 2.1 if the resource is authorized, and if this is > > the > > > > > active > > > > > > > > >> controller, process it > > > > > > > > >> 2.2 if the resource is authorized but this is not > > the > > > > > active > > > > > > > > >> controller, return NOT_CONTROLLER to the sender (forwarding > > > > > broker) > > > > > > > for > > > > > > > > >> retry > > > > > > > > >> 2.3 if the resource is not authorized, return > > > > > > > > >> CLUSTER_AUTHORIZATION_FAILURE to propagate back to the > > > original > > > > > > client > > > > > > > > >> through forwarding broker > > > > > > > > >> Step 3. If the request is not a forwarding request > > > > > > > > >> 3.1 If the resource is authorized, and this is the > > > > active > > > > > > > > >> controller, process it > > > > > > > > >> 3.2 If the resource is authorized, but this is not > > > > active > > > > > > > > >> controller, put the resource into the preparation for a new > > > > > > > AlterConfig > > > > > > > > >> request for forwarding > > > > > > > > >> 3.3 If the resource is not authorized, reply the > > > > original > > > > > > > client > > > > > > > > >> AUTHORIZATION_FAILURE when the forwarding request is > > returned > > > > > > > > >> > > > > > > > > >> On Thu, Jul 30, 2020 at 3:47 PM Boyang Chen < > > > > > > > reluctanthero...@gmail.com > > > > > > > > > > > > > > > > > >> wrote: > > > > > > > > >> > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > >>> On Thu, Jul 30, 2020 at 7:18 AM David Jacot < > > > > dja...@confluent.io > > > > > > > > > > > > > > wrote: > > > > > > > > >>> > > > > > > > > >>>> Hi Boyang, > > > > > > > > >>>> > > > > > > > > >>>> Thanks for your answers. > > > > > > > > >>>> > > > > > > > > >>>> > The point for using the listener name is more of a > > > security > > > > > > > purpose, > > > > > > > > >>>> to > > > > > > > > >>>> > detect any forged request to our best effort. > > > > > > > > >>>> > For throttling I think we could just check the request > > > > header > > > > > > for > > > > > > > > >>>> > *InitialClientId* existence, to distinguish whether to > > > apply > > > > > > > > >>>> > throttling strategy as forwarded request or direct > > > request. > > > > > > > > >>>> > > > > > > > > >>>> Reading "security" and "best effort" in the same sentence > > > > makes > > > > > > me a > > > > > > > > >>>> little nervous :). > > > > > > > > >>>> > > > > > > > > >>>> The identification issue is also valid for quota as we > > don't > > > > > want > > > > > > > one > > > > > > > > >>>> to be > > > > > > > > >>>> able to bypass the quota by forging a request as well, > > isn't > > > > it? > > > > > > > > >>>> Otherwise, > > > > > > > > >>>> anyone could just set the InitialPrincipal to bypass it. I > > > > think > > > > > > > that > > > > > > > > we > > > > > > > > >>>> should > > > > > > > > >>>> only use InitialPrincipal and/or InitialClientId when we > > > know > > > > > that > > > > > > > > they > > > > > > > > >>>> come > > > > > > > > >>>> from another broker. Based on what I read in the KIP, it > > > looks > > > > > > like > > > > > > > we > > > > > > > > >>>> could > > > > > > > > >>>> only use them when the principal has CLUSTER_ACTION > > > privilege. > > > > > Do > > > > > > I > > > > > > > > >>>> understand it correctly? > > > > > > > > >>>> > > > > > > > > >>>> There is no 100% safe way to distinguish between raw > > request > > > > and > > > > > > > > >>> forward request if you take malicious request into > > > > consideration, > > > > > > > which > > > > > > > > >>> could happen > > > > > > > > >>> anyway if the attacker prepares those requests to pass the > > > > checks > > > > > > > for a > > > > > > > > >>> pre-KIP-500 cluster. > > > > > > > > >>> We could at most know whether it is sent to the control > > > plane, > > > > or > > > > > > the > > > > > > > > >>> data plane, and whether it has extra header fields such as > > > > > initial > > > > > > > > >>> principal and client id defined. For a non-malicious > > request > > > > > > > > >>> going into the control plane, it must be sent from a valid > > > > > broker, > > > > > > > > which > > > > > > > > >>> is a prerequisite to check its > > > > > > > > >>> CLUSTER_ACTION principal. Take AlterConfig for an example, > > > the > > > > > > > intended > > > > > > > > >>> workflow for a *KIP-590 broker* would be: > > > > > > > > >>> > > > > > > > > >>> Step 1. Check the request context to see if this is a > > > > forwarding > > > > > > > > >>> request, by checking whether it is from control plane and > > > uses > > > > > > extra > > > > > > > > header > > > > > > > > >>> fields > > > > > > > > >>> 1.1 if it is a forwarding request, use > > CLUSTER_ACTION > > > > to > > > > > > > verify > > > > > > > > >>> the given resource > > > > > > > > >>> 1.2 if the resource is authorized, and if this is > > the > > > > > > active > > > > > > > > >>> controller, process it > > > > > > > > >>> 1.3 if the resource is authorized but this is not > > the > > > > > > active > > > > > > > > >>> controller, return NOT_CONTROLLER to the sender (forwarding > > > > > broker) > > > > > > > for > > > > > > > > >>> retry > > > > > > > > >>> 1.4 if the resource is not authorized, return > > > > > > > > >>> CLUSTER_AUTHORIZATION_FAILURE to propagate back to the > > > original > > > > > > > client > > > > > > > > >>> through forwarding broker > > > > > > > > >>> Step 2. If the request is not a forwarding request > > > > > > > > >>> 2.1 Verify with normal principal as ALTER on the > > > given > > > > > > > > resources > > > > > > > > >>> 2.2 If the resource is authorized, and this is the > > > > active > > > > > > > > >>> controller, process it > > > > > > > > >>> 2.3 If the resource is authorized, but this is not > > > > active > > > > > > > > >>> controller, put the resource into the preparation for a new > > > > > > > AlterConfig > > > > > > > > >>> request for forwarding > > > > > > > > >>> 2.4 If the resource is not authorized, reply the > > > > original > > > > > > > > client > > > > > > > > >>> AUTHORIZATION_FAILURE when the forwarding request is > > returned > > > > > > > > >>> > > > > > > > > >>> When the control plane and data plane are using the same > > > > > listener, > > > > > > we > > > > > > > > >>> couldn't distinguish whether a request is forwarded or not > > > for > > > > > > sure, > > > > > > > > so in > > > > > > > > >>> terms of the forward request checking, we have to require > > the > > > > > extra > > > > > > > > header > > > > > > > > >>> fields to present. A stronger checking mechanism could only > > > be > > > > > > > enforced > > > > > > > > >>> when we upgrade to post-KIP-500 with a separate controller > > > > > network. > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > >>>> I have made another pass on the whole KIP, I have few > > nits: > > > > > > > > >>>> > > > > > > > > >>>> - The sentence "Take AlterConfig as an example to > > understand > > > > the > > > > > > > > changes > > > > > > > > >>>> we are making." does not make much sense anymore in the > > > > > beginning > > > > > > of > > > > > > > > the > > > > > > > > >>>> "Proposed Changes" chapter. > > > > > > > > >>>> > > > > > > > > >>>> Sure, deleted. > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > >>>> - When you say "Existing RPCs which are sending directly > > to > > > > the > > > > > > > > >>>> controller > > > > > > > > >>>> will > > > > > > > > >>>> rely on forwarding as well.". I suggest to explicitly > > > mention > > > > > how > > > > > > > "old > > > > > > > > >>>> admin clients" > > > > > > > > >>>> will work here to complement the sentence. Something like: > > > > They > > > > > > will > > > > > > > > >>>> get a > > > > > > > > >>>> random > > > > > > > > >>>> broker id as the controller id in the metadata response > > and > > > > > stick > > > > > > to > > > > > > > > it > > > > > > > > >>>> as > > > > > > > > >>>> you explained. > > > > > > > > >>>> > > > > > > > > >>>> Sg, changed. > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > >>>> - "The purpose of adding principal name is for the audit > > > > > logging, > > > > > > > and > > > > > > > > >>>> the > > > > > > > > >>>> client id is > > > > > > > > >>>> being used to throttling according to KIP-599 > > requirement." > > > > > > > Actually, > > > > > > > > >>>> KIP-599 needs > > > > > > > > >>>> both the principal and the clientId. > > > > > > > > >>>> > > > > > > > > >>>> Makes sense. > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > >>>> - In the "Routing Request Security" chapter. It is written > > > > that > > > > > > the > > > > > > > > >>>> forwarding broker > > > > > > > > >>>> will verify the request with its own authorizer and will > > > just > > > > > > > forward > > > > > > > > >>>> it if > > > > > > > > >>>> the request > > > > > > > > >>>> looks good. When a request contains for instance multiple > > > > > topics, > > > > > > I > > > > > > > > >>>> suppose > > > > > > > > >>>> that > > > > > > > > >>>> we will forward only the authorized ones and not the whole > > > > > > original > > > > > > > > >>>> request > > > > > > > > >>>> as is. > > > > > > > > >>>> We may want to reword the sentence to make this clear. > > > > > > > > >>>> > > > > > > > > >>>> That makes sense, I will put this more detailed version of > > > > > routing > > > > > > > > into > > > > > > > > >>> the design. > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > >>>> - For the record, should we put the previous proposal in > > the > > > > > > > rejected > > > > > > > > >>>> alternatives as > > > > > > > > >>>> well? > > > > > > > > >>>> > > > > > > > > >>>> We do have big changes in this KIP, our current strategy > > is > > > to > > > > > > rely > > > > > > > on > > > > > > > > >>> wiki revisions if people > > > > > > > > >>> are interested to figure out our previous design strategy. > > > > > Putting > > > > > > > the > > > > > > > > >>> full design in current KIP proposal > > > > > > > > >>> would distract too much for readers as we discussed during > > > the > > > > > last > > > > > > > > time > > > > > > > > >>> of big refactoring. > > > > > > > > >>> > > > > > > > > >>> Best, > > > > > > > > >>>> David > > > > > > > > >>>> > > > > > > > > >>>> On Thu, Jul 30, 2020 at 3:51 AM Boyang Chen < > > > > > > > > reluctanthero...@gmail.com > > > > > > > > >>>> > > > > > > > > > >>>> wrote: > > > > > > > > >>>> > > > > > > > > >>>> > Thanks David for the feedback! > > > > > > > > >>>> > > > > > > > > > >>>> > On Wed, Jul 29, 2020 at 7:53 AM David Jacot < > > > > > > dja...@confluent.io> > > > > > > > > >>>> wrote: > > > > > > > > >>>> > > > > > > > > > >>>> > > Hi, Colin, Boyang, > > > > > > > > >>>> > > > > > > > > > > >>>> > > Colin, thanks for the clarification. Somehow, I > > thought > > > > that > > > > > > > even > > > > > > > > >>>> if the > > > > > > > > >>>> > > controller is ran independently, it > > > > > > > > >>>> > > would still run the listeners of the broker and thus > > > would > > > > > be > > > > > > > > >>>> accessible > > > > > > > > >>>> > by > > > > > > > > >>>> > > redirecting on the loopback > > > > > > > > >>>> > > interface. My mistake. > > > > > > > > >>>> > > > > > > > > > > >>>> > > Boyang, I have few questions/comments regarding the > > > > updated > > > > > > KIP: > > > > > > > > >>>> > > > > > > > > > > >>>> > > 1. I think that it would be great if we could clarify > > > how > > > > > old > > > > > > > > admin > > > > > > > > >>>> > clients > > > > > > > > >>>> > > which are directly talking to the > > > > > > > > >>>> > > controller will work with this KIP. I read between the > > > > lines > > > > > > > that, > > > > > > > > >>>> as we > > > > > > > > >>>> > > propose to provide a random > > > > > > > > >>>> > > broker Id as the controller Id in the metadata > > response, > > > > > they > > > > > > > will > > > > > > > > >>>> use a > > > > > > > > >>>> > > single node as a proxy. Is that > > > > > > > > >>>> > > correct? This deserves to be called out more > > explicitly > > > in > > > > > the > > > > > > > > >>>> design > > > > > > > > >>>> > > section instead of being hidden > > > > > > > > >>>> > > in the protocol bump of the metadata RPC. > > > > > > > > >>>> > > > > > > > > > > >>>> > > Makes sense, I stress this point in the compatibility > > > > > section. > > > > > > > > >>>> > > > > > > > > > >>>> > > > > > > > > > >>>> > > 1.1 If I understand correctly, could we assume that > > old > > > > > admin > > > > > > > > >>>> clients > > > > > > > > >>>> > will > > > > > > > > >>>> > > stick to the same "fake controller" > > > > > > > > >>>> > > until they refresh their metadata? Refreshing the > > > metadata > > > > > > > usually > > > > > > > > >>>> > happens > > > > > > > > >>>> > > when NOT_CONTROLLER > > > > > > > > >>>> > > is received but this won't happen anymore so they > > should > > > > > > change > > > > > > > > >>>> > > infrequently. > > > > > > > > >>>> > > > > > > > > > > >>>> > > That is correct, old admin clients would not try to > > > > refresh > > > > > > > their > > > > > > > > >>>> > metadata > > > > > > > > >>>> > due to NOT_CONTROLLER, > > > > > > > > >>>> > which is impossible to happen with the new broker > > cluster. > > > > > > > > >>>> > > > > > > > > > >>>> > > > > > > > > > >>>> > > 2. For the new admin client, I suppose that we plan on > > > > using > > > > > > > > >>>> > > LeastLoadedNodeProvider for the > > > > > > > > >>>> > > requests that are using ControllerNodeProvider. We > > could > > > > > > perhaps > > > > > > > > >>>> mention > > > > > > > > >>>> > > it. > > > > > > > > >>>> > > > > > > > > > > >>>> > > Sure, added. > > > > > > > > >>>> > > > > > > > > > >>>> > > > > > > > > > >>>> > > 3. Pre KIP-500, will we have a way to distinguish if a > > > > > request > > > > > > > > that > > > > > > > > >>>> is > > > > > > > > >>>> > > received by the controller is > > > > > > > > >>>> > > coming directly from a client or from a broker? You > > > > mention > > > > > > that > > > > > > > > the > > > > > > > > >>>> > > listener can be used to do > > > > > > > > >>>> > > this but as you pointed out, it is not mandatory. Do > > we > > > > have > > > > > > > > another > > > > > > > > >>>> > > reliable method? I am asking > > > > > > > > >>>> > > in the context of KIP-599 with the current controller, > > > we > > > > > may > > > > > > > need > > > > > > > > >>>> to > > > > > > > > >>>> > > throttle differently if the > > > > > > > > >>>> > > request comes from a client or from a broker. > > > > > > > > >>>> > > > > > > > > > > >>>> > > The point for using the listener name is more of a > > > > security > > > > > > > > >>>> purpose, to > > > > > > > > >>>> > detect any forged request to our best effort. > > > > > > > > >>>> > For throttling I think we could just check the request > > > > header > > > > > > for > > > > > > > > >>>> > *InitialClientId* existence, to distinguish whether to > > > apply > > > > > > > > >>>> > throttling strategy as forwarded request or direct > > > request. > > > > > > > > >>>> > > > > > > > > > >>>> > > > > > > > > > >>>> > > 4. Could we add `InitialClientId` as well? This will > > be > > > > > > required > > > > > > > > >>>> for the > > > > > > > > >>>> > > quota as we can apply them > > > > > > > > >>>> > > by principal and/or clientId. > > > > > > > > >>>> > > > > > > > > > > >>>> > > Sounds good, added. > > > > > > > > >>>> > > > > > > > > > >>>> > > > > > > > > > >>>> > > 5. A small remark regarding the structure of the KIP. > > It > > > > is > > > > > a > > > > > > > bit > > > > > > > > >>>> weird > > > > > > > > >>>> > > that requests that do not go > > > > > > > > >>>> > > to the controller are mentioned in the Proposed Design > > > > > section > > > > > > > and > > > > > > > > >>>> the > > > > > > > > >>>> > > requests that go to the > > > > > > > > >>>> > > controller are mentioned in the Public Interfaces. > > When > > > > one > > > > > > read > > > > > > > > the > > > > > > > > >>>> > > Proposed Design, it does not > > > > > > > > >>>> > > get a full picture of the whole new routing proposal > > for > > > > old > > > > > > and > > > > > > > > new > > > > > > > > >>>> > > clients. It would be great if we > > > > > > > > >>>> > > could have a full overview in that section. > > > > > > > > >>>> > > > > > > > > > > >>>> > > Good point, I will move the pieces around. > > > > > > > > >>>> > > > > > > > > > >>>> > > > > > > > > > >>>> > > Overall the change makes sense to me. I will work on > > > > > drafting > > > > > > an > > > > > > > > >>>> addendum > > > > > > > > >>>> > > to KIP-599 to > > > > > > > > >>>> > > alter the design to cope with these changes. At a > > first > > > > > > glance, > > > > > > > > that > > > > > > > > >>>> > seems > > > > > > > > >>>> > > doable if 1.1, 3 > > > > > > > > >>>> > > and 4 are OK. > > > > > > > > >>>> > > > > > > > > > > >>>> > > Thank you for the help! > > > > > > > > >>>> > > > > > > > > > >>>> > > > > > > > > > >>>> > > Thanks, > > > > > > > > >>>> > > David > > > > > > > > >>>> > > > > > > > > > > >>>> > > On Wed, Jul 29, 2020 at 5:29 AM Boyang Chen < > > > > > > > > >>>> reluctanthero...@gmail.com> > > > > > > > > >>>> > > wrote: > > > > > > > > >>>> > > > > > > > > > > >>>> > > > Thanks for the feedback Colin! > > > > > > > > >>>> > > > > > > > > > > > >>>> > > > On Tue, Jul 28, 2020 at 2:11 PM Colin McCabe < > > > > > > > > cmcc...@apache.org> > > > > > > > > >>>> > wrote: > > > > > > > > >>>> > > > > > > > > > > > >>>> > > > > Hi Boyang, > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > Thanks for updating this. A few comments below: > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > In the "Routing Request Security" section, there > > is > > > a > > > > > > > > reference > > > > > > > > >>>> to > > > > > > > > >>>> > > "older > > > > > > > > >>>> > > > > requests that need redirection." But after these > > > new > > > > > > > > >>>> revisions, both > > > > > > > > >>>> > > new > > > > > > > > >>>> > > > > and old requests need redirection. So we should > > > > > rephrase > > > > > > > > this. > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > > In addition, to avoid exposing this forwarding > > > power > > > > > to > > > > > > > the > > > > > > > > >>>> admin > > > > > > > > >>>> > > > > clients, > > > > > > > > >>>> > > > > > the routing request shall be forwarded towards > > the > > > > > > > > controller > > > > > > > > >>>> > broker > > > > > > > > >>>> > > > > internal > > > > > > > > >>>> > > > > > endpoint which should be only visible to other > > > > brokers > > > > > > > > inside > > > > > > > > >>>> the > > > > > > > > >>>> > > > > cluster > > > > > > > > >>>> > > > > > in the KIP-500 controller. Any admin > > configuration > > > > > > request > > > > > > > > >>>> with > > > > > > > > >>>> > > broker > > > > > > > > >>>> > > > > > principal should not be going through the public > > > > > > endpoint > > > > > > > > and > > > > > > > > >>>> will > > > > > > > > >>>> > be > > > > > > > > >>>> > > > > > rejected for security purpose. > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > We should also describe how this will work in the > > > > > > > pre-KIP-500 > > > > > > > > >>>> case. > > > > > > > > >>>> > In > > > > > > > > >>>> > > > > that case, CLUSTER_ACTION gets the extra > > permissions > > > > > > > described > > > > > > > > >>>> here > > > > > > > > >>>> > > only > > > > > > > > >>>> > > > > when the message comes in on the inter-broker > > > > listener. > > > > > > We > > > > > > > > >>>> should > > > > > > > > >>>> > > state > > > > > > > > >>>> > > > > that here. > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > (I can see that you have this information later on > > > in > > > > > the > > > > > > > > >>>> "Security > > > > > > > > >>>> > > > Access > > > > > > > > >>>> > > > > Changes" section, but it would be good to have it > > > here > > > > > as > > > > > > > > well, > > > > > > > > >>>> to > > > > > > > > >>>> > > avoid > > > > > > > > >>>> > > > > confusion.) > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > > To be more strict of protecting controller > > > > > information, > > > > > > > the > > > > > > > > >>>> > > > > "ControllerId" > > > > > > > > >>>> > > > > > field in new MetadataResponse shall be set to -1 > > > > when > > > > > > the > > > > > > > > >>>> original > > > > > > > > >>>> > > > > request > > > > > > > > >>>> > > > > > comes from a non-broker client and it is already > > > on > > > > > v10. > > > > > > > We > > > > > > > > >>>> shall > > > > > > > > >>>> > use > > > > > > > > >>>> > > > the > > > > > > > > >>>> > > > > > request listener name to distinguish whether a > > > given > > > > > > > request > > > > > > > > >>>> is > > > > > > > > >>>> > > > > inter-broker, > > > > > > > > >>>> > > > > > or from the client. > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > I'm not sure why we would need to distinguish > > > between > > > > > > broker > > > > > > > > >>>> clients > > > > > > > > >>>> > > and > > > > > > > > >>>> > > > > non-broker clients. Brokers don't generally send > > > > > > > > >>>> MetadataRequests to > > > > > > > > >>>> > > > other > > > > > > > > >>>> > > > > brokers, do they? Brokers learn about metadata > > from > > > > > > > > >>>> > > > UpdateMetadataRequest > > > > > > > > >>>> > > > > and LeaderAndIsrRequest, not by sending > > > > MetadataRequests > > > > > > to > > > > > > > > >>>> other > > > > > > > > >>>> > > > brokers. > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > We do have one use case where the MetadataRequest > > > gets > > > > > > sent > > > > > > > > >>>> between > > > > > > > > >>>> > the > > > > > > > > >>>> > > > brokers, which is the InterBrokerSendThread. > > Currently > > > > we > > > > > > > don't > > > > > > > > >>>> rely on > > > > > > > > >>>> > > it > > > > > > > > >>>> > > > to get the controller id, so I guess your suggestion > > > > > should > > > > > > be > > > > > > > > >>>> good to > > > > > > > > >>>> > > > enforce. We could use some meta comment on the > > > > > NetworkClient > > > > > > > > that > > > > > > > > >>>> it > > > > > > > > >>>> > > should > > > > > > > > >>>> > > > not be used to get the controller location. > > > > > > > > >>>> > > > > > > > > > > > >>>> > > > Probably what we want here is: v0-v9: return a > > random > > > > > broker > > > > > > > in > > > > > > > > >>>> the > > > > > > > > >>>> > > cluster > > > > > > > > >>>> > > > > as the controller ID. v10: no controllerID > > present > > > in > > > > > the > > > > > > > > >>>> > > > > MetadataResponse. We should also deprecate the > > > > > > adminClient > > > > > > > > >>>> method > > > > > > > > >>>> > > which > > > > > > > > >>>> > > > > gets the controllerId. > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > > BROKER_AUTHORIZATION_FAILURE(92, "Authorization > > > > failed > > > > > > for > > > > > > > > the > > > > > > > > >>>> > > > > > request during forwarding, this indicates an > > > > internal > > > > > > > error > > > > > > > > >>>> on the > > > > > > > > >>>> > > > broker > > > > > > > > >>>> > > > > > cluster security setup.", > > > > > > > > >>>> > BrokerAuthorizationFailureException::new); > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > Grammar nitpick: It would be good to have a period > > > > > between > > > > > > > > >>>> > "forwarding" > > > > > > > > >>>> > > > > and "this" to avoid a run-on sentence :) > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > best, > > > > > > > > >>>> > > > > Colin > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > On Mon, Jul 27, 2020, at 21:47, Boyang Chen wrote: > > > > > > > > >>>> > > > > > Hey there, > > > > > > > > >>>> > > > > > > > > > > > > > >>>> > > > > > I'm re-opening this thread because after some > > > > initial > > > > > > > > >>>> > implementations > > > > > > > > >>>> > > > > > started, we spotted some gaps in the approved > > KIP > > > as > > > > > > well > > > > > > > as > > > > > > > > >>>> some > > > > > > > > >>>> > > > > > inconsistencies with KIP-631 controller. There > > > are a > > > > > > > couple > > > > > > > > of > > > > > > > > >>>> > > > addendums > > > > > > > > >>>> > > > > to > > > > > > > > >>>> > > > > > the existing KIP, specifically: > > > > > > > > >>>> > > > > > > > > > > > > > >>>> > > > > > 1. As the controller is foreseen to be only > > > > accessible > > > > > > to > > > > > > > > the > > > > > > > > >>>> > > brokers, > > > > > > > > >>>> > > > > the > > > > > > > > >>>> > > > > > new admin client would not have direct access to > > > the > > > > > > > > >>>> controller. It > > > > > > > > >>>> > > is > > > > > > > > >>>> > > > > > guaranteed on the MetadataResponse level which > > no > > > > > longer > > > > > > > > >>>> provides > > > > > > > > >>>> > > > > > `ControllerId` to client side requests. > > > > > > > > >>>> > > > > > > > > > > > > > >>>> > > > > > 2. The broker would forward any direct ZK path > > > > > mutation > > > > > > > > >>>> requests, > > > > > > > > >>>> > > > > including > > > > > > > > >>>> > > > > > topic creation/deletion, reassignment, etc since > > > we > > > > > > > > deprecate > > > > > > > > >>>> the > > > > > > > > >>>> > > > direct > > > > > > > > >>>> > > > > > controller access on the client side. No more > > > > protocol > > > > > > > > >>>> version bump > > > > > > > > >>>> > > is > > > > > > > > >>>> > > > > > necessary for the configuration requests. > > > > > > > > >>>> > > > > > > > > > > > > > >>>> > > > > > 3. To make sure forwarding requests pass the > > > > > > > authorization, > > > > > > > > >>>> broker > > > > > > > > >>>> > > > > > principal CLUSTER_ACTION would be allowed to be > > > used > > > > > as > > > > > > an > > > > > > > > >>>> > > alternative > > > > > > > > >>>> > > > > > authentication method for a variety of principal > > > > > > > operations, > > > > > > > > >>>> > > including > > > > > > > > >>>> > > > > > ALTER, ALTER_CONFIG, DELETE, etc. It is because > > > the > > > > > > > > forwarding > > > > > > > > >>>> > > request > > > > > > > > >>>> > > > > > needs to use the proxy broker's own principal, > > > which > > > > > is > > > > > > > > >>>> currently > > > > > > > > >>>> > not > > > > > > > > >>>> > > > > > supported to be used for many configuration > > change > > > > > > > > >>>> authentication > > > > > > > > >>>> > > > listed > > > > > > > > >>>> > > > > > above. The full list could be found in the KIP. > > > > > > > > >>>> > > > > > > > > > > > > > >>>> > > > > > 4. Add a new BROKER_AUTHORIZATION_FAILURE error > > > code > > > > > to > > > > > > > > >>>> indicate > > > > > > > > >>>> > any > > > > > > > > >>>> > > > > > internal security configuration failure, when > > the > > > > > > > forwarded > > > > > > > > >>>> request > > > > > > > > >>>> > > > > failed > > > > > > > > >>>> > > > > > authentication on the controller side. > > > > > > > > >>>> > > > > > > > > > > > > > >>>> > > > > > Let me know what you think. With such a major > > > > > refinement > > > > > > > of > > > > > > > > >>>> the > > > > > > > > >>>> > KIP, > > > > > > > > >>>> > > > I'm > > > > > > > > >>>> > > > > > open for re-vote after discussions converge. > > > > > > > > >>>> > > > > > > > > > > > > > >>>> > > > > > Boyang > > > > > > > > >>>> > > > > > > > > > > > > > >>>> > > > > > On Wed, Jul 1, 2020 at 2:17 PM Boyang Chen < > > > > > > > > >>>> > > reluctanthero...@gmail.com > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > > wrote: > > > > > > > > >>>> > > > > > > > > > > > > > >>>> > > > > > > Hey folks, > > > > > > > > >>>> > > > > > > > > > > > > > > >>>> > > > > > > I have also synced on the KIP-578 which was > > > doing > > > > > the > > > > > > > > >>>> partition > > > > > > > > >>>> > > > limit, > > > > > > > > >>>> > > > > to > > > > > > > > >>>> > > > > > > make sure the partition limit error code would > > > be > > > > > > > properly > > > > > > > > >>>> > > propagated > > > > > > > > >>>> > > > > once > > > > > > > > >>>> > > > > > > it is done on top of KIP-590. Let me know if > > you > > > > > have > > > > > > > > >>>> further > > > > > > > > >>>> > > > > questions or > > > > > > > > >>>> > > > > > > concerns. > > > > > > > > >>>> > > > > > > > > > > > > > > >>>> > > > > > > Boyang > > > > > > > > >>>> > > > > > > > > > > > > > > >>>> > > > > > > On Tue, Jun 23, 2020 at 5:08 PM Boyang Chen < > > > > > > > > >>>> > > > > reluctanthero...@gmail.com> > > > > > > > > >>>> > > > > > > wrote: > > > > > > > > >>>> > > > > > > > > > > > > > > >>>> > > > > > >> Thanks for the clarification, Colin and > > Ismael. > > > > > > > > Personally > > > > > > > > >>>> I > > > > > > > > >>>> > also > > > > > > > > >>>> > > > feel > > > > > > > > >>>> > > > > > >> Option A is better to prioritize fixing the > > > gap. > > > > > Just > > > > > > > to > > > > > > > > be > > > > > > > > >>>> > clear, > > > > > > > > >>>> > > > the > > > > > > > > >>>> > > > > > >> proposed solution would be: > > > > > > > > >>>> > > > > > >> > > > > > > > > >>>> > > > > > >> 1. Bump the Metadata RPC version to return > > > > > > > > >>>> POLICY_VIOLATION. In > > > > > > > > >>>> > > the > > > > > > > > >>>> > > > > > >> application level, we should swap the error > > > > message > > > > > > > with > > > > > > > > >>>> the > > > > > > > > >>>> > > actual > > > > > > > > >>>> > > > > failure > > > > > > > > >>>> > > > > > >> reason such as "violation of topic creation > > > > policy > > > > > > when > > > > > > > > >>>> > attempting > > > > > > > > >>>> > > > to > > > > > > > > >>>> > > > > auto > > > > > > > > >>>> > > > > > >> create internal topic through > > MetadataRequest." > > > > > > > > >>>> > > > > > >> > > > > > > > > >>>> > > > > > >> 2. For older Metadata RPC, return > > > > > > AUTHORIZATION_FAILED > > > > > > > to > > > > > > > > >>>> fail > > > > > > > > >>>> > > fast. > > > > > > > > >>>> > > > > > >> > > > > > > > > >>>> > > > > > >> Will address our other discussed points as > > well > > > > in > > > > > > the > > > > > > > > >>>> KIP, let > > > > > > > > >>>> > me > > > > > > > > >>>> > > > > know > > > > > > > > >>>> > > > > > >> if you have further questions. > > > > > > > > >>>> > > > > > >> > > > > > > > > >>>> > > > > > >> Thanks, > > > > > > > > >>>> > > > > > >> Boyang > > > > > > > > >>>> > > > > > >> > > > > > > > > >>>> > > > > > >> On Tue, Jun 23, 2020 at 10:41 AM Ismael Juma > > < > > > > > > > > >>>> ism...@juma.me.uk > > > > > > > > >>>> > > > > > > > > > > >>>> > > > > wrote: > > > > > > > > >>>> > > > > > >> > > > > > > > > >>>> > > > > > >>> Option A is basically what I was thinking. > > But > > > > > with > > > > > > a > > > > > > > > >>>> slight > > > > > > > > >>>> > > > > adjustment: > > > > > > > > >>>> > > > > > >>> > > > > > > > > >>>> > > > > > >>> New versions of MetadataResponse return > > > > > > > > POLICY_VIOLATION, > > > > > > > > >>>> old > > > > > > > > >>>> > > > > versions > > > > > > > > >>>> > > > > > >>> return AUTHORIZATION_FAILED. The latter > > works > > > > > > > correctly > > > > > > > > >>>> with > > > > > > > > >>>> > old > > > > > > > > >>>> > > > Java > > > > > > > > >>>> > > > > > >>> clients (i.e. the client fails fast and > > > > propagates > > > > > > the > > > > > > > > >>>> error), > > > > > > > > >>>> > > I've > > > > > > > > >>>> > > > > > >>> tested > > > > > > > > >>>> > > > > > >>> it. Adjust new clients to treat > > > POLICY_VIOLATION > > > > > > like > > > > > > > > >>>> > > > > > >>> AUTHORIZATION_FAILED, > > > > > > > > >>>> > > > > > >>> but propagate the custom error message. > > > > > > > > >>>> > > > > > >>> > > > > > > > > >>>> > > > > > >>> Ismael > > > > > > > > >>>> > > > > > >>> > > > > > > > > >>>> > > > > > >>> On Mon, Jun 22, 2020 at 11:00 PM Colin > > McCabe > > > < > > > > > > > > >>>> > > cmcc...@apache.org> > > > > > > > > >>>> > > > > > >>> wrote: > > > > > > > > >>>> > > > > > >>> > > > > > > > > >>>> > > > > > >>> > > > > On Fri, Jun 19, 2020 at 3:18 PM > > Ismael > > > > > Juma > > > > > > < > > > > > > > > >>>> > > > > ism...@juma.me.uk> > > > > > > > > >>>> > > > > > >>> > wrote: > > > > > > > > >>>> > > > > > >>> > > > > > > > > > > > > >>>> > > > > > >>> > > > > > Hi Colin, > > > > > > > > >>>> > > > > > >>> > > > > > > > > > > > > > >>>> > > > > > >>> > > > > > The KIP states in the > > Compatibility > > > > > > section > > > > > > > > (not > > > > > > > > >>>> > Future > > > > > > > > >>>> > > > > work): > > > > > > > > >>>> > > > > > >>> > > > > > > > > > > > > > >>>> > > > > > >>> > > > > > "To support the proxy of requests, > > > we > > > > > need > > > > > > > to > > > > > > > > >>>> build a > > > > > > > > >>>> > > > > channel > > > > > > > > >>>> > > > > > >>> for > > > > > > > > >>>> > > > > > >>> > > > > > brokers to talk directly to the > > > > > > controller. > > > > > > > > >>>> This part > > > > > > > > >>>> > > of > > > > > > > > >>>> > > > > the > > > > > > > > >>>> > > > > > >>> design > > > > > > > > >>>> > > > > > >>> > > > > > is internal change only and won’t > > > > block > > > > > > the > > > > > > > > KIP > > > > > > > > >>>> > > > progress." > > > > > > > > >>>> > > > > > >>> > > > > > > > > > > > > > >>>> > > > > > >>> > > > > > I am clarifying that this is not > > > > > internal > > > > > > > only > > > > > > > > >>>> due to > > > > > > > > >>>> > > the > > > > > > > > >>>> > > > > > >>> config. > > > > > > > > >>>> > > > > > >>> > If we > > > > > > > > >>>> > > > > > >>> > > > > > say that this KIP depends on > > another > > > > KIP > > > > > > > > before > > > > > > > > >>>> we > > > > > > > > >>>> > can > > > > > > > > >>>> > > > > merge > > > > > > > > >>>> > > > > > >>> > > > > > it, that's fine although it feels > > a > > > > bit > > > > > > > > >>>> unnecessary. > > > > > > > > >>>> > > > > > >>> > > > > > > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > Hi Ismael, > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > I didn't realize there was still a > > reference > > > > to > > > > > > the > > > > > > > > >>>> separate > > > > > > > > >>>> > > > > controller > > > > > > > > >>>> > > > > > >>> > channel in the "Compatibility, > > Deprecation, > > > > and > > > > > > > > >>>> Migration > > > > > > > > >>>> > Plan" > > > > > > > > >>>> > > > > > >>> section. I > > > > > > > > >>>> > > > > > >>> > agree that it doesn't really belong there. > > > > > Given > > > > > > > that > > > > > > > > >>>> this > > > > > > > > >>>> > is > > > > > > > > >>>> > > > > creating > > > > > > > > >>>> > > > > > >>> > confusion, I would suggest that we just > > drop > > > > > this > > > > > > > from > > > > > > > > >>>> the > > > > > > > > >>>> > KIP > > > > > > > > >>>> > > > > > >>> entirely. > > > > > > > > >>>> > > > > > >>> > It really is orthogonal to what this KIP > > is > > > > > > about-- > > > > > > > we > > > > > > > > >>>> don't > > > > > > > > >>>> > > > need a > > > > > > > > >>>> > > > > > >>> > separate channel to implement redirection. > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > Boyang wrote: > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > > > > > > > > > > >>>> > > > > > >>> > > We are only opening the doors for > > specific > > > > > > > internal > > > > > > > > >>>> topics > > > > > > > > >>>> > > > > (offsets, > > > > > > > > >>>> > > > > > >>> txn > > > > > > > > >>>> > > > > > >>> > > log), which I assume the client should > > > have > > > > no > > > > > > > > >>>> possibility > > > > > > > > >>>> > to > > > > > > > > >>>> > > > > mutate > > > > > > > > >>>> > > > > > >>> the > > > > > > > > >>>> > > > > > >>> > > topic policy? > > > > > > > > >>>> > > > > > >>> > > > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > Hi Boyang, > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > I think you and Ismael are talking about > > > > > different > > > > > > > > >>>> scenarios. > > > > > > > > >>>> > > > You > > > > > > > > >>>> > > > > are > > > > > > > > >>>> > > > > > >>> > describing the scenario where the broker > > is > > > > > > > > >>>> auto-creating the > > > > > > > > >>>> > > > > > >>> transaction > > > > > > > > >>>> > > > > > >>> > log topic or consumer offset topic. This > > > > > scenario > > > > > > > > >>>> indeed > > > > > > > > >>>> > > should > > > > > > > > >>>> > > > > not > > > > > > > > >>>> > > > > > >>> happen > > > > > > > > >>>> > > > > > >>> > in a properly-configured cluster. > > However, > > > > > Ismael > > > > > > > is > > > > > > > > >>>> > > describing > > > > > > > > >>>> > > > a > > > > > > > > >>>> > > > > > >>> scenario > > > > > > > > >>>> > > > > > >>> > where the client is auto-creating some > > > > arbitrary > > > > > > > > >>>> non-internal > > > > > > > > >>>> > > > topic > > > > > > > > >>>> > > > > > >>> just by > > > > > > > > >>>> > > > > > >>> > sending a metadata request. > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > As far as I can see, there are two > > solutions > > > > > here: > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > A. Close the hole in CreateTopicsPolicy > > > > > > immediately. > > > > > > > > >>>> In new > > > > > > > > >>>> > > > > versions, > > > > > > > > >>>> > > > > > >>> > allow MetadataResponse to return > > > > > > > AUTHORIZATION_FAILED > > > > > > > > >>>> if we > > > > > > > > >>>> > > tried > > > > > > > > >>>> > > > > to > > > > > > > > >>>> > > > > > >>> > auto-create a topic and failed. Find some > > > > other > > > > > > > error > > > > > > > > >>>> code > > > > > > > > >>>> > to > > > > > > > > >>>> > > > > return > > > > > > > > >>>> > > > > > >>> for > > > > > > > > >>>> > > > > > >>> > existing versions. > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > B. Keep the hole in CreateTopicsPolicy and > > > add > > > > > > some > > > > > > > > >>>> > > configuration > > > > > > > > >>>> > > > > to > > > > > > > > >>>> > > > > > >>> allow > > > > > > > > >>>> > > > > > >>> > admins to gradually migrate to closing it. > > > In > > > > > > > > >>>> practice, this > > > > > > > > >>>> > > > > probably > > > > > > > > >>>> > > > > > >>> > means a configuration toggle that enables > > > > direct > > > > > > ZK > > > > > > > > >>>> access, > > > > > > > > >>>> > > that > > > > > > > > >>>> > > > > > >>> starts off > > > > > > > > >>>> > > > > > >>> > as enabled. Then we can eventually > > default > > > it > > > > > to > > > > > > > > false > > > > > > > > >>>> and > > > > > > > > >>>> > > then > > > > > > > > >>>> > > > > > >>> remove it > > > > > > > > >>>> > > > > > >>> > entirely over time. > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > best, > > > > > > > > >>>> > > > > > >>> > Colin > > > > > > > > >>>> > > > > > >>> > > > > > > > > > >>>> > > > > > >>> > > > > > > > > >>>> > > > > > >> > > > > > > > > >>>> > > > > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > > > > > > > > >>>> > > > > > > > > > > >>>> > > > > > > > > > >>>> > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >