I've added RequestedTopicState, as discussed in my last email. I've also added a paragraph to the migration plan about old clients making policy-violating delete topics or delete records request.
If no further comments a forthcoming in the next day or two then I will start a vote. Thanks, Tom On 5 October 2017 at 12:41, Tom Bentley <t.j.bent...@gmail.com> wrote: > I'd like to raise a somewhat subtle point about how the proposed API > should behave. > > The current CreateTopicPolicy gets passed either the request partition > count and replication factor, or the requested assignment. So if the > request had specified partition count and replication factor, the policy > sees a null replicaAssignments(). Likewise if the request specified a > replica assignment the policy would get back null from numPartitions() and > replicationFactor(). > > These semantics mean the policy can't reject an assignment that happened > to be auto-generated (or rather, it's obvious to the policy that the > assignment is auto generated, because it can't see such assignments), > though it can reject a request because the assignment was auto-generated, > or vice versa. > > Retaining these semantics makes the TopicState less symmetric between it's > use in requestedState() and the current state available from the > ClusterState, and also less symmetric between its use for createTopic() and > for alterTopic(). This can make it harder to write a policy. For example, > if I want the policy "the number of partitions must be < 100", if the > requestedState().numPartitions() can be null I need to cope with that > and figure it out from inspecting the replicasAssignments(). It would be > much better for the policy writer to just be able to write: > > if (request.requestedState().numPartitions() >= 100) > throw new PolicyViolationException("#partitions must be < 100") > > An alternative would be to keep the symmetry (and thus > TopicState.replicasAssignments() > would never return null, and TopicState.numPartitions() and > TopicState.replicationFactor() could each be primitives), but expose the > auto-generatedness of the replicaAssignments() explicitly, perhaps by using > a subtype of TopicState for the return type of requestedState(): > > interface RequestedTopicState extends TopicState { > /** > * True if the {@link TopicState#replicasAssignments()} > * in this request we generated by the broker, false if > * they were explicitly requested by the client. > */ > boolean generatedReplicaAssignments(); > } > > Thoughts? > > On 4 October 2017 at 11:06, Tom Bentley <t.j.bent...@gmail.com> wrote: > >> Good point. Then I guess I can do those items too. I would also need to >> do the same changes for DeleteRecordsRequest and Response. >> >> On 4 October 2017 at 10:37, Ismael Juma <ism...@juma.me.uk> wrote: >> >>> Those two points are related to policies in the following sense: >>> >>> 1. A policy that can't send errors to clients is much less useful >>> 2. Testing policies is much easier with `validateOnly` >>> >>> Ismael >>> >>> On Wed, Oct 4, 2017 at 9:20 AM, Tom Bentley <t.j.bent...@gmail.com> >>> wrote: >>> >>> > Thanks Edoardo, >>> > >>> > I've added that motivation to the KIP. >>> > >>> > KIP-201 doesn't address two points raised in KIP-170: Adding a >>> > validationOnly flag to >>> > DeleteTopicRequest and adding an error message to DeleteTopicResponse. >>> > Since those are not policy-related I think they're best left out of >>> > KIP-201. I suppose it is up to you and Mickael whether to narrow the >>> scope >>> > of KIP-170 to address those points. >>> > >>> > Thanks again, >>> > >>> > Tom >>> > >>> > On 4 October 2017 at 08:20, Edoardo Comar <eco...@uk.ibm.com> wrote: >>> > >>> > > Thanks Tom, >>> > > looks got to me and KIP-201 could supersede KIP-170 >>> > > but could you please add a missing motivation bullet that was behind >>> > > KIP-170: >>> > > >>> > > introducing ClusterState to allow validation of create/alter topic >>> > request >>> > > >>> > > not just against the request metadata but also >>> > > against the current amount of resources already used in the cluster >>> (eg >>> > > number of partitions). >>> > > >>> > > thanks >>> > > Edo >>> > > -------------------------------------------------- >>> > > >>> > > Edoardo Comar >>> > > >>> > > IBM Message Hub >>> > > >>> > > IBM UK Ltd, Hursley Park, SO21 2JN >>> > > >>> > > >>> > > >>> > > From: Tom Bentley <t.j.bent...@gmail.com> >>> > > To: dev@kafka.apache.org >>> > > Date: 02/10/2017 15:15 >>> > > Subject: Re: [DISCUSS] KIP-201: Rationalising Policy >>> interfaces >>> > > >>> > > >>> > > >>> > > Hi All, >>> > > >>> > > I've updated KIP-201 again so there is now a single policy interface >>> (and >>> > > thus a single key by which to configure it) for topic creation, >>> > > modification, deletion and record deletion, which each have their own >>> > > validation method. >>> > > >>> > > There are still a few loose ends: >>> > > >>> > > 1. I currently propose validateAlterTopic(), but it would be >>> possible to >>> > > be >>> > > more fine grained about this: validateAlterConfig(), >>> validAddPartitions() >>> > > and validateReassignPartitions(), for example. Obviously this >>> results in >>> > a >>> > > policy method per operation, and makes it more clear what is being >>> > > changed. >>> > > I guess the down side is its more work for implementer, and >>> potentially >>> > > makes it harder to change the interface in the future. >>> > > >>> > > 2. A couple of TODOs about what the TopicState interface should >>> return >>> > > when >>> > > a topic's partitions are being reassigned. >>> > > >>> > > Your thoughts on these or any other points are welcome. >>> > > >>> > > Thanks, >>> > > >>> > > Tom >>> > > >>> > > On 27 September 2017 at 11:45, Paolo Patierno <ppatie...@live.com> >>> > wrote: >>> > > >>> > > > Hi Ismael, >>> > > > >>> > > > >>> > > > 1. I don't have a real requirement now but "deleting" is an >>> > operation >>> > > > that could be really dangerous so it's always better having a way >>> for >>> > > > having more control on that. I know that we have the authorizer >>> used >>> > for >>> > > > that (delete on topic) but fine grained control could be better >>> (even >>> > > > already happens for topic deletion). >>> > > > 2. I know about the problem of restarting broker due to changes >>> on >>> > > > policies but what do you mean by doing that on the clients ? >>> > > > >>> > > > >>> > > > Paolo Patierno >>> > > > Senior Software Engineer (IoT) @ Red Hat >>> > > > Microsoft MVP on Azure & IoT >>> > > > Microsoft Azure Advisor >>> > > > >>> > > > Twitter : @ppatierno< >>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__twitter. >>> > > com_ppatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r= >>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y- >>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=43hzTLEDKw2v5Vh0zwkMTaaKD- >>> > HdJD8d_F4-Bsw25-Y&e= >>> > > > >>> > > > Linkedin : paolopatierno< >>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__it. >>> > > linkedin.com_in_paolopatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r= >>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y- >>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Ig0N7Nwf9EHfTJ2pH3jRM1JIdlzXw6 >>> > R5Drocu0TMRLk&e= >>> > > > >>> > > > Blog : DevExperience< >>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__ >>> > > paolopatierno.wordpress.com_&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r= >>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y- >>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Tc9NrTtG2GP7-zRjOHkXHfYI0rncO8_ >>> > jKpedna692z4&e= >>> > > > >>> > > > >>> > > > >>> > > > ________________________________ >>> > > > From: isma...@gmail.com <isma...@gmail.com> on behalf of Ismael >>> Juma < >>> > > > ism...@juma.me.uk> >>> > > > Sent: Wednesday, September 27, 2017 10:30 AM >>> > > > To: dev@kafka.apache.org >>> > > > Subject: Re: [DISCUSS] KIP-201: Rationalising Policy interfaces >>> > > > >>> > > > A couple of questions: >>> > > > >>> > > > 1. Is this a concrete requirement from a user or is it >>> hypothetical? >>> > > > 2. You sure you would want to do this in the broker instead of the >>> > > clients? >>> > > > It's worth remembering that updating broker policies involves a >>> rolling >>> > > > restart of the cluster, so it's not the right place for things that >>> > > change >>> > > > frequently. >>> > > > >>> > > > Ismael >>> > > > >>> > > > On Wed, Sep 27, 2017 at 11:26 AM, Paolo Patierno < >>> ppatie...@live.com> >>> > > > wrote: >>> > > > >>> > > > > Hi Ismael, >>> > > > > >>> > > > > regarding motivations for delete records, as I said during the >>> > > discussion >>> > > > > on KIP-204, it gives the possibility to avoid deleting messages >>> for >>> > > > > specific partitions (inside the topic) and starting from a >>> specific >>> > > > offset. >>> > > > > I could think on some users solutions where they know exactly >>> what >>> > the >>> > > > > partitions means in a specific topic (because they are using a >>> custom >>> > > > > partitioner on the producer side) so they know what kind of >>> messages >>> > > are >>> > > > > inside a partition allowing to delete them but not the others. >>> In >>> > > such a >>> > > > > policy a user could also check the timestamp related to the >>> offset >>> > for >>> > > > > allowing or not deletion on time base. >>> > > > > >>> > > > > >>> > > > > Paolo Patierno >>> > > > > Senior Software Engineer (IoT) @ Red Hat >>> > > > > Microsoft MVP on Azure & IoT >>> > > > > Microsoft Azure Advisor >>> > > > > >>> > > > > Twitter : @ppatierno< >>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__twitter. >>> > > com_ppatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r= >>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y- >>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=43hzTLEDKw2v5Vh0zwkMTaaKD- >>> > HdJD8d_F4-Bsw25-Y&e= >>> > > > >>> > > > > Linkedin : paolopatierno< >>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__it. >>> > > linkedin.com_in_paolopatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r= >>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y- >>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Ig0N7Nwf9EHfTJ2pH3jRM1JIdlzXw6 >>> > R5Drocu0TMRLk&e= >>> > > > >>> > > > > Blog : DevExperience< >>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__ >>> > > paolopatierno.wordpress.com_&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r= >>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y- >>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Tc9NrTtG2GP7-zRjOHkXHfYI0rncO8_ >>> > jKpedna692z4&e= >>> > > > >>> > > > > >>> > > > > >>> > > > > ________________________________ >>> > > > > From: isma...@gmail.com <isma...@gmail.com> on behalf of Ismael >>> > Juma < >>> > > > > ism...@juma.me.uk> >>> > > > > Sent: Wednesday, September 27, 2017 10:18 AM >>> > > > > To: dev@kafka.apache.org >>> > > > > Subject: Re: [DISCUSS] KIP-201: Rationalising Policy interfaces >>> > > > > >>> > > > > A couple more comments: >>> > > > > >>> > > > > 1. "If this KIP is accepted for Kafka 1.1.0 this removal could >>> happen >>> > > in >>> > > > > Kafka 1.2.0 or a later release." -> we only remove code in major >>> > > > releases. >>> > > > > So, if it's deprecated in 1.1.0, it would be removed in 2.0.0. >>> > > > > >>> > > > > 2. Deleting all messages in a topic is not really the same as >>> > deleting >>> > > a >>> > > > > topic. The latter will cause consumers and producers to error out >>> > > while >>> > > > the >>> > > > > former will not. It would be good to motivate the need for the >>> delete >>> > > > > records policy more. >>> > > > > >>> > > > > Ismael >>> > > > > >>> > > > > On Wed, Sep 27, 2017 at 11:12 AM, Ismael Juma <ism...@juma.me.uk >>> > >>> > > wrote: >>> > > > > >>> > > > > > Another quick comment: the KIP states that having multiple >>> > > interfaces >>> > > > > > imply that the logic must be in 2 places. That is not true >>> because >>> > > the >>> > > > > same >>> > > > > > class can implement multiple interfaces (this aspect was >>> considered >>> > > > when >>> > > > > we >>> > > > > > decided to introduce policies incrementally). >>> > > > > > >>> > > > > > The main reason why I think the original approach doesn't work >>> well >>> > > is >>> > > > > > that there is no direct mapping between an operation and the >>> > policy. >>> > > > That >>> > > > > > is, we initially thought we would have create/alter/delete >>> topics, >>> > > but >>> > > > > that >>> > > > > > didn't work out as the alter case is better served by multiple >>> > > request >>> > > > > > types. Given that, it's a bit awkward to maintain the original >>> > > approach >>> > > > > and >>> > > > > > a policy for topic management seemed easier to understand. On >>> that >>> > > > note, >>> > > > > > would `TopicManagementPolicy` be a better name? >>> > > > > > >>> > > > > > Looking at the updated KIP, I notice that we actually have a >>> > > > > > TopicDeletionPolicy with a separate config. That seems to be a >>> > > halfway >>> > > > > > house. Not sure about that. >>> > > > > > >>> > > > > > Ismael >>> > > > > > >>> > > > > > On Wed, Sep 27, 2017 at 10:47 AM, Tom Bentley >>> > > <t.j.bent...@gmail.com> >>> > > > > > wrote: >>> > > > > > >>> > > > > >> I have updated the KIP to add a common policy interface for >>> topic >>> > > and >>> > > > > >> message deletion. This included pulling ClusterState and >>> > TopicState >>> > > > > >> interfaces up to the top level so that they can be shared >>> between >>> > > the >>> > > > > two >>> > > > > >> policies. >>> > > > > >> >>> > > > > >> Cheers, >>> > > > > >> >>> > > > > >> Tom >>> > > > > >> >>> > > > > >> On 26 September 2017 at 18:09, Edoardo Comar < >>> eco...@uk.ibm.com> >>> > > > wrote: >>> > > > > >> >>> > > > > >> > Thanks Tom, >>> > > > > >> > In my original KIP-170 I mentioned that the method >>> > > > > >> > >>> > > > > >> > public Map<String, Integer> topicsPartitionCount(); >>> > > > > >> > >>> > > > > >> > was just a starting point for a general purpose ClusterState >>> > > > > >> > as it happened to be exactly the info we needed for our >>> policy >>> > > > > >> > implementation :-) >>> > > > > >> > it definitely doesn't feel general purpose enough. >>> > > > > >> > >>> > > > > >> > what about >>> > > > > >> > >>> > > > > >> > interface ClusterState { >>> > > > > >> > public TopicState topicState(String topicName); >>> > > > > >> > public Set<String> topics(); >>> > > > > >> > } >>> > > > > >> > >>> > > > > >> > I think that the implementation of ClusterState that the >>> server >>> > > will >>> > > > > >> pass >>> > > > > >> > to the policy.validate method >>> > > > > >> > would just lazily tap into MetadataCache. No need for big >>> > upfront >>> > > > > >> > allocations. >>> > > > > >> > >>> > > > > >> > ciao, >>> > > > > >> > Edo >>> > > > > >> > -------------------------------------------------- >>> > > > > >> > >>> > > > > >> > Edoardo Comar >>> > > > > >> > >>> > > > > >> > IBM Message Hub >>> > > > > >> > >>> > > > > >> > IBM UK Ltd, Hursley Park, SO21 2JN >>> > > > > >> > >>> > > > > >> > >>> > > > > >> > >>> > > > > >> > From: Tom Bentley <t.j.bent...@gmail.com> >>> > > > > >> > To: dev@kafka.apache.org >>> > > > > >> > Date: 26/09/2017 17:39 >>> > > > > >> > Subject: Re: [DISCUSS] KIP-201: Rationalising Policy >>> > > > interfaces >>> > > > > >> > >>> > > > > >> > >>> > > > > >> > >>> > > > > >> > Hi Edoardo, >>> > > > > >> > >>> > > > > >> > >>> > > > > >> > what about a single method in ClusterState >>> > > > > >> > > >>> > > > > >> > > interface ClusterState { >>> > > > > >> > > public Map<String,TopicState> topicsState(); >>> > > > > >> > > >>> > > > > >> > > } >>> > > > > >> > > >>> > > > > >> > > which could return a read-only snapshot of the cluster >>> > metadata >>> > > ? >>> > > > > >> > > >>> > > > > >> > >>> > > > > >> > Sure that would work too. A concern with that is that we >>> end up >>> > > > > >> allocating >>> > > > > >> > a potentially rather large amount for the Map and the >>> > collections >>> > > > > >> present >>> > > > > >> > in the TopicStates in order to provide the snapshot. The >>> caller >>> > > > might >>> > > > > >> only >>> > > > > >> > be interested in one item from the TopicState for one topic >>> in >>> > > the >>> > > > > map. >>> > > > > >> > Accessing this information via methods means the caller only >>> > pays >>> > > > for >>> > > > > >> what >>> > > > > >> > they use. >>> > > > > >> > >>> > > > > >> > Cheers, >>> > > > > >> > >>> > > > > >> > Tom >>> > > > > >> > >>> > > > > >> > >>> > > > > >> > >>> > > > > >> > Unless stated otherwise above: >>> > > > > >> > IBM United Kingdom Limited - Registered in England and Wales >>> > with >>> > > > > number >>> > > > > >> > 741598. >>> > > > > >> > Registered office: PO Box 41, North Harbour, Portsmouth, >>> > > Hampshire >>> > > > PO6 >>> > > > > >> 3AU >>> > > > > >> > >>> > > > > >> >>> > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > > >>> > > >>> > > >>> > > Unless stated otherwise above: >>> > > IBM United Kingdom Limited - Registered in England and Wales with >>> number >>> > > 741598. >>> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire >>> PO6 >>> > 3AU >>> > > >>> > >>> >> >> >