Hi, I gave the KIP another look after talking to some people at the Kafka
Summit in London. And I would like to clear up the motivation of this KIP.


At the moment, MM2 has some opinionated decisions that are creating issues
for teams that use IaC, federated solutions or have a capacity/budget
planning system for Kafka destination clusters. To explain it better, let's
assume we have MM2 with the following configurations to highlight these
problems.

```

topics = .*

refresh.topics.enabled = true

sync.topic.configs.enabled = true

sync.topic.acls.enabled = true

// Maybe in futrue we can have sync.group.acls.enabled = true

```


These configurations allow us to run MM2 with the value of its full
features. However, there are two main concerns when we run on a scale with
these configs:

1. *Capacity/Budgeting Planning:*

Functionality or features that impact capacity planning using MM2 are:

   1. MM2 automatically creates topics (breaking the rule of
   `auto.create.topics.enable=false`) and creates topic partitions on
   destination clusters if the number of partitions increases on the source.
   In the previous example, this functionality will apply to any topic that
   matches the regex of the `topics` config.
   2. Sync topic configs include configurations that impact capacity like `
   retention.ms` and `retention.bytes`.

These 2 points lead to adding new untracked capacity to destination
clusters without a way to count for them up-front or safeguard the cluster.
The team that runs the cluster will only see the capacity issue when their
disk usage hits the threshold for their alerts. The desk capacity issue can
be avoided if MM2 is flexible enough to

   - have a way for teams that run their ecosystem to have MM2 behave
   within their system.
   - disable the auto-creation and avoid syncing configs that impact
   capacity


2. *Provisioning conflict:*

In the previous MM2 configurations; we ended up with conflict as MM2 used
`AdminClient` directly to perform the following functionality

   -  Create a Kafka topic (no way to disable this at the moment)
   -  Add new Kafka partitions (no way to disable this at the moment)
   -  Sync Kafka Topic configurations (can be disabled, but then this
   reduces the value of MM2 potential for users)
   -  Sync Kafka topic's ACLs (can be disabled, but this reduces the users'
   value). Disabling this feature also means that users must ensure they have
   the right ACLs to the mirrored topics on the destination cluster before
   switching their consumers, especially when MM2 is used for disaster
   recovery. It may lead to extra downtime for them.


All these functionalities are using AdminClient; which causes an issue with
teams that

   - Manage their Kafka resources using tools like Strimizi or custom
   federated solutions. For example, Strimizi's UserOperator doesn't sync the
   topic ACLs when MM2 is enabled. Strimzi documentation mentions that users
   must to disable MM2 `sync.topic.acls.enabled` if they use `UserOperator`.
   On the other hand, Strimizi's TopicOperator doesn't have the same issue
   because it has a bi-directional reconciliation process that watches the
   topics state on the Kafka cluster and updates KafkaTopic resources for
   Strimzi. This design works fine with Kafka MM2 for Topics but not for
   syncing ACLs. Strimizi TopicOperator also doesn't have a way to stop
   syncing config that impact capacity for example retention configs.


   - Teams that run MM2 but don't own the destination cluster. In this
   case, these teams don't have Admin access, but they may have Kafka
   management solutions, such as yahoo/CMAK or an in-house solution. For such
   a tool as CMAK, these teams can update/create resources using CMAK REST API.


The Proposed KIP gives users the flexibility to integrate MM2 within their
ecosystem without disabling any MM2 features. We can achieve this
flexibility with one of the following solutions.

   1. Introduce a new interface that hides Admin interactions in one place.
   Then users can provide their way of resource management. As well as clean
   up the MM2 code by having one place that manages the resources, as at the
   moment, MM2 usage of AdminClient is all over the code.
   2. The second solution could be to add only a new config that allows the
   users to override AdminClient with another implementation of the
   AdminClient interface, as Ryanne suggested before. The downside is that
   AdminClient is enormous and constantly under development, so any users who
   opt-in for custom implementation will need to carry this burden.

I favour the first solution as it will make it either later to add any new
feature related to resource management. But don't mind if others think that
the second solution is easier for MM2 design.


*Note*: There are two possible future KIPs following this KIP to

   1. Add config to disable MM2 from auto creating or adding new topic
   partitions.
   2. Add a way to exclude a specific topic's configuration from being
   synced.


I hope this clears up the problem better. Please let me know what do you
think.


On Mon, Feb 7, 2022 at 3:26 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
wrote:

> Hi Mickael. Thanks for the feedback. I address some of your points below.
>
> *> This seems to address a relatively advanced and specific use case*
> The main point of the KIP is that MM2 is making a massive assumption that
> it has the right to alter/create resources. This assumption isn't valid in
> the world of Infra-as-Code, federated solutions and popularity of OS Kafka
> Kubernetes operators; these infra/resource management solutions aren't
> advanced use-cases anymore nowadays. These concerns had been raised in the
> past, especially regarding the assumption that MM2 can create topics on the
> destination cluster. For example,
> https://issues.apache.org/jira/browse/KAFKA-12753 and
> https://www.mail-archive.com/dev@kafka.apache.org/msg119340.html.
> The primary motivation is giving some power to data
> platform/infrastructure team to make MM2 part of their internal Kafka
> ecosystem without dropping the main features that make MM2 valuable, like
> syncing topic configs. For example, if someone uses any OS Kafka k8s
> operator, they can implement the class to interact with the k8s operator to
> create these resources.
>
> *> My initial concern is this may make it hard to evolve MirrorMaker as
> we'll likely need to update this new interface if new features are added.*
> I agree it's a disadvantage to adding a new interface however adding more
> admin interactions from MM2 to alter/create resources and access will feed
> the main issue as I mentioned above with the popularity of IaC and
> federated solutions; most data platform/infrastructure teams will endup
> disabling these new features.
> Also, at the moment, most MM2 interactions with the admin client are
> scattered across the codebase so having one place where all admin
> interactions are listed isn't a bad thing.
>
> *> For example if we wanted to sync group ACLs.*
> As I mentioned before, altering any resource's configurations with MM2 is
> the one main concern for any data platform/infrastructure team that wants
> to have control over their clusters and use MM2. So the main question with
> adding any new altering feature like sync group ACLs will raise the same
> question of how many teams will actually enable this feature.
>
>
>
>
>
>
>
>
>
> *>Regarding the proposed API, I have a few suggestions: >- What about
> using configure() instead of the constructor to pass the >configuration,
> especially as it's implementing Configurable >- It's not clear what all the
> arguments of createTopicPartitions()>are. What's the difference between
> partitionCounts and newPartitions?>Should we have separate methods for
> creating topics and partitions? >- Do we really need
> createCompactedTopic()? >- Instead of updateTopicConfigs() and updateAcls()
> should we use the >"alter" prefix to stay consistent with Admin?*
>
> These are good suggestions that will update the KIP to address these.
> Regarding the `createCompactedTopic` MM2 is using this method to create
> internal topics.
>
> Thanks
>
> On Wed, Jan 26, 2022 at 1:55 PM Mickael Maison <mickael.mai...@gmail.com>
> wrote:
>
>> Hi Omnia,
>>
>> Thanks for the KIP, sorry for taking so long to comment. I've only had
>> time to take a quick look so far.
>>
>> This seems to address a relatively advanced and specific use case. My
>> initial concern is this may make it hard to evolve MirrorMaker as
>> we'll likely need to update this new interface if new features are
>> added. For example if we wanted to sync group ACLs.
>> I'm wondering if it's something you've thought about. I'm not saying
>> it's a blocker but we have to weigh the pros and cons when introducing
>> new features.
>>
>> Regarding the proposed API, I have a few suggestions:
>> - What about using configure() instead of the constructor to pass the
>> configuration, especially as it's implementing Configurable
>> - It's not clear what all the arguments of createTopicPartitions()
>> are. What's the difference between partitionCounts and newPartitions?
>> Should we have separate methods for creating topics and partitions?
>> - Do we really need createCompactedTopic()?
>> - Instead of updateTopicConfigs() and updateAcls() should we use the
>> "alter" prefix to stay consistent with Admin?
>>
>> Thanks,
>> Mickael
>>
>> On Wed, Jan 26, 2022 at 11:26 AM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
>> wrote:
>> >
>> > Hi,
>> > If there are no more concerns regarding the proposal can I get some
>> votes on the KIP here
>> https://lists.apache.org/thread/950lpxjb5kbjm8qdszlpxm9h4j4sfyjx
>> >
>> > Thanks
>> >
>> > On Wed, Oct 27, 2021 at 3:54 PM Ryanne Dolan <ryannedo...@gmail.com>
>> wrote:
>> >>
>> >> Well I'm convinced! Thanks for looking into it.
>> >>
>> >> Ryanne
>> >>
>> >> On Wed, Oct 27, 2021, 8:49 AM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
>> wrote:
>> >>
>> >> > I checked the difference between the number of methods in the Admin
>> >> > interface and the number of methods MM2 invokes from Admin, and this
>> diff
>> >> > is enormous (it's more than 70 methods).
>> >> > As far as I can see, the following methods MM2 depends on (in
>> >> > MirrorSourceConnector, MirrorMaker, MirrorCheckpointTask and
>> >> > MirrorCheckpointConnector), this will leave 73 methods on the Admin
>> >> > interface that customer will need to return dummy data for,
>> >> >
>> >> >    1. create(conf)
>> >> >    2. close
>> >> >    3. listTopics()
>> >> >    4. createTopics(newTopics, createTopicsOptions)
>> >> >    5. createPartitions(newPartitions)
>> >> >    6. alterConfigs(configs)  // this method is marked for
>> deprecation in
>> >> >    Admin and the ConfigResource MM2 use is only TOPIC
>> >> >    7. createAcls(aclBindings) // the list of bindings always
>> filtered by
>> >> >    TOPIC
>> >> >    8. describeAcls(aclBindingFilter) // filter is always
>> ANY_TOPIC_ACL
>> >> >    9. describeConfigs(configResources) // Always for TOPIC resources
>> >> >    10. listConsumerGroupOffsets(groupId)
>> >> >    11. listConsumerGroups()
>> >> >    12. alterConsumerGroupOffsets(groupId, offsets)
>> >> >    13. describeCluster() // this is invoked from
>> >> > ConnectUtils.lookupKafkaClusterId(conf),
>> >> >    but MM2 isn't the one that initialize the AdminClient
>> >> >
>> >> > Going with the Admin interface in practice will make any custom Admin
>> >> > implementation humongous even for a fringe use case because of the
>> number
>> >> > of methods that need to return dummy data,
>> >> >
>> >> > I am still leaning toward a new interface as it abstract all MM2's
>> >> > interaction with Kafka Resources in one place; this makes it easier
>> to
>> >> > maintain and make it easier for the use cases where customers wish to
>> >> > provide a different method to handle resources.
>> >> >
>> >> > Omnia
>> >> >
>> >> > On Tue, Oct 26, 2021 at 4:10 PM Ryanne Dolan <ryannedo...@gmail.com>
>> >> > wrote:
>> >> >
>> >> > > I like the idea of failing-fast whenever a custom impl is
>> provided, but I
>> >> > > suppose that that could be done for Admin as well. I agree your
>> proposal
>> >> > is
>> >> > > more ergonomic, but maybe it's okay to have a little friction in
>> such
>> >> > > fringe use-cases.
>> >> > >
>> >> > > Ryanne
>> >> > >
>> >> > >
>> >> > > On Tue, Oct 26, 2021, 6:23 AM Omnia Ibrahim <
>> o.g.h.ibra...@gmail.com>
>> >> > > wrote:
>> >> > >
>> >> > > > Hey Ryanne, Thanks fo the quick feedback.
>> >> > > > Using the Admin interface would make everything easier, as MM2
>> will
>> >> > need
>> >> > > > only to configure the classpath for the new implementation and
>> use it
>> >> > > > instead of AdminClient.
>> >> > > > However, I have two concerns
>> >> > > > 1. The Admin interface is enormous, and the MM2 users will need
>> to know
>> >> > > the
>> >> > > > list of methods MM2 depends on and override these only instead of
>> >> > > > implementing the whole Admin interface.
>> >> > > > 2. MM2 users will need keep an eye on any changes to Admin
>> interface
>> >> > that
>> >> > > > impact MM2 for example deprecating methods.
>> >> > > > Am not sure if adding these concerns on the users is acceptable
>> or not.
>> >> > > > One solution to address these concerns could be adding some
>> checks to
>> >> > > make
>> >> > > > sure the methods MM2 uses from the Admin interface exists to fail
>> >> > faster.
>> >> > > > What do you think
>> >> > > >
>> >> > > > Omnia
>> >> > > >
>> >> > > >
>> >> > > > On Mon, Oct 25, 2021 at 11:24 PM Ryanne Dolan <
>> ryannedo...@gmail.com>
>> >> > > > wrote:
>> >> > > >
>> >> > > > > Thanks Omnia, neat idea. I wonder if we could use the existing
>> Admin
>> >> > > > > interface instead of defining a new one?
>> >> > > > >
>> >> > > > > Ryanne
>> >> > > > >
>> >> > > > > On Mon, Oct 25, 2021, 12:54 PM Omnia Ibrahim <
>> >> > o.g.h.ibra...@gmail.com>
>> >> > > > > wrote:
>> >> > > > >
>> >> > > > > > Hey everyone,
>> >> > > > > > Please take a look at KIP-787
>> >> > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-787%3A+MM2+Interface+to+manage+Kafka+resources
>> >> > > > > > <
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-787%3A+MM2+Interface+to+manage+Kafka+resources
>> >> > > > > > >
>> >> > > > > >
>> >> > > > > > Thanks for the feedback and support
>> >> > > > > > Omnia
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>>
>

Reply via email to