Hi Omnia,

Thanks, I updated the KIP with the DefaultTopicFilter changes, to reflect
the new mm2.*.internal pattern.

Best,
Patrik

On Mon, Aug 19, 2024 at 2:24 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
wrote:

> Thanks Patrik for updating the KIP.
> One last feedback, should we update `TOPICS_EXCLUDE_DEFAULT` also to
> reflect the same pattern instead of `.*[\\-\\.]internal`?
>
> Best,
> Omnia
>
> > On 16 Aug 2024, at 13:31, Patrik Marton <pmar...@cloudera.com.INVALID>
> wrote:
> >
> > Hi All,
> >
> > I updated the KIP with the new proposal, and changed the name to better
> > describe the latest proposal and the goal.
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1074%3A+Allow+the+replication+of+user+internal+topics
> >
> > Let me know your thoughts!
> >
> > Thanks!
> > Patrik
> >
> > On Wed, Aug 14, 2024 at 10:50 AM Viktor Somogyi-Vass
> > <viktor.somo...@cloudera.com.invalid> wrote:
> >
> >> Hi Omina,
> >>
> >> We have not considered this path yet that you proposed and I like it as
> >> well and I would also be in favor of a solution that doesn't try to
> >> introduce a new config. Since this KIP would only make it into 4.0
> anyway
> >> and your proposal seems like a smaller compatibility break that can
> >> be documented in the upgrade notes and also circumvented by a custom
> >> policy, I think we can do this.
> >>
> >> I'd be also happy to receive some feedback from those who have been in
> this
> >> conversation to form a consensus around this.
> >>
> >> Best,
> >> Viktor
> >>
> >> On Mon, Aug 12, 2024 at 2:19 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> >> wrote:
> >>
> >>> Hi Patrick,
> >>>> One thing that comes up for me is whether we should update the
> >>> TopicFilter
> >>>> too to have this more specific pattern, like "mm2-.*.internal".
> Ideally
> >>> I'd
> >>>> say yes, but what if a user already has existing .internal user topics
> >>> and
> >>>> relies on the ReplicationPolicy and TopicFilter not to replicate
> those,
> >>> and
> >>>> by changing both to a more specific pattern, we could technically
> >> enable
> >>>> their replication. I know this is an edge case, so am I being too
> >> careful
> >>>> here?
> >>>
> >>> I would say for the ReplicationPolicy blocking all `.internal` and
> >>> `-internal` as MM2 / Kafka internals is a bug. So anyone who has been
> >>> relaying on this bug to block topics is relaying on a faulty code
> >>> especially that none of the Kafka, connect or streams internals do
> create
> >>> any topic with these suffixes except MM2 and all of them can be
> >> identified
> >>> more accurately without this generic suffix.
> >>>
> >>>
> >>> The other potential cases we might break are
> >>> 1. where `replication.policy.internal.topic.separator.enabled` is
> >> enabled,
> >>> but we might only need to adjust the code to account for this either in
> >>> ReplicationPolicy itself
> >>> ` default boolean isMM2InternalTopic(String topic) { return
> >>> (topic.startsWith("mm2-") && topic.endsWith("internal")) ||
> >>> isCheckpointsTopic(topic); }`
> >>> Or in DefaultReplicationPolicy implementation of `isMM2InternalTopic`.
> >> So
> >>> we don’t break the backward compatibility for this case
> >>> 2. Any use case has a customised ReplicationPolicy with an override of
> >>> isMM2InternalTopic. This one might also break the backward
> compatibility
> >>>
> >>> As 4.0 coming soon we can leverage this and introduce a breaking change
> >> to
> >>> MM2 with clarification of the KIP's impact on backward compatibility
> for
> >>> the following cases,
> >>> - For anyone has been relaying on this bug to block normal topics with
> >>> `internal` suffix they will need to update topic filter config to block
> >>> this suffix manually if they wish for.
> >>> - For anyone overriding  isMM2InternalTopic in their custom
> >>> ReplicationPolicy, they might need to update their code account for the
> >> new
> >>> change to block only mm2 internal topics.
> >>>
> >>> If this path is acceptable with breaking the compatibility with the
> next
> >>> major Kafka release we should consider fixing the issue and not
> >> introducing
> >>> another config to go around it especially that this seems more like
> >> mistake
> >>> from the beginning as MM2 original KIP intention was trying to target
> MM2
> >>> and Kafka internal topics only and not all topics with this suffix.
> >>>
> >>> What does everyone think?
> >>>
> >>> Thanks
> >>> Omnia
> >>>> On 9 Aug 2024, at 10:40, Patrik Marton <pmar...@cloudera.com.INVALID>
> >>> wrote:
> >>>>
> >>>> Hi All,
> >>>>
> >>>> Thanks again for your feedback!
> >>>>
> >>>> Regarding the boolean solution, yes this would technically let the
> >>> customer
> >>>> accidentally replicate "__.*" topics, but only if they modify the
> >>>> TopicFilter to allow their replication, as by default topics matching
> >>> this
> >>>> pattern are excluded. But in this case we can still hardcode "__.*"
> >>> topics
> >>>> as internal in the replication policy, if we feel like we need that
> >> extra
> >>>> layer of safety.
> >>>>
> >>>> Thanks Omnia for your ideas!
> >>>> About #2, I think this is something that we should consider besides
> the
> >>>> boolean solution, as I like the idea that this way we can avoid adding
> >>>> another configuration to mm2, and simply define a more specific
> >> criteria
> >>>> for topics to be considered internal by the replication policy.
> >>>> One thing that comes up for me is whether we should update the
> >>> TopicFilter
> >>>> too to have this more specific pattern, like "mm2-.*.internal".
> Ideally
> >>> I'd
> >>>> say yes, but what if a user already has existing .internal user topics
> >>> and
> >>>> relies on the ReplicationPolicy and TopicFilter not to replicate
> those,
> >>> and
> >>>> by changing both to a more specific pattern, we could technically
> >> enable
> >>>> their replication. I know this is an edge case, so am I being too
> >> careful
> >>>> here?
> >>>>
> >>>> Let me know your thoughts, and I will update the KIP too.
> >>>>
> >>>> Best,
> >>>> Patrik
> >>>>
> >>>> On Thu, Aug 8, 2024 at 2:59 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com
> >
> >>>> wrote:
> >>>>
> >>>>> Hi
> >>>>> Thinking of it more, I feel leaving customers to mirroring Kafka’s
> >>>>> internal topics `__.*` either by mistake or intently will be eating
> up
> >>>>> resources from both mm2 and the clusters for no actual value as the
> >>> data is
> >>>>> not that useful outside the context of the original cluster.
> >>>>> These topics usually are unbalanced and clients are the one the
> >> control
> >>>>> their sizes which will lead to uneven distribution of MM2 resource
> >>> usages.
> >>>>> And if MM2 missed the tombstone records from the source cluster which
> >> is
> >>>>> used to cleanup metadata then mirrored records of these topics
> >> wouldn’t
> >>> get
> >>>>> cleaned appropriately.
> >>>>>
> >>>>> Other options to solve this
> >>>>>
> >>>>> 1. introduce a way to specify MM2’s internal topics suffix
> >> (checkpoints,
> >>>>> offset sync, offsets, status, and config topics) and by default this
> >> is
> >>>>> `.internal`, in case users are allowed to have `.internal` in the
> >> topic
> >>>>> name then MM2 can be setup with different suffix.
> >>>>> `-internal` was set as a way to identify connect internal topics but
> >> as
> >>>>> far as I can see there is no topic set by MM2 itself with `-internal`
> >> so
> >>>>> maybe we can drop `-internal`.
> >>>>> Then  for  __.* these aren’t allowed to be mirrored.
> >>>>>
> >>>>> 2. other option is updating `ReplicationPolicy::isMM2InternalTopic`
> to
> >>>>> check if topic is starting with `mm2` and end with `.internal` as
> this
> >>> is
> >>>>> the pattern for default topics so far.
> >>>>>> default boolean isMM2InternalTopic(String topic) {
> >>>>>>   // MM2 has a pattern for most topics that shouldn't be mirriord
> >>>>> which is mm2-*.*.internal:
> >>>>>>   // "mm2-offset-syncs." + clusterAlias + ".internal"
> >>>>>>   // "mm2-configs." + sourceAndTarget.source() + ".internal"
> >>>>>>   // "mm2-status." + sourceAndTarget.source() + ".internal"
> >>>>>>   // "mm2-offsets." + sourceAndTarget.source() + ".internal"
> >>>>>>   // The only exception is checkpoints which is clusterAlias +
> >>>>> ".checkpoints.internal"
> >>>>>>   return (topic.startsWith("mm2-") && topic.endsWith(".internal"))
> >> ||
> >>>>> isCheckpointsTopic(topic);
> >>>>>> }
> >>>>>
> >>>>> As far as I can see this will cover offset-sync topic, all default
> >>> connect
> >>>>> topic names used by mirror and checkpoints which what the method
> >>> originally
> >>>>> is trying to target. This way we will have some sense of “fake
> >>> namespace”
> >>>>> without introducing one.
> >>>>>
> >>>>> Then `ReplicationPolicy::isInternalTopic` can be
> >>>>>> default boolean isInternalTopic(String topic) {
> >>>>>>   boolean isKafkaInternalTopic = topic.startsWith("__") ||
> >>>>> topic.startsWith(".");
> >>>>>>   return isMM2InternalTopic(topic) || isKafkaInternalTopic;
> >>>>>> }
> >>>>>
> >>>>> `-internal` will be dropped as I can’t see any topic name with this
> >>> suffix
> >>>>> setup by MM2.
> >>>>>
> >>>>> Best,
> >>>>> Omnia
> >>>>>
> >>>>>> On 7 Aug 2024, at 17:26, Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> >>> wrote:
> >>>>>>
> >>>>>> Hi All,
> >>>>>>
> >>>>>> I have a couple of notes to add
> >>>>>>
> >>>>>> OI-1.Regarding the proposed default regex in the KIP,
> >>>>> "__.*|\\..*|.*-internal|.*\\.internal", when users wish to change
> >> this,
> >>>>> wouldn’t they mostly need to keep the __.* part? Especially since __
> >> is
> >>>>> primarily reserved for broker internal metadata topics.
> >>>>>>
> >>>>>> OI-2.Whether we choose a boolean or regex, I think we need to
> >> identify
> >>>>> if this will include replicating the internal metadata topics. The
> >>> metadata
> >>>>> within these topics are not that useful outside of their original
> >>> cluster,
> >>>>> so replicating them doesn’t add much value for anyone.
> >>>>>> The other problem with mistakenly replicating these topics is that
> >> they
> >>>>> are set up with a compacted policy, which means they don’t get
> >> truncated
> >>>>> naturally by retention. These metadata topics are cleaned in a
> special
> >>> way
> >>>>> to stay within the metadata retention milliseconds or expiration
> >>> configs.
> >>>>> As a result of this special situation, replicating these by mistake
> >>> might
> >>>>> put the destination cluster’s capacity at risk.
> >>>>>> Maybe we should be clear that any topic in Topic::INTERNAL_TOPICS is
> >>> not
> >>>>> going to be replicated? WDYT?
> >>>>>>
> >>>>>> Thanks
> >>>>>> Omnia
> >>>>>>
> >>>>>>> On 7 Aug 2024, at 15:46, Viktor Somogyi-Vass <
> >>>>> viktor.somo...@cloudera.com.INVALID> wrote:
> >>>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> 1. Yea, thinking of it more, I'm also concerned about the upgrade
> >> path
> >>>>> and
> >>>>>>> I think that it may not be worth adding a ton of code to work
> around
> >>>>> that
> >>>>>>> one time upgrade. It likely needs extra configs as well or
> >>>>> documentation to
> >>>>>>> instruct users, so it may not be worth the price. So let's add this
> >> to
> >>>>> the
> >>>>>>> rejected alternatives.
> >>>>>>>
> >>>>>>> 2. As Chris mentioned, I also feel that users may want to improve
> it
> >>>>> with
> >>>>>>> regexes and after a certain number of topics it may not be easily
> >>>>>>> manageable. And indeed, it feels a little bit like duplicating the
> >>>>>>> functionality of the topic filter. Overall I like the boolean
> config
> >>>>> best
> >>>>>>> as that prevents most of the unwanted replication cycles by default
> >>> with
> >>>>>>> the TopicFilter.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Viktor
> >>>>>>>
> >>>>>>> On Wed, Aug 7, 2024 at 10:59 AM Patrik Marton
> >>>>> <pmar...@cloudera.com.invalid>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi All,
> >>>>>>>>
> >>>>>>>> Thanks again for your feedback!
> >>>>>>>>
> >>>>>>>> My thoughts on Viktor's ideas:
> >>>>>>>>
> >>>>>>>> 1. This would definitely be a clean solution in the end, but I
> also
> >>>>> feel
> >>>>>>>> that the complexity of this solution might outweigh the benefits.
> >> If
> >>>>> you
> >>>>>>>> guys think, we can give it a thought and figure out what exactly
> >>> needs
> >>>>> to
> >>>>>>>> be changed and how the upgrade path would look like, but my
> opinion
> >>> is
> >>>>> that
> >>>>>>>> since it is a rare use-case, a simple solution might be better.
> >>>>>>>>
> >>>>>>>> 2. I initially wanted to propose something like this, but my
> >> concern
> >>>>> was
> >>>>>>>> that by adding another way a topic can be whitelisted, we would
> >> also
> >>>>>>>> increase the complexity and it would require more attention from
> >>>>> users. But
> >>>>>>>> I also think that this would probably be the safest solution of
> >> all,
> >>>>> and
> >>>>>>>> less risky than modifying regexes or using that simple boolean.
> >>>>>>>> Since the TopicFilter is not really the problem as its exclude
> list
> >>>>> can be
> >>>>>>>> modified, and it basically acts as another layer of safety, we
> >> could
> >>>>>>>> potentially add this new config (a list of topics) to the
> >>>>>>>> DefaultReplicationPolicy, and any topic listed in the new config
> >>> would
> >>>>> not
> >>>>>>>> be considered internal by the replication policy. This way we
> would
> >>> not
> >>>>>>>> duplicate the work of the Topic Filter, simply provide a way to
> >> tell
> >>>>> the
> >>>>>>>> DefaultReplicationPolicy that topics in this list seem internal,
> >> but
> >>>>> they
> >>>>>>>> are not.
> >>>>>>>>
> >>>>>>>> What do you think?
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Patrik
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Tue, Aug 6, 2024 at 7:17 PM Chris Egerton
> >> <chr...@aiven.io.invalid
> >>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Viktor,
> >>>>>>>>>
> >>>>>>>>> Nice to hear from you! My thoughts:
> >>>>>>>>>
> >>>>>>>>> 1. I initially liked this idea, but it doesn't seem like the
> costs
> >>> to
> >>>>> all
> >>>>>>>>> existing users outweigh the benefits that would be reaped by just
> >> a
> >>>>> few.
> >>>>>>>>> This is possibly because I'm assuming the upgrade path would be
> >>> pretty
> >>>>>>>>> gnarly, though; if we can make it really, really smooth,
> >> especially
> >>>>> for
> >>>>>>>>> existing users who don't need to be able to replicate topics
> >> ending
> >>> in
> >>>>>>>>> "-internal", then I think it'd be a viable option.
> >>>>>>>>>
> >>>>>>>>> 2. This is definitely safer than a simple boolean property, but
> >>>>> doesn't
> >>>>>>>> it
> >>>>>>>>> feel like we're duplicating the work of the topic filter at this
> >>>>> point,
> >>>>>>>>> except with something a little more cumbersome for users? People
> >>>>> already
> >>>>>>>>> have a way of include- and exclude-listing topics by regex with
> >> that
> >>>>>>>>> mechanism, and I'm worried we're not placing enough trust in the
> >>>>>>>>> flexibility and safety it already provides us and are adding
> >> layers
> >>> of
> >>>>>>>>> safety that are ultimately redundant. I might also be pessimistic
> >>> here
> >>>>>>>> but
> >>>>>>>>> I can't shake the feeling that we'd eventually get a KIP where
> >>> someone
> >>>>>>>>> states that it's too hard to add topics one-by-one and would like
> >> to
> >>>>>>>>> include them by regex, prefix match, or some other broader
> >>> mechanism.
> >>>>>>>>>
> >>>>>>>>> Do you find this convincing? I'm also curious what Greg and
> Patrik
> >>>>> think.
> >>>>>>>>>
> >>>>>>>>> Cheers,
> >>>>>>>>>
> >>>>>>>>> Chris
> >>>>>>>>>
> >>>>>>>>> On Tue, Aug 6, 2024 at 5:08 AM Viktor Somogyi-Vass
> >>>>>>>>> <viktor.somo...@cloudera.com.invalid> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I think there are a couple of other alternatives to discuss.
> >>>>>>>>>>
> >>>>>>>>>> 1. We may just simply change the internal topic naming to
> >> something
> >>>>>>>> like
> >>>>>>>>>> "mm2internal" and provide upgrade functionality as well. In this
> >>>>>>>> scenario
> >>>>>>>>>> we wouldn't need to tinker with the prefixes or the naming
> >>> patterns,
> >>>>> we
> >>>>>>>>>> just simply free up the "internal" keyword and after upgrade
> >> allow
> >>>>> all
> >>>>>>>>>> topics with internal to be mirrored. The upgrade however could
> be
> >>>>> very
> >>>>>>>>>> complicated as we'd need to provide a way for users to move away
> >>> from
> >>>>>>>> the
> >>>>>>>>>> old topics (what happens in case of EOS, how do we make sure to
> >>>>>>>>> discontinue
> >>>>>>>>>> the old topics, what happens if the upgrade gets interrupted
> >> etc.).
> >>>>>>>> But I
> >>>>>>>>>> think this is worth considering and for the upgrade, we may take
> >> a
> >>>>> few
> >>>>>>>>>> ideas from the zk->kraft migration to make it easier.
> >>>>>>>>>>
> >>>>>>>>>> 2. A new config to force the replication of certain topics. In
> >>> this,
> >>>>>>>>> users
> >>>>>>>>>> would have to specify topics 1 by 1 and this would bypass the
> >>> filters
> >>>>>>>> and
> >>>>>>>>>> the replication policy. Since forcing is usually not the normal
> >> way
> >>>>> of
> >>>>>>>>>> doing things in general, users may be more willing to accept
> >>>>> unintended
> >>>>>>>>>> side effects.
> >>>>>>>>>>
> >>>>>>>>>> What do you all think?
> >>>>>>>>>> (If they aren't better than the original ideas, I'm absolutely
> >> good
> >>>>>>>> with
> >>>>>>>>>> putting these to the rejected alternatives but wanted to write
> it
> >>> all
> >>>>>>>>> out.)
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Viktor
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Aug 5, 2024 at 8:18 PM Chris Egerton
> >>> <chr...@aiven.io.invalid
> >>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Patrik,
> >>>>>>>>>>>
> >>>>>>>>>>> I think I owe you an apology. After mulling this over for a few
> >>> more
> >>>>>>>>>> hours
> >>>>>>>>>>> and going over the code base some more, it seems like your
> >> initial
> >>>>>>>>>> approach
> >>>>>>>>>>> was actually better in some ways.
> >>>>>>>>>>>
> >>>>>>>>>>> With the regex-based approach, it becomes possible for users to
> >>>>>>>>> configure
> >>>>>>>>>>> the DefaultReplicationPolicy class in a way that for some topic
> >> T,
> >>>>>>>>>>> isInternalTopic(T) returns false, but isCheckpointsTopic(T),
> >>>>>>>>>>> isHeartbeatsTopic(T), or even isMM2InternalTopic(T) return
> true.
> >>> It
> >>>>>>>>> seems
> >>>>>>>>>>> like this kind of inconsistency might cause confusion to users
> >> and
> >>>>>>>>>> possibly
> >>>>>>>>>>> even break things down the road depending on how we leverage
> >> these
> >>>>>>>>>> methods
> >>>>>>>>>>> in the future.
> >>>>>>>>>>>
> >>>>>>>>>>> With the initial approach, using a simple boolean to enable the
> >>>>>>>>>> replication
> >>>>>>>>>>> of what appear to be internal topics, there is some risk of
> >> cycles
> >>>>>>>> and
> >>>>>>>>>>> accidental replication of genuinely internal topics, but less
> >>> than I
> >>>>>>>>>>> believed at first. In order for an internal topic to be
> >>> replicated,
> >>>>>>>> not
> >>>>>>>>>>> only would that boolean property have to be explicitly modified
> >> by
> >>>>>>>>> users,
> >>>>>>>>>>> but they would also have to be using a topic filter that allows
> >>>>> those
> >>>>>>>>>>> topics to pass through as well.
> >>>>>>>>>>>
> >>>>>>>>>>> I think a simple boolean is better here, as it's easier for
> >> users
> >>> to
> >>>>>>>>>>> configure, and less prone to logical inconsistencies. My only
> >>>>>>>>> suggestions
> >>>>>>>>>>> are:
> >>>>>>>>>>> - Apply the property to the MirrorSourceConnector class instead
> >> of
> >>>>>>>> the
> >>>>>>>>>>> DefaultReplicationPolicy class, since we're not actually
> >> changing
> >>>>> the
> >>>>>>>>>>> definition of what is and isn't an internal topic; instead,
> >> we're
> >>>>>>>>> simply
> >>>>>>>>>>> allowing topics to be replicated even if they appear to be
> >>> internal
> >>>>>>>>>> topics
> >>>>>>>>>>> - Rename it to something less verbose like
> >>>>>>>> "replicate.internal.topics"
> >>>>>>>>>>> - Add a warning to the docs for the property explicitly
> pointing
> >>>>>>>> people
> >>>>>>>>>> to
> >>>>>>>>>>> the topic filter class, stating that they should still make
> sure
> >>>>> that
> >>>>>>>>>>> genuinely internal topics (instead of topics that
> >>>>>>>>>>> ReplicationPolicy::isInternalTopic returns true for but which
> >> are
> >>>>> not
> >>>>>>>>>>> actually internal) are excluded by the filter
> >>>>>>>>>>>
> >>>>>>>>>>> Patrik, Greg, how does that sound?
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>>
> >>>>>>>>>>> Chris
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Aug 5, 2024 at 4:34 AM Patrik Marton
> >>>>>>>>>> <pmar...@cloudera.com.invalid
> >>>>>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Greg and Chris,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thank you for your feedbacks!
> >>>>>>>>>>>>
> >>>>>>>>>>>> I added the currently existing workarounds to the “Rejected
> >>>>>>>>>> Alternatives”
> >>>>>>>>>>>> section, and explained why I still think we need a separate
> >>>>>>>>> solution. I
> >>>>>>>>>>>> also added the previous proposal to the rejected alternatives,
> >>> as I
> >>>>>>>>>> agree
> >>>>>>>>>>>> with your concerns.
> >>>>>>>>>>>> I tried to come up with a bit different regex based solution,
> >> and
> >>>>>>>>>>> modified
> >>>>>>>>>>>> the KIP based on that. I think it is a more straightforward
> way
> >>> to
> >>>>>>>>> get
> >>>>>>>>>>> the
> >>>>>>>>>>>> desired behavior.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Let me know your thoughts.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Patrik
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 2024/07/30 18:57:18 Chris Egerton wrote:
> >>>>>>>>>>>>> Hi Patrick,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I share Greg's concerns with the feature as it's currently
> >>>>>>>>> proposed.
> >>>>>>>>>> I
> >>>>>>>>>>>>> don't think I could vote for something unless it made
> >>> replication
> >>>>>>>>> of
> >>>>>>>>>>>>> genuinely internal topics and replication cycles impossible,
> >> or
> >>>>>>>> at
> >>>>>>>>>>> least
> >>>>>>>>>>>>> significantly less likely.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Chris
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Jul 30, 2024, 14:51 Greg Harris
> >> <gr...@aiven.io.invalid
> >>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Patrik,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks for the KIP!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Your motivation for this KIP is reasonable, because it is
> >>>>>>>>>> definitely
> >>>>>>>>>>>>>> possible for the ".internal" suffix to collide with real
> >>>>>>>> topics.
> >>>>>>>>> It
> >>>>>>>>>>>> would
> >>>>>>>>>>>>>> have been nice if the original design included some
> >>>>>>>> mm2-specific
> >>>>>>>>>>>> namespace
> >>>>>>>>>>>>>> like "mm2.internal" to lessen the likelihood of a collision.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> However, this is a problem that has numerous existing
> >>>>>>>>> workarounds:
> >>>>>>>>>>>>>> * Use a custom ReplicationPolicy and override the methods
> >> (for
> >>>>>>>>>>> existing
> >>>>>>>>>>>>>> workloads/mirror makers)
> >>>>>>>>>>>>>> * Use non-conflicting user topic names (for new user topics)
> >>>>>>>>>>>>>> * Use the replication.policy.separator to use a
> >> non-conflicting
> >>>>>>>>>>>> separator
> >>>>>>>>>>>>>> character (for new mirror maker setups)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> And the feature as-described has significant risks attached:
> >>>>>>>>>>>>>> * May allow replication cycles and runaway replication
> >>>>>>>>>>>>>> * Mirrors internal topics that are unusable on the
> >> destination
> >>>>>>>>>>> cluster
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> While these risks can be accounted for if a user is
> attentive
> >>>>>>>>> (e.g.
> >>>>>>>>>>>> when
> >>>>>>>>>>>>>> they're writing their own ReplicationPolicy) it is not a
> >>>>>>>>> risk-free
> >>>>>>>>>>>>>> configuration that composes well with other out-of-the-box
> >>>>>>>>>>>> configurations.
> >>>>>>>>>>>>>> For example, someone may expect to take their existing
> >>>>>>>>>> configuration,
> >>>>>>>>>>>> turn
> >>>>>>>>>>>>>> on this new option, and expect reasonable behavior, which
> >> isn't
> >>>>>>>>>>> always
> >>>>>>>>>>>>>> guaranteed.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If you're still interested in this feature, please reference
> >>>>>>>> the
> >>>>>>>>>>>> existing
> >>>>>>>>>>>>>> workarounds and include them as rejected alternatives so we
> >> can
> >>>>>>>>>> know
> >>>>>>>>>>>> where
> >>>>>>>>>>>>>> the existing solutions fall short.
> >>>>>>>>>>>>>> We'd also have to figure out if and how the risks I
> mentioned
> >>>>>>>>> could
> >>>>>>>>>>> be
> >>>>>>>>>>>>>> mitigated.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> Greg
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Jul 30, 2024 at 5:49 AM Patrik Marton
> >>>>>>>>>>>> <pmar...@cloudera.com.invalid
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Team,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I would like to start a discussion on KIP-1074: Make the
> >>>>>>>>>>> replication
> >>>>>>>>>>>> of
> >>>>>>>>>>>>>>> internal topics configurable
> >>>>>>>>>>>>>>> <
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1074%3A+Make+the+replication+of+internal+topics+configurable
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The goal of this KIP is to make it easier to replicate
> >> topics
> >>>>>>>>>> that
> >>>>>>>>>>>> seem
> >>>>>>>>>>>>>>> internal (for example ending in .internal suffix), but are
> >>>>>>>> just
> >>>>>>>>>>>> normal
> >>>>>>>>>>>>>>> business topics created by the user.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I appreciate any feedback and recommendations!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks!
> >>>>>>>>>>>>>>> Patrik
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>
>
>

Reply via email to