Hi everyone!

Thanks again for all your feedback! The discussion thread has been silent
for a week now, I'll wait a bit for any new feedback and will start the
voting in a couple of days.

Best,
Patrik

On Wed, Aug 21, 2024 at 10:28 AM Patrik Marton <pmar...@cloudera.com> wrote:

> Hi Omnia,
>
> Thanks, I updated the KIP with the DefaultTopicFilter changes, to reflect
> the new mm2.*.internal pattern.
>
> Best,
> Patrik
>
> On Mon, Aug 19, 2024 at 2:24 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> wrote:
>
>> Thanks Patrik for updating the KIP.
>> One last feedback, should we update `TOPICS_EXCLUDE_DEFAULT` also to
>> reflect the same pattern instead of `.*[\\-\\.]internal`?
>>
>> Best,
>> Omnia
>>
>> > On 16 Aug 2024, at 13:31, Patrik Marton <pmar...@cloudera.com.INVALID>
>> wrote:
>> >
>> > Hi All,
>> >
>> > I updated the KIP with the new proposal, and changed the name to better
>> > describe the latest proposal and the goal.
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1074%3A+Allow+the+replication+of+user+internal+topics
>> >
>> > Let me know your thoughts!
>> >
>> > Thanks!
>> > Patrik
>> >
>> > On Wed, Aug 14, 2024 at 10:50 AM Viktor Somogyi-Vass
>> > <viktor.somo...@cloudera.com.invalid> wrote:
>> >
>> >> Hi Omina,
>> >>
>> >> We have not considered this path yet that you proposed and I like it as
>> >> well and I would also be in favor of a solution that doesn't try to
>> >> introduce a new config. Since this KIP would only make it into 4.0
>> anyway
>> >> and your proposal seems like a smaller compatibility break that can
>> >> be documented in the upgrade notes and also circumvented by a custom
>> >> policy, I think we can do this.
>> >>
>> >> I'd be also happy to receive some feedback from those who have been in
>> this
>> >> conversation to form a consensus around this.
>> >>
>> >> Best,
>> >> Viktor
>> >>
>> >> On Mon, Aug 12, 2024 at 2:19 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com
>> >
>> >> wrote:
>> >>
>> >>> Hi Patrick,
>> >>>> One thing that comes up for me is whether we should update the
>> >>> TopicFilter
>> >>>> too to have this more specific pattern, like "mm2-.*.internal".
>> Ideally
>> >>> I'd
>> >>>> say yes, but what if a user already has existing .internal user
>> topics
>> >>> and
>> >>>> relies on the ReplicationPolicy and TopicFilter not to replicate
>> those,
>> >>> and
>> >>>> by changing both to a more specific pattern, we could technically
>> >> enable
>> >>>> their replication. I know this is an edge case, so am I being too
>> >> careful
>> >>>> here?
>> >>>
>> >>> I would say for the ReplicationPolicy blocking all `.internal` and
>> >>> `-internal` as MM2 / Kafka internals is a bug. So anyone who has been
>> >>> relaying on this bug to block topics is relaying on a faulty code
>> >>> especially that none of the Kafka, connect or streams internals do
>> create
>> >>> any topic with these suffixes except MM2 and all of them can be
>> >> identified
>> >>> more accurately without this generic suffix.
>> >>>
>> >>>
>> >>> The other potential cases we might break are
>> >>> 1. where `replication.policy.internal.topic.separator.enabled` is
>> >> enabled,
>> >>> but we might only need to adjust the code to account for this either
>> in
>> >>> ReplicationPolicy itself
>> >>> ` default boolean isMM2InternalTopic(String topic) { return
>> >>> (topic.startsWith("mm2-") && topic.endsWith("internal")) ||
>> >>> isCheckpointsTopic(topic); }`
>> >>> Or in DefaultReplicationPolicy implementation of `isMM2InternalTopic`.
>> >> So
>> >>> we don’t break the backward compatibility for this case
>> >>> 2. Any use case has a customised ReplicationPolicy with an override of
>> >>> isMM2InternalTopic. This one might also break the backward
>> compatibility
>> >>>
>> >>> As 4.0 coming soon we can leverage this and introduce a breaking
>> change
>> >> to
>> >>> MM2 with clarification of the KIP's impact on backward compatibility
>> for
>> >>> the following cases,
>> >>> - For anyone has been relaying on this bug to block normal topics with
>> >>> `internal` suffix they will need to update topic filter config to
>> block
>> >>> this suffix manually if they wish for.
>> >>> - For anyone overriding  isMM2InternalTopic in their custom
>> >>> ReplicationPolicy, they might need to update their code account for
>> the
>> >> new
>> >>> change to block only mm2 internal topics.
>> >>>
>> >>> If this path is acceptable with breaking the compatibility with the
>> next
>> >>> major Kafka release we should consider fixing the issue and not
>> >> introducing
>> >>> another config to go around it especially that this seems more like
>> >> mistake
>> >>> from the beginning as MM2 original KIP intention was trying to target
>> MM2
>> >>> and Kafka internal topics only and not all topics with this suffix.
>> >>>
>> >>> What does everyone think?
>> >>>
>> >>> Thanks
>> >>> Omnia
>> >>>> On 9 Aug 2024, at 10:40, Patrik Marton <pmar...@cloudera.com.INVALID
>> >
>> >>> wrote:
>> >>>>
>> >>>> Hi All,
>> >>>>
>> >>>> Thanks again for your feedback!
>> >>>>
>> >>>> Regarding the boolean solution, yes this would technically let the
>> >>> customer
>> >>>> accidentally replicate "__.*" topics, but only if they modify the
>> >>>> TopicFilter to allow their replication, as by default topics matching
>> >>> this
>> >>>> pattern are excluded. But in this case we can still hardcode "__.*"
>> >>> topics
>> >>>> as internal in the replication policy, if we feel like we need that
>> >> extra
>> >>>> layer of safety.
>> >>>>
>> >>>> Thanks Omnia for your ideas!
>> >>>> About #2, I think this is something that we should consider besides
>> the
>> >>>> boolean solution, as I like the idea that this way we can avoid
>> adding
>> >>>> another configuration to mm2, and simply define a more specific
>> >> criteria
>> >>>> for topics to be considered internal by the replication policy.
>> >>>> One thing that comes up for me is whether we should update the
>> >>> TopicFilter
>> >>>> too to have this more specific pattern, like "mm2-.*.internal".
>> Ideally
>> >>> I'd
>> >>>> say yes, but what if a user already has existing .internal user
>> topics
>> >>> and
>> >>>> relies on the ReplicationPolicy and TopicFilter not to replicate
>> those,
>> >>> and
>> >>>> by changing both to a more specific pattern, we could technically
>> >> enable
>> >>>> their replication. I know this is an edge case, so am I being too
>> >> careful
>> >>>> here?
>> >>>>
>> >>>> Let me know your thoughts, and I will update the KIP too.
>> >>>>
>> >>>> Best,
>> >>>> Patrik
>> >>>>
>> >>>> On Thu, Aug 8, 2024 at 2:59 PM Omnia Ibrahim <
>> o.g.h.ibra...@gmail.com>
>> >>>> wrote:
>> >>>>
>> >>>>> Hi
>> >>>>> Thinking of it more, I feel leaving customers to mirroring Kafka’s
>> >>>>> internal topics `__.*` either by mistake or intently will be eating
>> up
>> >>>>> resources from both mm2 and the clusters for no actual value as the
>> >>> data is
>> >>>>> not that useful outside the context of the original cluster.
>> >>>>> These topics usually are unbalanced and clients are the one the
>> >> control
>> >>>>> their sizes which will lead to uneven distribution of MM2 resource
>> >>> usages.
>> >>>>> And if MM2 missed the tombstone records from the source cluster
>> which
>> >> is
>> >>>>> used to cleanup metadata then mirrored records of these topics
>> >> wouldn’t
>> >>> get
>> >>>>> cleaned appropriately.
>> >>>>>
>> >>>>> Other options to solve this
>> >>>>>
>> >>>>> 1. introduce a way to specify MM2’s internal topics suffix
>> >> (checkpoints,
>> >>>>> offset sync, offsets, status, and config topics) and by default this
>> >> is
>> >>>>> `.internal`, in case users are allowed to have `.internal` in the
>> >> topic
>> >>>>> name then MM2 can be setup with different suffix.
>> >>>>> `-internal` was set as a way to identify connect internal topics but
>> >> as
>> >>>>> far as I can see there is no topic set by MM2 itself with
>> `-internal`
>> >> so
>> >>>>> maybe we can drop `-internal`.
>> >>>>> Then  for  __.* these aren’t allowed to be mirrored.
>> >>>>>
>> >>>>> 2. other option is updating `ReplicationPolicy::isMM2InternalTopic`
>> to
>> >>>>> check if topic is starting with `mm2` and end with `.internal` as
>> this
>> >>> is
>> >>>>> the pattern for default topics so far.
>> >>>>>> default boolean isMM2InternalTopic(String topic) {
>> >>>>>>   // MM2 has a pattern for most topics that shouldn't be mirriord
>> >>>>> which is mm2-*.*.internal:
>> >>>>>>   // "mm2-offset-syncs." + clusterAlias + ".internal"
>> >>>>>>   // "mm2-configs." + sourceAndTarget.source() + ".internal"
>> >>>>>>   // "mm2-status." + sourceAndTarget.source() + ".internal"
>> >>>>>>   // "mm2-offsets." + sourceAndTarget.source() + ".internal"
>> >>>>>>   // The only exception is checkpoints which is clusterAlias +
>> >>>>> ".checkpoints.internal"
>> >>>>>>   return (topic.startsWith("mm2-") && topic.endsWith(".internal"))
>> >> ||
>> >>>>> isCheckpointsTopic(topic);
>> >>>>>> }
>> >>>>>
>> >>>>> As far as I can see this will cover offset-sync topic, all default
>> >>> connect
>> >>>>> topic names used by mirror and checkpoints which what the method
>> >>> originally
>> >>>>> is trying to target. This way we will have some sense of “fake
>> >>> namespace”
>> >>>>> without introducing one.
>> >>>>>
>> >>>>> Then `ReplicationPolicy::isInternalTopic` can be
>> >>>>>> default boolean isInternalTopic(String topic) {
>> >>>>>>   boolean isKafkaInternalTopic = topic.startsWith("__") ||
>> >>>>> topic.startsWith(".");
>> >>>>>>   return isMM2InternalTopic(topic) || isKafkaInternalTopic;
>> >>>>>> }
>> >>>>>
>> >>>>> `-internal` will be dropped as I can’t see any topic name with this
>> >>> suffix
>> >>>>> setup by MM2.
>> >>>>>
>> >>>>> Best,
>> >>>>> Omnia
>> >>>>>
>> >>>>>> On 7 Aug 2024, at 17:26, Omnia Ibrahim <o.g.h.ibra...@gmail.com>
>> >>> wrote:
>> >>>>>>
>> >>>>>> Hi All,
>> >>>>>>
>> >>>>>> I have a couple of notes to add
>> >>>>>>
>> >>>>>> OI-1.Regarding the proposed default regex in the KIP,
>> >>>>> "__.*|\\..*|.*-internal|.*\\.internal", when users wish to change
>> >> this,
>> >>>>> wouldn’t they mostly need to keep the __.* part? Especially since __
>> >> is
>> >>>>> primarily reserved for broker internal metadata topics.
>> >>>>>>
>> >>>>>> OI-2.Whether we choose a boolean or regex, I think we need to
>> >> identify
>> >>>>> if this will include replicating the internal metadata topics. The
>> >>> metadata
>> >>>>> within these topics are not that useful outside of their original
>> >>> cluster,
>> >>>>> so replicating them doesn’t add much value for anyone.
>> >>>>>> The other problem with mistakenly replicating these topics is that
>> >> they
>> >>>>> are set up with a compacted policy, which means they don’t get
>> >> truncated
>> >>>>> naturally by retention. These metadata topics are cleaned in a
>> special
>> >>> way
>> >>>>> to stay within the metadata retention milliseconds or expiration
>> >>> configs.
>> >>>>> As a result of this special situation, replicating these by mistake
>> >>> might
>> >>>>> put the destination cluster’s capacity at risk.
>> >>>>>> Maybe we should be clear that any topic in Topic::INTERNAL_TOPICS
>> is
>> >>> not
>> >>>>> going to be replicated? WDYT?
>> >>>>>>
>> >>>>>> Thanks
>> >>>>>> Omnia
>> >>>>>>
>> >>>>>>> On 7 Aug 2024, at 15:46, Viktor Somogyi-Vass <
>> >>>>> viktor.somo...@cloudera.com.INVALID> wrote:
>> >>>>>>>
>> >>>>>>> Hi all,
>> >>>>>>>
>> >>>>>>> 1. Yea, thinking of it more, I'm also concerned about the upgrade
>> >> path
>> >>>>> and
>> >>>>>>> I think that it may not be worth adding a ton of code to work
>> around
>> >>>>> that
>> >>>>>>> one time upgrade. It likely needs extra configs as well or
>> >>>>> documentation to
>> >>>>>>> instruct users, so it may not be worth the price. So let's add
>> this
>> >> to
>> >>>>> the
>> >>>>>>> rejected alternatives.
>> >>>>>>>
>> >>>>>>> 2. As Chris mentioned, I also feel that users may want to improve
>> it
>> >>>>> with
>> >>>>>>> regexes and after a certain number of topics it may not be easily
>> >>>>>>> manageable. And indeed, it feels a little bit like duplicating the
>> >>>>>>> functionality of the topic filter. Overall I like the boolean
>> config
>> >>>>> best
>> >>>>>>> as that prevents most of the unwanted replication cycles by
>> default
>> >>> with
>> >>>>>>> the TopicFilter.
>> >>>>>>>
>> >>>>>>> Best,
>> >>>>>>> Viktor
>> >>>>>>>
>> >>>>>>> On Wed, Aug 7, 2024 at 10:59 AM Patrik Marton
>> >>>>> <pmar...@cloudera.com.invalid>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>>> Hi All,
>> >>>>>>>>
>> >>>>>>>> Thanks again for your feedback!
>> >>>>>>>>
>> >>>>>>>> My thoughts on Viktor's ideas:
>> >>>>>>>>
>> >>>>>>>> 1. This would definitely be a clean solution in the end, but I
>> also
>> >>>>> feel
>> >>>>>>>> that the complexity of this solution might outweigh the benefits.
>> >> If
>> >>>>> you
>> >>>>>>>> guys think, we can give it a thought and figure out what exactly
>> >>> needs
>> >>>>> to
>> >>>>>>>> be changed and how the upgrade path would look like, but my
>> opinion
>> >>> is
>> >>>>> that
>> >>>>>>>> since it is a rare use-case, a simple solution might be better.
>> >>>>>>>>
>> >>>>>>>> 2. I initially wanted to propose something like this, but my
>> >> concern
>> >>>>> was
>> >>>>>>>> that by adding another way a topic can be whitelisted, we would
>> >> also
>> >>>>>>>> increase the complexity and it would require more attention from
>> >>>>> users. But
>> >>>>>>>> I also think that this would probably be the safest solution of
>> >> all,
>> >>>>> and
>> >>>>>>>> less risky than modifying regexes or using that simple boolean.
>> >>>>>>>> Since the TopicFilter is not really the problem as its exclude
>> list
>> >>>>> can be
>> >>>>>>>> modified, and it basically acts as another layer of safety, we
>> >> could
>> >>>>>>>> potentially add this new config (a list of topics) to the
>> >>>>>>>> DefaultReplicationPolicy, and any topic listed in the new config
>> >>> would
>> >>>>> not
>> >>>>>>>> be considered internal by the replication policy. This way we
>> would
>> >>> not
>> >>>>>>>> duplicate the work of the Topic Filter, simply provide a way to
>> >> tell
>> >>>>> the
>> >>>>>>>> DefaultReplicationPolicy that topics in this list seem internal,
>> >> but
>> >>>>> they
>> >>>>>>>> are not.
>> >>>>>>>>
>> >>>>>>>> What do you think?
>> >>>>>>>>
>> >>>>>>>> Best,
>> >>>>>>>> Patrik
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Tue, Aug 6, 2024 at 7:17 PM Chris Egerton
>> >> <chr...@aiven.io.invalid
>> >>>>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> Hi Viktor,
>> >>>>>>>>>
>> >>>>>>>>> Nice to hear from you! My thoughts:
>> >>>>>>>>>
>> >>>>>>>>> 1. I initially liked this idea, but it doesn't seem like the
>> costs
>> >>> to
>> >>>>> all
>> >>>>>>>>> existing users outweigh the benefits that would be reaped by
>> just
>> >> a
>> >>>>> few.
>> >>>>>>>>> This is possibly because I'm assuming the upgrade path would be
>> >>> pretty
>> >>>>>>>>> gnarly, though; if we can make it really, really smooth,
>> >> especially
>> >>>>> for
>> >>>>>>>>> existing users who don't need to be able to replicate topics
>> >> ending
>> >>> in
>> >>>>>>>>> "-internal", then I think it'd be a viable option.
>> >>>>>>>>>
>> >>>>>>>>> 2. This is definitely safer than a simple boolean property, but
>> >>>>> doesn't
>> >>>>>>>> it
>> >>>>>>>>> feel like we're duplicating the work of the topic filter at this
>> >>>>> point,
>> >>>>>>>>> except with something a little more cumbersome for users? People
>> >>>>> already
>> >>>>>>>>> have a way of include- and exclude-listing topics by regex with
>> >> that
>> >>>>>>>>> mechanism, and I'm worried we're not placing enough trust in the
>> >>>>>>>>> flexibility and safety it already provides us and are adding
>> >> layers
>> >>> of
>> >>>>>>>>> safety that are ultimately redundant. I might also be
>> pessimistic
>> >>> here
>> >>>>>>>> but
>> >>>>>>>>> I can't shake the feeling that we'd eventually get a KIP where
>> >>> someone
>> >>>>>>>>> states that it's too hard to add topics one-by-one and would
>> like
>> >> to
>> >>>>>>>>> include them by regex, prefix match, or some other broader
>> >>> mechanism.
>> >>>>>>>>>
>> >>>>>>>>> Do you find this convincing? I'm also curious what Greg and
>> Patrik
>> >>>>> think.
>> >>>>>>>>>
>> >>>>>>>>> Cheers,
>> >>>>>>>>>
>> >>>>>>>>> Chris
>> >>>>>>>>>
>> >>>>>>>>> On Tue, Aug 6, 2024 at 5:08 AM Viktor Somogyi-Vass
>> >>>>>>>>> <viktor.somo...@cloudera.com.invalid> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Hi all,
>> >>>>>>>>>>
>> >>>>>>>>>> I think there are a couple of other alternatives to discuss.
>> >>>>>>>>>>
>> >>>>>>>>>> 1. We may just simply change the internal topic naming to
>> >> something
>> >>>>>>>> like
>> >>>>>>>>>> "mm2internal" and provide upgrade functionality as well. In
>> this
>> >>>>>>>> scenario
>> >>>>>>>>>> we wouldn't need to tinker with the prefixes or the naming
>> >>> patterns,
>> >>>>> we
>> >>>>>>>>>> just simply free up the "internal" keyword and after upgrade
>> >> allow
>> >>>>> all
>> >>>>>>>>>> topics with internal to be mirrored. The upgrade however could
>> be
>> >>>>> very
>> >>>>>>>>>> complicated as we'd need to provide a way for users to move
>> away
>> >>> from
>> >>>>>>>> the
>> >>>>>>>>>> old topics (what happens in case of EOS, how do we make sure to
>> >>>>>>>>> discontinue
>> >>>>>>>>>> the old topics, what happens if the upgrade gets interrupted
>> >> etc.).
>> >>>>>>>> But I
>> >>>>>>>>>> think this is worth considering and for the upgrade, we may
>> take
>> >> a
>> >>>>> few
>> >>>>>>>>>> ideas from the zk->kraft migration to make it easier.
>> >>>>>>>>>>
>> >>>>>>>>>> 2. A new config to force the replication of certain topics. In
>> >>> this,
>> >>>>>>>>> users
>> >>>>>>>>>> would have to specify topics 1 by 1 and this would bypass the
>> >>> filters
>> >>>>>>>> and
>> >>>>>>>>>> the replication policy. Since forcing is usually not the normal
>> >> way
>> >>>>> of
>> >>>>>>>>>> doing things in general, users may be more willing to accept
>> >>>>> unintended
>> >>>>>>>>>> side effects.
>> >>>>>>>>>>
>> >>>>>>>>>> What do you all think?
>> >>>>>>>>>> (If they aren't better than the original ideas, I'm absolutely
>> >> good
>> >>>>>>>> with
>> >>>>>>>>>> putting these to the rejected alternatives but wanted to write
>> it
>> >>> all
>> >>>>>>>>> out.)
>> >>>>>>>>>>
>> >>>>>>>>>> Best,
>> >>>>>>>>>> Viktor
>> >>>>>>>>>>
>> >>>>>>>>>> On Mon, Aug 5, 2024 at 8:18 PM Chris Egerton
>> >>> <chr...@aiven.io.invalid
>> >>>>>>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Hi Patrik,
>> >>>>>>>>>>>
>> >>>>>>>>>>> I think I owe you an apology. After mulling this over for a
>> few
>> >>> more
>> >>>>>>>>>> hours
>> >>>>>>>>>>> and going over the code base some more, it seems like your
>> >> initial
>> >>>>>>>>>> approach
>> >>>>>>>>>>> was actually better in some ways.
>> >>>>>>>>>>>
>> >>>>>>>>>>> With the regex-based approach, it becomes possible for users
>> to
>> >>>>>>>>> configure
>> >>>>>>>>>>> the DefaultReplicationPolicy class in a way that for some
>> topic
>> >> T,
>> >>>>>>>>>>> isInternalTopic(T) returns false, but isCheckpointsTopic(T),
>> >>>>>>>>>>> isHeartbeatsTopic(T), or even isMM2InternalTopic(T) return
>> true.
>> >>> It
>> >>>>>>>>> seems
>> >>>>>>>>>>> like this kind of inconsistency might cause confusion to users
>> >> and
>> >>>>>>>>>> possibly
>> >>>>>>>>>>> even break things down the road depending on how we leverage
>> >> these
>> >>>>>>>>>> methods
>> >>>>>>>>>>> in the future.
>> >>>>>>>>>>>
>> >>>>>>>>>>> With the initial approach, using a simple boolean to enable
>> the
>> >>>>>>>>>> replication
>> >>>>>>>>>>> of what appear to be internal topics, there is some risk of
>> >> cycles
>> >>>>>>>> and
>> >>>>>>>>>>> accidental replication of genuinely internal topics, but less
>> >>> than I
>> >>>>>>>>>>> believed at first. In order for an internal topic to be
>> >>> replicated,
>> >>>>>>>> not
>> >>>>>>>>>>> only would that boolean property have to be explicitly
>> modified
>> >> by
>> >>>>>>>>> users,
>> >>>>>>>>>>> but they would also have to be using a topic filter that
>> allows
>> >>>>> those
>> >>>>>>>>>>> topics to pass through as well.
>> >>>>>>>>>>>
>> >>>>>>>>>>> I think a simple boolean is better here, as it's easier for
>> >> users
>> >>> to
>> >>>>>>>>>>> configure, and less prone to logical inconsistencies. My only
>> >>>>>>>>> suggestions
>> >>>>>>>>>>> are:
>> >>>>>>>>>>> - Apply the property to the MirrorSourceConnector class
>> instead
>> >> of
>> >>>>>>>> the
>> >>>>>>>>>>> DefaultReplicationPolicy class, since we're not actually
>> >> changing
>> >>>>> the
>> >>>>>>>>>>> definition of what is and isn't an internal topic; instead,
>> >> we're
>> >>>>>>>>> simply
>> >>>>>>>>>>> allowing topics to be replicated even if they appear to be
>> >>> internal
>> >>>>>>>>>> topics
>> >>>>>>>>>>> - Rename it to something less verbose like
>> >>>>>>>> "replicate.internal.topics"
>> >>>>>>>>>>> - Add a warning to the docs for the property explicitly
>> pointing
>> >>>>>>>> people
>> >>>>>>>>>> to
>> >>>>>>>>>>> the topic filter class, stating that they should still make
>> sure
>> >>>>> that
>> >>>>>>>>>>> genuinely internal topics (instead of topics that
>> >>>>>>>>>>> ReplicationPolicy::isInternalTopic returns true for but which
>> >> are
>> >>>>> not
>> >>>>>>>>>>> actually internal) are excluded by the filter
>> >>>>>>>>>>>
>> >>>>>>>>>>> Patrik, Greg, how does that sound?
>> >>>>>>>>>>>
>> >>>>>>>>>>> Best,
>> >>>>>>>>>>>
>> >>>>>>>>>>> Chris
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Mon, Aug 5, 2024 at 4:34 AM Patrik Marton
>> >>>>>>>>>> <pmar...@cloudera.com.invalid
>> >>>>>>>>>>>>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Hi Greg and Chris,
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Thank you for your feedbacks!
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I added the currently existing workarounds to the “Rejected
>> >>>>>>>>>> Alternatives”
>> >>>>>>>>>>>> section, and explained why I still think we need a separate
>> >>>>>>>>> solution. I
>> >>>>>>>>>>>> also added the previous proposal to the rejected
>> alternatives,
>> >>> as I
>> >>>>>>>>>> agree
>> >>>>>>>>>>>> with your concerns.
>> >>>>>>>>>>>> I tried to come up with a bit different regex based solution,
>> >> and
>> >>>>>>>>>>> modified
>> >>>>>>>>>>>> the KIP based on that. I think it is a more straightforward
>> way
>> >>> to
>> >>>>>>>>> get
>> >>>>>>>>>>> the
>> >>>>>>>>>>>> desired behavior.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Let me know your thoughts.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>> Patrik
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On 2024/07/30 18:57:18 Chris Egerton wrote:
>> >>>>>>>>>>>>> Hi Patrick,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I share Greg's concerns with the feature as it's currently
>> >>>>>>>>> proposed.
>> >>>>>>>>>> I
>> >>>>>>>>>>>>> don't think I could vote for something unless it made
>> >>> replication
>> >>>>>>>>> of
>> >>>>>>>>>>>>> genuinely internal topics and replication cycles impossible,
>> >> or
>> >>>>>>>> at
>> >>>>>>>>>>> least
>> >>>>>>>>>>>>> significantly less likely.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Best,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Chris
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Tue, Jul 30, 2024, 14:51 Greg Harris
>> >> <gr...@aiven.io.invalid
>> >>>>
>> >>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Hi Patrik,
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Thanks for the KIP!
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Your motivation for this KIP is reasonable, because it is
>> >>>>>>>>>> definitely
>> >>>>>>>>>>>>>> possible for the ".internal" suffix to collide with real
>> >>>>>>>> topics.
>> >>>>>>>>> It
>> >>>>>>>>>>>> would
>> >>>>>>>>>>>>>> have been nice if the original design included some
>> >>>>>>>> mm2-specific
>> >>>>>>>>>>>> namespace
>> >>>>>>>>>>>>>> like "mm2.internal" to lessen the likelihood of a
>> collision.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> However, this is a problem that has numerous existing
>> >>>>>>>>> workarounds:
>> >>>>>>>>>>>>>> * Use a custom ReplicationPolicy and override the methods
>> >> (for
>> >>>>>>>>>>> existing
>> >>>>>>>>>>>>>> workloads/mirror makers)
>> >>>>>>>>>>>>>> * Use non-conflicting user topic names (for new user
>> topics)
>> >>>>>>>>>>>>>> * Use the replication.policy.separator to use a
>> >> non-conflicting
>> >>>>>>>>>>>> separator
>> >>>>>>>>>>>>>> character (for new mirror maker setups)
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> And the feature as-described has significant risks
>> attached:
>> >>>>>>>>>>>>>> * May allow replication cycles and runaway replication
>> >>>>>>>>>>>>>> * Mirrors internal topics that are unusable on the
>> >> destination
>> >>>>>>>>>>> cluster
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> While these risks can be accounted for if a user is
>> attentive
>> >>>>>>>>> (e.g.
>> >>>>>>>>>>>> when
>> >>>>>>>>>>>>>> they're writing their own ReplicationPolicy) it is not a
>> >>>>>>>>> risk-free
>> >>>>>>>>>>>>>> configuration that composes well with other out-of-the-box
>> >>>>>>>>>>>> configurations.
>> >>>>>>>>>>>>>> For example, someone may expect to take their existing
>> >>>>>>>>>> configuration,
>> >>>>>>>>>>>> turn
>> >>>>>>>>>>>>>> on this new option, and expect reasonable behavior, which
>> >> isn't
>> >>>>>>>>>>> always
>> >>>>>>>>>>>>>> guaranteed.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> If you're still interested in this feature, please
>> reference
>> >>>>>>>> the
>> >>>>>>>>>>>> existing
>> >>>>>>>>>>>>>> workarounds and include them as rejected alternatives so we
>> >> can
>> >>>>>>>>>> know
>> >>>>>>>>>>>> where
>> >>>>>>>>>>>>>> the existing solutions fall short.
>> >>>>>>>>>>>>>> We'd also have to figure out if and how the risks I
>> mentioned
>> >>>>>>>>> could
>> >>>>>>>>>>> be
>> >>>>>>>>>>>>>> mitigated.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>>>> Greg
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> On Tue, Jul 30, 2024 at 5:49 AM Patrik Marton
>> >>>>>>>>>>>> <pmar...@cloudera.com.invalid
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Hi Team,
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> I would like to start a discussion on KIP-1074: Make the
>> >>>>>>>>>>> replication
>> >>>>>>>>>>>> of
>> >>>>>>>>>>>>>>> internal topics configurable
>> >>>>>>>>>>>>>>> <
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>
>> >>>
>> >>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1074%3A+Make+the+replication+of+internal+topics+configurable
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> The goal of this KIP is to make it easier to replicate
>> >> topics
>> >>>>>>>>>> that
>> >>>>>>>>>>>> seem
>> >>>>>>>>>>>>>>> internal (for example ending in .internal suffix), but are
>> >>>>>>>> just
>> >>>>>>>>>>>> normal
>> >>>>>>>>>>>>>>> business topics created by the user.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> I appreciate any feedback and recommendations!
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Thanks!
>> >>>>>>>>>>>>>>> Patrik
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>
>> >>>
>> >>
>>
>>

Reply via email to