Hi everyone! Thanks again for all your feedback! The discussion thread has been silent for a week now, I'll wait a bit for any new feedback and will start the voting in a couple of days.
Best, Patrik On Wed, Aug 21, 2024 at 10:28 AM Patrik Marton <pmar...@cloudera.com> wrote: > Hi Omnia, > > Thanks, I updated the KIP with the DefaultTopicFilter changes, to reflect > the new mm2.*.internal pattern. > > Best, > Patrik > > On Mon, Aug 19, 2024 at 2:24 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com> > wrote: > >> Thanks Patrik for updating the KIP. >> One last feedback, should we update `TOPICS_EXCLUDE_DEFAULT` also to >> reflect the same pattern instead of `.*[\\-\\.]internal`? >> >> Best, >> Omnia >> >> > On 16 Aug 2024, at 13:31, Patrik Marton <pmar...@cloudera.com.INVALID> >> wrote: >> > >> > Hi All, >> > >> > I updated the KIP with the new proposal, and changed the name to better >> > describe the latest proposal and the goal. >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1074%3A+Allow+the+replication+of+user+internal+topics >> > >> > Let me know your thoughts! >> > >> > Thanks! >> > Patrik >> > >> > On Wed, Aug 14, 2024 at 10:50 AM Viktor Somogyi-Vass >> > <viktor.somo...@cloudera.com.invalid> wrote: >> > >> >> Hi Omina, >> >> >> >> We have not considered this path yet that you proposed and I like it as >> >> well and I would also be in favor of a solution that doesn't try to >> >> introduce a new config. Since this KIP would only make it into 4.0 >> anyway >> >> and your proposal seems like a smaller compatibility break that can >> >> be documented in the upgrade notes and also circumvented by a custom >> >> policy, I think we can do this. >> >> >> >> I'd be also happy to receive some feedback from those who have been in >> this >> >> conversation to form a consensus around this. >> >> >> >> Best, >> >> Viktor >> >> >> >> On Mon, Aug 12, 2024 at 2:19 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com >> > >> >> wrote: >> >> >> >>> Hi Patrick, >> >>>> One thing that comes up for me is whether we should update the >> >>> TopicFilter >> >>>> too to have this more specific pattern, like "mm2-.*.internal". >> Ideally >> >>> I'd >> >>>> say yes, but what if a user already has existing .internal user >> topics >> >>> and >> >>>> relies on the ReplicationPolicy and TopicFilter not to replicate >> those, >> >>> and >> >>>> by changing both to a more specific pattern, we could technically >> >> enable >> >>>> their replication. I know this is an edge case, so am I being too >> >> careful >> >>>> here? >> >>> >> >>> I would say for the ReplicationPolicy blocking all `.internal` and >> >>> `-internal` as MM2 / Kafka internals is a bug. So anyone who has been >> >>> relaying on this bug to block topics is relaying on a faulty code >> >>> especially that none of the Kafka, connect or streams internals do >> create >> >>> any topic with these suffixes except MM2 and all of them can be >> >> identified >> >>> more accurately without this generic suffix. >> >>> >> >>> >> >>> The other potential cases we might break are >> >>> 1. where `replication.policy.internal.topic.separator.enabled` is >> >> enabled, >> >>> but we might only need to adjust the code to account for this either >> in >> >>> ReplicationPolicy itself >> >>> ` default boolean isMM2InternalTopic(String topic) { return >> >>> (topic.startsWith("mm2-") && topic.endsWith("internal")) || >> >>> isCheckpointsTopic(topic); }` >> >>> Or in DefaultReplicationPolicy implementation of `isMM2InternalTopic`. >> >> So >> >>> we don’t break the backward compatibility for this case >> >>> 2. Any use case has a customised ReplicationPolicy with an override of >> >>> isMM2InternalTopic. This one might also break the backward >> compatibility >> >>> >> >>> As 4.0 coming soon we can leverage this and introduce a breaking >> change >> >> to >> >>> MM2 with clarification of the KIP's impact on backward compatibility >> for >> >>> the following cases, >> >>> - For anyone has been relaying on this bug to block normal topics with >> >>> `internal` suffix they will need to update topic filter config to >> block >> >>> this suffix manually if they wish for. >> >>> - For anyone overriding isMM2InternalTopic in their custom >> >>> ReplicationPolicy, they might need to update their code account for >> the >> >> new >> >>> change to block only mm2 internal topics. >> >>> >> >>> If this path is acceptable with breaking the compatibility with the >> next >> >>> major Kafka release we should consider fixing the issue and not >> >> introducing >> >>> another config to go around it especially that this seems more like >> >> mistake >> >>> from the beginning as MM2 original KIP intention was trying to target >> MM2 >> >>> and Kafka internal topics only and not all topics with this suffix. >> >>> >> >>> What does everyone think? >> >>> >> >>> Thanks >> >>> Omnia >> >>>> On 9 Aug 2024, at 10:40, Patrik Marton <pmar...@cloudera.com.INVALID >> > >> >>> wrote: >> >>>> >> >>>> Hi All, >> >>>> >> >>>> Thanks again for your feedback! >> >>>> >> >>>> Regarding the boolean solution, yes this would technically let the >> >>> customer >> >>>> accidentally replicate "__.*" topics, but only if they modify the >> >>>> TopicFilter to allow their replication, as by default topics matching >> >>> this >> >>>> pattern are excluded. But in this case we can still hardcode "__.*" >> >>> topics >> >>>> as internal in the replication policy, if we feel like we need that >> >> extra >> >>>> layer of safety. >> >>>> >> >>>> Thanks Omnia for your ideas! >> >>>> About #2, I think this is something that we should consider besides >> the >> >>>> boolean solution, as I like the idea that this way we can avoid >> adding >> >>>> another configuration to mm2, and simply define a more specific >> >> criteria >> >>>> for topics to be considered internal by the replication policy. >> >>>> One thing that comes up for me is whether we should update the >> >>> TopicFilter >> >>>> too to have this more specific pattern, like "mm2-.*.internal". >> Ideally >> >>> I'd >> >>>> say yes, but what if a user already has existing .internal user >> topics >> >>> and >> >>>> relies on the ReplicationPolicy and TopicFilter not to replicate >> those, >> >>> and >> >>>> by changing both to a more specific pattern, we could technically >> >> enable >> >>>> their replication. I know this is an edge case, so am I being too >> >> careful >> >>>> here? >> >>>> >> >>>> Let me know your thoughts, and I will update the KIP too. >> >>>> >> >>>> Best, >> >>>> Patrik >> >>>> >> >>>> On Thu, Aug 8, 2024 at 2:59 PM Omnia Ibrahim < >> o.g.h.ibra...@gmail.com> >> >>>> wrote: >> >>>> >> >>>>> Hi >> >>>>> Thinking of it more, I feel leaving customers to mirroring Kafka’s >> >>>>> internal topics `__.*` either by mistake or intently will be eating >> up >> >>>>> resources from both mm2 and the clusters for no actual value as the >> >>> data is >> >>>>> not that useful outside the context of the original cluster. >> >>>>> These topics usually are unbalanced and clients are the one the >> >> control >> >>>>> their sizes which will lead to uneven distribution of MM2 resource >> >>> usages. >> >>>>> And if MM2 missed the tombstone records from the source cluster >> which >> >> is >> >>>>> used to cleanup metadata then mirrored records of these topics >> >> wouldn’t >> >>> get >> >>>>> cleaned appropriately. >> >>>>> >> >>>>> Other options to solve this >> >>>>> >> >>>>> 1. introduce a way to specify MM2’s internal topics suffix >> >> (checkpoints, >> >>>>> offset sync, offsets, status, and config topics) and by default this >> >> is >> >>>>> `.internal`, in case users are allowed to have `.internal` in the >> >> topic >> >>>>> name then MM2 can be setup with different suffix. >> >>>>> `-internal` was set as a way to identify connect internal topics but >> >> as >> >>>>> far as I can see there is no topic set by MM2 itself with >> `-internal` >> >> so >> >>>>> maybe we can drop `-internal`. >> >>>>> Then for __.* these aren’t allowed to be mirrored. >> >>>>> >> >>>>> 2. other option is updating `ReplicationPolicy::isMM2InternalTopic` >> to >> >>>>> check if topic is starting with `mm2` and end with `.internal` as >> this >> >>> is >> >>>>> the pattern for default topics so far. >> >>>>>> default boolean isMM2InternalTopic(String topic) { >> >>>>>> // MM2 has a pattern for most topics that shouldn't be mirriord >> >>>>> which is mm2-*.*.internal: >> >>>>>> // "mm2-offset-syncs." + clusterAlias + ".internal" >> >>>>>> // "mm2-configs." + sourceAndTarget.source() + ".internal" >> >>>>>> // "mm2-status." + sourceAndTarget.source() + ".internal" >> >>>>>> // "mm2-offsets." + sourceAndTarget.source() + ".internal" >> >>>>>> // The only exception is checkpoints which is clusterAlias + >> >>>>> ".checkpoints.internal" >> >>>>>> return (topic.startsWith("mm2-") && topic.endsWith(".internal")) >> >> || >> >>>>> isCheckpointsTopic(topic); >> >>>>>> } >> >>>>> >> >>>>> As far as I can see this will cover offset-sync topic, all default >> >>> connect >> >>>>> topic names used by mirror and checkpoints which what the method >> >>> originally >> >>>>> is trying to target. This way we will have some sense of “fake >> >>> namespace” >> >>>>> without introducing one. >> >>>>> >> >>>>> Then `ReplicationPolicy::isInternalTopic` can be >> >>>>>> default boolean isInternalTopic(String topic) { >> >>>>>> boolean isKafkaInternalTopic = topic.startsWith("__") || >> >>>>> topic.startsWith("."); >> >>>>>> return isMM2InternalTopic(topic) || isKafkaInternalTopic; >> >>>>>> } >> >>>>> >> >>>>> `-internal` will be dropped as I can’t see any topic name with this >> >>> suffix >> >>>>> setup by MM2. >> >>>>> >> >>>>> Best, >> >>>>> Omnia >> >>>>> >> >>>>>> On 7 Aug 2024, at 17:26, Omnia Ibrahim <o.g.h.ibra...@gmail.com> >> >>> wrote: >> >>>>>> >> >>>>>> Hi All, >> >>>>>> >> >>>>>> I have a couple of notes to add >> >>>>>> >> >>>>>> OI-1.Regarding the proposed default regex in the KIP, >> >>>>> "__.*|\\..*|.*-internal|.*\\.internal", when users wish to change >> >> this, >> >>>>> wouldn’t they mostly need to keep the __.* part? Especially since __ >> >> is >> >>>>> primarily reserved for broker internal metadata topics. >> >>>>>> >> >>>>>> OI-2.Whether we choose a boolean or regex, I think we need to >> >> identify >> >>>>> if this will include replicating the internal metadata topics. The >> >>> metadata >> >>>>> within these topics are not that useful outside of their original >> >>> cluster, >> >>>>> so replicating them doesn’t add much value for anyone. >> >>>>>> The other problem with mistakenly replicating these topics is that >> >> they >> >>>>> are set up with a compacted policy, which means they don’t get >> >> truncated >> >>>>> naturally by retention. These metadata topics are cleaned in a >> special >> >>> way >> >>>>> to stay within the metadata retention milliseconds or expiration >> >>> configs. >> >>>>> As a result of this special situation, replicating these by mistake >> >>> might >> >>>>> put the destination cluster’s capacity at risk. >> >>>>>> Maybe we should be clear that any topic in Topic::INTERNAL_TOPICS >> is >> >>> not >> >>>>> going to be replicated? WDYT? >> >>>>>> >> >>>>>> Thanks >> >>>>>> Omnia >> >>>>>> >> >>>>>>> On 7 Aug 2024, at 15:46, Viktor Somogyi-Vass < >> >>>>> viktor.somo...@cloudera.com.INVALID> wrote: >> >>>>>>> >> >>>>>>> Hi all, >> >>>>>>> >> >>>>>>> 1. Yea, thinking of it more, I'm also concerned about the upgrade >> >> path >> >>>>> and >> >>>>>>> I think that it may not be worth adding a ton of code to work >> around >> >>>>> that >> >>>>>>> one time upgrade. It likely needs extra configs as well or >> >>>>> documentation to >> >>>>>>> instruct users, so it may not be worth the price. So let's add >> this >> >> to >> >>>>> the >> >>>>>>> rejected alternatives. >> >>>>>>> >> >>>>>>> 2. As Chris mentioned, I also feel that users may want to improve >> it >> >>>>> with >> >>>>>>> regexes and after a certain number of topics it may not be easily >> >>>>>>> manageable. And indeed, it feels a little bit like duplicating the >> >>>>>>> functionality of the topic filter. Overall I like the boolean >> config >> >>>>> best >> >>>>>>> as that prevents most of the unwanted replication cycles by >> default >> >>> with >> >>>>>>> the TopicFilter. >> >>>>>>> >> >>>>>>> Best, >> >>>>>>> Viktor >> >>>>>>> >> >>>>>>> On Wed, Aug 7, 2024 at 10:59 AM Patrik Marton >> >>>>> <pmar...@cloudera.com.invalid> >> >>>>>>> wrote: >> >>>>>>> >> >>>>>>>> Hi All, >> >>>>>>>> >> >>>>>>>> Thanks again for your feedback! >> >>>>>>>> >> >>>>>>>> My thoughts on Viktor's ideas: >> >>>>>>>> >> >>>>>>>> 1. This would definitely be a clean solution in the end, but I >> also >> >>>>> feel >> >>>>>>>> that the complexity of this solution might outweigh the benefits. >> >> If >> >>>>> you >> >>>>>>>> guys think, we can give it a thought and figure out what exactly >> >>> needs >> >>>>> to >> >>>>>>>> be changed and how the upgrade path would look like, but my >> opinion >> >>> is >> >>>>> that >> >>>>>>>> since it is a rare use-case, a simple solution might be better. >> >>>>>>>> >> >>>>>>>> 2. I initially wanted to propose something like this, but my >> >> concern >> >>>>> was >> >>>>>>>> that by adding another way a topic can be whitelisted, we would >> >> also >> >>>>>>>> increase the complexity and it would require more attention from >> >>>>> users. But >> >>>>>>>> I also think that this would probably be the safest solution of >> >> all, >> >>>>> and >> >>>>>>>> less risky than modifying regexes or using that simple boolean. >> >>>>>>>> Since the TopicFilter is not really the problem as its exclude >> list >> >>>>> can be >> >>>>>>>> modified, and it basically acts as another layer of safety, we >> >> could >> >>>>>>>> potentially add this new config (a list of topics) to the >> >>>>>>>> DefaultReplicationPolicy, and any topic listed in the new config >> >>> would >> >>>>> not >> >>>>>>>> be considered internal by the replication policy. This way we >> would >> >>> not >> >>>>>>>> duplicate the work of the Topic Filter, simply provide a way to >> >> tell >> >>>>> the >> >>>>>>>> DefaultReplicationPolicy that topics in this list seem internal, >> >> but >> >>>>> they >> >>>>>>>> are not. >> >>>>>>>> >> >>>>>>>> What do you think? >> >>>>>>>> >> >>>>>>>> Best, >> >>>>>>>> Patrik >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On Tue, Aug 6, 2024 at 7:17 PM Chris Egerton >> >> <chr...@aiven.io.invalid >> >>>> >> >>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> Hi Viktor, >> >>>>>>>>> >> >>>>>>>>> Nice to hear from you! My thoughts: >> >>>>>>>>> >> >>>>>>>>> 1. I initially liked this idea, but it doesn't seem like the >> costs >> >>> to >> >>>>> all >> >>>>>>>>> existing users outweigh the benefits that would be reaped by >> just >> >> a >> >>>>> few. >> >>>>>>>>> This is possibly because I'm assuming the upgrade path would be >> >>> pretty >> >>>>>>>>> gnarly, though; if we can make it really, really smooth, >> >> especially >> >>>>> for >> >>>>>>>>> existing users who don't need to be able to replicate topics >> >> ending >> >>> in >> >>>>>>>>> "-internal", then I think it'd be a viable option. >> >>>>>>>>> >> >>>>>>>>> 2. This is definitely safer than a simple boolean property, but >> >>>>> doesn't >> >>>>>>>> it >> >>>>>>>>> feel like we're duplicating the work of the topic filter at this >> >>>>> point, >> >>>>>>>>> except with something a little more cumbersome for users? People >> >>>>> already >> >>>>>>>>> have a way of include- and exclude-listing topics by regex with >> >> that >> >>>>>>>>> mechanism, and I'm worried we're not placing enough trust in the >> >>>>>>>>> flexibility and safety it already provides us and are adding >> >> layers >> >>> of >> >>>>>>>>> safety that are ultimately redundant. I might also be >> pessimistic >> >>> here >> >>>>>>>> but >> >>>>>>>>> I can't shake the feeling that we'd eventually get a KIP where >> >>> someone >> >>>>>>>>> states that it's too hard to add topics one-by-one and would >> like >> >> to >> >>>>>>>>> include them by regex, prefix match, or some other broader >> >>> mechanism. >> >>>>>>>>> >> >>>>>>>>> Do you find this convincing? I'm also curious what Greg and >> Patrik >> >>>>> think. >> >>>>>>>>> >> >>>>>>>>> Cheers, >> >>>>>>>>> >> >>>>>>>>> Chris >> >>>>>>>>> >> >>>>>>>>> On Tue, Aug 6, 2024 at 5:08 AM Viktor Somogyi-Vass >> >>>>>>>>> <viktor.somo...@cloudera.com.invalid> wrote: >> >>>>>>>>> >> >>>>>>>>>> Hi all, >> >>>>>>>>>> >> >>>>>>>>>> I think there are a couple of other alternatives to discuss. >> >>>>>>>>>> >> >>>>>>>>>> 1. We may just simply change the internal topic naming to >> >> something >> >>>>>>>> like >> >>>>>>>>>> "mm2internal" and provide upgrade functionality as well. In >> this >> >>>>>>>> scenario >> >>>>>>>>>> we wouldn't need to tinker with the prefixes or the naming >> >>> patterns, >> >>>>> we >> >>>>>>>>>> just simply free up the "internal" keyword and after upgrade >> >> allow >> >>>>> all >> >>>>>>>>>> topics with internal to be mirrored. The upgrade however could >> be >> >>>>> very >> >>>>>>>>>> complicated as we'd need to provide a way for users to move >> away >> >>> from >> >>>>>>>> the >> >>>>>>>>>> old topics (what happens in case of EOS, how do we make sure to >> >>>>>>>>> discontinue >> >>>>>>>>>> the old topics, what happens if the upgrade gets interrupted >> >> etc.). >> >>>>>>>> But I >> >>>>>>>>>> think this is worth considering and for the upgrade, we may >> take >> >> a >> >>>>> few >> >>>>>>>>>> ideas from the zk->kraft migration to make it easier. >> >>>>>>>>>> >> >>>>>>>>>> 2. A new config to force the replication of certain topics. In >> >>> this, >> >>>>>>>>> users >> >>>>>>>>>> would have to specify topics 1 by 1 and this would bypass the >> >>> filters >> >>>>>>>> and >> >>>>>>>>>> the replication policy. Since forcing is usually not the normal >> >> way >> >>>>> of >> >>>>>>>>>> doing things in general, users may be more willing to accept >> >>>>> unintended >> >>>>>>>>>> side effects. >> >>>>>>>>>> >> >>>>>>>>>> What do you all think? >> >>>>>>>>>> (If they aren't better than the original ideas, I'm absolutely >> >> good >> >>>>>>>> with >> >>>>>>>>>> putting these to the rejected alternatives but wanted to write >> it >> >>> all >> >>>>>>>>> out.) >> >>>>>>>>>> >> >>>>>>>>>> Best, >> >>>>>>>>>> Viktor >> >>>>>>>>>> >> >>>>>>>>>> On Mon, Aug 5, 2024 at 8:18 PM Chris Egerton >> >>> <chr...@aiven.io.invalid >> >>>>>> >> >>>>>>>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>>> Hi Patrik, >> >>>>>>>>>>> >> >>>>>>>>>>> I think I owe you an apology. After mulling this over for a >> few >> >>> more >> >>>>>>>>>> hours >> >>>>>>>>>>> and going over the code base some more, it seems like your >> >> initial >> >>>>>>>>>> approach >> >>>>>>>>>>> was actually better in some ways. >> >>>>>>>>>>> >> >>>>>>>>>>> With the regex-based approach, it becomes possible for users >> to >> >>>>>>>>> configure >> >>>>>>>>>>> the DefaultReplicationPolicy class in a way that for some >> topic >> >> T, >> >>>>>>>>>>> isInternalTopic(T) returns false, but isCheckpointsTopic(T), >> >>>>>>>>>>> isHeartbeatsTopic(T), or even isMM2InternalTopic(T) return >> true. >> >>> It >> >>>>>>>>> seems >> >>>>>>>>>>> like this kind of inconsistency might cause confusion to users >> >> and >> >>>>>>>>>> possibly >> >>>>>>>>>>> even break things down the road depending on how we leverage >> >> these >> >>>>>>>>>> methods >> >>>>>>>>>>> in the future. >> >>>>>>>>>>> >> >>>>>>>>>>> With the initial approach, using a simple boolean to enable >> the >> >>>>>>>>>> replication >> >>>>>>>>>>> of what appear to be internal topics, there is some risk of >> >> cycles >> >>>>>>>> and >> >>>>>>>>>>> accidental replication of genuinely internal topics, but less >> >>> than I >> >>>>>>>>>>> believed at first. In order for an internal topic to be >> >>> replicated, >> >>>>>>>> not >> >>>>>>>>>>> only would that boolean property have to be explicitly >> modified >> >> by >> >>>>>>>>> users, >> >>>>>>>>>>> but they would also have to be using a topic filter that >> allows >> >>>>> those >> >>>>>>>>>>> topics to pass through as well. >> >>>>>>>>>>> >> >>>>>>>>>>> I think a simple boolean is better here, as it's easier for >> >> users >> >>> to >> >>>>>>>>>>> configure, and less prone to logical inconsistencies. My only >> >>>>>>>>> suggestions >> >>>>>>>>>>> are: >> >>>>>>>>>>> - Apply the property to the MirrorSourceConnector class >> instead >> >> of >> >>>>>>>> the >> >>>>>>>>>>> DefaultReplicationPolicy class, since we're not actually >> >> changing >> >>>>> the >> >>>>>>>>>>> definition of what is and isn't an internal topic; instead, >> >> we're >> >>>>>>>>> simply >> >>>>>>>>>>> allowing topics to be replicated even if they appear to be >> >>> internal >> >>>>>>>>>> topics >> >>>>>>>>>>> - Rename it to something less verbose like >> >>>>>>>> "replicate.internal.topics" >> >>>>>>>>>>> - Add a warning to the docs for the property explicitly >> pointing >> >>>>>>>> people >> >>>>>>>>>> to >> >>>>>>>>>>> the topic filter class, stating that they should still make >> sure >> >>>>> that >> >>>>>>>>>>> genuinely internal topics (instead of topics that >> >>>>>>>>>>> ReplicationPolicy::isInternalTopic returns true for but which >> >> are >> >>>>> not >> >>>>>>>>>>> actually internal) are excluded by the filter >> >>>>>>>>>>> >> >>>>>>>>>>> Patrik, Greg, how does that sound? >> >>>>>>>>>>> >> >>>>>>>>>>> Best, >> >>>>>>>>>>> >> >>>>>>>>>>> Chris >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> On Mon, Aug 5, 2024 at 4:34 AM Patrik Marton >> >>>>>>>>>> <pmar...@cloudera.com.invalid >> >>>>>>>>>>>> >> >>>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>>> Hi Greg and Chris, >> >>>>>>>>>>>> >> >>>>>>>>>>>> Thank you for your feedbacks! >> >>>>>>>>>>>> >> >>>>>>>>>>>> I added the currently existing workarounds to the “Rejected >> >>>>>>>>>> Alternatives” >> >>>>>>>>>>>> section, and explained why I still think we need a separate >> >>>>>>>>> solution. I >> >>>>>>>>>>>> also added the previous proposal to the rejected >> alternatives, >> >>> as I >> >>>>>>>>>> agree >> >>>>>>>>>>>> with your concerns. >> >>>>>>>>>>>> I tried to come up with a bit different regex based solution, >> >> and >> >>>>>>>>>>> modified >> >>>>>>>>>>>> the KIP based on that. I think it is a more straightforward >> way >> >>> to >> >>>>>>>>> get >> >>>>>>>>>>> the >> >>>>>>>>>>>> desired behavior. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Let me know your thoughts. >> >>>>>>>>>>>> >> >>>>>>>>>>>> Thanks, >> >>>>>>>>>>>> Patrik >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> On 2024/07/30 18:57:18 Chris Egerton wrote: >> >>>>>>>>>>>>> Hi Patrick, >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> I share Greg's concerns with the feature as it's currently >> >>>>>>>>> proposed. >> >>>>>>>>>> I >> >>>>>>>>>>>>> don't think I could vote for something unless it made >> >>> replication >> >>>>>>>>> of >> >>>>>>>>>>>>> genuinely internal topics and replication cycles impossible, >> >> or >> >>>>>>>> at >> >>>>>>>>>>> least >> >>>>>>>>>>>>> significantly less likely. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Best, >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Chris >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> On Tue, Jul 30, 2024, 14:51 Greg Harris >> >> <gr...@aiven.io.invalid >> >>>> >> >>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>> >> >>>>>>>>>>>>>> Hi Patrik, >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Thanks for the KIP! >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Your motivation for this KIP is reasonable, because it is >> >>>>>>>>>> definitely >> >>>>>>>>>>>>>> possible for the ".internal" suffix to collide with real >> >>>>>>>> topics. >> >>>>>>>>> It >> >>>>>>>>>>>> would >> >>>>>>>>>>>>>> have been nice if the original design included some >> >>>>>>>> mm2-specific >> >>>>>>>>>>>> namespace >> >>>>>>>>>>>>>> like "mm2.internal" to lessen the likelihood of a >> collision. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> However, this is a problem that has numerous existing >> >>>>>>>>> workarounds: >> >>>>>>>>>>>>>> * Use a custom ReplicationPolicy and override the methods >> >> (for >> >>>>>>>>>>> existing >> >>>>>>>>>>>>>> workloads/mirror makers) >> >>>>>>>>>>>>>> * Use non-conflicting user topic names (for new user >> topics) >> >>>>>>>>>>>>>> * Use the replication.policy.separator to use a >> >> non-conflicting >> >>>>>>>>>>>> separator >> >>>>>>>>>>>>>> character (for new mirror maker setups) >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> And the feature as-described has significant risks >> attached: >> >>>>>>>>>>>>>> * May allow replication cycles and runaway replication >> >>>>>>>>>>>>>> * Mirrors internal topics that are unusable on the >> >> destination >> >>>>>>>>>>> cluster >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> While these risks can be accounted for if a user is >> attentive >> >>>>>>>>> (e.g. >> >>>>>>>>>>>> when >> >>>>>>>>>>>>>> they're writing their own ReplicationPolicy) it is not a >> >>>>>>>>> risk-free >> >>>>>>>>>>>>>> configuration that composes well with other out-of-the-box >> >>>>>>>>>>>> configurations. >> >>>>>>>>>>>>>> For example, someone may expect to take their existing >> >>>>>>>>>> configuration, >> >>>>>>>>>>>> turn >> >>>>>>>>>>>>>> on this new option, and expect reasonable behavior, which >> >> isn't >> >>>>>>>>>>> always >> >>>>>>>>>>>>>> guaranteed. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> If you're still interested in this feature, please >> reference >> >>>>>>>> the >> >>>>>>>>>>>> existing >> >>>>>>>>>>>>>> workarounds and include them as rejected alternatives so we >> >> can >> >>>>>>>>>> know >> >>>>>>>>>>>> where >> >>>>>>>>>>>>>> the existing solutions fall short. >> >>>>>>>>>>>>>> We'd also have to figure out if and how the risks I >> mentioned >> >>>>>>>>> could >> >>>>>>>>>>> be >> >>>>>>>>>>>>>> mitigated. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Thanks, >> >>>>>>>>>>>>>> Greg >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> On Tue, Jul 30, 2024 at 5:49 AM Patrik Marton >> >>>>>>>>>>>> <pmar...@cloudera.com.invalid >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> Hi Team, >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> I would like to start a discussion on KIP-1074: Make the >> >>>>>>>>>>> replication >> >>>>>>>>>>>> of >> >>>>>>>>>>>>>>> internal topics configurable >> >>>>>>>>>>>>>>> < >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>> >> >>> >> >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1074%3A+Make+the+replication+of+internal+topics+configurable >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> The goal of this KIP is to make it easier to replicate >> >> topics >> >>>>>>>>>> that >> >>>>>>>>>>>> seem >> >>>>>>>>>>>>>>> internal (for example ending in .internal suffix), but are >> >>>>>>>> just >> >>>>>>>>>>>> normal >> >>>>>>>>>>>>>>> business topics created by the user. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> I appreciate any feedback and recommendations! >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> Thanks! >> >>>>>>>>>>>>>>> Patrik >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>> >> >>>>> >> >>>>> >> >>> >> >>> >> >> >> >>