Hi Anton, I think the tool as-is is mostly redundant since we already have the kafka-topics CLI. Maybe instead of accepting command-line parameters it can just take in a properties file that recognizes the exact same syntax we use in the worker config?
Cheers, Chris On Thu, Sep 18, 2025, 05:10 Anton Liauchuk <[email protected]> wrote: > Hi, > > I’ve added the CLI description to the proposal based on our discussion: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1209%3A+Add+configuration+to+control+internal+topic+creation+in+Kafka+Connect#KIP1209:AddconfigurationtocontrolinternaltopiccreationinKafkaConnect-InternalTopicsManagementScript > > On Mon, Sep 15, 2025 at 12:17 PM Anton Liauchuk <[email protected]> > wrote: > > > Thanks for the feedback! > > > >> Just to make sure we're all on the same page about the problem space > >> here--it seems like the only serious impact here could come from the > >> offsets topic. Creating a new, empty config or status topic should not > lead > >> to any problems if the existing one is accidentally deleted. Is this > >> correct? > > > > Yes, that’s correct - the offsets topic is the most critical case. > > > > An alternative flow could involve introducing a new CLI tool for creating > >> Connect's internal topics, with the intention that it could be used > before > >> bringing up any workers in the cluster, and then all workers in the > cluster > >> could be configured with internal topic creation disabled from day 1. We > >> could also recommend the use of that tool in the error message if > internal > >> topic creation is disabled and a topic isn't found. > > > > I agree - it makes total sense to have such a tool, which could be > > especially useful before creating a new Kafka Connect cluster. Should I > > update the KIP to include this proposed command-line tool? > > > > On Fri, Sep 12, 2025 at 10:21 PM Chris Egerton <[email protected]> > > wrote: > > > >> Hi Anton, > >> > >> I agree that, while rare, the case you've mentioned where a missing > >> offsets topic leads to reprocessing of potentially massive amounts of > data > >> is pretty nasty and it'd be nice to try to prevent it. > >> > >> Just to make sure we're all on the same page about the problem space > >> here--it seems like the only serious impact here could come from the > >> offsets topic. Creating a new, empty config or status topic should not > lead > >> to any problems if the existing one is accidentally deleted. Is this > >> correct? > >> > >> Separately, it seems like the intended flow is a bit awkward: users > bring > >> up a worker which automatically creates the internal topics, probably > bring > >> up more workers to flesh out the cluster, then have to edit the configs > for > >> every worker in the cluster in order to disable internal topic creation. > >> This would be okay in the case of a cluster migration, but it'd be > somewhat > >> cumbersome in the general case where a cluster has been brought up and, > >> with no plans of migration, users want to prevent accidental deletion of > >> the offsets topic from causing a ton of data to be reprocessed. > >> > >> An alternative flow could involve introducing a new CLI tool for > creating > >> Connect's internal topics, with the intention that it could be used > before > >> bringing up any workers in the cluster, and then all workers in the > cluster > >> could be configured with internal topic creation disabled from day 1. We > >> could also recommend the use of that tool in the error message if > internal > >> topic creation is disabled and a topic isn't found. > >> > >> Thoughts? > >> > >> On 2025/09/12 17:11:12 Anton Liauchuk wrote: > >> > Hi, > >> > > >> > Thanks for the feedback! > >> > > >> > I agree this should be a rare case, but the potential impact could be > >> > catastrophic in situations where many long-running connectors are > >> involved, > >> > as they might start reprocessing everything due to misconfiguration. > >> Such > >> > misconfiguration can happen because of invalid topic names or mistakes > >> > during migration to a new cluster. A recent example of this is the > >> > migration to a new Kafka cluster on AWS, the MSK cluster must be > >> recreated > >> > to migrate from ZooKeeper to KRaft. > >> > > >> > I’ve updated the KIP and PR: > >> > > >> > > >> > - > >> > > >> > Renamed the config to *internal.topics.creation.enable* > >> > - > >> > > >> > Extended the error messages. Example: > >> > > >> > *Topic 'non-existent-offset' specified via the > 'offset.storage.topic' > >> > property is missing. The config 'internal.topics.creation.enable' > is > >> set to > >> > 'false', so automatic creation of internal topics is disabled. > Either > >> > enable automatic creation or create the topics manually before > >> starting the > >> > worker.* > >> > > >> > Please take another look and share your thoughts > >> > > >> > On Fri, Sep 12, 2025 at 6:33 PM Mickael Maison < > >> [email protected]> > >> > wrote: > >> > > >> > > Hi, > >> > > > >> > > Thanks for the KIP. > >> > > > >> > > To be honest I'm unsure about this feature. I find the motivation > >> > > light, this targets a very specific error scenario. > >> > > If we set the new configuration to false, you then must manually > >> > > create the topics before the first start. Also all the internal > topic > >> > > configurations (offset.storage.partitions, etc) become unused. > >> > > > >> > > Then regarding the proposed configuration, have you considered > >> > > internal.topics.creation.enable? > >> > > Will the error message also specify the require configurations for > the > >> > > missing topics? > >> > > > >> > > Thanks, > >> > > Mickael > >> > > > >> > > On Wed, Sep 10, 2025 at 10:38 AM Andrei Rudkouski > >> > > <[email protected]> wrote: > >> > > > > >> > > > Hi, > >> > > > > >> > > > +1 (non-binding) > >> > > > > >> > > > Thanks for the KIP > >> > > > > >> > > > Best regards, > >> > > > Andrei Rudkouski > >> > > > > >> > > > On 2025/09/09 09:07:14 Anton Liauchuk wrote: > >> > > > > hi > >> > > > > > >> > > > > I’ve added an integration test for this new config: > >> > > > > > >> > > > >> > https://github.com/apache/kafka/pull/20384/files#diff-0f86ed224068b85289d9214ae1dad88865d159b8e3658a4231544de36f429bd5R248 > >> > > > > . > >> > > > > > >> > > > > > >> > > > > Please take a look at the KIP - it’s a really small > configuration. > >> > > > > > >> > > > > Once we get the required number of votes, the PR can be merged. > >> > > > > > >> > > > > On Thu, Sep 4, 2025 at 9:49 PM Hector Geraldino (BLOOMBERG/ 919 > >> 3RD A) > >> > > < > >> > > > > [email protected]> wrote: > >> > > > > > >> > > > > > +1 (non-binding) > >> > > > > > > >> > > > > > Thanks for the KIP > >> > > > > > > >> > > > > > From: [email protected] At: 09/04/25 13:24:06 UTC-4:00To: > >> > > > > > [email protected] > >> > > > > > Subject: [VOTE] KIP-1209: Add configuration to control > internal > >> topic > >> > > > > > creation in Kafka Connect > >> > > > > > > >> > > > > > hi > >> > > > > > > >> > > > > > I would like to start a vote on *KIP-1209: Add configuration > to > >> > > control > >> > > > > > internal topic creation in Kafka Connect* > >> > > > > > > >> > > > > > KIP: https://cwiki.apache.org/confluence/x/GAq2Fg > >> > > > > > Discussion thread: > >> > > > > > > >> https://lists.apache.org/thread/l89rzzm0jd4ml7xx66mkxfztst281so7 > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > >> > > > >> > > >> > > >
