[
https://issues.apache.org/jira/browse/KAFKA-16838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Egerton resolved KAFKA-16838.
-----------------------------------
Fix Version/s: 3.9.0
Resolution: Fixed
> Kafka Connect loads old tasks from removed connectors
> -----------------------------------------------------
>
> Key: KAFKA-16838
> URL: https://issues.apache.org/jira/browse/KAFKA-16838
> Project: Kafka
> Issue Type: Bug
> Components: connect
> Affects Versions: 3.5.1, 3.6.1, 3.8.0
> Reporter: Sergey Ivanov
> Assignee: Chris Egerton
> Priority: Major
> Fix For: 3.9.0
>
>
> Hello,
> When creating connector we faced an error from one of our ConfigProviders
> about not existing resource, but we didn't try to set that resource as config
> value:
> {code:java}
> [2024-05-24T12:08:24.362][ERROR][request_id= ][tenant_id=
> ][thread=DistributedHerder-connect-1-1][class=org.apache.kafka.connect.runtime.distributed.DistributedHerder][method=lambda$reconfigureConnectorTasksWithExponentialBackoffRetries$44]
> [Worker clientId=connect-1, groupId=streaming-service_streaming_service]
> Failed to reconfigure connector's tasks (local-file-sink), retrying after
> backoff.
> org.apache.kafka.common.config.ConfigException: Could not read properties
> from file /opt/kafka/provider.properties
> at
> org.apache.kafka.common.config.provider.FileConfigProvider.get(FileConfigProvider.java:98)
> at
> org.apache.kafka.common.config.ConfigTransformer.transform(ConfigTransformer.java:103)
> at
> org.apache.kafka.connect.runtime.WorkerConfigTransformer.transform(WorkerConfigTransformer.java:58)
> at
> org.apache.kafka.connect.storage.ClusterConfigState.taskConfig(ClusterConfigState.java:181)
> at
> org.apache.kafka.connect.runtime.AbstractHerder.taskConfigsChanged(AbstractHerder.java:804)
> at
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.publishConnectorTaskConfigs(DistributedHerder.java:2089)
> at
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.reconfigureConnector(DistributedHerder.java:2082)
> at
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.reconfigureConnectorTasksWithExponentialBackoffRetries(DistributedHerder.java:2025)
> at
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.lambda$null$42(DistributedHerder.java:2038)
> at
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.runRequest(DistributedHerder.java:2232)
> at
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:470)
> at
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:371)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> at java.base/java.lang.Thread.run(Thread.java:840)
> {code}
> It looked like there already was connector with the same name and same
> config, +but it wasn't.+
> After investigation we found out, that few months ago on that cloud there was
> the connector with the same name and another value for config provider. Then
> it was removed, but by some reason when we tried to create connector with the
> same name months ago AbstractHerder tried to update tasks from our previous
> connector
> As an example I used FileConfigProvider, but actually any ConfigProvider is
> acceptable which could raise exception if something wrong with config (like
> result doesn't exist).
> We continued our investigation and found the issue
> https://issues.apache.org/jira/browse/KAFKA-7745 that says Connect doesn't
> send tombstone message for *commit* and *task* records in the config topic of
> Kafka Connect. As we remember, the config topic is `compact` *that means
> commit and tasks are are always stored* (months, years after connector
> removing) while tombstones for connector messages are cleaned with
> {{delete.retention.ms}} property. That impacts further connector creations
> with the same name.
> We didn't investigate reasons in ConfigClusterStore and how to avoid that
> issue, because would {+}like to ask{+}, probably it's better to fix
> KAFKA-7745 and send tombstones for commit and task messages as connect does
> for connector and target messages?
> In the common way the TC looks like:
> # Create connector with config provider to resource1
> # Remove connector
> # Remove resouce1
> # Wait 2-4 weeks :) (until config topic being compacted and tombstone
> messages about config and target connector are removed)
> # Try to create connector with the same name and config provider to resource2
> I can provide synthetic TC to reproduce that error if needed.
>
> This is linked with https://issues.apache.org/jira/browse/KAFKA-16837 but
> it's not the same issue.
> As WA we can remove connector one more time, to get *tombstone* message for
> connector in config topic.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)