Martin Andersson created KAFKA-19697:
----------------------------------------
Summary: NPE Cannot invoke
org.apache.kafka.connect.runtime.ConnectMetrics$MetricGroup.close()
Key: KAFKA-19697
URL: https://issues.apache.org/jira/browse/KAFKA-19697
Project: Kafka
Issue Type: Bug
Components: connect
Affects Versions: 4.0.0
Environment: Kafka connect cluster with 20 workers running in
kubernetes, on homebrewed kafka images built from
eclipse-temurin:21-jre-alpine-3.21
Reporter: Martin Andersson
Several tasks in a sink connector in a long-running connect cluster broke
spontaneously with the following stacktrace:
{code:java}
java.lang.NullPointerException: Cannot invoke
\"org.apache.kafka.connect.runtime.ConnectMetrics$MetricGroup.close()\" because
the return value of \"java.util.concurrent.ConcurrentMap.get(Object)\" is null
at
org.apache.kafka.connect.runtime.Worker$ConnectorStatusMetricsGroup.recordTaskRemoved(Worker.java:2333)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:707)
at org.apache.kafka.connect.runtime.Worker.startSinkTask(Worker.java:568)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:2009)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder.lambda$getTaskStartingCallable$39(DistributedHerder.java:2059)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.base/java.lang.Thread.run(Unknown Source) {code}
Restarting the failed tasks with the REST API lead to another task failure with
the following stacktrace:
{code:java}
java.lang.NullPointerException: Cannot invoke \"java.util.Map.size()\" because
\"inputMap\" is null
at
org.apache.kafka.common.utils.Utils.castToStringObjectMap(Utils.java:1476)
at
org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:112)
at
org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:146)
at org.apache.kafka.connect.runtime.TaskConfig.<init>(TaskConfig.java:51)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:661)
at org.apache.kafka.connect.runtime.Worker.startSinkTask(Worker.java:568)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:2009)
at
org.apache.kafka.connect.runtime.distributed.DistributedHerder.lambda$getTaskStartingCallable$39(DistributedHerder.java:2059)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.base/java.lang.Thread.run(Unknown Source)
{code}
The failed tasks did not show up on the _connector-failed-task-count_ metric
(or in the _restarting/paused/failed_ task metrics), but the failing tasks
disappeared from the connector-running-task-count metric.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)