Steven Schlansker created KAFKA-19165:
-----------------------------------------
Summary: PartitionLeaderStrategy has very high error rate during
topic initialization
Key: KAFKA-19165
URL: https://issues.apache.org/jira/browse/KAFKA-19165
Project: Kafka
Issue Type: Improvement
Components: clients, streams
Affects Versions: 3.9.0
Reporter: Steven Schlansker
We implemented a Kafka Streams app.
Some integration tests run a Kafka broker and then connect the Streams app to
it, to ensure our application functions as desired.
When initializing each test case, all the Streams topics must be created. This
is expected as each integration test expects to run its own "copy" of the app
(different `application.id`)
The application is *very* chatty about this process.
We see hundreds of thousands of errors like:
{code:java}
2025-04-16T17:17:33.841Z [kafka-admin-client-thread |
search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin]
ERROR o.a.k.c.a.i.PartitionLeaderStrategy - [AdminClient
clientId=search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin]
Received unknown topic error for topic
search-indexing-2025-04-16-r67a-notification-group-eoc-merge-changelog
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server
does not host this topic-partition.
2025-04-16T17:17:33.841Z [kafka-admin-client-thread |
search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin]
ERROR o.a.k.c.a.i.PartitionLeaderStrategy - [AdminClient
clientId=search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin]
Received unknown topic error for topic
search-indexing-2025-04-16-r67a-current-time-store-changelog
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server
does not host this topic-partition.
2025-04-16T17:17:33.841Z [kafka-admin-client-thread |
search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin]
ERROR o.a.k.c.a.i.PartitionLeaderStrategy - [AdminClient
clientId=search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin]
Received unknown topic error for topic
search-indexing-2025-04-16-r67a-notification-group-eoc-merge-changelog
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server
does not host this topic-partition.
2025-04-16T17:17:33.841Z [kafka-admin-client-thread |
search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin]
ERROR o.a.k.c.a.i.PartitionLeaderStrategy - [AdminClient
clientId=search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin]
Received unknown topic error for topic
search-indexing-2025-04-16-r67a-current-time-store-changelog {code}
For a single topic, we get > 6000 errors in just a few seconds. The log file
ends up being many megabytes of this, to the point where some less-powerful
text editors struggle to even render the file.
Having so many errors that are in fact expected and non-actionable harms
observability of the Kafka Streams platform. Would it be sensible to suppress
"expected" exceptions of this type, as topics are being created? Or at least
rate-limit it, for example printing ever 10 seconds "Waiting for topics [...,
...] to be created for 30s..."
I also wonder if the admin client should rate limit how often it pings the
broker, to reduce broker load in this case.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)