[
https://issues.apache.org/jira/browse/KAFKA-19596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luke Chen resolved KAFKA-19596.
-------------------------------
Fix Version/s: 4.2.0
Resolution: Fixed
> Log auto topic creation failures more visibly
> ---------------------------------------------
>
> Key: KAFKA-19596
> URL: https://issues.apache.org/jira/browse/KAFKA-19596
> Project: Kafka
> Issue Type: Improvement
> Components: logging
> Reporter: Rob Young
> Assignee: Rob Young
> Priority: Trivial
> Fix For: 4.2.0
>
>
> Hi, I was playing with Share groups on the 4.1.0-rc2 and found it a little
> opaque to detect a failure to auto create the `__share_group_state` topic due
> to the replication factor not being satisfiable.
> I start up a cluster like this (taken from the dockerhub instructions):
> {code:java}
> podman run --rm \
> -p 9092:9092 \
> -e KAFKA_NODE_ID=1 \
> -e KAFKA_PROCESS_ROLES=broker,controller \
> -e KAFKA_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093 \
> -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092 \
> -e KAFKA_CONTROLLER_LISTENER_NAMES=CONTROLLER \
> -e
> KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
> \
> -e KAFKA_CONTROLLER_QUORUM_VOTERS=1@localhost:9093 \
> -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
> -e KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR=1 \
> -e KAFKA_TRANSACTION_STATE_LOG_MIN_ISR=1 \
> -e KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS=0 \
> -e KAFKA_NUM_PARTITIONS=3 \
> apache/kafka:4.1.0-rc2{code}
> and then run the commands:
> {code:java}
> robeyoun:kafka_2.13-4.1.0$ bin/kafka-features.sh --bootstrap-server
> localhost:9092 upgrade --feature share.version=1
> share.version was upgraded to 1.
> robeyoun:kafka_2.13-4.1.0$ bin/kafka-topics.sh --create --topic my_topic
> --bootstrap-server localhost:9092
> WARNING: Due to limitations in metric names, topics with a period ('.') or
> underscore ('_') could collide. To avoid issues it is best to use either, but
> not both.
> Created topic my_topic.
> robeyoun:kafka_2.13-4.1.0$ bin/kafka-console-share-consumer.sh
> --bootstrap-server localhost:9092 --topic quickstart-events
> [2025-08-12 09:50:13,130] WARN Share groups and KafkaShareConsumer are part
> of a preview feature introduced by KIP-932, and are not recommended for use
> in production.
> (org.apache.kafka.clients.consumer.internals.ShareConsumerDelegateCreator)
> [2025-08-12 09:50:13,357] WARN [ShareConsumer
> clientId=console-share-consumer, groupId=console-share-consumer] The metadata
> response from the cluster reported a recoverable issue with correlation id 20
> : {quickstart-events=UNKNOWN_TOPIC_OR_PARTITION}
> (org.apache.kafka.clients.NetworkClient)
> {code}
> the consumer now sits there, appearing to be consuming happily, but the
> broker is emitting a stream of logs like:
> {code:java}
> [2025-08-11 21:51:03,648] INFO Sent auto-creation request for
> Set(__share_group_state) to the active controller.
> (kafka.server.DefaultAutoTopicCreationManager)
> [2025-08-11 21:51:03,648] WARN Received retriable error in find coordinator
> for InitializeStateHandler using key
> SharePartitionKey{groupId=console-share-consumer,
> topicIdPartition=sI9vYBBGSKW_3BfG9ZJWhg:null-1}: The coordinator is not
> available.
> (org.apache.kafka.server.share.persister.PersisterStateManager$InitializeStateHandler){code}
> So there's a clue there that it's repeatedly trying to auto-create the topic.
> the underlying problem is the default replication factor of 3 for the
> __share_group_state topic, but I have to crank up the log level to DEBUG
> (setting -e KAFKA_LOG4J_LOGGERS="kafka=DEBUG") to see the cause:
> {code:java}
> [2025-08-11 21:53:58,343] INFO Sent auto-creation request for
> Set(__share_group_state) to the active controller.
> (kafka.server.DefaultAutoTopicCreationManager)
> [2025-08-11 21:53:58,343] WARN Received retriable error in find coordinator
> for InitializeStateHandler using key
> SharePartitionKey{groupId=console-share-consumer,
> topicIdPartition=8kjLkvNzSCy7geIIriJVCQ:null-2}: The coordinator is not
> available.
> (org.apache.kafka.server.share.persister.PersisterStateManager$InitializeStateHandler)
> [2025-08-11 21:53:58,345] DEBUG
> [broker-1-to-controller-forwarding-channel-manager]: Request
> CreateTopicsRequestData(topics=[CreatableTopic(name='__share_group_state',
> numPartitions=50, replicationFactor=3, assignments=[],
> configs=[CreatableTopicConfig(name='compression.type', value='producer'),
> CreatableTopicConfig(name='cleanup.policy', value='delete'),
> CreatableTopicConfig(name='min.insync.replicas', value='2'),
> CreatableTopicConfig(name='segment.bytes', value='104857600'),
> CreatableTopicConfig(name='retention.ms', value='-1')])], timeoutMs=30000,
> validateOnly=false) received ClientResponse(receivedTimeMs=1754949238345,
> latencyMs=2, disconnected=false, timedOut=false,
> requestHeader=RequestHeader(apiKey=CREATE_TOPICS, apiVersion=7, clientId=1,
> correlationId=22, headerVersion=2),
> responseBody=CreateTopicsResponseData(throttleTimeMs=0,
> topics=[CreatableTopicResult(name='__share_group_state',
> topicId=AAAAAAAAAAAAAAAAAAAAAA, errorCode=38, errorMessage='Unable to
> replicate the partition 3 time(s): The target replication factor of 3 cannot
> be reached because only 1 broker(s) are registered.', topicConfigErrorCode=0,
> numPartitions=-1, replicationFactor=-1, configs=[])]))
> (kafka.server.NodeToControllerRequestThread)
> [2025-08-11 21:53:58,345] DEBUG Cleared inflight topic creation state for
> HashMap(__share_group_state -> CreatableTopic(name='__share_group_state',
> numPartitions=50, replicationFactor=3, assignments=[],
> configs=[CreatableTopicConfig(name='compression.type', value='producer'),
> CreatableTopicConfig(name='cleanup.policy', value='delete'),
> CreatableTopicConfig(name='min.insync.replicas', value='2'),
> CreatableTopicConfig(name='segment.bytes', value='104857600'),
> CreatableTopicConfig(name='retention.ms', value='-1')]))
> (kafka.server.DefaultAutoTopicCreationManager)
> [2025-08-11 21:53:58,345] DEBUG Auto topic creation completed for
> Set(__share_group_state) with response
> CreateTopicsResponseData(throttleTimeMs=0,
> topics=[CreatableTopicResult(name='__share_group_state',
> topicId=AAAAAAAAAAAAAAAAAAAAAA, errorCode=38, errorMessage='Unable to
> replicate the partition 3 time(s): The target replication factor of 3 cannot
> be reached because only 1 broker(s) are registered.', topicConfigErrorCode=0,
> numPartitions=-1, replicationFactor=-1, configs=[])]).
> (kafka.server.DefaultAutoTopicCreationManager)
> {code}
> The point where we log the auto creation outcome has two cases where it warn
> logs, but then this case is being handled by a catch all debug log.
> [https://github.com/apache/kafka/blob/18045c6ac30921503deffbef1744bb365dc599fb/core/src/main/scala/kafka/server/AutoTopicCreationManager.scala#L141]
> If there's an error code set, maybe it should log at warn like the other
> cases? I'm happy to be assigned to this.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)