Pranav Rathi created KAFKA-20533:
------------------------------------
Summary: ShareFetch returns UNKNOWN_SERVER_ERROR when topic is
deleted during active share consumption
Key: KAFKA-20533
URL: https://issues.apache.org/jira/browse/KAFKA-20533
Project: Kafka
Issue Type: Bug
Reporter: Pranav Rathi
While working on the librdkafka KIP-932 share consumer implementation, I found
that when a topic is deleted while a share consumer is actively fetching from
it, the broker returns {{UNKNOWN_SERVER_ERROR}} (-1) as the top-level error in
the ShareFetchResponse.
I enabled DEBUG logging on the broker for {{kafka.server.KafkaApis}} and
{{org.apache.kafka.server.share}} and found that the broker internally
identifies the correct exception — {{UnknownTopicOrPartitionException}} — but
by the time the response reaches the wire, the error code has been replaced
with {{{}UNKNOWN_SERVER_ERROR{}}}. The expected behavior is for the broker to
return the per-partition error code it already identifies internally
({{{}UNKNOWN_TOPIC_OR_PARTITION{}}} or {{{}UNKNOWN_TOPIC_ID{}}}) so that
clients can identify the cause and respond accordingly.
h3. Reproduction
# Create topic {{demo-1}} with 1 partition
# Start a producer sending messages to {{demo-1}} at 1 msg/s
# Start a share consumer subscribed to {{demo-1}} — it receives messages
normally
# Stop the producer
# Delete topic {{demo-1}}
The share consumer immediately starts receiving {{UNKNOWN_SERVER_ERROR}} at a
very high rate.
h3. Observed Timeline
||Event||Timestamp||
|Topic deleted (log renamed and scheduled for deletion)|15:37:31.800|
|First {{UNKNOWN_SERVER_ERROR}} received by client|15:37:32.150|
|Last {{UNKNOWN_SERVER_ERROR}} received by client|15:37:34.231|
|Total {{UNKNOWN_SERVER_ERROR}} responses|3,187|
The errors lasted approximately 2 seconds until the share session stopped
including the partition.
h3. Broker vs Client Error Mismatch
The broker DEBUG logs show all 3,187 occurrences identified as
{{UnknownTopicOrPartitionException}} (error code 3):
{code:java}
[KafkaApi-1] Share Fetch request with correlation id 273 from client rdkafka
on partition 38WyjFvWQeeprA7A2i8blg:null-0 failed due to
org.apache.kafka.common.errors.UnknownTopicOrPartitionException
... (3,187 identical entries)
{code}
But the client receives {{UNKNOWN_SERVER_ERROR}} (error code -1) for all 3,187
responses:
{code:java}
ShareFetch response error UNKNOWN: ''
... (3,187 identical errors)
{code}
The counts match exactly — every request where the broker internally identifies
{{UnknownTopicOrPartitionException}} results in the client receiving
{{{}UNKNOWN_SERVER_ERROR{}}}.
Full broker and client logs from the latest reproduction are attached.
------------------------------------------------------------------------------
In a separate earlier test run with a multi-broker cluster and a similar
topic-deletion-while-subscribed scenario, I also observed this ERROR in the
broker logs:
{code:java}
[2026-04-08 19:01:28,379] ERROR Unable to perform write state RPC for key
SharePartitionKey{groupId=share-topic-deletion-while-subscribed,
topicIdPartition=aR9iBS_7SyKidYbzhwJf_g:null-0}:
Write operation on uninitialized share partition not allowed.
org.apache.kafka.common.errors.UnknownServerException:
Error in write state RPC. Write operation on uninitialized share partition
not allowed.
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)