GitHub user Radiancebobo edited a discussion: Issues with Topic-Level Policies
after Cluster Restart in Pulsar 3.0.5
We are experiencing an issue with Pulsar 3.0.5 in our environment and would
like to seek your advice.
# Environment Details
Pulsar Version: 3.0.5
Cluster Setup: 3 Bookies, 3 Brokers
System Topic Enabled: Yes
Topic-Level Policies Enabled: Yes
Reason for Using Topic-Level Policies: Different replication requirements for
topics under the same namespace.
## Bookie Configuration:
```yaml
journalSyncData: "true"
journalWriteData: "true"
```
## Broker Configuration:
```yaml
allowAutoTopicCreation: "true"
brokerDeleteInactiveTopicsEnabled: "false"
defaultNumPartitions: "3"
defaultRetentionSizeInMB: "103424"
defaultRetentionTimeInMinutes: "4320"
managedLedgerDefaultAckQuorum: "2"
managedLedgerDefaultEnsembleSize: "2"
managedLedgerDefaultWriteQuorum: "2"
managedLedgerMaxEntriesPerLedger: "50000"
managedLedgerMaxLedgerRolloverTimeMinutes: "240"
managedLedgerMinLedgerRolloverTimeMinutes: "10"
systemTopicEnabled: "true"
topicLevelPoliciesEnabled: "true"
```
# Problem Description
After a cluster restart (including after a power outage), some topics may
occasionally encounter the following error, causing them to be unable to
produce or consume:
## Error 1: BrokerService Exception
```yaml
2025-10-31T14:08:29,658+0000 [pulsar-io-5-8] ERROR
org.apache.pulsar.broker.service.BrokerService - Topic creation encountered an
exception by initialize topic policies service.
topic_name=persistent://10001001/default/log-partition-4 error_message=The
subscription multiTopicsReader-f5fb22e226 of the topic
persistent://10001001/default/__change_events-partition-0 gets the last message
id was failed
{"errorMsg":"Failed to read last entry of the compacted Ledger Error while
reading ledger","reqId":4227693217171430891,
"remote":"pulsar-broker-0.pulsar-broker.pulsar.svc.cluster.local/22.25.102.149:6650",
"local":"/22.25.102.149:59422"}
org.apache.pulsar.client.api.PulsarClientException$BrokerMetadataException: The
subscription multiTopicsReader-f5fb22e226 of the topic
persistent://10001001/default/__change_events-partition-0 gets the last message
id was failed
{"errorMsg":"Failed to read last entry of the compacted Ledger Error while
reading ledger","reqId":4227693217171430891,
"remote":"pulsar-broker-0.pulsar-broker.pulsar.svc.cluster.local/22.25.102.149:6650",
"local":"/22.25.102.149:59422"}
at
org.apache.pulsar.client.api.PulsarClientException.wrap(PulsarClientException.java:993)
~[org.apache.pulsar-pulsar-client-api-v3.0.5-v1.0.1.jar:v3.0.5-v1.0.1]
at
org.apache.pulsar.client.impl.ConsumerImpl.lambda$internalGetLastMessageIdAsync$64(ConsumerImpl.java:2566)
~[org.apache.pulsar-pulsar-client-original-v3.0.5-v1.0.1.jar:v3.0.5-v1.0.1]
at
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990)
~[?:?]
at
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:974)
~[?:?]
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
~[?:?]
at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
~[?:?]
at
org.apache.pulsar.client.impl.ClientCnx.handleError(ClientCnx.java:792)
~[org.apache.pulsar-pulsar-client-original-v3.0.5-v1.0.1.jar:v3.0.5-v1.0.1]
at
org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:192)
~[org.apache.pulsar-pulsar-common-v3.0.5-v1.0.1.jar:v3.0.5-v1.0.1]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
~[io.netty-netty-codec-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
~[io.netty-netty-codec-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.handler.flush.FlushConsolidationHandler.channelRead(FlushConsolidationHandler.java:152)
~[io.netty-netty-handler-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868)
~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:799)
~[io.netty-netty-transport-classes-epoll-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:501)
~[io.netty-netty-transport-classes-epoll-4.1.115.Final.jar:4.1.115.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:399)
~[io.netty-netty-transport-classes-epoll-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
~[io.netty-netty-common-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
~[io.netty-netty-common-4.1.115.Final.jar:4.1.115.Final]
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
~[io.netty-netty-common-4.1.115.Final.jar:4.1.115.Final]
at java.lang.Thread.run(Thread.java:840) ~[?:?]
```
## Error 2: BookKeeper Read Failures
```go
2025-10-31T14:08:30,971+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR
org.apache.bookkeeper.proto.PerChannelBookieClient - Read for failed on bookie
pulsar-bookie-2.pulsar-bookie.pulsar.svc.cluster.local:3181 code EIO
2025-10-31T14:08:30,971+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO
org.apache.bookkeeper.client.PendingReadOp - Error: Error while reading ledger
while reading L674 E0 from bookie:
pulsar-bookie-2.pulsar-bookie.pulsar.svc.cluster.local:3181
2025-10-31T14:08:30,971+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR
org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L674
E0-E0, Sent to [pulsar-bookie-2.pulsar-bookie.pulsar.svc.cluster.local:3181,
pulsar-bookie-1.pulsar-bookie.pulsar.svc.cluster.local:3181], Heard from [] :
bitset = {}, Error = 'Error while reading ledger'. First unread entry is (-1,
rc = null)
2025-10-31T14:08:30,971+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN
org.apache.pulsar.broker.service.ServerCnx -
[/22.25.102.149:59422][persistent://business/sec/__change_events-partition-1][multiTopicsReader-f080765823]
Failed to create consumer: consumerId=2407, Error while reading ledger -
ledger=674 - operation=Failed to read entry - entry=0
2025-10-31T14:08:30,972+0000 [pulsar-io-5-8] WARN
org.apache.pulsar.client.impl.ClientCnx - [id: 0xc7781652,
L:/22.25.102.149:59422 -
R:pulsar-broker-0.pulsar-broker.pulsar.svc.cluster.local/22.25.102.149:6650]
Received error from server: Error while reading ledger - ledger=674 -
operation=Failed to read entry - entry=0
2025-10-31T14:08:30,972+0000 [pulsar-io-5-8] WARN
org.apache.pulsar.client.impl.ConsumerImpl -
[persistent://business/sec/__change_events-partition-1][multiTopicsReader-f080765823]
Failed to subscribe to topic on
pulsar-broker-0.pulsar-broker.pulsar.svc.cluster.local/22.25.102.149:6650
```
## Error 3: Admin API Failures
When running pulsar-admin topics stats: pulsar-admin topics stats
persistent://business/notification/entry-partition-0
```go
--- An unexpected error occurred in the server ---
Message: Topic creation encountered an exception by initialize topic policies
service. topic_name=persistent://business/notification/entry-partition-0
error_message={"errorMsg":"Error while reading ledger - ledger=684 -
operation=Failed to read entry - entry=0","reqId":3147838028978352523,
"remote":"pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local/22.25.106.148:6650",
"local":"/22.25.102.143:43158"}
Stacktrace:
org.apache.pulsar.broker.service.BrokerServiceException$ServiceUnitNotReadyException:
Topic creation encountered an exception by initialize topic policies service.
topic_name=persistent://business/notification/entry-partition-0
error_message={"errorMsg":"Error while reading ledger - ledger=684 -
operation=Failed to read entry - entry=0","reqId":3147838028978352523,
"remote":"pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local/22.25.106.148:6650",
"local":"/22.25.102.143:43158"}
at
org.apache.pulsar.broker.service.BrokerService.lambda$getTopic$28(BrokerService.java:1080)
at
java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990)
at
java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:974)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
at
org.apache.pulsar.client.impl.PulsarClientImpl.lambda$createSingleTopicReaderAsync$14(PulsarClientImpl.java:689)
at
java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990)
at
java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:974)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
at
org.apache.pulsar.client.impl.MultiTopicsConsumerImpl.lambda$new$2(MultiTopicsConsumerImpl.java:193)
at
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
at
org.apache.pulsar.client.impl.MultiTopicsConsumerImpl.lambda$closeAsync$24(MultiTopicsConsumerImpl.java:634)
at
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
at
org.apache.pulsar.client.impl.ConsumerBase.lambda$failPendingReceive$1(ConsumerBase.java:349)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:840)
```
# Attempted Recovery Methods
Recreating __change_events Topics: but the errors persisted.
Disabling Topic-Level Policies: This resolved the production/consumption issues.
# Questions for the Community
Is this a known bug in Pulsar? If so, in which version was it fixed?
Are there any temporary workarounds besides disabling topic-level policies
entirely?
Currently, we're considering reverting to namespace-level policies. Are there
other recommended solutions to maintain topic-level Policy requirements while
avoiding this issue?
Any insights or suggestions would be greatly appreciated.
Thank you!
GitHub link: https://github.com/apache/pulsar/discussions/24930
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]