soumya-ghosh commented on issue #23920: URL: https://github.com/apache/pulsar/issues/23920#issuecomment-2630606792
Now looking back at the datadog metrics when `pulsar.pulsar_publish_rate_limit_times` went back down to 0, we observed that the producer got back to normal and records were delivered to consumers. Currently we are using Failover subscription mode. I have captured some threaddumps and a heapdump from one Pulsar broker, will try to share it. Regarding stats of the partitioned topic during the consumer not getting records issue, we have captured using `pulsar-admin topics partitioned-stats persistent://public/default/topic_name` (basically without the `--per-partiton` flag) - https://gist.github.com/soumya-ghosh/e97e19f716e8b917eae9cb55f91b9aa9 We had reported this issue on slack - https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1737060570361949 Non-zero backlog metric when a console consume is started from pulsar cli.  Messages are delivered after 10 minutes and backlog goes to 0.  Will try to capture the partitioned level stats and internal stats when issue occurs again. We suspect both producer and consumer issues are inter-related as observed them happening at the same. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
