kenbaev opened a new issue #10171: URL: https://github.com/apache/pulsar/issues/10171
**Describe the bug** Pulsar broker for no apparent reasons stops returning metrics for Prometheus. The following entries appears in the broker log `11:25:52.830 [pulsar-web-42-7] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:25:22 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30002 11:44:07.847 [pulsar-web-42-20] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:43:37 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30001 11:43:52.838 [pulsar-web-42-5] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:43:22 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30002 11:43:37.841 [pulsar-web-42-17] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:43:07 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30004 11:43:22.857 [pulsar-web-42-3] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:42:52 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30011 11:43:07.842 [pulsar-web-42-14] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:42:37 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30002` At the same time, problems with connecting to the fallen broker begin on the proxy `11:25:11.666 [pulsar-proxy-io-2-1] WARN org.apache.pulsar.proxy.server.LookupProxyHandler - [non-persistent://loadtest/pcm-reply/reply-unlock-project-lb-24] failed to get Partitioned metadata : Disconnected from server at newpulsar-broker.newpulsar.svc.cluster.local/10.247.106.218:6650 org.apache.pulsar.client.api.PulsarClientException: Disconnected from server at newpulsar-broker.newpulsar.svc.cluster.local/10.247.106.218:6650 at org.apache.pulsar.client.impl.ClientCnx.channelInactive(ClientCnx.java:245) [org.apache.pulsar-pulsar-client-original-2.7.0.6.jar:2.7.0.6] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:389) [io.netty-netty-codec-4.1.51.Final.jar:4.1.51.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:354) [io.netty-netty-codec-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:818) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final] at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final] at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) [io.netty-netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar:4.1.51.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]` **To Reproduce** The problem happened several times. We were unable to identify the Steps to reproduce it **Expected behavior** Stable work of pulsar broker **Desktop (please complete the following information):** - Apache Pulsar 2.7.0.6 - Kubernetes 1.18.3 - Cluster with 3 ZK, 5 bokies, 5 brokers, 5 proxies -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org