[ 
https://issues.apache.org/jira/browse/KAFKA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783530#comment-16783530
 ] 

Michael Kemmerer commented on KAFKA-8008:
-----------------------------------------

Same here. I have 3 clusters on the same V-Net in Azure. All of them were 
deployed at the same time using similar configurations, only one is 
experiencing this issue. I've even torn down the brokers of the affected 
cluster and rebuilt them without using SASL and they're still experiencing the 
same error state.

> Clients unable to connect and replicas are not able to connect to each other
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-8008
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8008
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller, core
>    Affects Versions: 2.1.0, 2.1.1
>         Environment: Java 11
>            Reporter: Abhi
>            Priority: Critical
>
> Hi,
> I upgrade to Kafka v2.1.1 recently and seeing the below exceptions in all the 
> servers. The kafka-network-thread-1-ListenerName are all consuming full cpu 
> cycles. Lots of TCP connections are in CLOSE_WAIT state.
> My broker setup is using kerberos authentication with 
> -Dsun.security.jgss.native=true.
> I am not sure how to handle this? Will increasing the kafka-network thread 
> count help if it is possible?
> Does this seem like a bug? I am happy to help in anyway I can as this issue 
> blocking our production usage and would like to get it resolved as early as 
> possible.
> *server.log snippet from one of the servers:*
> [2019-02-27 00:00:02,948] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=3] Built full fetch (sessionId=1488865423, epoch=INITIAL) for node 
> 2 with 3 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
> [2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=3] Initiating connection to node mwkafka-prod-02.nyc.foo.com:9092 
> (id: 2 rack: null) using address mwkafka-prod-02.nyc.foo.com/10.219.247.26 
> (org.apache.kafka.clients.NetworkClient)
> [2019-02-27 00:00:02,949] DEBUG Set SASL client state to 
> SEND_APIVERSIONS_REQUEST 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 00:00:02,949] DEBUG Creating SaslClient: 
> client=kafka/mwkafka-prod-01.nyc.foo....@unix.foo.com;service=kafka;serviceHostname=mwkafka-prod-02.nyc.foo.com;mechs=[GSSAPI]
>  (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=3] Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 166400, 
> SO_TIMEOUT = 0 to node 2 (org.apache.kafka.common.network.Selector)
> [2019-02-27 00:00:02,949] DEBUG Set SASL client state to 
> RECEIVE_APIVERSIONS_RESPONSE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
> fetcherId=3] Completed connection to node 2. Ready. 
> (org.apache.kafka.clients.NetworkClient)
> [2019-02-27 00:00:03,007] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=0] Built full fetch (sessionId=2039987243, epoch=INITIAL) for node 
> 5 with 0 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
> [2019-02-27 00:00:03,317] INFO [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=1] Error sending fetch request (sessionId=397037945, epoch=INITIAL) 
> to node 5: java.net.SocketTimeoutException: Failed to connect within 30000 
> ms. (org.apache.kafka.clients.FetchSessionHandler)
> [2019-02-27 00:00:03,317] WARN [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=1] Error in response for fetch request (type=FetchRequest, 
> replicaId=1, maxWait=10000, minBytes=1, maxBytes=10485760, 
> fetchData={reddyvel-159-0=(fetchOffset=3173198, logStartOffset=3173198, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-331-0=(fetchOffset=3173197, logStartOffset=3173197, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-newtp-5-64-0=(fetchOffset=8936, logStartOffset=8936, 
> maxBytes=1048576, currentLeaderEpoch=Optional[18]), 
> reddyvel-tp9-78-0=(fetchOffset=247943, logStartOffset=247943, 
> maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
> reddyvel-tp3-58-0=(fetchOffset=264495, logStartOffset=264495, 
> maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
> fps.trsy.fe_prvt-0=(fetchOffset=24, logStartOffset=8, maxBytes=1048576, 
> currentLeaderEpoch=Optional[3]), reddyvel-7-0=(fetchOffset=3173199, 
> logStartOffset=3173199, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-298-0=(fetchOffset=3173197, logStartOffset=3173197, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> fps.guas.peeq.fe_marb_us-0=(fetchOffset=2, logStartOffset=2, 
> maxBytes=1048576, currentLeaderEpoch=Optional[6]), 
> reddyvel-108-0=(fetchOffset=3173198, logStartOffset=3173198, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-988-0=(fetchOffset=3173185, logStartOffset=3173185, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-111-0=(fetchOffset=3173198, logStartOffset=3173198, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-409-0=(fetchOffset=3173194, logStartOffset=3173194, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-104-0=(fetchOffset=3173198, logStartOffset=3173198, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> fps.priveq.reins-0=(fetchOffset=12, logStartOffset=6, maxBytes=1048576, 
> currentLeaderEpoch=Optional[5]), reddyvel-353-0=(fetchOffset=3173197, 
> logStartOffset=3173197, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-tp10-63-0=(fetchOffset=220652, logStartOffset=220652, 
> maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
> reddyvel-newtp-5-86-0=(fetchOffset=8935, logStartOffset=8935, 
> maxBytes=1048576, currentLeaderEpoch=Optional[18]), 
> reddyvel-878-0=(fetchOffset=3173187, logStartOffset=3173187, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-621-0=(fetchOffset=3173190, logStartOffset=3173190, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> fps.agg.uopt.opt-0=(fetchOffset=28297, logStartOffset=28297, 
> maxBytes=1048576, currentLeaderEpoch=Optional[8]), 
> reddyvel-661-0=(fetchOffset=3173190, logStartOffset=3173190, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> fps.guas.jpeq.fe_marb-0=(fetchOffset=532, logStartOffset=10, 
> maxBytes=1048576, currentLeaderEpoch=Optional[3]), 
> reddyvel-607-0=(fetchOffset=3173191, logStartOffset=3173191, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> fps.seed.ornt.desim_ornt-0=(fetchOffset=4060, logStartOffset=2433, 
> maxBytes=1048576, currentLeaderEpoch=Optional[5]), 
> reddyvel-962-0=(fetchOffset=3173185, logStartOffset=3173185, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> fps.agg.ornt.desim_ornt-0=(fetchOffset=1177, logStartOffset=1177, 
> maxBytes=1048576, currentLeaderEpoch=Optional[8]), 
> reddyvel-tp6-71-0=(fetchOffset=256309, logStartOffset=256309, 
> maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
> fps.trsy.macro-0=(fetchOffset=324, logStartOffset=106, maxBytes=1048576, 
> currentLeaderEpoch=Optional[3]), fps.agg.dist.treas-0=(fetchOffset=0, 
> logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[8]), 
> reddyvel-newtp-8-111-0=(fetchOffset=1, logStartOffset=1, maxBytes=1048576, 
> currentLeaderEpoch=Optional[18]), reddyvel-94-0=(fetchOffset=3173198, 
> logStartOffset=3173198, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-544-0=(fetchOffset=3173192, logStartOffset=3173192, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-newtp-2-72-0=(fetchOffset=8679, logStartOffset=8679, 
> maxBytes=1048576, currentLeaderEpoch=Optional[18]), 
> fps.guas.useq.eq_stat_arb_useq-0=(fetchOffset=6071629, 
> logStartOffset=4470145, maxBytes=1048576, currentLeaderEpoch=Optional[4]), 
> reddyvel-newtp-10-75-0=(fetchOffset=9768, logStartOffset=9768, 
> maxBytes=1048576, currentLeaderEpoch=Optional[18]), 
> reddyvel-tp8-51-0=(fetchOffset=249873, logStartOffset=249873, 
> maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
> reddyvel-newtp-8-122-0=(fetchOffset=1, logStartOffset=1, maxBytes=1048576, 
> currentLeaderEpoch=Optional[18]), fps.seed.trsy.pe_china-0=(fetchOffset=24, 
> logStartOffset=8, maxBytes=1048576, currentLeaderEpoch=Optional[3]), 
> fps.seed.trsy.jcas-0=(fetchOffset=93, logStartOffset=31, maxBytes=1048576, 
> currentLeaderEpoch=Optional[3]), reddyvel-tp8-99-0=(fetchOffset=249871, 
> logStartOffset=249871, maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
> reddyvel-643-0=(fetchOffset=3173190, logStartOffset=3173190, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-577-0=(fetchOffset=3173191, logStartOffset=3173191, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-tp6-97-0=(fetchOffset=256307, logStartOffset=256307, 
> maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
> reddyvel-newtp-6-72-0=(fetchOffset=7652, logStartOffset=7652, 
> maxBytes=1048576, currentLeaderEpoch=Optional[18]), 
> reddyvel-959-0=(fetchOffset=3173185, logStartOffset=3173185, 
> maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
> reddyvel-newtp-4-70-0=(fetchOffset=8828, logStartOffset=8828, 
> maxBytes=1048576, currentLeaderEpoch=Optional[18]), 
> fps.seed.trsy.opt-0=(fetchOffset=114, logStartOffset=38, maxBytes=1048576, 
> currentLeaderEpoch=Optional[4])}, isolationLevel=READ_UNCOMMITTED, toForget=, 
> metadata=(sessionId=397037945, epoch=INITIAL)) 
> (kafka.server.ReplicaFetcherThread)
> java.net.SocketTimeoutException: Failed to connect within 30000 ms
>         at 
> kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:95)
>         at 
> kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:192)
>         at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:274)
>         at 
> kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:132)
>         at 
> kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:131)
>         at scala.Option.foreach(Option.scala:257)
>         at 
> kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:131)
>         at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:113)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
> [2019-02-27 00:00:04,007] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=0] Built full fetch (sessionId=2039987243, epoch=INITIAL) for node 
> 5 with 45 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
> [2019-02-27 00:00:04,007] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=0] Initiating connection to node mwkafka-prod-01.tbd.foo.com:9092 
> (id: 5 rack: null) using address mwkafka-prod-01.tbd.foo.com/10.236.30.30 
> (org.apache.kafka.clients.NetworkClient)
> [2019-02-27 00:00:04,008] DEBUG Set SASL client state to 
> SEND_APIVERSIONS_REQUEST 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 00:00:04,008] DEBUG Creating SaslClient: 
> client=kafka/mwkafka-prod-01.nyc.foo....@unix.foo.com;service=kafka;serviceHostname=mwkafka-prod-01.tbd.foo.com;mechs=[GSSAPI]
>  (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 00:00:04,008] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=0] Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 166400, 
> SO_TIMEOUT = 0 to node 5 (org.apache.kafka.common.network.Selector)
> [2019-02-27 00:00:04,008] DEBUG Set SASL client state to 
> RECEIVE_APIVERSIONS_RESPONSE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 00:00:04,008] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=0] Completed connection to node 5. Ready. 
> (org.apache.kafka.clients.NetworkClient)
> [2019-02-27 00:00:04,318] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=1] Built full fetch (sessionId=397037945, epoch=INITIAL) for node 5 
> with 0 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
> [2019-02-27 00:00:05,318] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=1] Built full fetch (sessionId=397037945, epoch=INITIAL) for node 5 
> with 48 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
> [2019-02-27 00:00:05,318] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=1] Initiating connection to node mwkafka-prod-01.tbd.foo.com:9092 
> (id: 5 rack: null) using address mwkafka-prod-01.tbd.foo.com/10.236.30.30 
> (org.apache.kafka.clients.NetworkClient)
> [2019-02-27 00:00:05,318] DEBUG Set SASL client state to 
> SEND_APIVERSIONS_REQUEST 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 00:00:05,318] DEBUG Creating SaslClient: 
> client=kafka/mwkafka-prod-01.nyc.foo....@unix.foo.com;service=kafka;serviceHostname=mwkafka-prod-01.tbd.foo.com;mechs=[GSSAPI]
>  (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 00:00:05,319] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=1] Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 166400, 
> SO_TIMEOUT = 0 to node 5 (org.apache.kafka.common.network.Selector)
> [2019-02-27 00:00:05,319] DEBUG Set SASL client state to 
> RECEIVE_APIVERSIONS_RESPONSE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 00:00:05,319] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
> fetcherId=1] Completed connection to node 5. Ready. 
> (org.apache.kafka.clients.NetworkClient)
> [2019-02-27 00:00:05,477] DEBUG Set SASL server state to AUTHENTICATE 
> (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
> [2019-02-27 00:00:05,477] DEBUG [SocketServer brokerId=1] Connection with 
> /10.236.30.31 disconnected (org.apache.kafka.common.network.Selector)
> java.io.EOFException
>         at 
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96)
>         at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:237)
>         at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:132)
>         at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
>         at org.apache.kafka.common.network.Selector.poll(Selector.java:467)
>         at kafka.network.Processor.poll(SocketServer.scala:689)
>         at kafka.network.Processor.run(SocketServer.scala:594)
>         at java.base/java.lang.Thread.run(Thread.java:834)
> [2019-02-27 00:00:05,477] DEBUG Handling Kafka request SASL_HANDSHAKE 
> (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
> [2019-02-27 00:00:05,477] DEBUG Using SASL mechanism 'GSSAPI' provided by 
> client 
> (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
> [2019-02-27 00:00:05,477] DEBUG Creating SaslServer for 
> kafka/mwkafka-prod-01.nyc.foo....@unix.foo.com with mechanism GSSAPI 
> (org.apache.kafka.common.security.authenticator.SaslServerAuthenticator)
> [2019-02-27 00:00:06,056] INFO [ReplicaFetcher replicaId=1, leaderId=4, 
> fetcherId=2] Error sending fetch request (sessionId=373847113, epoch=INITIAL) 
> to node 4: java.net.SocketTimeoutException: Failed to connect within 30000 
> ms. (org.apache.kafka.clients.FetchSessionHandler)
> *Lsof output:*
> kafka...@mwkafka-prod-01.nyc[toa]:/local/kafka/logs> lsof -P -p 103485 | grep 
> TCP | grep CLOSE
> java    103485 kafkagod  635u     IPv4           86522305       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-02.nyc.foo.com:46014 
> (CLOSE_WAIT)
> java    103485 kafkagod  639u     IPv4           86519040       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-02.nyc.foo.com:45926 
> (CLOSE_WAIT)
> java    103485 kafkagod  642u     IPv4           86519057       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-01.dr.foo.com:47424 
> (CLOSE_WAIT)
> java    103485 kafkagod  643u     IPv4           86519058       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-01.dr.foo.com:47428 
> (CLOSE_WAIT)
> java    103485 kafkagod  683u     IPv4           86509505       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-01.tbd.foo.com:57856 
> (CLOSE_WAIT)
> java    103485 kafkagod  684u     IPv4           86522910       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-01.tbd.foo.com:57894 
> (CLOSE_WAIT)
> java    103485 kafkagod  688u     IPv4           86522176       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-02.nyc.foo.com:45966 
> (CLOSE_WAIT)
> java    103485 kafkagod  690u     IPv4           86522306       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-02.tbd.foo.com:35326 
> (CLOSE_WAIT)
> java    103485 kafkagod  695u     IPv4           86522192       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-02.nyc.foo.com:45968 
> (CLOSE_WAIT)
> java    103485 kafkagod  696u     IPv4           86509516       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-02.dr.foo.com:39676 
> (CLOSE_WAIT)
> java    103485 kafkagod  697u     IPv4           86522307       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-02.tbd.foo.com:35328 
> (CLOSE_WAIT)
> java    103485 kafkagod  705u     IPv4           86519026       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-01.tbd.foo.com:57838 
> (CLOSE_WAIT)
> java    103485 kafkagod  726u     IPv4           86509517       0t0       TCP 
> mwkafka-prod-01.nyc.foo.com:9092->mwkafka-prod-02.tbd.foo.com:35258 
> (CLOSE_WAIT)
> *contoller logs below show that no broker is able connect to each other:*
> [2019-02-27 03:39:16,135] WARN [RequestSendThread controllerId=1] Controller 
> 1's connection to broker mwkafka-prod-02.dr.foo.com:9092 (id: 4 rack: 
> dr.foo.com) was unsuccessful (kafka.controller.RequestSendThread)
> java.net.SocketTimeoutException: Failed to connect within 30000 ms
>         at 
> kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:280)
>         at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:233)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
> [2019-02-27 03:39:20,915] WARN [RequestSendThread controllerId=1] Controller 
> 1's connection to broker mwkafka-prod-02.tbd.foo.com:9092 (id: 6 rack: 
> tbd.foo.com) was unsuccessful (kafka.controller.RequestSendThread)
> java.net.SocketTimeoutException: Failed to connect within 30000 ms
>         at 
> kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:280)
>         at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:233)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
> [2019-02-27 03:39:25,839] WARN [RequestSendThread controllerId=1] Controller 
> 1's connection to broker mwkafka-prod-01.dr.foo.com:9092 (id: 3 rack: 
> dr.foo.com) was unsuccessful (kafka.controller.RequestSendThread)
> java.net.SocketTimeoutException: Failed to connect within 30000 ms
>         at 
> kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:280)
>         at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:233)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
> [2019-02-27 03:39:31,371] WARN [RequestSendThread controllerId=1] Controller 
> 1's connection to broker mwkafka-prod-01.tbd.foo.com:9092 (id: 5 rack: 
> tbd.foo.com) was unsuccessful (kafka.controller.RequestSendThread)
> java.net.SocketTimeoutException: Failed to connect within 30000 ms
>         at 
> kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:280)
>         at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:233)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
> [2019-02-27 03:39:40,440] WARN [RequestSendThread controllerId=1] Controller 
> 1's connection to broker mwkafka-prod-01.nyc.foo.com:9092 (id: 1 rack: 
> nyc.foo.com) was unsuccessful (kafka.controller.RequestSendThread)
> java.net.SocketTimeoutException: Failed to connect within 30000 ms
>         at 
> kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:280)
>         at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:233)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
> [2019-02-27 03:39:42,149] WARN [RequestSendThread controllerId=1] Controller 
> 1's connection to broker mwkafka-prod-02.nyc.foo.com:9092 (id: 2 rack: 
> nyc.foo.com) was unsuccessful (kafka.controller.RequestSendThread)
> java.net.SocketTimeoutException: Failed to connect within 30000 ms
>         at 
> kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:280)
>         at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:233)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
> *Client consumer logs: consumer is not able to connect to the brokers*
> [2019-02-27 03:37:37,587] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 131072, SO_TIMEOUT = 0 to 
> node -6 (org.apache.kafka.common.network.Selector)
> [2019-02-27 03:37:37,587] DEBUG Set SASL client state to 
> RECEIVE_APIVERSIONS_RESPONSE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 03:37:37,587] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Completed connection to node -6. Fetching API versions. 
> (org.apache.kafka.clients.NetworkClient)
> [2019-02-27 03:38:02,488] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Coordinator discovery failed, refreshing metadata 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:38:02,488] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Sending FindCoordinator request to broker mwkafka-prod-02.tbd.deshaw.com:9092 
> (id: -6 rack: null) 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:38:08,618] DEBUG Set SASL client state to INITIAL 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 03:38:08,619] DEBUG Set SASL client state to INTERMEDIATE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 03:38:29,288] DEBUG Set SASL client state to 
> SEND_HANDSHAKE_REQUEST 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 03:38:29,288] DEBUG Set SASL client state to 
> RECEIVE_HANDSHAKE_RESPONSE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 03:38:32,493] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Coordinator discovery failed, refreshing metadata 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:38:32,493] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Sending FindCoordinator request to broker mwkafka-prod-02.dr.deshaw.com:9092 
> (id: -4 rack: null) 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:39:02,498] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Coordinator discovery failed, refreshing metadata 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:39:02,498] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Sending FindCoordinator request to broker mwkafka-prod-02.dr.deshaw.com:9092 
> (id: -3 rack: null) 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:39:32,501] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Coordinator discovery failed, refreshing metadata 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:39:32,501] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Sending FindCoordinator request to broker mwkafka-prod-01.nyc.deshaw.com:9092 
> (id: -1 rack: null) 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:39:33,050] DEBUG Set SASL client state to 
> SEND_HANDSHAKE_REQUEST 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 03:39:33,051] DEBUG Set SASL client state to 
> RECEIVE_HANDSHAKE_RESPONSE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 03:39:34,683] DEBUG Set SASL client state to CLIENT_COMPLETE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 03:39:41,887] DEBUG Set SASL client state to CLIENT_COMPLETE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 03:40:02,503] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Coordinator discovery failed, refreshing metadata 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:40:02,503] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Sending FindCoordinator request to broker mwkafka-prod-02.tbd.deshaw.com:9092 
> (id: -6 rack: null) 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:40:31,269] DEBUG Set SASL client state to COMPLETE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
> [2019-02-27 03:40:31,269] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Initiating API versions fetch from node -1. 
> (org.apache.kafka.clients.NetworkClient)
> [2019-02-27 03:40:32,507] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Coordinator discovery failed, refreshing metadata 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:40:32,507] DEBUG [Consumer clientId=test_con, groupId=chow] 
> Sending FindCoordinator request to broker mwkafka-prod-02.dr.deshaw.com:9092 
> (id: -4 rack: null) 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2019-02-27 03:40:42,438] DEBUG Set SASL client state to COMPLETE 
> (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to