[ 
https://issues.apache.org/jira/browse/KAFKA-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Aliase updated KAFKA-7457:
---------------------------------
    Summary: AbstractCoordinator struck in Discover  (was: AbstractCoordinator 
struck in discover)

> AbstractCoordinator struck in Discover
> --------------------------------------
>
>                 Key: KAFKA-7457
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7457
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 0.10.1.1
>         Environment: Linux
>            Reporter: Joseph Aliase
>            Priority: Minor
>
> AbstractCoordinator in kafka-client is stuck in discover and never rejoins 
> the group. Post restart application is able to join the consumer group and 
> consume from the topic.
> We see below logs every 10 minute. The sequence of events are:
> a) NetworkClient complains that connection is idle and closes the connection.
> b) Consumer client tries to determine co-ordinator by sending request to Node 
> 2.
> c) Node 2 responds by saying Node 3 is group co-ordinator.
> d) Consumer client connects to group co-ordinator.
> e) There is radio silence for 10 minutes and the sequence gets repeated. 
>  
> 2018-09-28 16:35:59.771 TRACE org.apache.kafka.common.network.Selector 
> [pool-4-thread-50] [active] [wc] About to close the idle connection from 
> 2147483644 due to being idle for 540140 millis
> 2018-09-28 16:35:59.771 DEBUG org.apache.kafka.clients.NetworkClient 
> [pool-4-thread-50] [active] [wc] Node 2147483644 disconnected.
> 2018-09-28 16:35:59.771 INFO 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> [pool-4-thread-50] [active] [wc] Marking the coordinator kafka03-wc.net:9092 
> (id: 2147483644 rack: null) dead for group test
> 2018-09-28 16:35:59.771 DEBUG 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> [pool-4-thread-50] [active] [wc] Sending coordinator request for group test 
> to broker kafka02-wc.net:9092 (id: 2 rack: null)
> 2018-09-28 16:35:59.771 DEBUG org.apache.kafka.clients.NetworkClient 
> [pool-4-thread-50] [active] [wc] Sending metadata request \{topics=[address]} 
> to node 2
> 2018-09-28 16:35:59.796 DEBUG org.apache.kafka.clients.Metadata 
> [pool-4-thread-50] [active] [wc] Updated cluster metadata version 401 to 
> Cluster(id = oC0BPXfOT42WqN7-v6b5Gw, nodes = [kafka02-wc.net:9092 (id: 2 
> rack: null), kafka03-wc.net:9092 (id: 3 rack: null), kafka05-wc.net:9092 (id: 
> 5 rack: null), kafka01-wc.net:9092 (id: 1 rack: null), kafka04-wc.net:9092 
> (id: 4 rack: null)], partitions = [Partition(topic = address, partition = 2, 
> leader = 1, replicas = [1,5,4,], isr = [5,1,4,]), Partition(topic = address, 
> partition = 1, leader = 5, replicas = [5,3,4,], isr = [5,3,4,]), 
> Partition(topic = address, partition = 0, leader = 4, replicas = [2,3,4,], 
> isr = [4,3,2,]), Partition(topic = address, partition = 6, leader = 5, 
> replicas = [1,5,4,], isr = [5,1,4,]), Partition(topic = address, partition = 
> 5, leader = 4, replicas = [5,3,4,], isr = [5,4,3,]), Partition(topic = 
> address, partition = 4, leader = 3, replicas = [1,2,3,], isr = [1,3,2,]), 
> Partition(topic = address, partition = 3, leader = 2, replicas = [1,5,2,], 
> isr = [5,1,2,]), Partition(topic = address, partition = 16, leader = 5, 
> replicas = [5,2,3,], isr = [5,3,2,]), Partition(topic = address, partition = 
> 15, leader = 4, replicas = [1,2,4,], isr = [1,4,2,]), Partition(topic = 
> address, partition = 10, leader = 4, replicas = [1,5,4,], isr = [5,1,4,]), 
> Partition(topic = address, partition = 9, leader = 3, replicas = [2,3,4,], 
> isr = [3,4,2,]), Partition(topic = address, partition = 8, leader = 2, 
> replicas = [1,2,3,], isr = [1,3,2,]), Partition(topic = address, partition = 
> 7, leader = 1, replicas = [1,5,2,], isr = [5,1,2,]), Partition(topic = 
> address, partition = 14, leader = 3, replicas = [5,3,4,], isr = [5,4,3,]), 
> Partition(topic = address, partition = 13, leader = 2, replicas = [2,3,4,], 
> isr = [3,4,2,]), Partition(topic = address, partition = 12, leader = 1, 
> replicas = [1,2,3,], isr = [1,3,2,]), Partition(topic = address, partition = 
> 11, leader = 5, replicas = [1,5,2,], isr = [5,1,2,])])
> 2018-09-28 16:35:59.797 DEBUG 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> [pool-4-thread-50] [active] [wc] Received group coordinator response 
> ClientResponse(receivedTimeMs=1538152559797, disconnected=false, 
> request=ClientRequest(expectResponse=true, 
> callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@2531ff5f,
>  
> request=RequestSend(header=\{api_key=10,api_version=0,correlation_id=8354,client_id=active-wc-1-256},
>  body=\{group_id=test}), createdTimeMs=1538152559771, 
> sendTimeMs=1538152559771), 
> responseBody=\{error_code=0,coordinator={node_id=3,host=kafka03-wc.net,port=9092}})
> 2018-09-28 16:35:59.797 INFO 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> [pool-4-thread-50] [active] [wc] Discovered coordinator kafka03-wc.net:9092 
> (id: 2147483644 rack: null) for group test.
> 2018-09-28 16:35:59.797 INFO 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> [pool-4-thread-50] [active] [wc] Marking the coordinator kafka03-wc.net:9092 
> (id: 2147483644 rack: null) dead for group test
> 2018-09-28 16:36:09.797 DEBUG 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> [pool-4-thread-50] [active] [wc] Sending coordinator request for group test 
> to broker kafka02-wc.net:9092 (id: 2 rack: null)
> 2018-09-28 16:36:09.822 DEBUG 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> [pool-4-thread-50] [active] [wc] Received group coordinator response 
> ClientResponse(receivedTimeMs=1538152569822, disconnected=false, 
> request=ClientRequest(expectResponse=true, 
> callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler@773b5447,
>  
> request=RequestSend(header=\{api_key=10,api_version=0,correlation_id=8356,client_id=active-wc-1-256},
>  body=\{group_id=test}), createdTimeMs=1538152569797, 
> sendTimeMs=1538152569797), 
> responseBody=\{error_code=0,coordinator={node_id=3,host=kafka03-wc.net,port=9092}})
> 2018-09-28 16:36:09.822 INFO 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator 
> [pool-4-thread-50] [active] [wc] Discovered coordinator kafka03-wc.net:9092 
> (id: 2147483644 rack: null) for group test.
> 2018-09-28 16:36:09.822 DEBUG org.apache.kafka.clients.NetworkClient 
> [pool-4-thread-50] [active] [wc] Initiating connection to node 2147483644 at 
> kafka03-wc.net:9092.
> 2018-09-28 16:36:09.846 DEBUG org.apache.kafka.common.network.Selector 
> [pool-4-thread-50] [active] [wc] Created socket with SO_RCVBUF = 65536, 
> SO_SNDBUF = 131072, SO_TIMEOUT = 0 to node 2147483644
> 2018-09-28 16:36:09.846 DEBUG org.apache.kafka.clients.NetworkClient 
> [pool-4-thread-50] [active] [wc] Completed connection to node 2147483644



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to