[jira] [Comment Edited] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.

2019-09-15 Thread Fangbin Sun (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930261#comment-16930261
 ] 

Fangbin Sun edited comment on KAFKA-6582 at 9/16/19 6:54 AM:
-

Someone encountered similar issue in version 2.1.1, KAFKA-7870, is the issue 
indeed resolved in 2.1.1?


was (Author: fangbin):
Someone encountered similar issue in 
[KAFKA-7870|https://issues.apache.org/jira/browse/KAFKA-7870], is the issue 
indeed resolved in 2.1.1?

> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the broker.
> --
>
> Key: KAFKA-6582
> URL: https://issues.apache.org/jira/browse/KAFKA-6582
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 1.0.0
> Environment: Ubuntu 16.04
> Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 
> x86_64 x86_64 x86_64 GNU/Linux
> java version "9.0.1"
> Java(TM) SE Runtime Environment (build 9.0.1+11)
> Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) 
> but also tried with the latest JVM 8 before with the same result.
>Reporter: Jurriaan Pruis
>Priority: Major
> Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot 
> 2019-01-18 at 13.16.59.png
>
>
> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the 'single ISR' 
> broker (if you describe the partitions of replicated topic it is clear that 
> some partitions are only in sync on this broker).
> This bug resembles KAFKA-4477 a lot, but since that issue is marked as 
> resolved this is probably something else but similar.
> We have the same issue (or at least it looks pretty similar) on Kafka 1.0. 
> Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've 
> upgraded from Kafka 0.10.2.1).
> This happens almost every 24-48 hours on a random broker. This is why we 
> currently have a cronjob which restarts every broker every 24 hours. 
> During this issue the ISR shows the following server log: 
> {code:java}
> [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor)
> [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor)
> [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor)
> [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor)
> [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor)
> [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor)
> {code}
> Also on the ISR broker, the controller log shows this:
> {code:java}
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: 
> Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: 
> Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: 
> Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread){code}
> And the non-ISR brokers show these kind of errors:
>  
> {code:java}
> 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, 
> fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, 
> replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, 
> fetchData={..}, isolationLevel=READ_UNCOMMITTED) 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 3 was disconnected before the response was 
> read
>  at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95)
>  at 
> kafka.server.ReplicaFetcherBlockingSen

[jira] [Comment Edited] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.

2019-09-15 Thread Fangbin Sun (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930261#comment-16930261
 ] 

Fangbin Sun edited comment on KAFKA-6582 at 9/16/19 6:55 AM:
-

Someone encountered similar issue in version 2.1.1(KAFKA-7870), is the issue 
indeed resolved in 2.1.1?


was (Author: fangbin):
Someone encountered similar issue in version 2.1.1, KAFKA-7870, is the issue 
indeed resolved in 2.1.1?

> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the broker.
> --
>
> Key: KAFKA-6582
> URL: https://issues.apache.org/jira/browse/KAFKA-6582
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 1.0.0
> Environment: Ubuntu 16.04
> Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 
> x86_64 x86_64 x86_64 GNU/Linux
> java version "9.0.1"
> Java(TM) SE Runtime Environment (build 9.0.1+11)
> Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) 
> but also tried with the latest JVM 8 before with the same result.
>Reporter: Jurriaan Pruis
>Priority: Major
> Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot 
> 2019-01-18 at 13.16.59.png
>
>
> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the 'single ISR' 
> broker (if you describe the partitions of replicated topic it is clear that 
> some partitions are only in sync on this broker).
> This bug resembles KAFKA-4477 a lot, but since that issue is marked as 
> resolved this is probably something else but similar.
> We have the same issue (or at least it looks pretty similar) on Kafka 1.0. 
> Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've 
> upgraded from Kafka 0.10.2.1).
> This happens almost every 24-48 hours on a random broker. This is why we 
> currently have a cronjob which restarts every broker every 24 hours. 
> During this issue the ISR shows the following server log: 
> {code:java}
> [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor)
> [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor)
> [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor)
> [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor)
> [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor)
> [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor)
> {code}
> Also on the ISR broker, the controller log shows this:
> {code:java}
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: 
> Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: 
> Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: 
> Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread){code}
> And the non-ISR brokers show these kind of errors:
>  
> {code:java}
> 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, 
> fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, 
> replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, 
> fetchData={..}, isolationLevel=READ_UNCOMMITTED) 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 3 was disconnected before the response was 
> read
>  at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95)
>  at 
> kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingS

[jira] [Comment Edited] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.

2019-03-01 Thread JIRA


[ 
https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780986#comment-16780986
 ] 

Jan Mynařík edited comment on KAFKA-6582 at 3/1/19 9:53 AM:


Does anyone know if this issue affects 2.0.1. Because we have it in production 
and had a similar problem.


was (Author: pogo):
Does anyone know if this issues affects 2.0.1. Because we have it in production 
and had a similar problem.

> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the broker.
> --
>
> Key: KAFKA-6582
> URL: https://issues.apache.org/jira/browse/KAFKA-6582
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 1.0.0
> Environment: Ubuntu 16.04
> Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 
> x86_64 x86_64 x86_64 GNU/Linux
> java version "9.0.1"
> Java(TM) SE Runtime Environment (build 9.0.1+11)
> Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) 
> but also tried with the latest JVM 8 before with the same result.
>Reporter: Jurriaan Pruis
>Priority: Major
> Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot 
> 2019-01-18 at 13.16.59.png
>
>
> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the 'single ISR' 
> broker (if you describe the partitions of replicated topic it is clear that 
> some partitions are only in sync on this broker).
> This bug resembles KAFKA-4477 a lot, but since that issue is marked as 
> resolved this is probably something else but similar.
> We have the same issue (or at least it looks pretty similar) on Kafka 1.0. 
> Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've 
> upgraded from Kafka 0.10.2.1).
> This happens almost every 24-48 hours on a random broker. This is why we 
> currently have a cronjob which restarts every broker every 24 hours. 
> During this issue the ISR shows the following server log: 
> {code:java}
> [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor)
> [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor)
> [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor)
> [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor)
> [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor)
> [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor)
> {code}
> Also on the ISR broker, the controller log shows this:
> {code:java}
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: 
> Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: 
> Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: 
> Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread){code}
> And the non-ISR brokers show these kind of errors:
>  
> {code:java}
> 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, 
> fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, 
> replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, 
> fetchData={..}, isolationLevel=READ_UNCOMMITTED) 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 3 was disconnected before the response was 
> read
>  at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95)
>  at 
> kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockin

[jira] [Comment Edited] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.

2019-01-18 Thread Juris Pavlyuchenkov (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746193#comment-16746193
 ] 

Juris Pavlyuchenkov edited comment on KAFKA-6582 at 1/18/19 11:30 AM:
--

Had the same issue after upgrading from 1.0.1 to 2.1. Rolling back to 1.0.1 
helped. We are running cluster of 5 brokers with 26 topics, 16 partitions each 
+ {{__consumer_offsets}} topic with 50 partitions. Each topic has replication 
factor of 2.

The only thing I have noticed is that after we had upgraded from 1.0.1 to 2.1, 
{{kafka_server_fetcherlagmetrics_consumerlag}} metric started to grow and 
behaved weirdly for all the partitions. I tried increasing 
{{replica.fetch.max.bytes}} and {{num.replica.fetchers}}, but it did not help.

Here is an example for one of the topics. The flat line in the middle happened 
during the event when one of the brokers got stuck.

!Screenshot 2019-01-18 at 13.08.17.png!

And this is how fetcher lag looks before and after rollback:

!Screenshot 2019-01-18 at 13.16.59.png!  

I also tried to check lag with `kafka-replica-verification.sh` but it did not 
show any issues there:
{code:java}
# ./kafka-replica-verification.sh --broker-list 
kafka-0.kafka:9092,kafka-1.kafka:9092,kafka-2.kafka:9092,kafka-3.kafka:9092,kafka-4.kafka:9092
2019-01-18 00:21:58,854: verification process is started.
2019-01-18 00:22:28,800: max lag is 2 for partition presence-2 at offset 11587 
among 451 partitions
2019-01-18 00:22:58,803: max lag is 2 for partition mqtt-presence-13 at offset 
10418 among 451 partitions
2019-01-18 00:23:28,806: max lag is 1 for partition users-11 at offset 3529 
among 451 partitions
2019-01-18 00:23:58,810: max lag is 51 for partition notifications-14 at offset 
193179 among 451 partitions
2019-01-18 00:24:28,811: max lag is 3 for partition api-responses-0 at offset 
56367 among 451 partitions
2019-01-18 00:24:58,813: max lag is 1 for partition follows-6 at offset 1059 
among 451 partitions{code}


was (Author: juris):
Had the same issue after upgrading from 1.0.1 to 2.1. Rolling back to 1.0.1 
helped. We are running cluster of 5 brokers with 26 topics, 16 partitions each 
+ {{__consumer_offsets}} topic with 50 partitions. Each topic has replication 
factor of 2.

The only thing I have noticed is that after we had upgraded from 1.0.1 to 2.1, 
{{kafka_server_fetcherlagmetrics_consumerlag}} metric started to grow and 
behaved weirdly for all the partitions. I tried increasing 
`replica.fetch.max.bytes` and `num.replica.fetchers`, but it did not help.

Here is an example for one of the topics. The flat line in the middle happened 
during the event when one of the brokers got stuck.

!Screenshot 2019-01-18 at 13.08.17.png!

 

And this is how fetcher lag looks before and after rollback:

!Screenshot 2019-01-18 at 13.16.59.png!

 

I also tried to check lag with `kafka-replica-verification.sh` but it did not 
show any issues there:
{code:java}
# ./kafka-replica-verification.sh --broker-list 
kafka-0.kafka:9092,kafka-1.kafka:9092,kafka-2.kafka:9092,kafka-3.kafka:9092,kafka-4.kafka:9092
2019-01-18 00:21:58,854: verification process is started.
2019-01-18 00:22:28,800: max lag is 2 for partition presence-2 at offset 11587 
among 451 partitions
2019-01-18 00:22:58,803: max lag is 2 for partition mqtt-presence-13 at offset 
10418 among 451 partitions
2019-01-18 00:23:28,806: max lag is 1 for partition users-11 at offset 3529 
among 451 partitions
2019-01-18 00:23:58,810: max lag is 51 for partition notifications-14 at offset 
193179 among 451 partitions
2019-01-18 00:24:28,811: max lag is 3 for partition api-responses-0 at offset 
56367 among 451 partitions
2019-01-18 00:24:58,813: max lag is 1 for partition follows-6 at offset 1059 
among 451 partitions{code}

> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the broker.
> --
>
> Key: KAFKA-6582
> URL: https://issues.apache.org/jira/browse/KAFKA-6582
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 1.0.0
> Environment: Ubuntu 16.04
> Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 
> x86_64 x86_64 x86_64 GNU/Linux
> java version "9.0.1"
> Java(TM) SE Runtime Environment (build 9.0.1+11)
> Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) 
> but also tried with the latest JVM 8 before with the same result.
>Reporter: Jurriaan Pruis
>Priority: Major
> Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot 
> 2019-01-18 at 13.16.59.png
>
>
> Partitions get underreplicated, with a single ISR, and doesn't recov

[jira] [Comment Edited] (KAFKA-6582) Partitions get underreplicated, with a single ISR, and doesn't recover. Other brokers do not take over and we need to manually restart the broker.

2019-02-24 Thread Sebastian Schmitz (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776419#comment-16776419
 ] 

Sebastian Schmitz edited comment on KAFKA-6582 at 2/24/19 9:59 PM:
---

We are using 2.1.0

Will update to 2.1.1 today to see if it's related to the mentioned deadlock


was (Author: paxi):
We are using 2.1.0

> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the broker.
> --
>
> Key: KAFKA-6582
> URL: https://issues.apache.org/jira/browse/KAFKA-6582
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 1.0.0
> Environment: Ubuntu 16.04
> Linux kafka04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 
> x86_64 x86_64 x86_64 GNU/Linux
> java version "9.0.1"
> Java(TM) SE Runtime Environment (build 9.0.1+11)
> Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode) 
> but also tried with the latest JVM 8 before with the same result.
>Reporter: Jurriaan Pruis
>Priority: Major
> Attachments: Screenshot 2019-01-18 at 13.08.17.png, Screenshot 
> 2019-01-18 at 13.16.59.png
>
>
> Partitions get underreplicated, with a single ISR, and doesn't recover. Other 
> brokers do not take over and we need to manually restart the 'single ISR' 
> broker (if you describe the partitions of replicated topic it is clear that 
> some partitions are only in sync on this broker).
> This bug resembles KAFKA-4477 a lot, but since that issue is marked as 
> resolved this is probably something else but similar.
> We have the same issue (or at least it looks pretty similar) on Kafka 1.0. 
> Since upgrading to Kafka 1.0 in November 2017 we've had these issues (we've 
> upgraded from Kafka 0.10.2.1).
> This happens almost every 24-48 hours on a random broker. This is why we 
> currently have a cronjob which restarts every broker every 24 hours. 
> During this issue the ISR shows the following server log: 
> {code:java}
> [2018-02-20 12:02:08,342] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.148.20:56352-96708 (kafka.network.Processor)
> [2018-02-20 12:02:08,364] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54412-96715 (kafka.network.Processor)
> [2018-02-20 12:02:08,349] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.149.18:35182-96705 (kafka.network.Processor)
> [2018-02-20 12:02:08,379] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.150.25:54456-96717 (kafka.network.Processor)
> [2018-02-20 12:02:08,448] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.159.20:36388-96720 (kafka.network.Processor)
> [2018-02-20 12:02:08,683] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.132.0.32:9092-10.14.157.110:41922-96740 (kafka.network.Processor)
> {code}
> Also on the ISR broker, the controller log shows this:
> {code:java}
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-3-send-thread]: 
> Controller 3 connected to 10.132.0.32:9092 (id: 3 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,927] INFO [Controller-3-to-broker-0-send-thread]: 
> Controller 3 connected to 10.132.0.10:9092 (id: 0 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread)
> [2018-02-20 12:02:14,928] INFO [Controller-3-to-broker-1-send-thread]: 
> Controller 3 connected to 10.132.0.12:9092 (id: 1 rack: null) for sending 
> state change requests (kafka.controller.RequestSendThread){code}
> And the non-ISR brokers show these kind of errors:
>  
> {code:java}
> 2018-02-20 12:02:29,204] WARN [ReplicaFetcher replicaId=1, leaderId=3, 
> fetcherId=0] Error in fetch to broker 3, request (type=FetchRequest, 
> replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, 
> fetchData={..}, isolationLevel=READ_UNCOMMITTED) 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 3 was disconnected before the response was 
> read
>  at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:95)
>  at 
> kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:96)
>  at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.sca