[ 
https://issues.apache.org/jira/browse/KAFKA-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alok Nikhil updated KAFKA-12345:
--------------------------------
    Description: 
Occasionally, a scheduler thread on a broker crashes with this stack

 
{code:java}
[2021-02-19 01:04:24,683] ERROR Uncaught exception in scheduled task 
'send-alter-isr' (kafka.utils.KafkaScheduler)
 java.lang.NullPointerException
 at kafka.server.AlterIsrManagerImpl.sendRequest(AlterIsrManager.scala:117)
 at 
kafka.server.AlterIsrManagerImpl.propagateIsrChanges(AlterIsrManager.scala:85)
 at kafka.server.AlterIsrManagerImpl.$anonfun$start$1(AlterIsrManager.scala:66)
 at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114)
 at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
 at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
 at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 at java.base/java.lang.Thread.run(Thread.java:834){code}
 

After that the broker is unable to fetch any records from any other broker (and 
vice versa)
{code:java}
[2021-02-19 01:05:07,000] INFO [ReplicaFetcher replicaId=0, leaderId=4, 
fetcherId=0] Error sending fetch request (sessionId=164432409
 2, epoch=957) to node 4: (org.apache.kafka.clients.FetchSessionHandler)
 java.io.IOException: Connection to 4 was disconnected before the response was 
read
 at 
org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)
 at 
kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:110)
 at 
kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:215)
 at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:313)
 at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:139)
 at 
kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:138)
 at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:121)
 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96){code}
 

  was:
Occasionally, a scheduler thread on a broker crashes with this stack

```
 [2021-02-19 01:04:24,683] ERROR Uncaught exception in scheduled task 
'send-alter-isr' (kafka.utils.KafkaScheduler)
 java.lang.NullPointerException
 at kafka.server.AlterIsrManagerImpl.sendRequest(AlterIsrManager.scala:117)
 at 
kafka.server.AlterIsrManagerImpl.propagateIsrChanges(AlterIsrManager.scala:85)
 at kafka.server.AlterIsrManagerImpl.$anonfun$start$1(AlterIsrManager.scala:66)
 at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114)
 at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
 at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
 at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 at java.base/java.lang.Thread.run(Thread.java:834)

```


 After that the broker is unable to fetch any records from any other broker 
(and vice versa)

```
 [2021-02-19 01:05:07,000] INFO [ReplicaFetcher replicaId=0, leaderId=4, 
fetcherId=0] Error sending fetch request (sessionId=164432409
 2, epoch=957) to node 4: (org.apache.kafka.clients.FetchSessionHandler)
 java.io.IOException: Connection to 4 was disconnected before the response was 
read
 at 
org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)
 at 
kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:110)
 at 
kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:215)
 at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:313)
 at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:139)
 at 
kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:138)
 at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:121)
 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)

```


> KIP-500: AlterIsrManager crashes on broker idle-state
> -----------------------------------------------------
>
>                 Key: KAFKA-12345
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12345
>             Project: Kafka
>          Issue Type: Task
>          Components: core
>            Reporter: Alok Nikhil
>            Priority: Major
>              Labels: kip-500
>
> Occasionally, a scheduler thread on a broker crashes with this stack
>  
> {code:java}
> [2021-02-19 01:04:24,683] ERROR Uncaught exception in scheduled task 
> 'send-alter-isr' (kafka.utils.KafkaScheduler)
>  java.lang.NullPointerException
>  at kafka.server.AlterIsrManagerImpl.sendRequest(AlterIsrManager.scala:117)
>  at 
> kafka.server.AlterIsrManagerImpl.propagateIsrChanges(AlterIsrManager.scala:85)
>  at 
> kafka.server.AlterIsrManagerImpl.$anonfun$start$1(AlterIsrManager.scala:66)
>  at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114)
>  at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>  at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>  at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  at java.base/java.lang.Thread.run(Thread.java:834){code}
>  
> After that the broker is unable to fetch any records from any other broker 
> (and vice versa)
> {code:java}
> [2021-02-19 01:05:07,000] INFO [ReplicaFetcher replicaId=0, leaderId=4, 
> fetcherId=0] Error sending fetch request (sessionId=164432409
>  2, epoch=957) to node 4: (org.apache.kafka.clients.FetchSessionHandler)
>  java.io.IOException: Connection to 4 was disconnected before the response 
> was read
>  at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)
>  at 
> kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:110)
>  at 
> kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:215)
>  at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:313)
>  at 
> kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:139)
>  at 
> kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:138)
>  at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:121)
>  at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to