Hi, tao xiao and Jiangjie Qin
I encounter with the same issue, my node had recovered from high load problem
(caused by other application)
this is the kafka-topic show:
Topic:ad_click_sts PartitionCount:6 ReplicationFactor:2 Configs:
Topic: ad_click_sts Partition: 0 Leader: 1 Replicas: 1,0
Isr: 1
Topic: ad_click_sts Partition: 1 Leader: 0 Replicas: 0,1
Isr: 0
Topic: ad_click_sts Partition: 2 Leader: 1 Replicas: 1,0
Isr: 1
Topic: ad_click_sts Partition: 3 Leader: 0 Replicas: 0,1
Isr: 0
Topic: ad_click_sts Partition: 4 Leader: 1 Replicas: 1,0
Isr: 1
Topic: ad_click_sts Partition: 5 Leader: 0 Replicas: 0,1
Isr: 0
ReplicaFetcherThread info extracted from kafka server.log :
[2015-03-09 21:06:05,450] ERROR [ReplicaFetcherThread-0-0], Error in fetch
Name: FetchRequest; Version: 0; CorrelationId: 7331; ClientId:
ReplicaFetcherThread-0-0; ReplicaId: 1; MaxWait: 500 ms; MinBytes: 1 bytes;
RequestInfo: [ad_click_sts,5] ->
PartitionFetchInfo(6149699,1048576),[ad_click_sts,3] ->
PartitionFetchInfo(6147835,1048576),[ad_click_sts,1] ->
PartitionFetchInfo(6235071,1048576) (kafka.server.ReplicaFetcherThread)
java.net.SocketTimeoutException
at
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:201)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:86)
……..
at
kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:108)
at
kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108)
at
kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:108)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:107)
at
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:96)
at
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:88)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
[2015-03-09 21:06:05,450] WARN Reconnect due to socket error: null
(kafka.consumer.SimpleConsumer)
[2015-03-09 21:05:57,116] INFO Partition [ad_click_sts,4] on broker 1: Cached
zkVersion [556] not equal to that in zookeeper, skip updating ISR
(kafka.cluster.Partition)
[2015-03-09 21:06:05,772] INFO Partition [ad_click_sts,2] on broker 1:
Shrinking ISR for partition [ad_click_sts,2] from 1,0 to 1
(kafka.cluster.Partition)
How to fix this Isr problem ? Is there some command can be run ?
Regards
sy.pan