[
https://issues.apache.org/jira/browse/KAFKA-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605691#comment-15605691
]
Guozhang Wang commented on KAFKA-3536:
--------------------------------------
This issue is observed in 0.9.0 as well, copying from mailing list:
{code}
Hello,
The error log indicates that by the time the the broker tries to complete the
fetch request, it is no longer the leader for some of the requested
topic-partitions. This can happen when brokers are shutting down / restarting
that cause leaders to be migrated by controller automatically or when leaders
are manually migrated due to operations, and such error logs should be
transient as the follower replicas should be able to refresh its metadata later
and discover the new leaders.
The confusion comes from the fact that, the fetch request from the
ReplicaFetcherThread-0-2141642 does not actually contain partition
[retail.d.ris.spider.request, 1], but it causes the HW of some of the fetched
partitions to be advanced, and hence the broker tries to see if some OTHER
delayed fetch requests can now be satisfied and completed because of the
advanced HW (in function tryCompleteDelayedRequests). And some of those
requests contains the partition, whose leader is no longer broker 2141642.
I think your encountered issue is summarized in KAFKA-3536 and has not been
resolved yet, currently you can ignore the spammed error logs.
Guozhang
On Tue, Oct 25, 2016 at 5:32 AM, Json Tu <[email protected]> wrote:
Hi all,
I use Kafka 0.9.0.0, and we have a cluster with 6 nodes, when I restart
a broker,we find there are many logs as below,
[2016-10-24 15:29:00,914] ERROR [KafkaApi-2141642] error when handling request
Name: FetchRequest; Version: 1; CorrelationId: 4928; ClientId:
ReplicaFetcherThread-0-2141642; ReplicaId: 2141386; MaxWait: 500 ms; MinBytes:
1 bytes; RequestInfo: [retail.c.order.logistics,0] ->
PartitionFetchInfo(215258,1048576),[waimai_c_ugc_msg,1] ->
PartitionFetchInfo(12426588,1048576),[waimai_c_ucenter_asyncrelationbind_delay,25]
-> PartitionFetchInfo(0,1048576),[waimai_c_order_databus_wmorder,44] ->
PartitionFetchInfo(49555913,1048576),[__consumer_offsets,23] ->
PartitionFetchInfo(11846051,1048576),[retail.m.sp.sku.update,3] ->
PartitionFetchInfo(21563,1048576),[waimai_c_monitor_orderlogisticsstatus,28] ->
PartitionFetchInfo(26926356,1048576),[waimai_c_ucenter_loadrelation,0] ->
PartitionFetchInfo(54583,1048576),[__consumer_offsets,29] ->
PartitionFetchInfo(23479045,1048576),[waimai_c_order_databus_wmorder,14] ->
PartitionFetchInfo(49568225,1048576),[waimai_c_ordertag_orderdealremark,31] ->
PartitionFetchInfo(1829838,1048576),[retail.d.ris.spider.request,0] ->
PartitionFetchInfo(709845,1048576),[__consumer_offsets,13] ->
PartitionFetchInfo(9376691,1048576),[waimai_c_ugc_msg_staging,2] ->
PartitionFetchInfo(38,1048576),[retail.b.openapi.push.retry.stage,0] ->
PartitionFetchInfo(0,1048576),[waimai_c_ucenter_favoritepoi,15] ->
PartitionFetchInfo(390045,1048576),[retail.b.order.phonecall,0] ->
PartitionFetchInfo(1,1048576),[waimai_c_ucenter_loadrelation,45] ->
PartitionFetchInfo(53975,1048576),[waimai_c_ordertag_orderdealremark,1] ->
PartitionFetchInfo(1829848,1048576),[retail.d.ris.spider.jddj.request,0] ->
PartitionFetchInfo(5116337,1048576),[waimai_c_ucenter_asyncrelationbind_delay,13]
-> PartitionFetchInfo(0,1048576),[waimai_c_ucenter_asyncrelationbind_delay,55]
-> PartitionFetchInfo(0,1048576),[waimai_push_e_operate_prod,3] ->
PartitionFetchInfo(442564,1048576),[waimai_ordersa_topic_user_order_in_poi_count_diff,5]
-> PartitionFetchInfo(23791010,1048576),[retail.c.order.create,4] ->
PartitionFetchInfo(72902,1048576),[waimai_c_ucenter_asyncrelationbind_staging,2]
-> PartitionFetchInfo(66,1048576),[waimai_c_order_orderevent_topic,35] ->
PartitionFetchInfo(0,1048576),[waimai_c_ucenter_syncuserrelation,43] ->
PartitionFetchInfo(0,1048576),[waimai_c_order_databus_wmorder,48] ->
PartitionFetchInfo(49496018,1048576),[waimai_c_monitor_orderstatus,32] ->
PartitionFetchInfo(50623699,1048576),[waimai_c_ucenter_loadrelation,15] ->
PartitionFetchInfo(54360,1048576),[waimai_c_monitor_orderstatus,2] ->
PartitionFetchInfo(50624881,1048576),[waimai_c_order_databus_wmorder,24] ->
PartitionFetchInfo(49548334,1048576),[waimai_c_order_databus_wmorder,18] ->
PartitionFetchInfo(49489397,1048576),[waimai_c_ucenter_asyncrelationbind,36] ->
PartitionFetchInfo(53430,1048576) (kafka.server.KafkaApis)
kafka.common.NotLeaderForPartitionException: Leader not local for partition
[retail.d.ris.spider.request,1] on broker 2141642
at
kafka.server.ReplicaManager.getLeaderReplicaIfLocal(ReplicaManager.scala:296)
at
kafka.server.DelayedFetch$$anonfun$tryComplete$1.apply(DelayedFetch.scala:77)
at
kafka.server.DelayedFetch$$anonfun$tryComplete$1.apply(DelayedFetch.scala:72)
at
scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224)
at
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
at
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
at kafka.server.DelayedFetch.tryComplete(DelayedFetch.scala:72)
at
kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:307)
at
kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:227)
at
kafka.server.ReplicaManager.tryCompleteDelayedFetch(ReplicaManager.scala:202)
at
kafka.cluster.Partition.tryCompleteDelayedRequests(Partition.scala:372)
at kafka.cluster.Partition.maybeExpandIsr(Partition.scala:294)
at
kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:243)
at
kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:852)
at
kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:849)
at
scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224)
at
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
at
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
at
kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:849)
at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:467)
at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:434)
at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
at java.lang.Thread.run(Thread.java:745)
what confused me is that,retail.d.ris.spider.request is not contained in this
request,why will log it in handleFetchRequest,how can it happen and how to
resolve it?
{code}
> ReplicaFetcherThread should not log errors when leadership changes
> ------------------------------------------------------------------
>
> Key: KAFKA-3536
> URL: https://issues.apache.org/jira/browse/KAFKA-3536
> Project: Kafka
> Issue Type: Improvement
> Components: core
> Reporter: Stig Rohde Døssing
> Priority: Minor
>
> When there is a leadership change, ReplicaFetcherThread will spam the log
> with errors similar to the log snippet below.
> {noformat}
> [ReplicaFetcherThread-0-2], Error for partition [ticketupdate,7] to broker
> 2:class kafka.common.NotLeaderForPartitionException
> (kafka.server.ReplicaFetcherThread)
> {noformat}
> ReplicaFetcherThread/AbstractFetcherThread should log those exceptions at a
> lower log level, since they don't actually indicate an error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)