[
https://issues.apache.org/jira/browse/KAFKA-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608708#comment-15608708
]
Json Tu commented on KAFKA-3536:
--------------------------------
Thank you,But I think I can’t ignore it,because it make the cluster can not
provide normal service,which leades some producer or fetch fail for a long time
before I restart current broker.
as you describe, this error may be come from some formerly fetch operation
which contain this partition,which leads many fetch response error.
The delayFetch's tryComplete() function implements as below,
override def tryComplete() : Boolean = {
var accumulatedSize = 0
fetchMetadata.fetchPartitionStatus.foreach {
case (topicAndPartition, fetchStatus) =>
val fetchOffset = fetchStatus.startOffsetMetadata
try {
if (fetchOffset != LogOffsetMetadata.UnknownOffsetMetadata) {
val replica =
replicaManager.getLeaderReplicaIfLocal(topicAndPartition.topic,
topicAndPartition.partition)
/*ignore some codes*/
}
} catch {
/*ignore some code*/
case nle: NotLeaderForPartitionException => // Case A
debug("Broker is no longer the leader of %s, satisfy %s
immediately".format(topicAndPartition, fetchMetadata))
return forceComplete()
}
}
/* ignore some codes */
}
when meet NotLeaderForPartitionException, it will invoke forceComplete()
function, then it will invoke onComplete() function, which implements as below,
override def onComplete() {
val logReadResults =
replicaManager.readFromLocalLog(fetchMetadata.fetchOnlyLeader,
fetchMetadata.fetchOnlyCommitted,
fetchMetadata.fetchPartitionStatus.mapValues(status => status.fetchInfo))
val fetchPartitionData = logReadResults.mapValues(result =>
FetchResponsePartitionData(result.errorCode, result.hw,
result.info.messageSet))
responseCallback(fetchPartitionData)
}
so, I think it exit the tryComplete function in advance because of this
partition, which makes the partition latter in this request may not be
completely be satisfied and return to the fetch broker,
which leads some producer and consumer fail for a longtime,I don’t know is it
correct
> ReplicaFetcherThread should not log errors when leadership changes
> ------------------------------------------------------------------
>
> Key: KAFKA-3536
> URL: https://issues.apache.org/jira/browse/KAFKA-3536
> Project: Kafka
> Issue Type: Improvement
> Components: core
> Reporter: Stig Rohde Døssing
> Priority: Minor
>
> When there is a leadership change, ReplicaFetcherThread will spam the log
> with errors similar to the log snippet below.
> {noformat}
> [ReplicaFetcherThread-0-2], Error for partition [ticketupdate,7] to broker
> 2:class kafka.common.NotLeaderForPartitionException
> (kafka.server.ReplicaFetcherThread)
> {noformat}
> ReplicaFetcherThread/AbstractFetcherThread should log those exceptions at a
> lower log level, since they don't actually indicate an error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)