[ https://issues.apache.org/jira/browse/KAFKA-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430429#comment-17430429 ]
Luke Chen commented on KAFKA-13370: ----------------------------------- [~20100g], thanks for reporting the issue. PR is ready now: [https://github.com/apache/kafka/pull/11413] Welcome to provide comments. Thanks. > Offset commit failure percentage metric is not computed correctly (regression) > ------------------------------------------------------------------------------ > > Key: KAFKA-13370 > URL: https://issues.apache.org/jira/browse/KAFKA-13370 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect, metrics > Affects Versions: 2.8.0 > Environment: Confluent Platform Helm Chart (v6.2.0) > Reporter: Vincent Giroux > Assignee: Luke Chen > Priority: Minor > Fix For: 2.8.0 > > > There seems to have been a regression in the way the offset-commit-* metrics > are calculated for *source* Kafka Connect connectors since version 2.8.0. > Before this version, any timeout or interruption while trying to commit > offsets for source connectors (e.g. MM2 MirrorSourceConnector) would get > correctly flagged as an offset commit failure (i.e the > *offset-commit-failure-percentage* metric ** would be non-zero). Since > version 2.8.0, these errors are considered as successes. > After digging through the code, the commit where this bug was introduced > appears to be this one : > [https://github.com/apache/kafka/commit/047ad654da7903f3903760b0e6a6a58648ca7715] > I believe removing the boolean *success* argument in the *recordCommit* > method of the *WorkerTask* class (argument deemed redundant because of the > presence of the Throwable *error* argument) and only considering the presence > of a non-null error to determine if a commit is a success or failure might be > a mistake. This is because in the *commitOffsets* method of the > *WorkerSourceTask* class, there are multiple cases where an exception object > is either not available or is not passed to the *recordCommitFailure* method, > e.g. : > * *TImeout #1* : > [https://github.com/apache/kafka/blob/2.8/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L519] > > * *Timeout #2* : > [https://github.com/apache/kafka/blob/2.8/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L584] > > * *Interruption* : > [https://github.com/apache/kafka/blob/2.8/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L529] > > * *Unserializable offset* : > [https://github.com/apache/kafka/blob/2.8/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L562] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)