[
https://issues.apache.org/jira/browse/KAFKA-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15804755#comment-15804755
]
ASF GitHub Bot commented on KAFKA-4441:
---------------------------------------
GitHub user edoardocomar opened a pull request:
https://github.com/apache/kafka/pull/2325
KAFKA-4441 Monitoring incorrect during topic creation and deletion
OfflinePartitionsCount PreferredReplicaImbalanceCount metrics check for
topic being deleted
Added integration test which polls the metrics while topics are being
created and deleted
Developed with @mimaison
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/edoardocomar/kafka KAFKA-4441
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/2325.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2325
----
commit a793d249e2653255eb62a0d1b9c4a2b99c917b11
Author: Mickael Maison <[email protected]>
Date: 2016-12-15T13:26:51Z
KAFKA-4441 Monitoring incorrect during topic creation and deletion
OfflinePartitionsCount PreferredReplicaImbalanceCount metrics check for
topic being deleted
Added integration test which polls the metrics while topics are being
created and deleted
Developed with @mimaison
----
> Kafka Monitoring is incorrect during rapid topic creation and deletion
> ----------------------------------------------------------------------
>
> Key: KAFKA-4441
> URL: https://issues.apache.org/jira/browse/KAFKA-4441
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.10.0.0, 0.10.0.1
> Reporter: Tom Crayford
> Assignee: Edoardo Comar
>
> Kafka reports several metrics off the state of partitions:
> UnderReplicatedPartitions
> PreferredReplicaImbalanceCount
> OfflinePartitionsCount
> All of these metrics trigger when rapidly creating and deleting topics in a
> tight loop, although the actual causes of the metrics firing are from topics
> that are undergoing creation/deletion, and the cluster is otherwise stable.
> Looking through the source code, topic deletion goes through an asynchronous
> state machine:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/TopicDeletionManager.scala#L35.
> However, the metrics do not know about the progress of this state machine:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/KafkaController.scala#L185
>
> I believe the fix to this is relatively simple - we need to make the metrics
> know that a topic is currently undergoing deletion or creation, and only
> include topics that are "stable"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)