[jira] [Commented] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.
[ https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389844#comment-17389844 ] Raj commented on KAFKA-2729: Hi [~junrao] , This was just hit in our production as well although I was able to resolve it by only restarting the broker that reported errors as opposed to the controller or the whole cluster. Kafka version : 2.3.1 I can confirm the events are identical to what [~l0co] explained above. * ZK session disconnected on broker 5 * Replica Fetchers stopped on other brokers * ZK Connection re-established on broker 5 after a few seconds * Broker 5 came back online and started reporting the "Cached zkVersion[130] not equal to..." and shrunk ISRs to only itself As it didn't recover automatically, I restarted the broker after 30 minutes and it then went back to normal. I did see that the controller tried to send correct metadata to broker 5 but which was rejected due to epoch inconsistency. {noformat} ERROR [KafkaApi-5] Error when handling request: clientId=21, correlationId=2, api=UPDATE_METADATA, body={controller_id=21,controller_epoch=53,broker_epoch=223338313060,topic_states=[{topic-a,partition_states=[{partition=0,controller_epoch=53,leader=25,leader_epoch=70,isr=[25,17],zk_version=131,replicas=[5,25,17],offline_replicas=[]}... ... java.lang.IllegalStateException: Epoch 223338313060 larger than current broker epoch 223338311791 at kafka.server.KafkaApis.isBrokerEpochStale(KafkaApis.scala:2612) at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:194) at kafka.server.KafkaApis.handle(KafkaApis.scala:117) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at java.base/java.lang.Thread.run(Thread.java:834) ... ... ... [2021-07-29 11:07:30,210] INFO [Partition topic-a-0 broker=5] Cached zkVersion [130] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) ... {noformat} Preferred leader election error as seen on controller {noformat} [2021-07-29 11:11:57,432] ERROR [Controller id=21] Error completing preferred replica leader election for partition topic-a-0 (kafka.controller.KafkaController) kafka.common.StateChangeFailedException: Failed to elect leader for partition topic-a-0 under strategy PreferredReplicaPartitionLeaderElectionStrategy at kafka.controller.ZkPartitionStateMachine$$anonfun$doElectLeaderForPartitions$3.apply(PartitionStateMachine.scala:381) at kafka.controller.ZkPartitionStateMachine$$anonfun$doElectLeaderForPartitions$3.apply(PartitionStateMachine.scala:378) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at kafka.controller.ZkPartitionStateMachine.doElectLeaderForPartitions(PartitionStateMachine.scala:378) at kafka.controller.ZkPartitionStateMachine.electLeaderForPartitions(PartitionStateMachine.scala:305) at kafka.controller.ZkPartitionStateMachine.doHandleStateChanges(PartitionStateMachine.scala:215) at kafka.controller.ZkPartitionStateMachine.handleStateChanges(PartitionStateMachine.scala:145) at kafka.controller.KafkaController.kafka$controller$KafkaController$$onPreferredReplicaElection(KafkaController.scala:646) at kafka.controller.KafkaController$$anonfun$checkAndTriggerAutoLeaderRebalance$3.apply(KafkaController.scala:995) at kafka.controller.KafkaController$$anonfun$checkAndTriggerAutoLeaderRebalance$3.apply(KafkaController.scala:976) at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:221) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428) at kafka.controller.KafkaController.checkAndTriggerAutoLeaderRebalance(KafkaController.scala:976) at kafka.controller.KafkaController.processAutoPreferredReplicaLeaderElection(KafkaController.scala:1004) at kafka.controller.KafkaController.process(KafkaController.scala:1564) at kafka.controller.QueuedEvent.process(ControllerEventManager.scala:53) at kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply$mcV$sp(ControllerEventManager.scala:137) at kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:137) at kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:137) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31) at kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:136) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89){noformat} After the restart of broker-5, it was able to take back leadership of the desired partitions Kindly let me know if
[jira] [Commented] (KAFKA-7878) Connect Task already exists in this worker when failed to create consumer
[ https://issues.apache.org/jira/browse/KAFKA-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166089#comment-17166089 ] Nitika Raj commented on KAFKA-7878: --- We faced similar problems but then tracking back in logs we found the actual issue to be as https://issues.apache.org/jira/browse/KAFKA-9385 ; thus https://issues.apache.org/jira/browse/KAFKA-9184 which has been resolved in AK 2.3.2/2.4. > Connect Task already exists in this worker when failed to create consumer > - > > Key: KAFKA-7878 > URL: https://issues.apache.org/jira/browse/KAFKA-7878 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.1, 2.0.1 >Reporter: Loïc Monney >Priority: Major > > *Assumption* > 1. DNS is not available during a few minutes > 2. Consumer group rebalances > 3. Client is not able to resolve DNS entries anymore and fails > 4. Task seems already registered, so at next rebalance the task will fail due > to *Task already exists in this worker* and the only way to recover is to > restart the connect process > *Real log entries* > * Distributed cluster running one connector on top of Kubernetes > * Connect 2.0.1 > * kafka-connect-hdfs 5.0.1 > {noformat} > [2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from > bootstrap.servers as DNS resolution failed for kafka.xxx.net > (org.apache.kafka.clients.ClientUtils:56) > [2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed > initialization and will not be started. > (org.apache.kafka.connect.runtime.WorkerSinkTask:142) > org.apache.kafka.connect.errors.ConnectException: Failed to create consumer > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139) > at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka > consumer > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:799) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:615) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:596) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474) > ... 10 more > Caused by: org.apache.kafka.common.config.ConfigException: No resolvable > bootstrap urls given in bootstrap.servers > at > org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:709) > ... 13 more > [2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:868) > [2019-01-28 13:31:25,926] INFO Rebalance started > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239) > [2019-01-28 13:31:25,927] INFO Stopping task xxx-22 > (org.apache.kafka.connect.runtime.Worker:555) > [2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for > rebalance > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269) > [2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, > groupId=xxx-cluster] (Re-)joining group > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509) > [2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, > groupId=xxx-cluster] Successfully joined group with generation 29 > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473) > [2019-01-28 13:31:30,746] INFO Joined group and got assignment: > Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e', > leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[], > taskIds=[xxx-22]} > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217) > [2019-01-28 13:31:30,747] INFO Starting connectors and tasks using config > offset 32 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:858) > [2019-01-28
[jira] [Updated] (KAFKA-9172) Kafka Connect JMX : sink task metrics are missing in some cases after rebalancing of the tasks
[ https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raj updated KAFKA-9172: --- Labels: metrics (was: ) > Kafka Connect JMX : sink task metrics are missing in some cases after > rebalancing of the tasks > --- > > Key: KAFKA-9172 > URL: https://issues.apache.org/jira/browse/KAFKA-9172 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.1.1 >Reporter: Raj >Priority: Major > Labels: metrics > > Kafka Connect exposes various metrics via JMX. We observed some times that > few of the sink task metrics mbeans are getting deleted just after workers > rebalances all the tasks. > Also, I don't see any logs getting registered related to sink-task-metrics > mbeans at the same time . But I see similar WARN log at same time : > > {code:java} > 2019-11-11 20:58:09 WARN AppInfoParser:66 - Error registering AppInfo mbean > javax.management.InstanceAlreadyExistsException: > kafka.consumer:type=app-info,id=ResiliencyRestartJob90 > at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) > at > org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140) > at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > > Please ask me if you need any additional information. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9172) Kafka Connect JMX : sink task metrics are missing in some cases after rebalancing of the tasks
[ https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raj updated KAFKA-9172: --- Component/s: (was: metrics) > Kafka Connect JMX : sink task metrics are missing in some cases after > rebalancing of the tasks > --- > > Key: KAFKA-9172 > URL: https://issues.apache.org/jira/browse/KAFKA-9172 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.1.1 >Reporter: Raj >Priority: Major > > Kafka Connect exposes various metrics via JMX. We observed some times that > few of the sink task metrics mbeans are getting deleted just after workers > rebalances all the tasks. > Also, I don't see any logs getting registered related to sink-task-metrics > mbeans at the same time . But I see similar WARN log at same time : > > {code:java} > 2019-11-11 20:58:09 WARN AppInfoParser:66 - Error registering AppInfo mbean > javax.management.InstanceAlreadyExistsException: > kafka.consumer:type=app-info,id=ResiliencyRestartJob90 > at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) > at > org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140) > at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > > Please ask me if you need any additional information. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9172) Kafka Connect JMX : sink task metrics are missing in some cases after rebalancing of the tasks
[ https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raj updated KAFKA-9172: --- Summary: Kafka Connect JMX : sink task metrics are missing in some cases after rebalancing of the tasks (was: Kafka Connect JMX : source & sink task metrics are missing in some cases after rebalancing of the tasks) > Kafka Connect JMX : sink task metrics are missing in some cases after > rebalancing of the tasks > --- > > Key: KAFKA-9172 > URL: https://issues.apache.org/jira/browse/KAFKA-9172 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect, metrics >Affects Versions: 2.1.1 >Reporter: Raj >Priority: Major > > Kafka Connect exposes various metrics via JMX. We observed some times that > few of the sink task metrics mbeans are getting deleted just after workers > rebalances all the tasks. > Also, I don't see any logs getting registered related to sink-task-metrics > mbeans at the same time . But I see similar WARN log at same time : > > {code:java} > 2019-11-11 20:58:09 WARN AppInfoParser:66 - Error registering AppInfo mbean > javax.management.InstanceAlreadyExistsException: > kafka.consumer:type=app-info,id=ResiliencyRestartJob90 > at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) > at > org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140) > at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > > Please ask me if you need any additional information. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9172) Kafka Connect JMX : source & sink task metrics are missing in some cases after rebalancing of the tasks
[ https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972199#comment-16972199 ] Raj commented on KAFKA-9172: I think this is a similar ticket and I am facing the same issue with Kafka Connect : https://issues.apache.org/jira/browse/KAFKA-3992 > Kafka Connect JMX : source & sink task metrics are missing in some cases > after rebalancing of the tasks > --- > > Key: KAFKA-9172 > URL: https://issues.apache.org/jira/browse/KAFKA-9172 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.1.1 >Reporter: Raj >Priority: Major > > Kafka Connect exposes various metrics via JMX. We observed some times that > few of the sink task metrics mbeans are getting deleted just after workers > rebalances all the tasks. > Also, I don't see any logs getting registered related to sink-task-metrics > mbeans at the same time . But I see similar WARN log at same time : > > {code:java} > 2019-11-11 20:58:09 WARN AppInfoParser:66 - Error registering AppInfo mbean > javax.management.InstanceAlreadyExistsException: > kafka.consumer:type=app-info,id=ResiliencyRestartJob90 > at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) > at > org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140) > at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > > Please ask me if you need any additional information. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9172) Kafka Connect JMX : source & sink task metrics are missing in some cases after rebalancing of the tasks
[ https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raj updated KAFKA-9172: --- Component/s: metrics > Kafka Connect JMX : source & sink task metrics are missing in some cases > after rebalancing of the tasks > --- > > Key: KAFKA-9172 > URL: https://issues.apache.org/jira/browse/KAFKA-9172 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect, metrics >Affects Versions: 2.1.1 >Reporter: Raj >Priority: Major > > Kafka Connect exposes various metrics via JMX. We observed some times that > few of the sink task metrics mbeans are getting deleted just after workers > rebalances all the tasks. > Also, I don't see any logs getting registered related to sink-task-metrics > mbeans at the same time . But I see similar WARN log at same time : > > {code:java} > 2019-11-11 20:58:09 WARN AppInfoParser:66 - Error registering AppInfo mbean > javax.management.InstanceAlreadyExistsException: > kafka.consumer:type=app-info,id=ResiliencyRestartJob90 > at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) > at > org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62) > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140) > at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880) > at > org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > > Please ask me if you need any additional information. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9172) Kafka Connect JMX : source & sink task metrics are missing in some cases after rebalancing of the tasks
Raj created KAFKA-9172: -- Summary: Kafka Connect JMX : source & sink task metrics are missing in some cases after rebalancing of the tasks Key: KAFKA-9172 URL: https://issues.apache.org/jira/browse/KAFKA-9172 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 2.1.1 Reporter: Raj Kafka Connect exposes various metrics via JMX. We observed some times that few of the sink task metrics mbeans are getting deleted just after workers rebalances all the tasks. Also, I don't see any logs getting registered related to sink-task-metrics mbeans at the same time . But I see similar WARN log at same time : {code:java} // 2019-11-11 20:58:09 WARN AppInfoParser:66 - Error registering AppInfo mbean javax.management.InstanceAlreadyExistsException: kafka.consumer:type=app-info,id=ResiliencyRestartJob90 at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) at org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62) at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784) at org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481) at org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140) at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110) at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880) at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} Please ask me if you need any additional information. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9172) Kafka Connect JMX : source & sink task metrics are missing in some cases after rebalancing of the tasks
[ https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raj updated KAFKA-9172: --- Description: Kafka Connect exposes various metrics via JMX. We observed some times that few of the sink task metrics mbeans are getting deleted just after workers rebalances all the tasks. Also, I don't see any logs getting registered related to sink-task-metrics mbeans at the same time . But I see similar WARN log at same time : {code:java} 2019-11-11 20:58:09 WARN AppInfoParser:66 - Error registering AppInfo mbean javax.management.InstanceAlreadyExistsException: kafka.consumer:type=app-info,id=ResiliencyRestartJob90 at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) at org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62) at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784) at org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481) at org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140) at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110) at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880) at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} Please ask me if you need any additional information. was: Kafka Connect exposes various metrics via JMX. We observed some times that few of the sink task metrics mbeans are getting deleted just after workers rebalances all the tasks. Also, I don't see any logs getting registered related to sink-task-metrics mbeans at the same time . But I see similar WARN log at same time : {code:java} // 2019-11-11 20:58:09 WARN AppInfoParser:66 - Error registering AppInfo mbean javax.management.InstanceAlreadyExistsException: kafka.consumer:type=app-info,id=ResiliencyRestartJob90 at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) at org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62) at org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784) at org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481) at org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140) at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865) at org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110) at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880) at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at