[jira] [Commented] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.

2021-07-29 Thread Raj (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389844#comment-17389844
 ] 

Raj commented on KAFKA-2729:


Hi [~junrao] ,

This was just hit in our production as well although I was able to resolve it 
by only restarting the broker that reported errors as opposed to the controller 
or the whole cluster.

Kafka version : 2.3.1

I can confirm the events are identical to what [~l0co]  explained above. 
 * ZK session disconnected on broker 5
 * Replica Fetchers stopped on other brokers
 * ZK Connection re-established on broker 5 after a few seconds
 * Broker 5 came back online and started reporting the "Cached zkVersion[130] 
not equal to..." and shrunk ISRs to only itself

As it didn't recover automatically, I restarted the broker after 30 minutes and 
it then went back to normal.

I did see that the controller tried to send correct metadata to broker 5 but 
which was rejected due to epoch inconsistency.
{noformat}
ERROR [KafkaApi-5] Error when handling request: clientId=21, correlationId=2, 
api=UPDATE_METADATA, 
body={controller_id=21,controller_epoch=53,broker_epoch=223338313060,topic_states=[{topic-a,partition_states=[{partition=0,controller_epoch=53,leader=25,leader_epoch=70,isr=[25,17],zk_version=131,replicas=[5,25,17],offline_replicas=[]}...
...
java.lang.IllegalStateException: Epoch 223338313060 larger than current broker 
epoch 223338311791
at kafka.server.KafkaApis.isBrokerEpochStale(KafkaApis.scala:2612)
at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:194)
at kafka.server.KafkaApis.handle(KafkaApis.scala:117)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69)
at java.base/java.lang.Thread.run(Thread.java:834)
...
...
...
[2021-07-29 11:07:30,210] INFO [Partition topic-a-0 broker=5] Cached zkVersion 
[130] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)
...

{noformat}
 

Preferred leader election error as seen on controller
{noformat}
[2021-07-29 11:11:57,432] ERROR [Controller id=21] Error completing preferred 
replica leader election for partition topic-a-0 
(kafka.controller.KafkaController)
kafka.common.StateChangeFailedException: Failed to elect leader for partition 
topic-a-0 under strategy PreferredReplicaPartitionLeaderElectionStrategy
at 
kafka.controller.ZkPartitionStateMachine$$anonfun$doElectLeaderForPartitions$3.apply(PartitionStateMachine.scala:381)
at 
kafka.controller.ZkPartitionStateMachine$$anonfun$doElectLeaderForPartitions$3.apply(PartitionStateMachine.scala:378)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at 
kafka.controller.ZkPartitionStateMachine.doElectLeaderForPartitions(PartitionStateMachine.scala:378)
at 
kafka.controller.ZkPartitionStateMachine.electLeaderForPartitions(PartitionStateMachine.scala:305)
at 
kafka.controller.ZkPartitionStateMachine.doHandleStateChanges(PartitionStateMachine.scala:215)
at 
kafka.controller.ZkPartitionStateMachine.handleStateChanges(PartitionStateMachine.scala:145)
at 
kafka.controller.KafkaController.kafka$controller$KafkaController$$onPreferredReplicaElection(KafkaController.scala:646)
at 
kafka.controller.KafkaController$$anonfun$checkAndTriggerAutoLeaderRebalance$3.apply(KafkaController.scala:995)
at 
kafka.controller.KafkaController$$anonfun$checkAndTriggerAutoLeaderRebalance$3.apply(KafkaController.scala:976)
at 
scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:221)
at 
scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:428)
at 
kafka.controller.KafkaController.checkAndTriggerAutoLeaderRebalance(KafkaController.scala:976)
at 
kafka.controller.KafkaController.processAutoPreferredReplicaLeaderElection(KafkaController.scala:1004)
at kafka.controller.KafkaController.process(KafkaController.scala:1564)
at kafka.controller.QueuedEvent.process(ControllerEventManager.scala:53)
at 
kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply$mcV$sp(ControllerEventManager.scala:137)
at 
kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:137)
at 
kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:137)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
at 
kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:136)
at 
kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89){noformat}
 

After the restart of broker-5, it was able to take back leadership of the 
desired partitions

 

Kindly let me know if 

[jira] [Commented] (KAFKA-7878) Connect Task already exists in this worker when failed to create consumer

2020-07-27 Thread Nitika Raj (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166089#comment-17166089
 ] 

Nitika Raj commented on KAFKA-7878:
---

We faced similar problems but then tracking back in logs we found the actual 
issue to be as https://issues.apache.org/jira/browse/KAFKA-9385 ; thus 
https://issues.apache.org/jira/browse/KAFKA-9184
which has been resolved in AK 2.3.2/2.4.

> Connect Task already exists in this worker when failed to create consumer
> -
>
> Key: KAFKA-7878
> URL: https://issues.apache.org/jira/browse/KAFKA-7878
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.1, 2.0.1
>Reporter: Loïc Monney
>Priority: Major
>
> *Assumption*
> 1. DNS is not available during a few minutes
> 2. Consumer group rebalances
> 3. Client is not able to resolve DNS entries anymore and fails
> 4. Task seems already registered, so at next rebalance the task will fail due 
> to *Task already exists in this worker* and the only way to recover is to 
> restart the connect process
> *Real log entries*
> * Distributed cluster running one connector on top of Kubernetes
> * Connect 2.0.1
> * kafka-connect-hdfs 5.0.1
> {noformat}
> [2019-01-28 13:31:25,914] WARN Removing server kafka.xxx.net:9093 from 
> bootstrap.servers as DNS resolution failed for kafka.xxx.net 
> (org.apache.kafka.clients.ClientUtils:56)
> [2019-01-28 13:31:25,915] ERROR WorkerSinkTask\{id=xxx-22} Task failed 
> initialization and will not be started. 
> (org.apache.kafka.connect.runtime.WorkerSinkTask:142)
> org.apache.kafka.connect.errors.ConnectException: Failed to create consumer
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:476)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:139)
>  at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:873)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:111)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:888)
>  at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:884)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> consumer
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:799)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:615)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:596)
>  at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:474)
>  ... 10 more
> Caused by: org.apache.kafka.common.config.ConfigException: No resolvable 
> bootstrap urls given in bootstrap.servers
>  at 
> org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:66)
>  at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:709)
>  ... 13 more
> [2019-01-28 13:31:25,925] INFO Finished starting connectors and tasks 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:868)
> [2019-01-28 13:31:25,926] INFO Rebalance started 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1239)
> [2019-01-28 13:31:25,927] INFO Stopping task xxx-22 
> (org.apache.kafka.connect.runtime.Worker:555)
> [2019-01-28 13:31:26,021] INFO Finished stopping tasks in preparation for 
> rebalance 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1269)
> [2019-01-28 13:31:26,021] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] (Re-)joining group 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:509)
> [2019-01-28 13:31:30,746] INFO [Worker clientId=connect-1, 
> groupId=xxx-cluster] Successfully joined group with generation 29 
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:473)
> [2019-01-28 13:31:30,746] INFO Joined group and got assignment: 
> Assignment\{error=0, leader='connect-1-05961f03-52a7-4c02-acc2-0f1fb021692e', 
> leaderUrl='http://192.168.46.59:8083/', offset=32, connectorIds=[], 
> taskIds=[xxx-22]} 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1217)
> [2019-01-28 13:31:30,747] INFO Starting connectors and tasks using config 
> offset 32 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:858)
> [2019-01-28 

[jira] [Updated] (KAFKA-9172) Kafka Connect JMX : sink task metrics are missing in some cases after rebalancing of the tasks

2019-11-12 Thread Raj (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raj updated KAFKA-9172:
---
Labels: metrics  (was: )

> Kafka Connect JMX :  sink task metrics are missing in some cases after 
> rebalancing of the tasks
> ---
>
> Key: KAFKA-9172
> URL: https://issues.apache.org/jira/browse/KAFKA-9172
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.1.1
>Reporter: Raj
>Priority: Major
>  Labels: metrics
>
> Kafka Connect exposes various metrics via JMX. We observed some times that  
> few of the sink task metrics mbeans are getting deleted just after workers 
> rebalances all the tasks. 
> Also, I don't see any logs getting registered related to sink-task-metrics 
> mbeans at the same time . But I see similar WARN log at same time :
>  
> {code:java}
> 2019-11-11 20:58:09 WARN  AppInfoParser:66 - Error registering AppInfo mbean
> javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=app-info,id=ResiliencyRestartJob90
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>   at 
> org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140)
>   at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Please ask me if you need any additional information.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9172) Kafka Connect JMX : sink task metrics are missing in some cases after rebalancing of the tasks

2019-11-12 Thread Raj (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raj updated KAFKA-9172:
---
Component/s: (was: metrics)

> Kafka Connect JMX :  sink task metrics are missing in some cases after 
> rebalancing of the tasks
> ---
>
> Key: KAFKA-9172
> URL: https://issues.apache.org/jira/browse/KAFKA-9172
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.1.1
>Reporter: Raj
>Priority: Major
>
> Kafka Connect exposes various metrics via JMX. We observed some times that  
> few of the sink task metrics mbeans are getting deleted just after workers 
> rebalances all the tasks. 
> Also, I don't see any logs getting registered related to sink-task-metrics 
> mbeans at the same time . But I see similar WARN log at same time :
>  
> {code:java}
> 2019-11-11 20:58:09 WARN  AppInfoParser:66 - Error registering AppInfo mbean
> javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=app-info,id=ResiliencyRestartJob90
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>   at 
> org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140)
>   at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Please ask me if you need any additional information.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9172) Kafka Connect JMX : sink task metrics are missing in some cases after rebalancing of the tasks

2019-11-12 Thread Raj (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raj updated KAFKA-9172:
---
Summary: Kafka Connect JMX :  sink task metrics are missing in some cases 
after rebalancing of the tasks  (was: Kafka Connect JMX : source & sink task 
metrics are missing in some cases after rebalancing of the tasks)

> Kafka Connect JMX :  sink task metrics are missing in some cases after 
> rebalancing of the tasks
> ---
>
> Key: KAFKA-9172
> URL: https://issues.apache.org/jira/browse/KAFKA-9172
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect, metrics
>Affects Versions: 2.1.1
>Reporter: Raj
>Priority: Major
>
> Kafka Connect exposes various metrics via JMX. We observed some times that  
> few of the sink task metrics mbeans are getting deleted just after workers 
> rebalances all the tasks. 
> Also, I don't see any logs getting registered related to sink-task-metrics 
> mbeans at the same time . But I see similar WARN log at same time :
>  
> {code:java}
> 2019-11-11 20:58:09 WARN  AppInfoParser:66 - Error registering AppInfo mbean
> javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=app-info,id=ResiliencyRestartJob90
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>   at 
> org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140)
>   at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Please ask me if you need any additional information.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9172) Kafka Connect JMX : source & sink task metrics are missing in some cases after rebalancing of the tasks

2019-11-12 Thread Raj (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972199#comment-16972199
 ] 

Raj commented on KAFKA-9172:


I think this is a similar ticket and I am facing the same issue with Kafka 
Connect  : https://issues.apache.org/jira/browse/KAFKA-3992

> Kafka Connect JMX : source & sink task metrics are missing in some cases 
> after rebalancing of the tasks
> ---
>
> Key: KAFKA-9172
> URL: https://issues.apache.org/jira/browse/KAFKA-9172
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.1.1
>Reporter: Raj
>Priority: Major
>
> Kafka Connect exposes various metrics via JMX. We observed some times that  
> few of the sink task metrics mbeans are getting deleted just after workers 
> rebalances all the tasks. 
> Also, I don't see any logs getting registered related to sink-task-metrics 
> mbeans at the same time . But I see similar WARN log at same time :
>  
> {code:java}
> 2019-11-11 20:58:09 WARN  AppInfoParser:66 - Error registering AppInfo mbean
> javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=app-info,id=ResiliencyRestartJob90
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>   at 
> org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140)
>   at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Please ask me if you need any additional information.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9172) Kafka Connect JMX : source & sink task metrics are missing in some cases after rebalancing of the tasks

2019-11-12 Thread Raj (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raj updated KAFKA-9172:
---
Component/s: metrics

> Kafka Connect JMX : source & sink task metrics are missing in some cases 
> after rebalancing of the tasks
> ---
>
> Key: KAFKA-9172
> URL: https://issues.apache.org/jira/browse/KAFKA-9172
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect, metrics
>Affects Versions: 2.1.1
>Reporter: Raj
>Priority: Major
>
> Kafka Connect exposes various metrics via JMX. We observed some times that  
> few of the sink task metrics mbeans are getting deleted just after workers 
> rebalances all the tasks. 
> Also, I don't see any logs getting registered related to sink-task-metrics 
> mbeans at the same time . But I see similar WARN log at same time :
>  
> {code:java}
> 2019-11-11 20:58:09 WARN  AppInfoParser:66 - Error registering AppInfo mbean
> javax.management.InstanceAlreadyExistsException: 
> kafka.consumer:type=app-info,id=ResiliencyRestartJob90
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>   at 
> org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140)
>   at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> Please ask me if you need any additional information.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9172) Kafka Connect JMX : source & sink task metrics are missing in some cases after rebalancing of the tasks

2019-11-12 Thread Raj (Jira)
Raj created KAFKA-9172:
--

 Summary: Kafka Connect JMX : source & sink task metrics are 
missing in some cases after rebalancing of the tasks
 Key: KAFKA-9172
 URL: https://issues.apache.org/jira/browse/KAFKA-9172
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 2.1.1
Reporter: Raj


Kafka Connect exposes various metrics via JMX. We observed some times that  few 
of the sink task metrics mbeans are getting deleted just after workers 
rebalances all the tasks. 

Also, I don't see any logs getting registered related to sink-task-metrics 
mbeans at the same time . But I see similar WARN log at same time :

 
{code:java}
// 2019-11-11 20:58:09 WARN  AppInfoParser:66 - Error registering AppInfo mbean
javax.management.InstanceAlreadyExistsException: 
kafka.consumer:type=app-info,id=ResiliencyRestartJob90
at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at 
org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}
 

Please ask me if you need any additional information.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9172) Kafka Connect JMX : source & sink task metrics are missing in some cases after rebalancing of the tasks

2019-11-12 Thread Raj (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raj updated KAFKA-9172:
---
Description: 
Kafka Connect exposes various metrics via JMX. We observed some times that  few 
of the sink task metrics mbeans are getting deleted just after workers 
rebalances all the tasks. 

Also, I don't see any logs getting registered related to sink-task-metrics 
mbeans at the same time . But I see similar WARN log at same time :

 
{code:java}
2019-11-11 20:58:09 WARN  AppInfoParser:66 - Error registering AppInfo mbean
javax.management.InstanceAlreadyExistsException: 
kafka.consumer:type=app-info,id=ResiliencyRestartJob90
at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at 
org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}
 

Please ask me if you need any additional information.

 

  was:
Kafka Connect exposes various metrics via JMX. We observed some times that  few 
of the sink task metrics mbeans are getting deleted just after workers 
rebalances all the tasks. 

Also, I don't see any logs getting registered related to sink-task-metrics 
mbeans at the same time . But I see similar WARN log at same time :

 
{code:java}
// 2019-11-11 20:58:09 WARN  AppInfoParser:66 - Error registering AppInfo mbean
javax.management.InstanceAlreadyExistsException: 
kafka.consumer:type=app-info,id=ResiliencyRestartJob90
at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at 
org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:62)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:784)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.createConsumer(WorkerSinkTask.java:481)
at 
org.apache.kafka.connect.runtime.WorkerSinkTask.initialize(WorkerSinkTask.java:140)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:452)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)
at 
org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at