[jira] [Commented] (KAFKA-7697) Possible deadlock in kafka.cluster.Partition
[ https://issues.apache.org/jira/browse/KAFKA-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873850#comment-16873850 ] muchl commented on KAFKA-7697: -- [~rsivaram] Thank you very much. [KAFKA-7538 | https://issues.apache.org/jira/browse/KAFKA-7538] This seems to be the problem I encountered.I will focus on the KAFKA-7538. > Possible deadlock in kafka.cluster.Partition > > > Key: KAFKA-7697 > URL: https://issues.apache.org/jira/browse/KAFKA-7697 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Gian Merlino >Assignee: Rajini Sivaram >Priority: Blocker > Fix For: 2.2.0, 2.1.1 > > Attachments: 2.1.1-hangs.log, 322.tdump, kafka.log, kafka_jstack.txt, > threaddump.txt > > > After upgrading a fairly busy broker from 0.10.2.0 to 2.1.0, it locked up > within a few minutes (by "locked up" I mean that all request handler threads > were busy, and other brokers reported that they couldn't communicate with > it). I restarted it a few times and it did the same thing each time. After > downgrading to 0.10.2.0, the broker was stable. I attached a threaddump.txt > from the last attempt on 2.1.0 that shows lots of kafka-request-handler- > threads trying to acquire the leaderIsrUpdateLock lock in > kafka.cluster.Partition. > It jumps out that there are two threads that already have some read lock > (can't tell which one) and are trying to acquire a second one (on two > different read locks: 0x000708184b88 and 0x00070821f188): > kafka-request-handler-1 and kafka-request-handler-4. Both are handling a > produce request, and in the process of doing so, are calling > Partition.fetchOffsetSnapshot while trying to complete a DelayedFetch. At the > same time, both of those locks have writers from other threads waiting on > them (kafka-request-handler-2 and kafka-scheduler-6). Neither of those locks > appear to have writers that hold them (if only because no threads in the dump > are deep enough in inWriteLock to indicate that). > ReentrantReadWriteLock in nonfair mode prioritizes waiting writers over > readers. Is it possible that kafka-request-handler-1 and > kafka-request-handler-4 are each trying to read-lock the partition that is > currently locked by the other one, and they're both parked waiting for > kafka-request-handler-2 and kafka-scheduler-6 to get write locks, which they > never will, because the former two threads own read locks and aren't giving > them up? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8604) kafka log dir was marked as offline because of deleting segments of __consumer_offsets failed
[ https://issues.apache.org/jira/browse/KAFKA-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873734#comment-16873734 ] songyingshuan commented on KAFKA-8604: -- The exceptions was thrown from here: {code:scala} def delete() { val deletedLog = log.delete() val deletedIndex = index.delete() val deletedTimeIndex = timeIndex.delete() val deletedTxnIndex = txnIndex.delete() if (!deletedLog && log.file.exists) // from here throw new IOException("Delete of log " + log.file.getName + " failed.") if (!deletedIndex && index.file.exists) throw new IOException("Delete of index " + index.file.getName + " failed.") if (!deletedTimeIndex && timeIndex.file.exists) throw new IOException("Delete of time index " + timeIndex.file.getName + " failed.") if (!deletedTxnIndex && txnIndex.file.exists) throw new IOException("Delete of transaction index " + txnIndex.file.getName + " failed.") } {code} it means deletedLog == false && log.file.exists == true,that is the log file to be deleted failed to delete and the file is still in the disk. but we have checked the partition file dir and the file did not exist. So confused !! > kafka log dir was marked as offline because of deleting segments of > __consumer_offsets failed > - > > Key: KAFKA-8604 > URL: https://issues.apache.org/jira/browse/KAFKA-8604 > Project: Kafka > Issue Type: Bug > Components: log cleaner >Affects Versions: 1.0.1 >Reporter: songyingshuan >Priority: Major > Attachments: error-logs.log > > > We encountered a problem in our product env without any foresight. When kafka > broker trying to clean __consumer_offsets-38 (and only happents to this > partition), the log shows > it failed, and marking the whole disk/log dir offline, and this leads to a > negative impact on some normal partitions (because of the ISR list of those > partitions decrease). > we had to restart the broker server to reuse the disk/dir which was marked as > offline. BUT!! this problem occurs periodically with the same reason so we > have to restart broker periodically. > we read some source code of kafka-1.0.1, but cannot make sure why this > happends. And The cluster status had been good until this problem suddenly > attacked us. > the error log is something like this : > > {code:java} > 2019-06-25 00:11:26,241 INFO kafka.log.TimeIndex: Deleting index > /data6/kafka/data/__consumer_offsets-38/012855596978.timeindex.deleted > 2019-06-25 00:11:26,258 ERROR kafka.server.LogDirFailureChannel: Error while > deleting segments for __consumer_offsets-38 in dir /data6/kafka/data > java.io.IOException: Delete of log .log.deleted failed. > at kafka.log.LogSegment.delete(LogSegment.scala:496) > at > kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596) > at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) > at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) > at kafka.log.Log.maybeHandleIOException(Log.scala:1669) > at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595) > at > kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599) > at > kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-06-25 00:11:26,265 ERROR kafka.utils.KafkaScheduler: Uncaught exception > in scheduled task 'delete-file' > org.apache.kafka.common.errors.KafkaStorageException: Error while deleting > segments for __consumer_offsets-38 in dir /data6/kafka/data > Caused by: java.io.IOException: Delete of log > .log.deleted failed. > at kafka.log.LogSegment.delete(LogSegment.scala:496) > at > kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596) > at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) > at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) > at kafka.log.Log.maybeHandleIOException(Log.scala:1669) > at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log
[jira] [Updated] (KAFKA-8604) kafka log dir was marked as offline because of deleting segments of __consumer_offsets failed
[ https://issues.apache.org/jira/browse/KAFKA-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] songyingshuan updated KAFKA-8604: - Attachment: error-logs.log > kafka log dir was marked as offline because of deleting segments of > __consumer_offsets failed > - > > Key: KAFKA-8604 > URL: https://issues.apache.org/jira/browse/KAFKA-8604 > Project: Kafka > Issue Type: Bug > Components: log cleaner >Affects Versions: 1.0.1 >Reporter: songyingshuan >Priority: Major > Attachments: error-logs.log > > > We encountered a problem in our product env without any foresight. When kafka > broker trying to clean __consumer_offsets-38 (and only happents to this > partition), the log shows > it failed, and marking the whole disk/log dir offline, and this leads to a > negative impact on some normal partitions (because of the ISR list of those > partitions decrease). > we had to restart the broker server to reuse the disk/dir which was marked as > offline. BUT!! this problem occurs periodically with the same reason so we > have to restart broker periodically. > we read some source code of kafka-1.0.1, but cannot make sure why this > happends. And The cluster status had been good until this problem suddenly > attacked us. > the error log is something like this : > > {code:java} > 2019-06-25 00:11:26,241 INFO kafka.log.TimeIndex: Deleting index > /data6/kafka/data/__consumer_offsets-38/012855596978.timeindex.deleted > 2019-06-25 00:11:26,258 ERROR kafka.server.LogDirFailureChannel: Error while > deleting segments for __consumer_offsets-38 in dir /data6/kafka/data > java.io.IOException: Delete of log .log.deleted failed. > at kafka.log.LogSegment.delete(LogSegment.scala:496) > at > kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596) > at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) > at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) > at kafka.log.Log.maybeHandleIOException(Log.scala:1669) > at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595) > at > kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599) > at > kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-06-25 00:11:26,265 ERROR kafka.utils.KafkaScheduler: Uncaught exception > in scheduled task 'delete-file' > org.apache.kafka.common.errors.KafkaStorageException: Error while deleting > segments for __consumer_offsets-38 in dir /data6/kafka/data > Caused by: java.io.IOException: Delete of log > .log.deleted failed. > at kafka.log.LogSegment.delete(LogSegment.scala:496) > at > kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596) > at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) > at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) > at kafka.log.Log.maybeHandleIOException(Log.scala:1669) > at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595) > at > kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599) > at > kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-06-25 00:11:26,268 INFO kafka.server.ReplicaManager: [ReplicaManager > broker=1001] Stopping serving replicas in dir /data6/kafka/data > {code} > and we also find that the cleaning of __c
[jira] [Created] (KAFKA-8605) Warn users when they have same connector in their plugin-path more than once
Cyrus Vafadari created KAFKA-8605: - Summary: Warn users when they have same connector in their plugin-path more than once Key: KAFKA-8605 URL: https://issues.apache.org/jira/browse/KAFKA-8605 Project: Kafka Issue Type: Improvement Components: KafkaConnect Reporter: Cyrus Vafadari Right now it is very easy to have multiple copies of the same connector in the plugin-path and not realize it. This can be problematic if a user is adding dependencies into the plugin, or accidentally using the wrong version of the connector. An unintrusive improvement would be to log a warning if the same connector appears in the plugin-path more than once -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7077) KIP-318: Make Kafka Connect Source idempotent
[ https://issues.apache.org/jira/browse/KAFKA-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873440#comment-16873440 ] Jose A. Iñigo commented on KAFKA-7077: -- Hi [~stephane.maa...@gmail.com], are there any plans to address this in the near future? > KIP-318: Make Kafka Connect Source idempotent > - > > Key: KAFKA-7077 > URL: https://issues.apache.org/jira/browse/KAFKA-7077 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Affects Versions: 2.0.0 >Reporter: Stephane Maarek >Assignee: Stephane Maarek >Priority: Major > > KIP Link: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-318%3A+Make+Kafka+Connect+Source+idempotent -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8597) Give access to the Dead Letter Queue APIs to Kafka Connect Developers
[ https://issues.apache.org/jira/browse/KAFKA-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873387#comment-16873387 ] Randall Hauch commented on KAFKA-8597: -- Although I'll have to think about this more, my initial thought is that we won't want to expose the existing DLQ API (which is currently not public) to the connectors. We may want to consider having a method on the SinkTaskContext that connectors can use to record a failed record, though the underlying guarantees provided by that approach (using a producer) complicate the guarantees provided through the consumer offset management. A significant disadvantage of any extension of the existing API is that any connector that depends on that new API would be constrained in the Kafka Connect versions into which it can be installed. For example, let's imagine that we add this API in AK 2.4.0; any connector that uses this API would only be able to be used in Kafka Connect 2.4.0, and would not work in any earlier versions. > Give access to the Dead Letter Queue APIs to Kafka Connect Developers > - > > Key: KAFKA-8597 > URL: https://issues.apache.org/jira/browse/KAFKA-8597 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: Andrea Santurbano >Priority: Major > Fix For: 2.4.0 > > > Would be cool to have the chance to have access to the DLQ APIs in order to > enable us (developers) to use that. > For instance, if someone uses JSON as message format with no schema and it's > trying to import some data into a table, and the JSON contains a null value > for a NON-NULL table field, so we want to move that event to the DLQ. > Thanks a lot! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6945) Add support to allow users to acquire delegation tokens for other users
[ https://issues.apache.org/jira/browse/KAFKA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873383#comment-16873383 ] Manikumar commented on KAFKA-6945: -- yes, that PR should have working code. But rebase to latest will give lot of conflicts. > Add support to allow users to acquire delegation tokens for other users > --- > > Key: KAFKA-6945 > URL: https://issues.apache.org/jira/browse/KAFKA-6945 > Project: Kafka > Issue Type: Sub-task >Reporter: Manikumar >Assignee: Viktor Somogyi-Vass >Priority: Major > Labels: needs-kip > > Currently, we only allow a user to create delegation token for that user > only. > We should allow users to acquire delegation tokens for other users. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8597) Give access to the Dead Letter Queue APIs to Kafka Connect Developers
[ https://issues.apache.org/jira/browse/KAFKA-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-8597: - Fix Version/s: 2.4.0 > Give access to the Dead Letter Queue APIs to Kafka Connect Developers > - > > Key: KAFKA-8597 > URL: https://issues.apache.org/jira/browse/KAFKA-8597 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: Andrea Santurbano >Priority: Major > Fix For: 2.4.0 > > > Would be cool to have the chance to have access to the DLQ APIs in order to > enable us (developers) to use that. > For instance, if someone uses JSON as message format with no schema and it's > trying to import some data into a table, and the JSON contains a null value > for a NON-NULL table field, so we want to move that event to the DLQ. > Thanks a lot! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6945) Add support to allow users to acquire delegation tokens for other users
[ https://issues.apache.org/jira/browse/KAFKA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873374#comment-16873374 ] Viktor Somogyi-Vass commented on KAFKA-6945: Found it, I probably asked early :) https://github.com/omkreddy/kafka/commits/KAFKA-6945-GET-DELEGATION-TOKEN > Add support to allow users to acquire delegation tokens for other users > --- > > Key: KAFKA-6945 > URL: https://issues.apache.org/jira/browse/KAFKA-6945 > Project: Kafka > Issue Type: Sub-task >Reporter: Manikumar >Assignee: Viktor Somogyi-Vass >Priority: Major > Labels: needs-kip > > Currently, we only allow a user to create delegation token for that user > only. > We should allow users to acquire delegation tokens for other users. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6945) Add support to allow users to acquire delegation tokens for other users
[ https://issues.apache.org/jira/browse/KAFKA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873350#comment-16873350 ] Viktor Somogyi-Vass commented on KAFKA-6945: Thanks Manikumar! Do you perhaps have a WIP code for this that can be continued? > Add support to allow users to acquire delegation tokens for other users > --- > > Key: KAFKA-6945 > URL: https://issues.apache.org/jira/browse/KAFKA-6945 > Project: Kafka > Issue Type: Sub-task >Reporter: Manikumar >Assignee: Viktor Somogyi-Vass >Priority: Major > Labels: needs-kip > > Currently, we only allow a user to create delegation token for that user > only. > We should allow users to acquire delegation tokens for other users. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8604) kafka log dir was marked as offline because of deleting segments of __consumer_offsets failed
songyingshuan created KAFKA-8604: Summary: kafka log dir was marked as offline because of deleting segments of __consumer_offsets failed Key: KAFKA-8604 URL: https://issues.apache.org/jira/browse/KAFKA-8604 Project: Kafka Issue Type: Bug Components: log cleaner Affects Versions: 1.0.1 Reporter: songyingshuan We encountered a problem in our product env without any foresight. When kafka broker trying to clean __consumer_offsets-38 (and only happents to this partition), the log shows it failed, and marking the whole disk/log dir offline, and this leads to a negative impact on some normal partitions (because of the ISR list of those partitions decrease). we had to restart the broker server to reuse the disk/dir which was marked as offline. BUT!! this problem occurs periodically with the same reason so we have to restart broker periodically. we read some source code of kafka-1.0.1, but cannot make sure why this happends. And The cluster status had been good until this problem suddenly attacked us. the error log is something like this : {code:java} 2019-06-25 00:11:26,241 INFO kafka.log.TimeIndex: Deleting index /data6/kafka/data/__consumer_offsets-38/012855596978.timeindex.deleted 2019-06-25 00:11:26,258 ERROR kafka.server.LogDirFailureChannel: Error while deleting segments for __consumer_offsets-38 in dir /data6/kafka/data java.io.IOException: Delete of log .log.deleted failed. at kafka.log.LogSegment.delete(LogSegment.scala:496) at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596) at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) at kafka.log.Log.maybeHandleIOException(Log.scala:1669) at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595) at kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599) at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2019-06-25 00:11:26,265 ERROR kafka.utils.KafkaScheduler: Uncaught exception in scheduled task 'delete-file' org.apache.kafka.common.errors.KafkaStorageException: Error while deleting segments for __consumer_offsets-38 in dir /data6/kafka/data Caused by: java.io.IOException: Delete of log .log.deleted failed. at kafka.log.LogSegment.delete(LogSegment.scala:496) at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596) at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596) at kafka.log.Log.maybeHandleIOException(Log.scala:1669) at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595) at kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599) at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110) at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2019-06-25 00:11:26,268 INFO kafka.server.ReplicaManager: [ReplicaManager broker=1001] Stopping serving replicas in dir /data6/kafka/data {code} and we also find that the cleaning of __consumer_offsets-38 is so frequently that almost 85% of output log is related. something like this : {code:java} 2019-06-25 20:29:03,222 INFO kafka.log.OffsetIndex: Deleting index /data6/kafka/data/__consumer_offsets-38/008457474982.index.deleted 2019-06-25 20:29:03,222 INFO kafka.log.TimeIndex: Deleting index /data6/kafka/data/__consumer_offsets-38/008457474982.timeindex.deleted 2019-06-25 20:29:03,295 INFO kafka.log.Log: Deleting segme
[jira] [Commented] (KAFKA-7697) Possible deadlock in kafka.cluster.Partition
[ https://issues.apache.org/jira/browse/KAFKA-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873214#comment-16873214 ] Rajini Sivaram commented on KAFKA-7697: --- [~muchl] This Jira addressed the deadlock which is fixed in 2.1.1. There is a separate Jira to reduce lock contention (https://issues.apache.org/jira/browse/KAFKA-7538) which could be the issue you ran into in 2.1.1. > Possible deadlock in kafka.cluster.Partition > > > Key: KAFKA-7697 > URL: https://issues.apache.org/jira/browse/KAFKA-7697 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Gian Merlino >Assignee: Rajini Sivaram >Priority: Blocker > Fix For: 2.2.0, 2.1.1 > > Attachments: 2.1.1-hangs.log, 322.tdump, kafka.log, kafka_jstack.txt, > threaddump.txt > > > After upgrading a fairly busy broker from 0.10.2.0 to 2.1.0, it locked up > within a few minutes (by "locked up" I mean that all request handler threads > were busy, and other brokers reported that they couldn't communicate with > it). I restarted it a few times and it did the same thing each time. After > downgrading to 0.10.2.0, the broker was stable. I attached a threaddump.txt > from the last attempt on 2.1.0 that shows lots of kafka-request-handler- > threads trying to acquire the leaderIsrUpdateLock lock in > kafka.cluster.Partition. > It jumps out that there are two threads that already have some read lock > (can't tell which one) and are trying to acquire a second one (on two > different read locks: 0x000708184b88 and 0x00070821f188): > kafka-request-handler-1 and kafka-request-handler-4. Both are handling a > produce request, and in the process of doing so, are calling > Partition.fetchOffsetSnapshot while trying to complete a DelayedFetch. At the > same time, both of those locks have writers from other threads waiting on > them (kafka-request-handler-2 and kafka-scheduler-6). Neither of those locks > appear to have writers that hold them (if only because no threads in the dump > are deep enough in inWriteLock to indicate that). > ReentrantReadWriteLock in nonfair mode prioritizes waiting writers over > readers. Is it possible that kafka-request-handler-1 and > kafka-request-handler-4 are each trying to read-lock the partition that is > currently locked by the other one, and they're both parked waiting for > kafka-request-handler-2 and kafka-scheduler-6 to get write locks, which they > never will, because the former two threads own read locks and aren't giving > them up? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-6945) Add support to allow users to acquire delegation tokens for other users
[ https://issues.apache.org/jira/browse/KAFKA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viktor Somogyi-Vass reassigned KAFKA-6945: -- Assignee: Viktor Somogyi-Vass (was: Manikumar) > Add support to allow users to acquire delegation tokens for other users > --- > > Key: KAFKA-6945 > URL: https://issues.apache.org/jira/browse/KAFKA-6945 > Project: Kafka > Issue Type: Sub-task >Reporter: Manikumar >Assignee: Viktor Somogyi-Vass >Priority: Major > Labels: needs-kip > > Currently, we only allow a user to create delegation token for that user > only. > We should allow users to acquire delegation tokens for other users. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-5181) Add a new request into admin client to manually setting the committed offsets
[ https://issues.apache.org/jira/browse/KAFKA-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873039#comment-16873039 ] Khaireddine Rezgui commented on KAFKA-5181: --- Hi [~guozhang], can i take this task ? > Add a new request into admin client to manually setting the committed offsets > - > > Key: KAFKA-5181 > URL: https://issues.apache.org/jira/browse/KAFKA-5181 > Project: Kafka > Issue Type: Bug >Reporter: Guozhang Wang >Priority: Major > Labels: needs-kip > > In the old consumer, where offsets are stored in ZK, a common operation is to > manually set the committed offsets for consumers by writing directly to ZK. > In the new consumer we remove the ZK dependency because it is risky to let > any client to talk to directly ZK, which is a multi-tenant service. However > this common operation "shortcut" is lost because of that change. > We should add this functionality back to Kafka, and now with the admin client > we could add it there. The client only needs to find the corresponding group > coordinator and send the offset commit request to it. The handling logic on > the broker side should not change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8596) Kafka topic pre-creation error message needs to be passed to application as an exception
[ https://issues.apache.org/jira/browse/KAFKA-8596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gurudatt Kulkarni resolved KAFKA-8596. -- Resolution: Duplicate > Kafka topic pre-creation error message needs to be passed to application as > an exception > > > Key: KAFKA-8596 > URL: https://issues.apache.org/jira/browse/KAFKA-8596 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 2.1.1 >Reporter: Ashish Vyas >Priority: Minor > > If i don't have a topic pre-created, I get an error log that reads "is > unknown yet during rebalance, please make sure they have been pre-created > before starting the Streams application." Ideally I expect an exception here > being thrown that I can catch in my application and decide what I want to do. > > Without this, my app keeps running and actual functionality doesn't work > making it time consuming to debug. I want to stop the application right at > this point. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)