[jira] [Commented] (KAFKA-7697) Possible deadlock in kafka.cluster.Partition

2019-06-26 Thread muchl (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873850#comment-16873850
 ] 

muchl commented on KAFKA-7697:
--

[~rsivaram] Thank you very much.
[KAFKA-7538 | https://issues.apache.org/jira/browse/KAFKA-7538] This seems to 
be the problem I encountered.I will focus on the KAFKA-7538.


> Possible deadlock in kafka.cluster.Partition
> 
>
> Key: KAFKA-7697
> URL: https://issues.apache.org/jira/browse/KAFKA-7697
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Gian Merlino
>Assignee: Rajini Sivaram
>Priority: Blocker
> Fix For: 2.2.0, 2.1.1
>
> Attachments: 2.1.1-hangs.log, 322.tdump, kafka.log, kafka_jstack.txt, 
> threaddump.txt
>
>
> After upgrading a fairly busy broker from 0.10.2.0 to 2.1.0, it locked up 
> within a few minutes (by "locked up" I mean that all request handler threads 
> were busy, and other brokers reported that they couldn't communicate with 
> it). I restarted it a few times and it did the same thing each time. After 
> downgrading to 0.10.2.0, the broker was stable. I attached a threaddump.txt 
> from the last attempt on 2.1.0 that shows lots of kafka-request-handler- 
> threads trying to acquire the leaderIsrUpdateLock lock in 
> kafka.cluster.Partition.
> It jumps out that there are two threads that already have some read lock 
> (can't tell which one) and are trying to acquire a second one (on two 
> different read locks: 0x000708184b88 and 0x00070821f188): 
> kafka-request-handler-1 and kafka-request-handler-4. Both are handling a 
> produce request, and in the process of doing so, are calling 
> Partition.fetchOffsetSnapshot while trying to complete a DelayedFetch. At the 
> same time, both of those locks have writers from other threads waiting on 
> them (kafka-request-handler-2 and kafka-scheduler-6). Neither of those locks 
> appear to have writers that hold them (if only because no threads in the dump 
> are deep enough in inWriteLock to indicate that).
> ReentrantReadWriteLock in nonfair mode prioritizes waiting writers over 
> readers. Is it possible that kafka-request-handler-1 and 
> kafka-request-handler-4 are each trying to read-lock the partition that is 
> currently locked by the other one, and they're both parked waiting for 
> kafka-request-handler-2 and kafka-scheduler-6 to get write locks, which they 
> never will, because the former two threads own read locks and aren't giving 
> them up?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8604) kafka log dir was marked as offline because of deleting segments of __consumer_offsets failed

2019-06-26 Thread songyingshuan (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873734#comment-16873734
 ] 

songyingshuan commented on KAFKA-8604:
--

The exceptions was thrown from here:
 
{code:scala}
  def delete() {
val deletedLog = log.delete()
val deletedIndex = index.delete()
val deletedTimeIndex = timeIndex.delete()
val deletedTxnIndex = txnIndex.delete()
if (!deletedLog && log.file.exists) // from here
  throw new IOException("Delete of log " + log.file.getName + " failed.")
if (!deletedIndex && index.file.exists)
  throw new IOException("Delete of index " + index.file.getName + " 
failed.")
if (!deletedTimeIndex && timeIndex.file.exists)
  throw new IOException("Delete of time index " + timeIndex.file.getName + 
" failed.")
if (!deletedTxnIndex && txnIndex.file.exists)
  throw new IOException("Delete of transaction index " + 
txnIndex.file.getName + " failed.")
  }
{code}

it means deletedLog == false && log.file.exists == true,that is the log file to 
be deleted failed to delete and the file is still in the disk.
but we have checked the partition file dir and the file did not exist. So 
confused !!


> kafka log dir was marked as offline because of deleting segments of 
> __consumer_offsets failed
> -
>
> Key: KAFKA-8604
> URL: https://issues.apache.org/jira/browse/KAFKA-8604
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 1.0.1
>Reporter: songyingshuan
>Priority: Major
> Attachments: error-logs.log
>
>
> We encountered a problem in our product env without any foresight. When kafka 
> broker trying to clean __consumer_offsets-38 (and only happents to this 
> partition), the log shows
> it failed, and marking the whole disk/log dir offline, and this leads to a 
> negative impact on some normal partitions (because of the ISR list of those 
> partitions decrease).
> we had to restart the broker server to reuse the disk/dir which was marked as 
> offline. BUT!! this problem occurs periodically with the same reason so we 
> have to restart broker periodically.
> we read some source code of kafka-1.0.1, but cannot make sure why this 
> happends. And The cluster status had been good until this problem suddenly 
> attacked us.
> the error log is something like this :
>  
> {code:java}
> 2019-06-25 00:11:26,241 INFO kafka.log.TimeIndex: Deleting index 
> /data6/kafka/data/__consumer_offsets-38/012855596978.timeindex.deleted
> 2019-06-25 00:11:26,258 ERROR kafka.server.LogDirFailureChannel: Error while 
> deleting segments for __consumer_offsets-38 in dir /data6/kafka/data
> java.io.IOException: Delete of log .log.deleted failed.
> at kafka.log.LogSegment.delete(LogSegment.scala:496)
> at 
> kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596)
> at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
> at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
> at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
> at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595)
> at 
> kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599)
> at 
> kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
> at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2019-06-25 00:11:26,265 ERROR kafka.utils.KafkaScheduler: Uncaught exception 
> in scheduled task 'delete-file'
> org.apache.kafka.common.errors.KafkaStorageException: Error while deleting 
> segments for __consumer_offsets-38 in dir /data6/kafka/data
> Caused by: java.io.IOException: Delete of log 
> .log.deleted failed.
> at kafka.log.LogSegment.delete(LogSegment.scala:496)
> at 
> kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596)
> at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
> at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
> at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
> at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log

[jira] [Updated] (KAFKA-8604) kafka log dir was marked as offline because of deleting segments of __consumer_offsets failed

2019-06-26 Thread songyingshuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

songyingshuan updated KAFKA-8604:
-
Attachment: error-logs.log

> kafka log dir was marked as offline because of deleting segments of 
> __consumer_offsets failed
> -
>
> Key: KAFKA-8604
> URL: https://issues.apache.org/jira/browse/KAFKA-8604
> Project: Kafka
>  Issue Type: Bug
>  Components: log cleaner
>Affects Versions: 1.0.1
>Reporter: songyingshuan
>Priority: Major
> Attachments: error-logs.log
>
>
> We encountered a problem in our product env without any foresight. When kafka 
> broker trying to clean __consumer_offsets-38 (and only happents to this 
> partition), the log shows
> it failed, and marking the whole disk/log dir offline, and this leads to a 
> negative impact on some normal partitions (because of the ISR list of those 
> partitions decrease).
> we had to restart the broker server to reuse the disk/dir which was marked as 
> offline. BUT!! this problem occurs periodically with the same reason so we 
> have to restart broker periodically.
> we read some source code of kafka-1.0.1, but cannot make sure why this 
> happends. And The cluster status had been good until this problem suddenly 
> attacked us.
> the error log is something like this :
>  
> {code:java}
> 2019-06-25 00:11:26,241 INFO kafka.log.TimeIndex: Deleting index 
> /data6/kafka/data/__consumer_offsets-38/012855596978.timeindex.deleted
> 2019-06-25 00:11:26,258 ERROR kafka.server.LogDirFailureChannel: Error while 
> deleting segments for __consumer_offsets-38 in dir /data6/kafka/data
> java.io.IOException: Delete of log .log.deleted failed.
> at kafka.log.LogSegment.delete(LogSegment.scala:496)
> at 
> kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596)
> at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
> at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
> at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
> at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595)
> at 
> kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599)
> at 
> kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
> at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2019-06-25 00:11:26,265 ERROR kafka.utils.KafkaScheduler: Uncaught exception 
> in scheduled task 'delete-file'
> org.apache.kafka.common.errors.KafkaStorageException: Error while deleting 
> segments for __consumer_offsets-38 in dir /data6/kafka/data
> Caused by: java.io.IOException: Delete of log 
> .log.deleted failed.
> at kafka.log.LogSegment.delete(LogSegment.scala:496)
> at 
> kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596)
> at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
> at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
> at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
> at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595)
> at 
> kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599)
> at 
> kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
> at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2019-06-25 00:11:26,268 INFO kafka.server.ReplicaManager: [ReplicaManager 
> broker=1001] Stopping serving replicas in dir /data6/kafka/data
> {code}
>  and we also find that the cleaning of __c

[jira] [Created] (KAFKA-8605) Warn users when they have same connector in their plugin-path more than once

2019-06-26 Thread Cyrus Vafadari (JIRA)
Cyrus Vafadari created KAFKA-8605:
-

 Summary: Warn users when they have same connector in their 
plugin-path more than once
 Key: KAFKA-8605
 URL: https://issues.apache.org/jira/browse/KAFKA-8605
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: Cyrus Vafadari


Right now it is very easy to have multiple copies of the same connector in the 
plugin-path and not realize it.

This can be problematic if a user is adding dependencies into the plugin, or 
accidentally using the wrong version of the connector.

An unintrusive improvement would be to log a warning if the same connector 
appears in the plugin-path more than once



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7077) KIP-318: Make Kafka Connect Source idempotent

2019-06-26 Thread JIRA


[ 
https://issues.apache.org/jira/browse/KAFKA-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873440#comment-16873440
 ] 

Jose A. Iñigo commented on KAFKA-7077:
--

Hi [~stephane.maa...@gmail.com], are there any plans to address this in the 
near future?

> KIP-318: Make Kafka Connect Source idempotent
> -
>
> Key: KAFKA-7077
> URL: https://issues.apache.org/jira/browse/KAFKA-7077
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 2.0.0
>Reporter: Stephane Maarek
>Assignee: Stephane Maarek
>Priority: Major
>
> KIP Link: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-318%3A+Make+Kafka+Connect+Source+idempotent



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8597) Give access to the Dead Letter Queue APIs to Kafka Connect Developers

2019-06-26 Thread Randall Hauch (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873387#comment-16873387
 ] 

Randall Hauch commented on KAFKA-8597:
--

Although I'll have to think about this more, my initial thought is that we 
won't want to expose the existing DLQ API (which is currently not public) to 
the connectors. We may want to consider having a method on the SinkTaskContext 
that connectors can use to record a failed record, though the underlying 
guarantees provided by that approach (using a producer) complicate the 
guarantees provided through the consumer offset management.

A significant disadvantage of any extension of the existing API is that any 
connector that depends on that new API would be constrained in the Kafka 
Connect versions into which it can be installed. For example, let's imagine 
that we add this API in AK 2.4.0; any connector that uses this API would only 
be able to be used in Kafka Connect 2.4.0, and would not work in any earlier 
versions.

> Give access to the Dead Letter Queue APIs to Kafka Connect Developers
> -
>
> Key: KAFKA-8597
> URL: https://issues.apache.org/jira/browse/KAFKA-8597
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Andrea Santurbano
>Priority: Major
> Fix For: 2.4.0
>
>
> Would be cool to have the chance to have access to the DLQ APIs in order to 
> enable us (developers) to use that.
> For instance, if someone uses JSON as message format with no schema and it's 
> trying to import some data into a table, and the JSON contains a null value 
> for a NON-NULL table field, so we want to move that event to the DLQ.
> Thanks a lot!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6945) Add support to allow users to acquire delegation tokens for other users

2019-06-26 Thread Manikumar (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873383#comment-16873383
 ] 

Manikumar commented on KAFKA-6945:
--

yes, that PR should have working code. But rebase to latest will give lot of 
conflicts.

> Add support to allow users to acquire delegation tokens for other users
> ---
>
> Key: KAFKA-6945
> URL: https://issues.apache.org/jira/browse/KAFKA-6945
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Manikumar
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: needs-kip
>
> Currently, we only allow a user to create delegation token for that user 
> only. 
> We should allow users to acquire delegation tokens for other users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8597) Give access to the Dead Letter Queue APIs to Kafka Connect Developers

2019-06-26 Thread Randall Hauch (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8597:
-
Fix Version/s: 2.4.0

> Give access to the Dead Letter Queue APIs to Kafka Connect Developers
> -
>
> Key: KAFKA-8597
> URL: https://issues.apache.org/jira/browse/KAFKA-8597
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Andrea Santurbano
>Priority: Major
> Fix For: 2.4.0
>
>
> Would be cool to have the chance to have access to the DLQ APIs in order to 
> enable us (developers) to use that.
> For instance, if someone uses JSON as message format with no schema and it's 
> trying to import some data into a table, and the JSON contains a null value 
> for a NON-NULL table field, so we want to move that event to the DLQ.
> Thanks a lot!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6945) Add support to allow users to acquire delegation tokens for other users

2019-06-26 Thread Viktor Somogyi-Vass (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873374#comment-16873374
 ] 

Viktor Somogyi-Vass commented on KAFKA-6945:


Found it, I probably asked early :)
https://github.com/omkreddy/kafka/commits/KAFKA-6945-GET-DELEGATION-TOKEN

> Add support to allow users to acquire delegation tokens for other users
> ---
>
> Key: KAFKA-6945
> URL: https://issues.apache.org/jira/browse/KAFKA-6945
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Manikumar
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: needs-kip
>
> Currently, we only allow a user to create delegation token for that user 
> only. 
> We should allow users to acquire delegation tokens for other users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6945) Add support to allow users to acquire delegation tokens for other users

2019-06-26 Thread Viktor Somogyi-Vass (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873350#comment-16873350
 ] 

Viktor Somogyi-Vass commented on KAFKA-6945:


Thanks Manikumar! Do you perhaps have a WIP code for this that can be continued?

> Add support to allow users to acquire delegation tokens for other users
> ---
>
> Key: KAFKA-6945
> URL: https://issues.apache.org/jira/browse/KAFKA-6945
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Manikumar
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: needs-kip
>
> Currently, we only allow a user to create delegation token for that user 
> only. 
> We should allow users to acquire delegation tokens for other users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8604) kafka log dir was marked as offline because of deleting segments of __consumer_offsets failed

2019-06-26 Thread songyingshuan (JIRA)
songyingshuan created KAFKA-8604:


 Summary: kafka log dir was marked as offline because of deleting 
segments of __consumer_offsets failed
 Key: KAFKA-8604
 URL: https://issues.apache.org/jira/browse/KAFKA-8604
 Project: Kafka
  Issue Type: Bug
  Components: log cleaner
Affects Versions: 1.0.1
Reporter: songyingshuan


We encountered a problem in our product env without any foresight. When kafka 
broker trying to clean __consumer_offsets-38 (and only happents to this 
partition), the log shows
it failed, and marking the whole disk/log dir offline, and this leads to a 
negative impact on some normal partitions (because of the ISR list of those 
partitions decrease).
we had to restart the broker server to reuse the disk/dir which was marked as 
offline. BUT!! this problem occurs periodically with the same reason so we have 
to restart broker periodically.

we read some source code of kafka-1.0.1, but cannot make sure why this 
happends. And The cluster status had been good until this problem suddenly 
attacked us.

the error log is something like this :

 
{code:java}
2019-06-25 00:11:26,241 INFO kafka.log.TimeIndex: Deleting index 
/data6/kafka/data/__consumer_offsets-38/012855596978.timeindex.deleted
2019-06-25 00:11:26,258 ERROR kafka.server.LogDirFailureChannel: Error while 
deleting segments for __consumer_offsets-38 in dir /data6/kafka/data
java.io.IOException: Delete of log .log.deleted failed.
at kafka.log.LogSegment.delete(LogSegment.scala:496)
at 
kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596)
at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595)
at 
kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599)
at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-06-25 00:11:26,265 ERROR kafka.utils.KafkaScheduler: Uncaught exception in 
scheduled task 'delete-file'
org.apache.kafka.common.errors.KafkaStorageException: Error while deleting 
segments for __consumer_offsets-38 in dir /data6/kafka/data
Caused by: java.io.IOException: Delete of log .log.deleted 
failed.
at kafka.log.LogSegment.delete(LogSegment.scala:496)
at 
kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply$mcV$sp(Log.scala:1596)
at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
at kafka.log.Log$$anonfun$kafka$log$Log$$deleteSeg$1$1.apply(Log.scala:1596)
at kafka.log.Log.maybeHandleIOException(Log.scala:1669)
at kafka.log.Log.kafka$log$Log$$deleteSeg$1(Log.scala:1595)
at 
kafka.log.Log$$anonfun$kafka$log$Log$$asyncDeleteSegment$1.apply$mcV$sp(Log.scala:1599)
at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-06-25 00:11:26,268 INFO kafka.server.ReplicaManager: [ReplicaManager 
broker=1001] Stopping serving replicas in dir /data6/kafka/data
{code}
 and we also find that the cleaning of __consumer_offsets-38 is so frequently 
that almost 85% of output log is related. something like this :

 
{code:java}
2019-06-25 20:29:03,222 INFO kafka.log.OffsetIndex: Deleting index 
/data6/kafka/data/__consumer_offsets-38/008457474982.index.deleted 
2019-06-25 20:29:03,222 INFO kafka.log.TimeIndex: Deleting index 
/data6/kafka/data/__consumer_offsets-38/008457474982.timeindex.deleted 
2019-06-25 20:29:03,295 INFO kafka.log.Log: Deleting segme

[jira] [Commented] (KAFKA-7697) Possible deadlock in kafka.cluster.Partition

2019-06-26 Thread Rajini Sivaram (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873214#comment-16873214
 ] 

Rajini Sivaram commented on KAFKA-7697:
---

[~muchl] This Jira addressed the deadlock which is fixed in 2.1.1. There is a 
separate Jira  to reduce lock contention 
(https://issues.apache.org/jira/browse/KAFKA-7538) which could be the issue you 
ran into in 2.1.1.

> Possible deadlock in kafka.cluster.Partition
> 
>
> Key: KAFKA-7697
> URL: https://issues.apache.org/jira/browse/KAFKA-7697
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Gian Merlino
>Assignee: Rajini Sivaram
>Priority: Blocker
> Fix For: 2.2.0, 2.1.1
>
> Attachments: 2.1.1-hangs.log, 322.tdump, kafka.log, kafka_jstack.txt, 
> threaddump.txt
>
>
> After upgrading a fairly busy broker from 0.10.2.0 to 2.1.0, it locked up 
> within a few minutes (by "locked up" I mean that all request handler threads 
> were busy, and other brokers reported that they couldn't communicate with 
> it). I restarted it a few times and it did the same thing each time. After 
> downgrading to 0.10.2.0, the broker was stable. I attached a threaddump.txt 
> from the last attempt on 2.1.0 that shows lots of kafka-request-handler- 
> threads trying to acquire the leaderIsrUpdateLock lock in 
> kafka.cluster.Partition.
> It jumps out that there are two threads that already have some read lock 
> (can't tell which one) and are trying to acquire a second one (on two 
> different read locks: 0x000708184b88 and 0x00070821f188): 
> kafka-request-handler-1 and kafka-request-handler-4. Both are handling a 
> produce request, and in the process of doing so, are calling 
> Partition.fetchOffsetSnapshot while trying to complete a DelayedFetch. At the 
> same time, both of those locks have writers from other threads waiting on 
> them (kafka-request-handler-2 and kafka-scheduler-6). Neither of those locks 
> appear to have writers that hold them (if only because no threads in the dump 
> are deep enough in inWriteLock to indicate that).
> ReentrantReadWriteLock in nonfair mode prioritizes waiting writers over 
> readers. Is it possible that kafka-request-handler-1 and 
> kafka-request-handler-4 are each trying to read-lock the partition that is 
> currently locked by the other one, and they're both parked waiting for 
> kafka-request-handler-2 and kafka-scheduler-6 to get write locks, which they 
> never will, because the former two threads own read locks and aren't giving 
> them up?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-6945) Add support to allow users to acquire delegation tokens for other users

2019-06-26 Thread Viktor Somogyi-Vass (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass reassigned KAFKA-6945:
--

Assignee: Viktor Somogyi-Vass  (was: Manikumar)

> Add support to allow users to acquire delegation tokens for other users
> ---
>
> Key: KAFKA-6945
> URL: https://issues.apache.org/jira/browse/KAFKA-6945
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Manikumar
>Assignee: Viktor Somogyi-Vass
>Priority: Major
>  Labels: needs-kip
>
> Currently, we only allow a user to create delegation token for that user 
> only. 
> We should allow users to acquire delegation tokens for other users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-5181) Add a new request into admin client to manually setting the committed offsets

2019-06-26 Thread Khaireddine Rezgui (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873039#comment-16873039
 ] 

Khaireddine Rezgui commented on KAFKA-5181:
---

Hi [~guozhang], can i take this task ?

> Add a new request into admin client to manually setting the committed offsets
> -
>
> Key: KAFKA-5181
> URL: https://issues.apache.org/jira/browse/KAFKA-5181
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Priority: Major
>  Labels: needs-kip
>
> In the old consumer, where offsets are stored in ZK, a common operation is to 
> manually set the committed offsets for consumers by writing directly to ZK.
> In the new consumer we remove the ZK dependency because it is risky to let 
> any client to talk to directly ZK, which is a multi-tenant service. However 
> this common operation "shortcut" is lost because of that change.
> We should add this functionality back to Kafka, and now with the admin client 
> we could add it there. The client only needs to find the corresponding group 
> coordinator and send the offset commit request to it. The handling logic on 
> the broker side should not change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8596) Kafka topic pre-creation error message needs to be passed to application as an exception

2019-06-26 Thread Gurudatt Kulkarni (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gurudatt Kulkarni resolved KAFKA-8596.
--
Resolution: Duplicate

> Kafka topic pre-creation error message needs to be passed to application as 
> an exception
> 
>
> Key: KAFKA-8596
> URL: https://issues.apache.org/jira/browse/KAFKA-8596
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.1.1
>Reporter: Ashish Vyas
>Priority: Minor
>
> If i don't have a topic pre-created, I get an error log that reads "is 
> unknown yet during rebalance, please make sure they have been pre-created 
> before starting the Streams application." Ideally I expect an exception here 
> being thrown that I can catch in my application and decide what I want to do. 
>  
> Without this, my app keeps running and actual functionality doesn't work 
> making it time consuming to debug. I want to stop the application right at 
> this point.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)