[jira] (KAFKA-4094) Fix importance labels for Kafka Server config

2024-06-03 Thread Abhi (Jira)


[ https://issues.apache.org/jira/browse/KAFKA-4094 ]


Abhi deleted comment on KAFKA-4094:
-

was (Author: JIRAUSER305362):
Hi [~jkreps] 
Can you describe this issue more and also mention which component it belongs to?

As per my understanding, you need the server configs like:- 

HIGH - broker.id, log.dirs, num.partitions

MEDIUM - log.segment.bytes, log.cleanup.policy

LOW -  message.max.bytes, replica.fetch.max.bytes

> Fix importance labels for Kafka Server config
> -
>
> Key: KAFKA-4094
> URL: https://issues.apache.org/jira/browse/KAFKA-4094
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jay Kreps
>Priority: Major
>  Labels: newbie
>
> We have > 100 server configs. The importance label is meant to help people 
> navigate this in a sane way. The intention is something like the following:
> HIGH - things you must think about and set
> MEDIUM - things you don't necessarily need to set but that you might want to 
> tune
> LOW - thing you probably don't need to set
> Currently we have a gazillion things marked as high including very subtle 
> tuning params and also things marked as deprecated (which probably should be 
> its own importance level). This makes it really hard for people to figure out 
> which configurations to actually learn about and use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-4094) Fix importance labels for Kafka Server config

2024-06-03 Thread Abhi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851571#comment-17851571
 ] 

Abhi edited comment on KAFKA-4094 at 6/3/24 10:27 AM:
--

Hi [~jkreps] 
Can you describe this issue more and also mention which component it belongs to?

As per my understanding, you need the server configs like:- 

HIGH - broker.id, log.dirs, num.partitions

MEDIUM - log.segment.bytes, log.cleanup.policy

LOW -  message.max.bytes, replica.fetch.max.bytes


was (Author: JIRAUSER305362):
Hi [~jkreps] 
Can you describe this issue more and also mention which component it belongs to?

As per my understanding, you need the server configs like:- 


HIGH - broker.id, log.dirs, num.partitions

{*}{*}MEDIUM - log.segment.bytes, log.cleanup.policy

{*}{*}{*}{*}LOW - ** message.max.bytes, replica.fetch.max.bytes

> Fix importance labels for Kafka Server config
> -
>
> Key: KAFKA-4094
> URL: https://issues.apache.org/jira/browse/KAFKA-4094
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jay Kreps
>Priority: Major
>  Labels: newbie
>
> We have > 100 server configs. The importance label is meant to help people 
> navigate this in a sane way. The intention is something like the following:
> HIGH - things you must think about and set
> MEDIUM - things you don't necessarily need to set but that you might want to 
> tune
> LOW - thing you probably don't need to set
> Currently we have a gazillion things marked as high including very subtle 
> tuning params and also things marked as deprecated (which probably should be 
> its own importance level). This makes it really hard for people to figure out 
> which configurations to actually learn about and use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-4094) Fix importance labels for Kafka Server config

2024-06-03 Thread Abhi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851571#comment-17851571
 ] 

Abhi commented on KAFKA-4094:
-

Hi [~jkreps] 
Can you describe this issue more and also mention which component it belongs to?

As per my understanding, you need the server configs like:- 


HIGH - broker.id, log.dirs, num.partitions

{*}{*}MEDIUM - log.segment.bytes, log.cleanup.policy

{*}{*}{*}{*}LOW - ** message.max.bytes, replica.fetch.max.bytes

> Fix importance labels for Kafka Server config
> -
>
> Key: KAFKA-4094
> URL: https://issues.apache.org/jira/browse/KAFKA-4094
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jay Kreps
>Priority: Major
>  Labels: newbie
>
> We have > 100 server configs. The importance label is meant to help people 
> navigate this in a sane way. The intention is something like the following:
> HIGH - things you must think about and set
> MEDIUM - things you don't necessarily need to set but that you might want to 
> tune
> LOW - thing you probably don't need to set
> Currently we have a gazillion things marked as high including very subtle 
> tuning params and also things marked as deprecated (which probably should be 
> its own importance level). This makes it really hard for people to figure out 
> which configurations to actually learn about and use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-8046) Shutdown broker because all log dirs in /tmp/kafka-logs have failed

2020-07-11 Thread Abhi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155988#comment-17155988
 ] 

Abhi commented on KAFKA-8046:
-

I am also saw  the same exception in kafka_2.12-2.3.0. 

[2020-07-11 02:50:03,621] ERROR Error while reading checkpoint file 
/local/kafka/data/replication-offset-checkpoint 
(kafka.server.LogDirFailureChannel)
java.nio.charset.MalformedInputException: Input length = 1
at 
java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274)
at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.base/java.io.InputStreamReader.read(InputStreamReader.java:185)
at java.base/java.io.BufferedReader.fill(BufferedReader.java:161)
at java.base/java.io.BufferedReader.readLine(BufferedReader.java:326)
at java.base/java.io.BufferedReader.readLine(BufferedReader.java:392)
at 
kafka.server.checkpoints.CheckpointFile.liftedTree2$1(CheckpointFile.scala:90)
at kafka.server.checkpoints.CheckpointFile.read(CheckpointFile.scala:86)
at 
kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:61)
at 
kafka.cluster.Partition.$anonfun$getOrCreateReplica$1(Partition.scala:204)
at kafka.utils.Pool$$anon$1.apply(Pool.scala:61)
at 
java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1705)
at kafka.utils.Pool.getAndMaybePut(Pool.scala:60)
at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:198)
at kafka.cluster.Partition.$anonfun$makeLeader$3(Partition.scala:376)
at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at scala.collection.TraversableLike.map(TraversableLike.scala:237)
at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
at scala.collection.AbstractTraversable.map(Traversable.scala:108)
at kafka.cluster.Partition.$anonfun$makeLeader$1(Partition.scala:376)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:261)
at kafka.cluster.Partition.makeLeader(Partition.scala:370)
at 
kafka.server.ReplicaManager.$anonfun$makeLeaders$5(ReplicaManager.scala:1188)
at 
scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)
at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)
at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)
at kafka.server.ReplicaManager.makeLeaders(ReplicaManager.scala:1186)
at 
kafka.server.ReplicaManager.becomeLeaderOrFollower(ReplicaManager.scala:1098)
at kafka.server.KafkaApis.handleLeaderAndIsrRequest(KafkaApis.scala:198)
at kafka.server.KafkaApis.handle(KafkaApis.scala:115)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69)


> Shutdown broker because all log dirs in /tmp/kafka-logs have failed
> ---
>
> Key: KAFKA-8046
> URL: https://issues.apache.org/jira/browse/KAFKA-8046
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.0.0
> Environment: centos 7
>Reporter: jaren
>Priority: Major
>
> kafka stop working every few days.Here are some of logs.
> ERROR Error while reading checkpoint file 
> /tmp/kafka-logs/cleaner-offset-checkpoint (kafka.server.LogDirFailureChannel)
> java.io.FileNotFoundException: /tmp/kafka-logs/cleaner-offset-checkpoint (No 
> such file or directory)
>  at java.io.FileInputStream.open0(Native Method)
>  at java.io.FileInputStream.open(FileInputStream.java:195)
>  at java.io.FileInputStream.(FileInputStream.java:138)
>  at 
> kafka.server.checkpoints.CheckpointFile.liftedTree2$1(CheckpointFile.scala:87)
>  at kafka.server.checkpoints.CheckpointFile.read(CheckpointFile.scala:86)
>  at 
> kafka.server.checkpoints.OffsetCheckpointFile.read(OffsetCheckpointFile.scala:61)
>  at 
> kafka.log.LogCleanerManager$$anonfun$allCleanerCheckpoints$1$$anonfun$apply$1.apply(LogCleanerManager.scala:89)
>  at 
> kafka.log.LogCleanerManager$$anonfun$allCleanerCheckpoints$1$$an

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-05-31 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16852787#comment-16852787
 ] 

Abhi commented on KAFKA-7925:
-

Hi [~rsivaram] [~enothereska],

Is there any update on this? When will this patch be available in Kafka release?

Thanks,
Abhi

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.2.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1, Kafka v2.2.0
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod 2048u sock 0,7 0t0 30012546 protocol

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-04-02 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807858#comment-16807858
 ] 

Abhi commented on KAFKA-7925:
-

[~enothereska] I added v2.2 as well in the description. Do you have any updates 
on when the next version is planned and if this fix will be available in it?

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.2.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1, Kafka v2.2.0
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod 2048u

[jira] [Updated] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-04-02 Thread Abhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhi updated KAFKA-7925:

Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1, Kafka v2.2.0  (was: Java 
11, Kafka v2.1.0, Kafka v2.1.1)

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1, Kafka v2.2.0
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod 2048u sock 0,7 0t0 30012546 protocol: TCP
> java 144319 kafkagod 2049u IPv4 30005418 0t0 TCP 
> mwkafka-prod-02.tbd:

[jira] [Updated] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-04-02 Thread Abhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhi updated KAFKA-7925:

Affects Version/s: 2.2.0

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.2.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1, Kafka v2.2.0
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod 2048u sock 0,7 0t0 30012546 protocol: TCP
> java 144319 kafkagod 2049u IPv4 30005418 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39686 (CLOSE_WAIT)
> java 144319 kafkagod 2050u IPv

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-22 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798931#comment-16798931
 ] 

Abhi commented on KAFKA-7925:
-

[~rsivaram] In what release can I expect the fix to be and what will be 
expected dates for it? I am asking this since it is a blocker for our 
deployment and will probably have to continue to use the local patched kafka 
version.

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02

[jira] [Commented] (KAFKA-7982) ConcurrentModificationException and Continuous warnings "Attempting to send response via channel for which there is no open connection"

2019-03-22 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798929#comment-16798929
 ] 

Abhi commented on KAFKA-7982:
-

Hi,

Any updates on this? 

Thanks!

> ConcurrentModificationException and Continuous warnings "Attempting to send 
> response via channel for which there is no open connection"
> ---
>
> Key: KAFKA-7982
> URL: https://issues.apache.org/jira/browse/KAFKA-7982
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.1
>Reporter: Abhi
>Priority: Major
>
> Hi,
> I am getting following warnings in server.log continuosly and due to this 
> client consumer is not able to consumer messages.
> [2019-02-20 10:26:30,312] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35248-6259 (kafka.network.Processor)
>  [2019-02-20 10:26:56,760] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35604-6261 (kafka.network.Processor)
> I also noticed that before these warnings started to appear, following 
> concurrent modification exception for the same IP address:
> [2019-02-20 09:01:11,175] INFO Initiating logout for 
> kafka/u-kafkatst-kafkadev-1.sd@unix.com 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)
>  [2019-02-20 09:01:11,176] WARN [SocketServer brokerId=1] Unexpected error 
> from /10.219.25.239; closing connection 
> (org.apache.kafka.common.network.Selector)
>  java.util.ConcurrentModificationException
>  at 
> java.base/java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:970)
>  at java.base/java.util.LinkedList$ListItr.next(LinkedList.java:892)
>  at java.base/javax.security.auth.Subject$SecureSet$1.next(Subject.java:1096)
>  at java.base/javax.security.auth.Subject$ClassSet$1.run(Subject.java:1501)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.base/javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1499)
>  at java.base/javax.security.auth.Subject$ClassSet.(Subject.java:1472)
>  at 
> java.base/javax.security.auth.Subject.getPrivateCredentials(Subject.java:764)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:336)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:328)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.security.jgss/sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:328)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:187)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:439)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:74)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:148)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>  at 
> java.security.sasl/javax.security.sasl.Sasl.createSaslServer(Sasl.java:537)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.lambda$createSaslKerberosServer$12(SaslServerAuthenticator.java:212)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:211)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslServer(SaslServerAuthenticator.java:164)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:450)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:248)
>  at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:132)
>  at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
>  at org.apache.kafka.common.network.Selector.poll(Selector.java:467)
>  at kafka.network.Processor.poll(SocketServer.scala:689)
>  at kafka.network.Processor.run(SocketServer.scala:594)
>  at java.base/java.lang.Thread.run(Thread.java:834)
>  [2019-

[jira] [Created] (KAFKA-8146) WARNING: An illegal reflective access operation has occurred

2019-03-22 Thread Abhi (JIRA)
Abhi created KAFKA-8146:
---

 Summary: WARNING: An illegal reflective access operation has 
occurred
 Key: KAFKA-8146
 URL: https://issues.apache.org/jira/browse/KAFKA-8146
 Project: Kafka
  Issue Type: Bug
  Components: clients, core
Affects Versions: 2.1.1
 Environment: Java 11
Kafka v2.1.1
Reporter: Abhi


Hi,
I am running Kafka v2.1.1 and see below warnings at the startup of server and 
clients. What is the cause of these warnings and how they can be avoided or 
fixed?


*Client side:*
WARNING: Illegal reflective access by 
org.apache.kafka.common.network.SaslChannelBuilder 
(file:/local/kafka/kafka_installation/kafka_2.12-2.1.1/libs/kafka-clients-2.1.1.jar)
 to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of 
org.apache.kafka.common.network.SaslChannelBuilder
WARNING: Use --illegal-access=warn to enable warnings of further illegal 
reflective access operations
WARNING: All illegal access operations will be denied in a future release


*Server side:*
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by 
org.apache.zookeeper.server.util.KerberosUtil 
(file:/local/kafka/kafka_installation/kafka_2.12-2.1.1/libs/zookeeper-3.4.13.jar)
 to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of 
org.apache.zookeeper.server.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal 
reflective access operations
WARNING: All illegal access operations will be denied in a future release





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-20 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796961#comment-16796961
 ] 

Abhi commented on KAFKA-7925:
-

[~rsivaram] 
Any updates on whether this will be available in Kafka v2.2.0 release ?

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod 2048u sock 0,7 0t0 30012546 protocol: TCP
> java 144319 kafkagod 2049u IPv4 30005418 0t0 TCP 
> mwk

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-11 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789251#comment-16789251
 ] 

Abhi commented on KAFKA-7925:
-

[~rsivaram] [~ijuma] This issue is a blocker for our deployment. Can this be 
included in the next v2.2.0 release?

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod 2048u sock 0,7 0t0 30012546 protocol: TCP
> java 144319 kafkagod 2049

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-08 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16787862#comment-16787862
 ] 

Abhi commented on KAFKA-7925:
-

[~rsivaram]
I am not seeing the  org.apache.kafka.common.errors.UnknownServerException with 
the producer anymore and haven't noticed the 100% cpu so far with the patch.
Can we please get this patch included in the v2.2.0 release? And when is v2.2.0 
release scheduled?

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mw

[jira] [Created] (KAFKA-8071) Specify default partitions and replication factor for regex based topics in kafka

2019-03-08 Thread Abhi (JIRA)
Abhi created KAFKA-8071:
---

 Summary: Specify default partitions and replication factor for 
regex based topics in kafka
 Key: KAFKA-8071
 URL: https://issues.apache.org/jira/browse/KAFKA-8071
 Project: Kafka
  Issue Type: New Feature
  Components: controller
Affects Versions: 2.1.1
Reporter: Abhi


Is it possible to specify different default partitions and replication factor 
for topics of type foo.*? If not what is the best way to achieve this from 
producer point of view?

I know of KafkaAdmin utils but the topic creation will happen on producer and I 
don't want to give admin permissions on metadata stored in zookeeper to the 
user running the producer for security reasons.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-06 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16785656#comment-16785656
 ] 

Abhi commented on KAFKA-7925:
-

>> What version of producers/consumers are you using?
I am using this client jar for producer and consumer -  
kafka_2.12-2.1.1/libs/kafka-clients-2.1.1.jar

>>The simplest way to figure out what the error was would be turn on request 
>>logging on the broker. And then you can see the request and responses in 
>>detail. In config/log4j.properties, change this line to TRACE instead of 
>>WARN, and you should find the trace in kafka-request.log in the log directory.
I will try this and update here soon.


> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> j

[jira] [Updated] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-05 Thread Abhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhi updated KAFKA-7925:

Attachment: jira-server.log-6
jira-server.log-5
jira-server.log-4
jira-server.log-3
jira-server.log-2
jira-server.log-1
jira_prod.producer.log

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-05 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784669#comment-16784669
 ] 

Abhi commented on KAFKA-7925:
-

I tried the test again. I started the producer application to publish messages 
on 40 topics using same producer sequentially. I got the 
org.apache.kafka.common.errors.UnknownServerException again. This time all the 
servers were running properly without any exceptions.
I have uploaded the producer and all brokers server.log at debug level in the 
issue.

Getting the same exception when running consumer group command:
env KAFKA_LOG4J_CONFIG=/u/choudhab/kafka/log4j.properties 
KAFKA_OPTS="-Dsun.security.jgss.native=true 
-Dsun.security.jgss.lib=/usr/libexec/libgsswrap.so 
-Djavax.security.auth.useSubjectCredsOnly=false 
-Djava.security.auth.login.config=/u/bansalp/kafka/producer_jaas.conf" 
/proj/tools/infra/apache/kafka/kafka_2.12-2.1.1/bin/kafka-consumer-groups.sh 
--bootstrap-server mwkafka-prod-01.nyc:9092 --command-config 
/u/choudhab/kafka/command_config  --list
Error: Executing consumer group command failed due to 
org.apache.kafka.common.errors.UnknownServerException: Error listing groups on 
mwkafka-prod-01.dr.xxx.com:9092 (id: 3 rack: dr.xxx.com)
java.util.concurrent.ExecutionException: 
org.apache.kafka.common.errors.UnknownServerException: Error listing groups on 
mwkafka-prod-01.dr.xxx.com:9092 (id: 3 rack: dr.xxx.com)
at 
org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
at 
org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
at 
org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
at 
org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:262)
at 
kafka.admin.ConsumerGroupCommand$ConsumerGroupService.listGroups(ConsumerGroupCommand.scala:132)
at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:58)
at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala)
Caused by: org.apache.kafka.common.errors.UnknownServerException: Error listing 
groups on mwkafka-prod-01.dr.xxx.com:9092 (id: 3 rack: dr.xxx.com)
[2019-03-05 12:13:15,120] WARN [Principal=null]: TGT renewal thread has been 
interrupted and will exit. 
(org.apache.kafka.common.security.kerberos.KerberosLogin)


> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619

[jira] [Comment Edited] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-05 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784456#comment-16784456
 ] 

Abhi edited comment on KAFKA-7925 at 3/5/19 1:54 PM:
-

[~rsivaram]
Did you run a clean build using the instructions Manikumar posted above? We 
want to make sure that all the jars you are running with came from that build 
to avoid NoClassDefFoundError. The client exceptions could be related to that.
>> All brokers were using JARs from [~omkreddy]'s build (using the 
>> instructions). There was no NoClassDefFoundError at the startup. Note that 
>> only one broker saw these exceptions and they all use same configuration and 
>> jars

>>Were all the brokers running with the build including the PR for a day? And 
>>during this time, were the clients always failing? Are the clients also 
>>running with the build including the PR?
Yes the brokers ran fine with PR build. The client doesn't always fail. When I 
run producer with 1 or 2 topics for some time, it works as expected but when i 
go to 40 topics, it fails after sending messages to first 15 topics (this 
happens sequentially).

The NoClassDefFoundError  exceptions went away with a restart of that 
particular broker.

I will give this another go just to make sure no other issue is affecting the 
observations.



was (Author: xabhi):
[~rsivaram]
Did you run a clean build using the instructions Manikumar posted above? We 
want to make sure that all the jars you are running with came from that build 
to avoid NoClassDefFoundError. The client exceptions could be related to that.
>> All brokers were using JARs from [~omkreddy]'s build (using the 
>> instructions). There was no NoClassDefFoundError at the startup. Note that 
>> only one broker saw these exceptions and they all use same configuration and 
>> jars

>>Were all the brokers running with the build including the PR for a day? And 
>>during this time, were the clients always failing? Are the clients also 
>>running with the build including the PR?
Yes the brokers ran fine with PR build. The client doesn't always fail. When I 
run producer with 1 or 2 topics for some time, it works as expected but when i 
go to 40 topics, it fails after sending messages to first 15 topics (this 
happens sequentially).

I can give this another go if you want.


> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-05 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784456#comment-16784456
 ] 

Abhi commented on KAFKA-7925:
-

[~rsivaram]
Did you run a clean build using the instructions Manikumar posted above? We 
want to make sure that all the jars you are running with came from that build 
to avoid NoClassDefFoundError. The client exceptions could be related to that.
>> All brokers were using JARs from [~omkreddy]'s build (using the 
>> instructions). There was no NoClassDefFoundError at the startup. Note that 
>> only one broker saw these exceptions and they all use same configuration and 
>> jars

>>Were all the brokers running with the build including the PR for a day? And 
>>during this time, were the clients always failing? Are the clients also 
>>running with the build including the PR?
Yes the brokers ran fine with PR build. The client doesn't always fail. When I 
run producer with 1 or 2 topics for some time, it works as expected but when i 
go to 40 topics, it fails after sending messages to first 15 topics (this 
happens sequentially).

I can give this another go if you want.


> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-05 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784403#comment-16784403
 ] 

Abhi commented on KAFKA-7925:
-

[~rsivaram]
While look at the broker logs, I found multiple java.lang.NoClassDefFoundError 
exceptions in one of the server logs. I am not sure if this is related to this 
issue or totally separate issue. These exceptions started coming after the 
server was running fine for a day and I don't see such exceptions in other 
servers.

[2019-03-03 03:25:33,167] ERROR [ReplicaFetcher replicaId=4, leaderId=5, 
fetcherId=3] Error due to (kafka.server.ReplicaFetcherThread)
org.apache.kafka.common.KafkaException: Error processing data for partition 
fps.pese.desim_se-0 offset 202742
at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7(AbstractFetcherThread.scala:338)
at scala.Option.foreach(Option.scala:274)
at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6(AbstractFetcherThread.scala:296)
at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6$adapted(AbstractFetcherThread.scala:295)
at 
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$5(AbstractFetcherThread.scala:295)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:295)
at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:132)
at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:131)
at scala.Option.foreach(Option.scala:274)
at 
kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:131)
at 
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:113)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89)
Caused by: java.lang.NoClassDefFoundError: kafka/log/BatchMetadata
at kafka.log.Log.$anonfun$append$2(Log.scala:925)
at kafka.log.Log.maybeHandleIOException(Log.scala:2013)
at kafka.log.Log.append(Log.scala:827)
at kafka.log.Log.appendAsFollower(Log.scala:807)
at 
kafka.cluster.Partition.$anonfun$doAppendRecordsToFollowerOrFutureReplica$1(Partition.scala:708)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:259)
at 
kafka.cluster.Partition.doAppendRecordsToFollowerOrFutureReplica(Partition.scala:699)
at 
kafka.cluster.Partition.appendRecordsToFollowerOrFutureReplica(Partition.scala:715)
at 
kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:157)
at 
kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7(AbstractFetcherThread.scala:307)
... 16 more
Caused by: java.lang.ClassNotFoundException: kafka.log.BatchMetadata
at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)
at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 27 more

Other such exceptions included java.lang.NoClassDefFoundError: 
kafka/common/LongRef, java.lang.NoClassDefFoundError: kafka/log/LogAppendInfo$, 
java.lang.NoClassDefFoundError: kafka/api/LeaderAndIsr, 
java.lang.NoClassDefFoundError: org/apache/zookeeper/proto/SetWatches etc.


Do you need server logs for all brokers? The size is in >300MB for the logs is 
it okay to upload here on JIRA? I can upload client logs as they are small. I 
will try to extract out the logs for the test duration and upload here soon.






> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers i

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-05 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784365#comment-16784365
 ] 

Abhi commented on KAFKA-7925:
-

[~rsivaram]

I am also seeing a similar exception when running consumer group command (with 
the setup running your patch). I don't get this exception on the other kafka 
setup that is running with v2.1.1

env KAFKA_LOG4J_CONFIG=log4j.properties 
KAFKA_OPTS="-Dsun.security.jgss.native=true 
-Dsun.security.jgss.lib=/usr/libexec/libgsswrap.so 
-Djavax.security.auth.useSubjectCredsOnly=false 
-Djava.security.auth.login.config=jaas.conf" 
/proj/tools/infra/apache/kafka/kafka_2.12-2.1.1/bin/kafka-consumer-groups.sh 
--bootstrap-server mwkafka-prod-01.nyc:9092 --command-config command_config  
--list  
Error: Executing consumer group command failed due to 
org.apache.kafka.common.errors.UnknownServerException: Error listing groups on 
mwkafka-prod-01.dr.xxx.com:9092 (id: 3 rack: dr.xxx.com)
java.util.concurrent.ExecutionException: 
org.apache.kafka.common.errors.UnknownServerException: Error listing groups on 
mwkafka-prod-01.dr.xxx.com:9092 (id: 3 rack: dr.xxx.com)
at 
org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
at 
org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
at 
org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
at 
org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:262)
at 
kafka.admin.ConsumerGroupCommand$ConsumerGroupService.listGroups(ConsumerGroupCommand.scala:132)
at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:58)
at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala)
Caused by: org.apache.kafka.common.errors.UnknownServerException: Error listing 
groups on mwkafka-prod-01.dr.xxx.com:9092 (id: 3 rack: dr.xxx.com)


> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> jav

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-05 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784298#comment-16784298
 ] 

Abhi commented on KAFKA-7925:
-

Hi, I deployed the patch in my setup but now I am getting below exception when 
trying to publish messages. This exception is received on client side. I did 
not see any error or warnings around this time (2019-03-05 04:16:25) in server 
logs.

[2019-03-05 04:16:25,146] ERROR Uncaught exception in thread 
'kafka-producer-network-thread | test_prod': 
(org.apache.kafka.common.utils.KafkaThread)
deshaw.common.util.ApplicationDeath: 
org.apache.kafka.common.errors.UnknownServerException: The server experienced 
an unexpected error when processing the request.
at 
deshaw.kafka.test.KafkaTestProducer$ProducerCallback.onCompletion(KafkaTestProducer.java:475)
at 
org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1304)
at 
org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:230)
at 
org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:196)
at 
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:717)
at 
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:685)
at 
org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:635)
at 
org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:557)
at 
org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74)
at 
org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:786)
at 
org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109)
at 
org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:557)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:549)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:311)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:235)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.kafka.common.errors.UnknownServerException: The server 
experienced an unexpected error when processing the request.
[2019-03-05 04:17:25,144] DEBUG [Producer clientId=test_prod] Exception 
occurred during message send: (org.apache.kafka.clients.producer.KafkaProducer)

On a side note, after deploying this patch, I also observed lot of connection 
disconnects:

[2019-03-05 01:23:39,553] DEBUG [SocketServer brokerId=1] Connection with 
/10.219.26.10 disconnected (org.apache.kafka.common.network.Selector)
java.io.EOFException
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96)
at 
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
at 
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
at 
org.apache.kafka.common.network.Selector.attemptRead(Selector.java:640)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:561)
at org.apache.kafka.common.network.Selector.poll(Selector.java:472)
at kafka.network.Processor.poll(SocketServer.scala:830)
at kafka.network.Processor.run(SocketServer.scala:730)
at java.base/java.lang.Thread.run(Thread.java:834)
[2019-03-05 01:23:39,553] DEBUG [SocketServer brokerId=1] Connection with 
/10.219.26.10 disconnected (org.apache.kafka.common.network.Selector)
java.io.EOFException
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96)
at 
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
at 
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
at 
org.apache.kafka.common.network.Selector.attemptRead(Selector.java:640)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:561)
at org.apache.kafka.common.network.Selector.poll(Selector.java:472)
at kafka.network.Processor.poll(SocketServer.scala:830)
at kafka.network.Processor.run(SocketServer.scala:730)
at java.base/java.lang.Thread.run(Thread.java:834)

[2019-03-05 01:27:57,386] DEBUG [Controller id=1, targetBrokerId=4] Connection 
with mwkafka-prod-02.dr.deshaw.com/10.218.247.23 disconnected 
(org.apache.kafka.common.network.Selector)
java.io.EOFException
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96)
at 
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
at 
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
at 
org.apache.kafka.common.network.Selector.attemptRead(Selector.java:640)
at 
org.apache.k

[jira] [Comment Edited] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-05 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784298#comment-16784298
 ] 

Abhi edited comment on KAFKA-7925 at 3/5/19 10:35 AM:
--

Hi, I deployed the patch in my setup but now I am getting below exception when 
trying to publish messages. This exception is received on client side. I did 
not see any error or warnings around this time (2019-03-05 04:16:25) in server 
logs.

[2019-03-05 04:16:25,146] ERROR Uncaught exception in thread 
'kafka-producer-network-thread | test_prod': 
(org.apache.kafka.common.utils.KafkaThread)
common.util.ApplicationDeath: 
org.apache.kafka.common.errors.UnknownServerException: The server experienced 
an unexpected error when processing the request.
at 
xxx.kafka.test.KafkaTestProducer$ProducerCallback.onCompletion(KafkaTestProducer.java:475)
at 
org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1304)
at 
org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:230)
at 
org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:196)
at 
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:717)
at 
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:685)
at 
org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:635)
at 
org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:557)
at 
org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74)
at 
org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:786)
at 
org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109)
at 
org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:557)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:549)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:311)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:235)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.kafka.common.errors.UnknownServerException: The server 
experienced an unexpected error when processing the request.
[2019-03-05 04:17:25,144] DEBUG [Producer clientId=test_prod] Exception 
occurred during message send: (org.apache.kafka.clients.producer.KafkaProducer)

On a side note, after deploying this patch, I also observed lot of connection 
disconnects:

[2019-03-05 01:23:39,553] DEBUG [SocketServer brokerId=1] Connection with 
/10.219.26.10 disconnected (org.apache.kafka.common.network.Selector)
java.io.EOFException
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96)
at 
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
at 
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
at 
org.apache.kafka.common.network.Selector.attemptRead(Selector.java:640)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:561)
at org.apache.kafka.common.network.Selector.poll(Selector.java:472)
at kafka.network.Processor.poll(SocketServer.scala:830)
at kafka.network.Processor.run(SocketServer.scala:730)
at java.base/java.lang.Thread.run(Thread.java:834)
[2019-03-05 01:23:39,553] DEBUG [SocketServer brokerId=1] Connection with 
/10.219.26.10 disconnected (org.apache.kafka.common.network.Selector)
java.io.EOFException
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96)
at 
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
at 
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
at 
org.apache.kafka.common.network.Selector.attemptRead(Selector.java:640)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:561)
at org.apache.kafka.common.network.Selector.poll(Selector.java:472)
at kafka.network.Processor.poll(SocketServer.scala:830)
at kafka.network.Processor.run(SocketServer.scala:730)
at java.base/java.lang.Thread.run(Thread.java:834)

[2019-03-05 01:27:57,386] DEBUG [Controller id=1, targetBrokerId=4] Connection 
with mwkafka-prod-02.dr.xxx.com/10.218.247.23 disconnected 
(org.apache.kafka.common.network.Selector)
java.io.EOFException
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96)
at 
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
at 
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
at 
org.apache.kafka.common.network.Selector.attemptRead(Selecto

[jira] [Comment Edited] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-03-05 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784298#comment-16784298
 ] 

Abhi edited comment on KAFKA-7925 at 3/5/19 10:38 AM:
--

Hi, I deployed the patch in my setup but now I am getting below exception when 
trying to publish messages on 40 topics using same producer. This exception is 
received on client side. I did not see any error or warnings around this time 
(2019-03-05 04:16:25) in server logs.

[2019-03-05 04:16:25,146] ERROR Uncaught exception in thread 
'kafka-producer-network-thread | test_prod': 
(org.apache.kafka.common.utils.KafkaThread)
common.util.ApplicationDeath: 
org.apache.kafka.common.errors.UnknownServerException: The server experienced 
an unexpected error when processing the request.
at 
xxx.kafka.test.KafkaTestProducer$ProducerCallback.onCompletion(KafkaTestProducer.java:475)
at 
org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1304)
at 
org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:230)
at 
org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:196)
at 
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:717)
at 
org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:685)
at 
org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:635)
at 
org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:557)
at 
org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74)
at 
org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:786)
at 
org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109)
at 
org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:557)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:549)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:311)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:235)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.kafka.common.errors.UnknownServerException: The server 
experienced an unexpected error when processing the request.
[2019-03-05 04:17:25,144] DEBUG [Producer clientId=test_prod] Exception 
occurred during message send: (org.apache.kafka.clients.producer.KafkaProducer)

On a side note, after deploying this patch, I also observed lot of connection 
disconnects:

[2019-03-05 01:23:39,553] DEBUG [SocketServer brokerId=1] Connection with 
/10.219.26.10 disconnected (org.apache.kafka.common.network.Selector)
java.io.EOFException
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96)
at 
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
at 
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
at 
org.apache.kafka.common.network.Selector.attemptRead(Selector.java:640)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:561)
at org.apache.kafka.common.network.Selector.poll(Selector.java:472)
at kafka.network.Processor.poll(SocketServer.scala:830)
at kafka.network.Processor.run(SocketServer.scala:730)
at java.base/java.lang.Thread.run(Thread.java:834)
[2019-03-05 01:23:39,553] DEBUG [SocketServer brokerId=1] Connection with 
/10.219.26.10 disconnected (org.apache.kafka.common.network.Selector)
java.io.EOFException
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96)
at 
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
at 
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
at 
org.apache.kafka.common.network.Selector.attemptRead(Selector.java:640)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:561)
at org.apache.kafka.common.network.Selector.poll(Selector.java:472)
at kafka.network.Processor.poll(SocketServer.scala:830)
at kafka.network.Processor.run(SocketServer.scala:730)
at java.base/java.lang.Thread.run(Thread.java:834)

[2019-03-05 01:27:57,386] DEBUG [Controller id=1, targetBrokerId=4] Connection 
with mwkafka-prod-02.dr.xxx.com/10.218.247.23 disconnected 
(org.apache.kafka.common.network.Selector)
java.io.EOFException
at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96)
at 
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424)
at 
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385)
at 
org.apache.kafka.common.net

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-02-28 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781333#comment-16781333
 ] 

Abhi commented on KAFKA-7925:
-

[~rsivaram] Thanks Rajini for looking at this issue. Yes, you can close the 
other two JIRAs. I will be happy to test the PR. Can you tell how can i 
checkout the PR code into local kafka checkout. 

Should i cloned https://github.com/apache/kafka.git but here I am not sure how 
to checkout the pr code or should i go ahead and clone - 
https://github.com/rajinisivaram/kafka.git?

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-

[jira] [Commented] (KAFKA-8008) Clients unable to connect and replicas are not able to connect to each other

2019-02-27 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16779637#comment-16779637
 ] 

Abhi commented on KAFKA-8008:
-

One the kafka network thread is stuck in below state:

"kafka-network-thread-1-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2" #81 
prio=5 os_prio=0 cpu=4839838.11ms elapsed=33760.37s allocated=59G 
defined_classes=36 tid=0x000
07fb046d12800 nid=0x100fa runnable  [0x7faee8df8000]
   java.lang.Thread.State: RUNNABLE
at java.util.HashMap$TreeNode.find(java.base@11.0.2/HashMap.java:1928)
at java.util.HashMap$TreeNode.find(java.base@11.0.2/HashMap.java:1924)
at java.util.HashMap$TreeNode.find(java.base@11.0.2/HashMap.java:1924)
at java.util.HashMap$TreeNode.find(java.base@11.0.2/HashMap.java:1924)
at java.util.HashMap$TreeNode.find(java.base@11.0.2/HashMap.java:1924)
at java.util.HashMap$TreeNode.find(java.base@11.0.2/HashMap.java:1924)
at java.util.HashMap$TreeNode.find(java.base@11.0.2/HashMap.java:1924)
at java.util.HashMap$TreeNode.find(java.base@11.0.2/HashMap.java:1924)
at java.util.HashMap$TreeNode.find(java.base@11.0.2/HashMap.java:1924)
at 
java.util.HashMap$TreeNode.putTreeVal(java.base@11.0.2/HashMap.java:2043)
at java.util.HashMap.putVal(java.base@11.0.2/HashMap.java:633)
at java.util.HashMap.put(java.base@11.0.2/HashMap.java:607)
at java.util.HashSet.add(java.base@11.0.2/HashSet.java:220)
at 
javax.security.auth.Subject$ClassSet.populateSet(java.base@11.0.2/Subject.java:1518)
at 
javax.security.auth.Subject$ClassSet.(java.base@11.0.2/Subject.java:1472)
- locked <0x0006c8a9b970> (a java.util.Collections$SynchronizedSet)
at 
javax.security.auth.Subject.getPrivateCredentials(java.base@11.0.2/Subject.java:764)
at 
sun.security.jgss.GSSUtil$1.run(java.security.jgss@11.0.2/GSSUtil.java:336)
at 
sun.security.jgss.GSSUtil$1.run(java.security.jgss@11.0.2/GSSUtil.java:328)
at java.security.AccessController.doPrivileged(java.base@11.0.2/Native 
Method)
at 
sun.security.jgss.GSSUtil.searchSubject(java.security.jgss@11.0.2/GSSUtil.java:328)
at 
sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(java.security.jgss@11.0.2/NativeGSSFactory.java:53)
at 
sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(java.security.jgss@11.0.2/NativeGSSFactory.java:116)
at 
sun.security.jgss.GSSManagerImpl.getCredentialElement(java.security.jgss@11.0.2/GSSManagerImpl.java:187)
at 
sun.security.jgss.GSSCredentialImpl.add(java.security.jgss@11.0.2/GSSCredentialImpl.java:439)
at 
sun.security.jgss.GSSCredentialImpl.(java.security.jgss@11.0.2/GSSCredentialImpl.java:74)
at 
sun.security.jgss.GSSManagerImpl.createCredential(java.security.jgss@11.0.2/GSSManagerImpl.java:148)
at 
com.sun.security.sasl.gsskerb.GssKrb5Server.(jdk.security.jgss@11.0.2/GssKrb5Server.java:108)
at 
com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(jdk.security.jgss@11.0.2/FactoryImpl.java:85)
at 
javax.security.sasl.Sasl.createSaslServer(java.security.sasl@11.0.2/Sasl.java:537)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.lambda$createSaslKerberosServer$1(SaslServerAuthenticator.java:212)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator$$Lambda$970/0x0008006a7840.run(Unknown
 Source)
at java.security.AccessController.doPrivileged(java.base@11.0.2/Native 
Method)
at javax.security.auth.Subject.doAs(java.base@11.0.2/Subject.java:423)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:211)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslServer(SaslServerAuthenticator.java:164)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:450)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:248)
at 
org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:132)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
at org.apache.kafka.common.network.Selector.poll(Selector.java:467)
at kafka.network.Processor.poll(SocketServer.scala:689)
at kafka.network.Processor.run(SocketServer.scala:594)
at java.lang.Thread.run(java.base@11.0.2/Thread.java:834)



> Clients unable to connect and replicas are not able to connect to each other
> 
>
> Key: KAFKA-8008
> URL: https://issues.apache.org/jira/browse

[jira] [Updated] (KAFKA-8008) Clients unable to connect and replicas are not able to connect to each other

2019-02-27 Thread Abhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhi updated KAFKA-8008:

Description: 
Hi,

I upgrade to Kafka v2.1.1 recently and seeing the below exceptions in all the 
servers. The kafka-network-thread-1-ListenerName are all consuming full cpu 
cycles. Lots of TCP connections are in CLOSE_WAIT state.

My broker setup is using kerberos authentication with 
-Dsun.security.jgss.native=true.

I am not sure how to handle this? Will increasing the kafka-network thread 
count help if it is possible?

Does this seem like a bug? I am happy to help in anyway I can as this issue 
blocking our production usage and would like to get it resolved as early as 
possible.


*server.log snippet from one of the servers:*
[2019-02-27 00:00:02,948] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Built full fetch (sessionId=1488865423, epoch=INITIAL) for node 2 
with 3 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
[2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Initiating connection to node mwkafka-prod-02.nyc.foo.com:9092 
(id: 2 rack: null) using address mwkafka-prod-02.nyc.foo.com/10.219.247.26 
(org.apache.kafka.clients.NetworkClient)
[2019-02-27 00:00:02,949] DEBUG Set SASL client state to 
SEND_APIVERSIONS_REQUEST 
(org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
[2019-02-27 00:00:02,949] DEBUG Creating SaslClient: 
client=kafka/mwkafka-prod-01.nyc.foo@unix.foo.com;service=kafka;serviceHostname=mwkafka-prod-02.nyc.foo.com;mechs=[GSSAPI]
 (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
[2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 166400, 
SO_TIMEOUT = 0 to node 2 (org.apache.kafka.common.network.Selector)
[2019-02-27 00:00:02,949] DEBUG Set SASL client state to 
RECEIVE_APIVERSIONS_RESPONSE 
(org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
[2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Completed connection to node 2. Ready. 
(org.apache.kafka.clients.NetworkClient)
[2019-02-27 00:00:03,007] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
fetcherId=0] Built full fetch (sessionId=2039987243, epoch=INITIAL) for node 5 
with 0 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
[2019-02-27 00:00:03,317] INFO [ReplicaFetcher replicaId=1, leaderId=5, 
fetcherId=1] Error sending fetch request (sessionId=397037945, epoch=INITIAL) 
to node 5: java.net.SocketTimeoutException: Failed to connect within 3 ms. 
(org.apache.kafka.clients.FetchSessionHandler)
[2019-02-27 00:00:03,317] WARN [ReplicaFetcher replicaId=1, leaderId=5, 
fetcherId=1] Error in response for fetch request (type=FetchRequest, 
replicaId=1, maxWait=1, minBytes=1, maxBytes=10485760, 
fetchData={reddyvel-159-0=(fetchOffset=3173198, logStartOffset=3173198, 
maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-331-0=(fetchOffset=3173197, logStartOffset=3173197, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-newtp-5-64-0=(fetchOffset=8936, 
logStartOffset=8936, maxBytes=1048576, currentLeaderEpoch=Optional[18]), 
reddyvel-tp9-78-0=(fetchOffset=247943, logStartOffset=247943, maxBytes=1048576, 
currentLeaderEpoch=Optional[19]), reddyvel-tp3-58-0=(fetchOffset=264495, 
logStartOffset=264495, maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
fps.trsy.fe_prvt-0=(fetchOffset=24, logStartOffset=8, maxBytes=1048576, 
currentLeaderEpoch=Optional[3]), reddyvel-7-0=(fetchOffset=3173199, 
logStartOffset=3173199, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-298-0=(fetchOffset=3173197, logStartOffset=3173197, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), fps.guas.peeq.fe_marb_us-0=(fetchOffset=2, 
logStartOffset=2, maxBytes=1048576, currentLeaderEpoch=Optional[6]), 
reddyvel-108-0=(fetchOffset=3173198, logStartOffset=3173198, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-988-0=(fetchOffset=3173185, 
logStartOffset=3173185, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-111-0=(fetchOffset=3173198, logStartOffset=3173198, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-409-0=(fetchOffset=3173194, 
logStartOffset=3173194, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-104-0=(fetchOffset=3173198, logStartOffset=3173198, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), fps.priveq.reins-0=(fetchOffset=12, 
logStartOffset=6, maxBytes=1048576, currentLeaderEpoch=Optional[5]), 
reddyvel-353-0=(fetchOffset=3173197, logStartOffset=3173197, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-tp10-63-0=(fetchOffset=220652, 
logStartOffset=220652, maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
reddyvel-newtp-5-86-0=(fetchOffset=8935, logStartOffset=8935, maxBytes=1048576, 

[jira] [Updated] (KAFKA-8008) Clients unable to connect and replicas are not able to connect to each other

2019-02-27 Thread Abhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhi updated KAFKA-8008:

Description: 
Hi,

I upgrade to Kafka v2.1.1 recently and seeing the below exceptions in all the 
servers. The kafka-network-thread-1-ListenerName are all consuming full cpu 
cycles. Lots of TCP connections are in CLOSE_WAIT state.

My broker setup is using kerberos authentication with 
-Dsun.security.jgss.native=true.

I am not sure how to handle this? Will increasing the kafka-network thread 
count help if it is possible?

Does this seem like a bug? I am happy to help in anyway I can as this issue 
blocking our production usage and would like to get it resolved as early as 
possible.


*server.log snippet from one of the servers:
*[2019-02-27 00:00:02,948] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Built full fetch (sessionId=1488865423, epoch=INITIAL) for node 2 
with 3 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
[2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Initiating connection to node mwkafka-prod-02.nyc.foo.com:9092 
(id: 2 rack: null) using address mwkafka-prod-02.nyc.foo.com/10.219.247.26 
(org.apache.kafka.clients.NetworkClient)
[2019-02-27 00:00:02,949] DEBUG Set SASL client state to 
SEND_APIVERSIONS_REQUEST 
(org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
[2019-02-27 00:00:02,949] DEBUG Creating SaslClient: 
client=kafka/mwkafka-prod-01.nyc.foo@unix.foo.com;service=kafka;serviceHostname=mwkafka-prod-02.nyc.foo.com;mechs=[GSSAPI]
 (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
[2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 166400, 
SO_TIMEOUT = 0 to node 2 (org.apache.kafka.common.network.Selector)
[2019-02-27 00:00:02,949] DEBUG Set SASL client state to 
RECEIVE_APIVERSIONS_RESPONSE 
(org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
[2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Completed connection to node 2. Ready. 
(org.apache.kafka.clients.NetworkClient)
[2019-02-27 00:00:03,007] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
fetcherId=0] Built full fetch (sessionId=2039987243, epoch=INITIAL) for node 5 
with 0 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
[2019-02-27 00:00:03,317] INFO [ReplicaFetcher replicaId=1, leaderId=5, 
fetcherId=1] Error sending fetch request (sessionId=397037945, epoch=INITIAL) 
to node 5: java.net.SocketTimeoutException: Failed to connect within 3 ms. 
(org.apache.kafka.clients.FetchSessionHandler)
[2019-02-27 00:00:03,317] WARN [ReplicaFetcher replicaId=1, leaderId=5, 
fetcherId=1] Error in response for fetch request (type=FetchRequest, 
replicaId=1, maxWait=1, minBytes=1, maxBytes=10485760, 
fetchData={reddyvel-159-0=(fetchOffset=3173198, logStartOffset=3173198, 
maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-331-0=(fetchOffset=3173197, logStartOffset=3173197, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-newtp-5-64-0=(fetchOffset=8936, 
logStartOffset=8936, maxBytes=1048576, currentLeaderEpoch=Optional[18]), 
reddyvel-tp9-78-0=(fetchOffset=247943, logStartOffset=247943, maxBytes=1048576, 
currentLeaderEpoch=Optional[19]), reddyvel-tp3-58-0=(fetchOffset=264495, 
logStartOffset=264495, maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
fps.trsy.fe_prvt-0=(fetchOffset=24, logStartOffset=8, maxBytes=1048576, 
currentLeaderEpoch=Optional[3]), reddyvel-7-0=(fetchOffset=3173199, 
logStartOffset=3173199, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-298-0=(fetchOffset=3173197, logStartOffset=3173197, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), fps.guas.peeq.fe_marb_us-0=(fetchOffset=2, 
logStartOffset=2, maxBytes=1048576, currentLeaderEpoch=Optional[6]), 
reddyvel-108-0=(fetchOffset=3173198, logStartOffset=3173198, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-988-0=(fetchOffset=3173185, 
logStartOffset=3173185, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-111-0=(fetchOffset=3173198, logStartOffset=3173198, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-409-0=(fetchOffset=3173194, 
logStartOffset=3173194, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-104-0=(fetchOffset=3173198, logStartOffset=3173198, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), fps.priveq.reins-0=(fetchOffset=12, 
logStartOffset=6, maxBytes=1048576, currentLeaderEpoch=Optional[5]), 
reddyvel-353-0=(fetchOffset=3173197, logStartOffset=3173197, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-tp10-63-0=(fetchOffset=220652, 
logStartOffset=220652, maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
reddyvel-newtp-5-86-0=(fetchOffset=8935, logStartOffset=8935, maxBytes=1048576, 

[jira] [Created] (KAFKA-8008) Clients unable to connect and replicas are not able to connect to each other

2019-02-27 Thread Abhi (JIRA)
Abhi created KAFKA-8008:
---

 Summary: Clients unable to connect and replicas are not able to 
connect to each other
 Key: KAFKA-8008
 URL: https://issues.apache.org/jira/browse/KAFKA-8008
 Project: Kafka
  Issue Type: Bug
  Components: controller, core
Affects Versions: 2.1.1, 2.1.0
 Environment: Java 11
Reporter: Abhi


Hi,

I upgrade to Kafka v2.1.1 in hope that issue 
https://issues.apache.org/jira/browse/KAFKA-7925 will be fixed. However, I am 
still seeing the similar issue in my kafka cluster.

I am seeing the same exceptions in all the servers. The 
kafka-network-thread-1-ListenerName are all consuming full cpu cycles. Lots of 
TCP connections are in CLOSE_WAIT state.

My broker setup is using kerberos authentication with 
-Dsun.security.jgss.native=true.

I am not sure how to handle this? Will increasing the kafka-network thread 
count help if it is possible?

Does this seem like a bug? I am happy to help in anyway I can as this issue 
blocking our production usage and would like to get it resolved as early as 
possible.


*server.log snippet from one of the servers:
*[2019-02-27 00:00:02,948] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Built full fetch (sessionId=1488865423, epoch=INITIAL) for node 2 
with 3 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
[2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Initiating connection to node mwkafka-prod-02.nyc.foo.com:9092 
(id: 2 rack: null) using address mwkafka-prod-02.nyc.foo.com/10.219.247.26 
(org.apache.kafka.clients.NetworkClient)
[2019-02-27 00:00:02,949] DEBUG Set SASL client state to 
SEND_APIVERSIONS_REQUEST 
(org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
[2019-02-27 00:00:02,949] DEBUG Creating SaslClient: 
client=kafka/mwkafka-prod-01.nyc.foo@unix.foo.com;service=kafka;serviceHostname=mwkafka-prod-02.nyc.foo.com;mechs=[GSSAPI]
 (org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
[2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 166400, 
SO_TIMEOUT = 0 to node 2 (org.apache.kafka.common.network.Selector)
[2019-02-27 00:00:02,949] DEBUG Set SASL client state to 
RECEIVE_APIVERSIONS_RESPONSE 
(org.apache.kafka.common.security.authenticator.SaslClientAuthenticator)
[2019-02-27 00:00:02,949] DEBUG [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=3] Completed connection to node 2. Ready. 
(org.apache.kafka.clients.NetworkClient)
[2019-02-27 00:00:03,007] DEBUG [ReplicaFetcher replicaId=1, leaderId=5, 
fetcherId=0] Built full fetch (sessionId=2039987243, epoch=INITIAL) for node 5 
with 0 partition(s). (org.apache.kafka.clients.FetchSessionHandler)
[2019-02-27 00:00:03,317] INFO [ReplicaFetcher replicaId=1, leaderId=5, 
fetcherId=1] Error sending fetch request (sessionId=397037945, epoch=INITIAL) 
to node 5: java.net.SocketTimeoutException: Failed to connect within 3 ms. 
(org.apache.kafka.clients.FetchSessionHandler)
[2019-02-27 00:00:03,317] WARN [ReplicaFetcher replicaId=1, leaderId=5, 
fetcherId=1] Error in response for fetch request (type=FetchRequest, 
replicaId=1, maxWait=1, minBytes=1, maxBytes=10485760, 
fetchData={reddyvel-159-0=(fetchOffset=3173198, logStartOffset=3173198, 
maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-331-0=(fetchOffset=3173197, logStartOffset=3173197, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-newtp-5-64-0=(fetchOffset=8936, 
logStartOffset=8936, maxBytes=1048576, currentLeaderEpoch=Optional[18]), 
reddyvel-tp9-78-0=(fetchOffset=247943, logStartOffset=247943, maxBytes=1048576, 
currentLeaderEpoch=Optional[19]), reddyvel-tp3-58-0=(fetchOffset=264495, 
logStartOffset=264495, maxBytes=1048576, currentLeaderEpoch=Optional[19]), 
fps.trsy.fe_prvt-0=(fetchOffset=24, logStartOffset=8, maxBytes=1048576, 
currentLeaderEpoch=Optional[3]), reddyvel-7-0=(fetchOffset=3173199, 
logStartOffset=3173199, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-298-0=(fetchOffset=3173197, logStartOffset=3173197, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), fps.guas.peeq.fe_marb_us-0=(fetchOffset=2, 
logStartOffset=2, maxBytes=1048576, currentLeaderEpoch=Optional[6]), 
reddyvel-108-0=(fetchOffset=3173198, logStartOffset=3173198, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-988-0=(fetchOffset=3173185, 
logStartOffset=3173185, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-111-0=(fetchOffset=3173198, logStartOffset=3173198, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), reddyvel-409-0=(fetchOffset=3173194, 
logStartOffset=3173194, maxBytes=1048576, currentLeaderEpoch=Optional[23]), 
reddyvel-104-0=(fetchOffset=3173198, logStartOffset=3173198, maxBytes=1048576, 
currentLeaderEpoch=Optional[23]), fps.priveq.reins-0=(fetchOff

[jira] [Updated] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-02-27 Thread Abhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhi updated KAFKA-7925:

Affects Version/s: 2.1.1

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod 2048u sock 0,7 0t0 30012546 protocol: TCP
> java 144319 kafkagod 2049u IPv4 30005418 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39686 (CLOSE_WAIT)
> java 144319 kafkagod 2050u IPv4 30009977 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34552 (ESTABLISHED)
> java 144319 kafkagod 2060u sock 0,7 0t0 30003439 protocol: TCP
> java 14431

[jira] [Updated] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-02-27 Thread Abhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhi updated KAFKA-7925:

Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1  (was: Java 11, Kafka 
v2.1.0)

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1
>Reporter: Abhi
>Priority: Critical
> Attachments: threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod 2048u sock 0,7 0t0 30012546 protocol: TCP
> java 144319 kafkagod 2049u IPv4 30005418 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39686 (CLOSE_WAIT)
> java 144319 kafkagod 2050u IPv4 30009977 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34552 (ESTABLISHED)
> java 144319 kafkagod 2060u 

[jira] [Comment Edited] (KAFKA-7982) ConcurrentModificationException and Continuous warnings "Attempting to send response via channel for which there is no open connection"

2019-02-26 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1623#comment-1623
 ] 

Abhi edited comment on KAFKA-7982 at 2/26/19 9:38 AM:
--

The two different principals is a copy-paste mistake from my rough draft, kafka 
server is using only one principal of type 
kafka/u-kafkatst-kafkadev-1.sd.deshaw@unix.deshaw.com


*Jaas config file*
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/local/apps/kafkatst-kafka/config/kafka_server.keytab"
principal="kafka/u-kafkatst-kafkadev-1.sd.deshaw@unix.deshaw.com";
};

Client {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true;
};

What logs do you want to see - server.logs, kafka-authorizer, state-change or 
controller.log?

Thanks!


was (Author: xabhi):
*Jaas config file*
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/local/apps/kafkatst-kafka/config/kafka_server.keytab"
principal="kafka/u-kafkatst-kafkadev-1.sd.deshaw@unix.deshaw.com";
};

Client {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true;
};

What logs do you want to see - server.logs, kafka-authorizer, state-change or 
controller.log?

Thanks!

> ConcurrentModificationException and Continuous warnings "Attempting to send 
> response via channel for which there is no open connection"
> ---
>
> Key: KAFKA-7982
> URL: https://issues.apache.org/jira/browse/KAFKA-7982
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.1
>Reporter: Abhi
>Priority: Major
>
> Hi,
> I am getting following warnings in server.log continuosly and due to this 
> client consumer is not able to consumer messages.
> [2019-02-20 10:26:30,312] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35248-6259 (kafka.network.Processor)
>  [2019-02-20 10:26:56,760] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35604-6261 (kafka.network.Processor)
> I also noticed that before these warnings started to appear, following 
> concurrent modification exception for the same IP address:
> [2019-02-20 09:01:11,175] INFO Initiating logout for 
> kafka/u-kafkatst-kafkadev-1.sd@unix.com 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)
>  [2019-02-20 09:01:11,176] WARN [SocketServer brokerId=1] Unexpected error 
> from /10.219.25.239; closing connection 
> (org.apache.kafka.common.network.Selector)
>  java.util.ConcurrentModificationException
>  at 
> java.base/java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:970)
>  at java.base/java.util.LinkedList$ListItr.next(LinkedList.java:892)
>  at java.base/javax.security.auth.Subject$SecureSet$1.next(Subject.java:1096)
>  at java.base/javax.security.auth.Subject$ClassSet$1.run(Subject.java:1501)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.base/javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1499)
>  at java.base/javax.security.auth.Subject$ClassSet.(Subject.java:1472)
>  at 
> java.base/javax.security.auth.Subject.getPrivateCredentials(Subject.java:764)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:336)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:328)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.security.jgss/sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:328)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:187)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:439)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:74)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:148)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>  at 
> java.security.sasl/javax.security.sasl.Sasl.createSaslServer(Sasl.java:537)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthentic

[jira] [Comment Edited] (KAFKA-7982) ConcurrentModificationException and Continuous warnings "Attempting to send response via channel for which there is no open connection"

2019-02-26 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1623#comment-1623
 ] 

Abhi edited comment on KAFKA-7982 at 2/26/19 8:29 AM:
--

*Jaas config file: *
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/local/apps/kafkatst-kafka/config/kafka_server.keytab"
principal="kafka/u-kafkatst-kafkadev-1.sd.deshaw@unix.deshaw.com";
};

Client {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true;
};

What logs do you want to see - server.logs, kafka-authorizer, state-change or 
controller.log?

Thanks!


was (Author: xabhi):
*Jaas config file:*
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/local/apps/kafkatst-kafka/config/kafka_server.keytab"
principal="kafka/u-kafkatst-kafkadev-1.sd.deshaw@unix.deshaw.com";
};

Client {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true;
};

What logs do you want to see - server.logs, kafka-authorizer, state-change or 
controller.log?

Thanks!

> ConcurrentModificationException and Continuous warnings "Attempting to send 
> response via channel for which there is no open connection"
> ---
>
> Key: KAFKA-7982
> URL: https://issues.apache.org/jira/browse/KAFKA-7982
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.1
>Reporter: Abhi
>Priority: Major
>
> Hi,
> I am getting following warnings in server.log continuosly and due to this 
> client consumer is not able to consumer messages.
> [2019-02-20 10:26:30,312] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35248-6259 (kafka.network.Processor)
>  [2019-02-20 10:26:56,760] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35604-6261 (kafka.network.Processor)
> I also noticed that before these warnings started to appear, following 
> concurrent modification exception for the same IP address:
> [2019-02-20 09:01:11,175] INFO Initiating logout for 
> kafka/u-kafkatst-kafkadev-1.sd@unix.com 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)
>  [2019-02-20 09:01:11,176] WARN [SocketServer brokerId=1] Unexpected error 
> from /10.219.25.239; closing connection 
> (org.apache.kafka.common.network.Selector)
>  java.util.ConcurrentModificationException
>  at 
> java.base/java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:970)
>  at java.base/java.util.LinkedList$ListItr.next(LinkedList.java:892)
>  at java.base/javax.security.auth.Subject$SecureSet$1.next(Subject.java:1096)
>  at java.base/javax.security.auth.Subject$ClassSet$1.run(Subject.java:1501)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.base/javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1499)
>  at java.base/javax.security.auth.Subject$ClassSet.(Subject.java:1472)
>  at 
> java.base/javax.security.auth.Subject.getPrivateCredentials(Subject.java:764)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:336)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:328)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.security.jgss/sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:328)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:187)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:439)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:74)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:148)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>  at 
> java.security.sasl/javax.security.sasl.Sasl.createSaslServer(Sasl.java:537)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.lambda$createSaslKerberosServer$12(SaslServerAuthenticator.java:212)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at java.base/javax.security.au

[jira] [Comment Edited] (KAFKA-7982) ConcurrentModificationException and Continuous warnings "Attempting to send response via channel for which there is no open connection"

2019-02-26 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1623#comment-1623
 ] 

Abhi edited comment on KAFKA-7982 at 2/26/19 8:30 AM:
--

*Jaas config file*
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/local/apps/kafkatst-kafka/config/kafka_server.keytab"
principal="kafka/u-kafkatst-kafkadev-1.sd.deshaw@unix.deshaw.com";
};

Client {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true;
};

What logs do you want to see - server.logs, kafka-authorizer, state-change or 
controller.log?

Thanks!


was (Author: xabhi):
*Jaas config file: *
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/local/apps/kafkatst-kafka/config/kafka_server.keytab"
principal="kafka/u-kafkatst-kafkadev-1.sd.deshaw@unix.deshaw.com";
};

Client {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true;
};

What logs do you want to see - server.logs, kafka-authorizer, state-change or 
controller.log?

Thanks!

> ConcurrentModificationException and Continuous warnings "Attempting to send 
> response via channel for which there is no open connection"
> ---
>
> Key: KAFKA-7982
> URL: https://issues.apache.org/jira/browse/KAFKA-7982
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.1
>Reporter: Abhi
>Priority: Major
>
> Hi,
> I am getting following warnings in server.log continuosly and due to this 
> client consumer is not able to consumer messages.
> [2019-02-20 10:26:30,312] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35248-6259 (kafka.network.Processor)
>  [2019-02-20 10:26:56,760] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35604-6261 (kafka.network.Processor)
> I also noticed that before these warnings started to appear, following 
> concurrent modification exception for the same IP address:
> [2019-02-20 09:01:11,175] INFO Initiating logout for 
> kafka/u-kafkatst-kafkadev-1.sd@unix.com 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)
>  [2019-02-20 09:01:11,176] WARN [SocketServer brokerId=1] Unexpected error 
> from /10.219.25.239; closing connection 
> (org.apache.kafka.common.network.Selector)
>  java.util.ConcurrentModificationException
>  at 
> java.base/java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:970)
>  at java.base/java.util.LinkedList$ListItr.next(LinkedList.java:892)
>  at java.base/javax.security.auth.Subject$SecureSet$1.next(Subject.java:1096)
>  at java.base/javax.security.auth.Subject$ClassSet$1.run(Subject.java:1501)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.base/javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1499)
>  at java.base/javax.security.auth.Subject$ClassSet.(Subject.java:1472)
>  at 
> java.base/javax.security.auth.Subject.getPrivateCredentials(Subject.java:764)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:336)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:328)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.security.jgss/sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:328)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:187)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:439)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:74)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:148)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>  at 
> java.security.sasl/javax.security.sasl.Sasl.createSaslServer(Sasl.java:537)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.lambda$createSaslKerberosServer$12(SaslServerAuthenticator.java:212)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at java.base/javax.security.aut

[jira] [Commented] (KAFKA-7982) ConcurrentModificationException and Continuous warnings "Attempting to send response via channel for which there is no open connection"

2019-02-26 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1623#comment-1623
 ] 

Abhi commented on KAFKA-7982:
-

*Jaas config file:*
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/local/apps/kafkatst-kafka/config/kafka_server.keytab"
principal="kafka/u-kafkatst-kafkadev-1.sd.deshaw@unix.deshaw.com";
};

Client {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true;
};

What logs do you want to see - server.logs, kafka-authorizer, state-change or 
controller.log?

Thanks!

> ConcurrentModificationException and Continuous warnings "Attempting to send 
> response via channel for which there is no open connection"
> ---
>
> Key: KAFKA-7982
> URL: https://issues.apache.org/jira/browse/KAFKA-7982
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.1
>Reporter: Abhi
>Priority: Major
>
> Hi,
> I am getting following warnings in server.log continuosly and due to this 
> client consumer is not able to consumer messages.
> [2019-02-20 10:26:30,312] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35248-6259 (kafka.network.Processor)
>  [2019-02-20 10:26:56,760] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35604-6261 (kafka.network.Processor)
> I also noticed that before these warnings started to appear, following 
> concurrent modification exception for the same IP address:
> [2019-02-20 09:01:11,175] INFO Initiating logout for 
> kafka/u-kafkatst-kafkadev-1.sd@unix.com 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)
>  [2019-02-20 09:01:11,176] WARN [SocketServer brokerId=1] Unexpected error 
> from /10.219.25.239; closing connection 
> (org.apache.kafka.common.network.Selector)
>  java.util.ConcurrentModificationException
>  at 
> java.base/java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:970)
>  at java.base/java.util.LinkedList$ListItr.next(LinkedList.java:892)
>  at java.base/javax.security.auth.Subject$SecureSet$1.next(Subject.java:1096)
>  at java.base/javax.security.auth.Subject$ClassSet$1.run(Subject.java:1501)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.base/javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1499)
>  at java.base/javax.security.auth.Subject$ClassSet.(Subject.java:1472)
>  at 
> java.base/javax.security.auth.Subject.getPrivateCredentials(Subject.java:764)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:336)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:328)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.security.jgss/sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:328)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:187)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:439)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:74)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:148)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>  at 
> java.security.sasl/javax.security.sasl.Sasl.createSaslServer(Sasl.java:537)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.lambda$createSaslKerberosServer$12(SaslServerAuthenticator.java:212)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:211)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslServer(SaslServerAuthenticator.java:164)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:450)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenti

[jira] [Updated] (KAFKA-7982) ConcurrentModificationException and Continuous warnings "Attempting to send response via channel for which there is no open connection"

2019-02-22 Thread Abhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhi updated KAFKA-7982:

Description: 
Hi,

I am getting following warnings in server.log continuosly and due to this 
client consumer is not able to consumer messages.

[2019-02-20 10:26:30,312] WARN Attempting to send response via channel for 
which there is no open connection, connection id 
10.218.27.45:9092-10.219.25.239:35248-6259 (kafka.network.Processor)
 [2019-02-20 10:26:56,760] WARN Attempting to send response via channel for 
which there is no open connection, connection id 
10.218.27.45:9092-10.219.25.239:35604-6261 (kafka.network.Processor)

I also noticed that before these warnings started to appear, following 
concurrent modification exception for the same IP address:

[2019-02-20 09:01:11,175] INFO Initiating logout for 
kafka/u-kafkatst-kafkadev-1.sd@unix.com 
(org.apache.kafka.common.security.kerberos.KerberosLogin)
 [2019-02-20 09:01:11,176] WARN [SocketServer brokerId=1] Unexpected error from 
/10.219.25.239; closing connection (org.apache.kafka.common.network.Selector)
 java.util.ConcurrentModificationException
 at 
java.base/java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:970)
 at java.base/java.util.LinkedList$ListItr.next(LinkedList.java:892)
 at java.base/javax.security.auth.Subject$SecureSet$1.next(Subject.java:1096)
 at java.base/javax.security.auth.Subject$ClassSet$1.run(Subject.java:1501)
 at java.base/java.security.AccessController.doPrivileged(Native Method)
 at 
java.base/javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1499)
 at java.base/javax.security.auth.Subject$ClassSet.(Subject.java:1472)
 at 
java.base/javax.security.auth.Subject.getPrivateCredentials(Subject.java:764)
 at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:336)
 at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:328)
 at java.base/java.security.AccessController.doPrivileged(Native Method)
 at java.security.jgss/sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:328)
 at 
java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
 at 
java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
 at 
java.security.jgss/sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:187)
 at 
java.security.jgss/sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:439)
 at 
java.security.jgss/sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:74)
 at 
java.security.jgss/sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:148)
 at 
jdk.security.jgss/com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
 at 
jdk.security.jgss/com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
 at java.security.sasl/javax.security.sasl.Sasl.createSaslServer(Sasl.java:537)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.lambda$createSaslKerberosServer$12(SaslServerAuthenticator.java:212)
 at java.base/java.security.AccessController.doPrivileged(Native Method)
 at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:211)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslServer(SaslServerAuthenticator.java:164)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:450)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:248)
 at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:132)
 at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
 at org.apache.kafka.common.network.Selector.poll(Selector.java:467)
 at kafka.network.Processor.poll(SocketServer.scala:689)
 at kafka.network.Processor.run(SocketServer.scala:594)
 at java.base/java.lang.Thread.run(Thread.java:834)
 [2019-02-22 00:18:29,439] INFO Initiating re-login for 
kafka/u-kafkatst-kafkadev-1.sd.deshaw@unix.deshaw.com 
(org.apache.kafka.common.security.kerberos.KerberosLogin)
 [2019-02-22 00:18:29,440] WARN [SocketServer brokerId=1] Unexpected error from 
/10.219.25.239; closing connection (org.apache.kafka.common.network.Selector)
 org.apache.kafka.common.KafkaException: Principal could not be determined from 
Subject, this may be a transient failure due to Kerberos re-login
 at 
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.firstPrincipal(SaslClientAuthenticator.java:435)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:177)
 at 
org.apache.kafka.common.security.authenticat

[jira] [Created] (KAFKA-7982) ConcurrentModificationException and Continuous warnings "Attempting to send response via channel for which there is no open connection"

2019-02-22 Thread Abhi (JIRA)
Abhi created KAFKA-7982:
---

 Summary: ConcurrentModificationException and Continuous warnings 
"Attempting to send response via channel for which there is no open connection"
 Key: KAFKA-7982
 URL: https://issues.apache.org/jira/browse/KAFKA-7982
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.1.1
Reporter: Abhi


Hi,

I am getting follow warnings in server.log continuosly and due to this client 
consumer is not able to consumer messages.

[2019-02-20 10:26:30,312] WARN Attempting to send response via channel for 
which there is no open connection, connection id 
10.218.27.45:9092-10.219.25.239:35248-6259 (kafka.network.Processor)
[2019-02-20 10:26:56,760] WARN Attempting to send response via channel for 
which there is no open connection, connection id 
10.218.27.45:9092-10.219.25.239:35604-6261 (kafka.network.Processor)

I also noticed that before these warnings started to appear, following 
concurrent modification exception for the same IP address:

[2019-02-20 09:01:11,175] INFO Initiating logout for 
kafka/u-kafkatst-kafkadev-1.sd@unix.com 
(org.apache.kafka.common.security.kerberos.KerberosLogin)
[2019-02-20 09:01:11,176] WARN [SocketServer brokerId=1] Unexpected error from 
/10.219.25.239; closing connection (org.apache.kafka.common.network.Selector)
java.util.ConcurrentModificationException
at 
java.base/java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:970)
at java.base/java.util.LinkedList$ListItr.next(LinkedList.java:892)
at 
java.base/javax.security.auth.Subject$SecureSet$1.next(Subject.java:1096)
at 
java.base/javax.security.auth.Subject$ClassSet$1.run(Subject.java:1501)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at 
java.base/javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1499)
at 
java.base/javax.security.auth.Subject$ClassSet.(Subject.java:1472)
at 
java.base/javax.security.auth.Subject.getPrivateCredentials(Subject.java:764)
at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:336)
at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:328)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at 
java.security.jgss/sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:328)
at 
java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
at 
java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
at 
java.security.jgss/sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:187)
at 
java.security.jgss/sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:439)
at 
java.security.jgss/sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:74)
at 
java.security.jgss/sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:148)
at 
jdk.security.jgss/com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
at 
jdk.security.jgss/com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
at 
java.security.sasl/javax.security.sasl.Sasl.createSaslServer(Sasl.java:537)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.lambda$createSaslKerberosServer$12(SaslServerAuthenticator.java:212)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:211)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslServer(SaslServerAuthenticator.java:164)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:450)
at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:248)
at 
org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:132)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
at org.apache.kafka.common.network.Selector.poll(Selector.java:467)
at kafka.network.Processor.poll(SocketServer.scala:689)
at kafka.network.Processor.run(SocketServer.scala:594)
at java.base/java.lang.Thread.run(Thread.java:834)
[2019-02-22 00:18:29,439] INFO Initiating re-login for 
kafka/u-kafkatst-kafkadev-1.sd.deshaw@unix.deshaw.com 
(org.apache.kafka.common.security.kerberos.KerberosLogin)
[2019-02-22 00:18:29,440] WARN [SocketServer brokerId=1] Unexpected error from 
/10.219.25.239; closing connection (org.

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-02-22 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775061#comment-16775061
 ] 

Abhi commented on KAFKA-7925:
-

Any updates on this?

>From the threaddump, I see has 
>'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0' locked 
>'0x0006ca1c9a80' and doesn't seem to be making progress. The other network 
>threads 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1' 
>and  and 
>'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2' are 
>waiting to lock '0x0006ca1c9a80'.

This is causing no new connection requests to be accepted. Can you please check 
this.



> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0
> Environment: Java 11, Kafka v2.1.0
>Reporter: Abhi
>Priority: Critical
> Attachments: threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 

[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-02-14 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16768321#comment-16768321
 ] 

Abhi commented on KAFKA-7925:
-

I keep seeing below exception in all server logs.  No clients are able to 
connect to kafka brokers and keep timing out. Could anyone please help with 
this issue or provide a workaround to avoid this?


java.net.SocketTimeoutException: Failed to connect within 3 ms
at 
kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:93)
at 
kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:190)
at 
kafka.server.AbstractFetcherThread.kafka$server$AbstractFetcherThread$$processFetchRequest(AbstractFetcherThread.scala:241)
at 
kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:130)
at 
kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:129)
at scala.Option.foreach(Option.scala:257)
at 
kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:129)
at 
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:111)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
[2019-02-14 09:20:00,617] INFO [ReplicaFetcher replicaId=1, leaderId=6, 
fetcherId=0] Error sending fetch request (sessionId=841897464, epoch=INITIAL) 
to node 6: java.net.SocketTimeoutException: Failed to connect within 3 ms. 
(org.apache.kafka.clients.FetchSessionHandler)


> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0
> Environment: Java 11, Kafka v2.1.0
>Reporter: Abhi
>Priority: Critical
> Attachments: threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mw

[jira] [Created] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-02-13 Thread Abhi (JIRA)
Abhi created KAFKA-7925:
---

 Summary: Constant 100% cpu usage by all kafka brokers
 Key: KAFKA-7925
 URL: https://issues.apache.org/jira/browse/KAFKA-7925
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.1.0
 Environment: Java 11, Kafka v2.1.0
Reporter: Abhi
 Attachments: threadump20190212.txt

Hi,

I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
without any clients connected to any broker.

This is a bug that we have seen multiple times in our kafka setup that is not 
yet open to clients. It is becoming a blocker for our deployment now.

I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
below). In thread usage, I am seeing these threads 
'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
 taking up more than 90% of the cpu time in a 60s interval.

I have attached a thread dump of one of the brokers in the cluster.

*Java version:*

openjdk 11.0.2 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)

*Kafka verison:* v2.1.0

 

*connections:*

java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
java 144319 kafkagod 2048u sock 0,7 0t0 30012546 protocol: TCP
java 144319 kafkagod 2049u IPv4 30005418 0t0 TCP 
mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39686 (CLOSE_WAIT)
java 144319 kafkagod 2050u IPv4 30009977 0t0 TCP 
mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34552 (ESTABLISHED)
java 144319 kafkagod 2060u sock 0,7 0t0 30003439 protocol: TCP
java 144319 kafkagod 2061u IPv4 30012906 0t0 TCP 
mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51862 (ESTABLISHED)
java 144319 kafkagod 2069u IPv4 30005642 0t0 TCP 
mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34570 (ESTABLISHED)
java 144319 kafkagod 2073u sock 0,7 0t0 30003440 protocol: TCP
java 144319 kafkagod 2086u IPv4 30005644 0t0 TCP 
mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:5187

[jira] [Commented] (KAFKA-7812) Deadlock in SaslServerAuthenticator related threads

2019-01-29 Thread Abhi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754787#comment-16754787
 ] 

Abhi commented on KAFKA-7812:
-

Hi [~rsivaram], Yes I am running the servers with 
`sun.security.jgss.native=true`. There weren't a huge number of connections but 
I was running a performance test when this happened so there would have been a 
lot of kafka message requests being sent.

> Deadlock in SaslServerAuthenticator related threads
> ---
>
> Key: KAFKA-7812
> URL: https://issues.apache.org/jira/browse/KAFKA-7812
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Abhi
>Priority: Major
> Attachments: threaddump-100cpu-tbd-broker-5
>
>
> I am encountering a deadlock situation in SaslServerAuthenticator related 
> code path where one thread is waiting for a monitor object locked by another 
> thread.
> +*Thread 1:*+
> "kafka-network-thread-5-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0" #66 
> prio=5 os_prio=0 tid=0x7fe131e17000 nid=0x78f7 runnable 
> [{color:#d04437}*0x7fde287ed000*{color}]
>  java.lang.Thread.State: RUNNABLE
>  at java.util.HashMap$TreeNode.find(HashMap.java:1865)
>  at java.util.HashMap$TreeNode.find(HashMap.java:1861)
>  at java.util.HashMap$TreeNode.find(HashMap.java:1861)
>  at java.util.HashMap$TreeNode.find(HashMap.java:1861)
>  at java.util.HashMap$TreeNode.find(HashMap.java:1861)
>  at java.util.HashMap$TreeNode.find(HashMap.java:1861)
>  at java.util.HashMap$TreeNode.find(HashMap.java:1861)
>  at java.util.HashMap$TreeNode.find(HashMap.java:1861)
>  at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:1979)
>  at java.util.HashMap.putVal(HashMap.java:637)
>  at java.util.HashMap.put(HashMap.java:611)
>  at java.util.HashSet.add(HashSet.java:219)
>  at javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1418)
>  at javax.security.auth.Subject$ClassSet.(Subject.java:1372)
>  - *{color:#f79232}locked <0x00068893aae8>{color}* (a 
> java.util.Collections$SynchronizedSet)
>  at javax.security.auth.Subject.getPrivateCredentials(Subject.java:767)
>  at sun.security.jgss.GSSUtil$1.run(GSSUtil.java:343)
>  at sun.security.jgss.GSSUtil$1.run(GSSUtil.java:335)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:335)
>  at 
> sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
>  at 
> sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
>  at 
> sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:193)
>  at sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:427)
>  at sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:62)
>  at sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:154)
>  at com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
>  at 
> com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>  at javax.security.sasl.Sasl.createSaslServer(Sasl.java:524)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator$2.run(SaslServerAuthenticator.java:215)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator$2.run(SaslServerAuthenticator.java:213)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:213)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslServer(SaslServerAuthenticator.java:162)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:443)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:253)
>  at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:127)
>  at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:487)
>  at org.apache.kafka.common.network.Selector.poll(Selector.java:425)
>  at kafka.network.Processor.poll(SocketServer.scala:678)
>  at kafka.network.Processor.run(SocketServer.scala:583)
>  at java.lang.Thread.run(Thread.java:745)
> Locked ownable synchronizers:
>  - None
> *+Thread 2:+*
> "kafka-network-thread-5-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2" #68 
> prio=5 os_prio=0 tid=0x7fe131e1a800 nid=0x78f9 waiting for monitor entry 
> [{color:#d04437}*0x7fde277ed000*{color}]
>  java.lang.Thread.State: BLOCKED (on object monitor)
>  at java.util.Collections$SynchronizedCollection.add(Collections.java:2035)
>  - {color:#

[jira] [Created] (KAFKA-7812) Deadlock in SaslServerAuthenticator related threads

2019-01-11 Thread Abhi (JIRA)
Abhi created KAFKA-7812:
---

 Summary: Deadlock in SaslServerAuthenticator related threads
 Key: KAFKA-7812
 URL: https://issues.apache.org/jira/browse/KAFKA-7812
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.0.0
Reporter: Abhi
 Attachments: threaddump-100cpu-tbd-broker-5

I am encountering a deadlock situation in SaslServerAuthenticator related code 
path where one thread is waiting for a monitor object locked by another thread.

+*Thread 1:*+

"kafka-network-thread-5-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0" #66 
prio=5 os_prio=0 tid=0x7fe131e17000 nid=0x78f7 runnable 
[{color:#d04437}*0x7fde287ed000*{color}]
 java.lang.Thread.State: RUNNABLE
 at java.util.HashMap$TreeNode.find(HashMap.java:1865)
 at java.util.HashMap$TreeNode.find(HashMap.java:1861)
 at java.util.HashMap$TreeNode.find(HashMap.java:1861)
 at java.util.HashMap$TreeNode.find(HashMap.java:1861)
 at java.util.HashMap$TreeNode.find(HashMap.java:1861)
 at java.util.HashMap$TreeNode.find(HashMap.java:1861)
 at java.util.HashMap$TreeNode.find(HashMap.java:1861)
 at java.util.HashMap$TreeNode.find(HashMap.java:1861)
 at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:1979)
 at java.util.HashMap.putVal(HashMap.java:637)
 at java.util.HashMap.put(HashMap.java:611)
 at java.util.HashSet.add(HashSet.java:219)
 at javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1418)
 at javax.security.auth.Subject$ClassSet.(Subject.java:1372)
 - *{color:#f79232}locked <0x00068893aae8>{color}* (a 
java.util.Collections$SynchronizedSet)
 at javax.security.auth.Subject.getPrivateCredentials(Subject.java:767)
 at sun.security.jgss.GSSUtil$1.run(GSSUtil.java:343)
 at sun.security.jgss.GSSUtil$1.run(GSSUtil.java:335)
 at java.security.AccessController.doPrivileged(Native Method)
 at sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:335)
 at 
sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
 at 
sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
 at 
sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:193)
 at sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:427)
 at sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:62)
 at sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:154)
 at com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
 at 
com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
 at javax.security.sasl.Sasl.createSaslServer(Sasl.java:524)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator$2.run(SaslServerAuthenticator.java:215)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator$2.run(SaslServerAuthenticator.java:213)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:213)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslServer(SaslServerAuthenticator.java:162)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:443)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:253)
 at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:127)
 at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:487)
 at org.apache.kafka.common.network.Selector.poll(Selector.java:425)
 at kafka.network.Processor.poll(SocketServer.scala:678)
 at kafka.network.Processor.run(SocketServer.scala:583)
 at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
 - None

*+Thread 2:+*

"kafka-network-thread-5-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2" #68 
prio=5 os_prio=0 tid=0x7fe131e1a800 nid=0x78f9 waiting for monitor entry 
[{color:#d04437}*0x7fde277ed000*{color}]
 java.lang.Thread.State: BLOCKED (on object monitor)
 at java.util.Collections$SynchronizedCollection.add(Collections.java:2035)
 - {color:#f79232}*waiting to lock <0x00068893aae8>*{color} (a 
java.util.Collections$SynchronizedSet)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:206)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslServer(SaslServerAuthenticator.java:162)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:443)
 at 
org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:253)
 at org.apache.kafka.common.network.KafkaChanne

[jira] [Commented] (KAFKA-6337) Error for partition [__consumer_offsets,15] to broker

2017-12-11 Thread Abhi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285796#comment-16285796
 ] 

Abhi commented on KAFKA-6337:
-

[~omkreddy] Please suggest 

> Error for partition [__consumer_offsets,15] to broker
> -
>
> Key: KAFKA-6337
> URL: https://issues.apache.org/jira/browse/KAFKA-6337
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
> Environment: Windows running Kafka(0.10.2.0)
> 3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
> running on single windows machine with different disk for logs directory.
>Reporter: Abhi
>Priority: Blocker
>  Labels: windows
>
> Hello *
> I am running Kafka(0.10.2.0) on windows from the past one year ...
> But off late there has been unique Broker issues that I have observed 4-5 
> times in
> last 4 months.
> Kafka setup cofig...
> 3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
> running on single windows machine with different disk for logs directory
> My Kafka has 2 Topics with partition size 50 each , and replication factor of 
> 3.
> My partition logic selection: Each message has a unique ID and logic of 
> selecting partition is ( unique ID % 50), and then calling Kafka producer API 
> to route a specific message to a particular topic partition .
> My Each Broker Properties look like this
> {{broker.id=0
> port:9093
> num.network.threads=3
> num.io.threads=8
> socket.send.buffer.bytes=102400
> socket.receive.buffer.bytes=102400
> socket.request.max.bytes=104857600
> offsets.retention.minutes=360
> advertised.host.name=1.1.1.2
> advertised.port:9093
> ctories under which to store log files
> log.dirs=C:\\kafka_2.10-0.10.2.0-SNAPSHOT\\data\\kafka-logs
> num.partitions=1
> num.recovery.threads.per.data.dir=1
> log.retention.minutes=360
> log.segment.bytes=52428800
> log.retention.check.interval.ms=30
> log.cleaner.enable=true
> log.cleanup.policy=delete
> log.cleaner.min.cleanable.ratio=0.5
> log.cleaner.backoff.ms=15000
> log.segment.delete.delay.ms=6000
> auto.create.topics.enable=false
> zookeeper.connect=1.1.1.2:2181,1.1.1.3:2182,1.1.1.4:2183
> zookeeper.connection.timeout.ms=6000
> }}
> But of-late there has been a unique case that's cropping out in Kafka broker 
> nodes,
> _[2017-12-02 02:47:40,024] ERROR [ReplicaFetcherThread-0-4], Error for 
> partition [__consumer_offsets,15] to broker 
> 4:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server 
> is not the leader for that topic-partition. 
> (kafka.server.ReplicaFetcherThread)_
> The entire server.log is filled with these logs, and its very huge too , 
> please help me in understanding under what circumstances these can occur, and 
> what measures I need to take.. 
> Please help me this is the third time in last three Saturdays i faced the 
> similar issue. 
> Courtesy
> Abhi
> !wq 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6337) Error for partition [__consumer_offsets,15] to broker

2017-12-11 Thread Abhi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285765#comment-16285765
 ] 

Abhi commented on KAFKA-6337:
-

This error oflate has been observed very frequently in my production 
environment. 
Where as the traffic to my kafka is constant from day 1 ...

During the above error yes there is a service disruption as my Kafka producer 
failed to produce on particular partitions, and those messages keeps on pilled 
up at Kafka Producer internal queue. 

You mean increasing this property of "*zookeeper.connection.timeout.ms=6000*" 
of each server.properties file of each broker

And do you suggest some value , does there exist any science like what value of 
time out I should mention ? 30 Seconds, 1 Minute, 5 minute ? 

Courtesy 
Abhit
!wq


> Error for partition [__consumer_offsets,15] to broker
> -
>
> Key: KAFKA-6337
> URL: https://issues.apache.org/jira/browse/KAFKA-6337
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
> Environment: Windows running Kafka(0.10.2.0)
> 3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
> running on single windows machine with different disk for logs directory.
>Reporter: Abhi
>Priority: Blocker
>  Labels: windows
>
> Hello *
> I am running Kafka(0.10.2.0) on windows from the past one year ...
> But off late there has been unique Broker issues that I have observed 4-5 
> times in
> last 4 months.
> Kafka setup cofig...
> 3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
> running on single windows machine with different disk for logs directory
> My Kafka has 2 Topics with partition size 50 each , and replication factor of 
> 3.
> My partition logic selection: Each message has a unique ID and logic of 
> selecting partition is ( unique ID % 50), and then calling Kafka producer API 
> to route a specific message to a particular topic partition .
> My Each Broker Properties look like this
> {{broker.id=0
> port:9093
> num.network.threads=3
> num.io.threads=8
> socket.send.buffer.bytes=102400
> socket.receive.buffer.bytes=102400
> socket.request.max.bytes=104857600
> offsets.retention.minutes=360
> advertised.host.name=1.1.1.2
> advertised.port:9093
> ctories under which to store log files
> log.dirs=C:\\kafka_2.10-0.10.2.0-SNAPSHOT\\data\\kafka-logs
> num.partitions=1
> num.recovery.threads.per.data.dir=1
> log.retention.minutes=360
> log.segment.bytes=52428800
> log.retention.check.interval.ms=30
> log.cleaner.enable=true
> log.cleanup.policy=delete
> log.cleaner.min.cleanable.ratio=0.5
> log.cleaner.backoff.ms=15000
> log.segment.delete.delay.ms=6000
> auto.create.topics.enable=false
> zookeeper.connect=1.1.1.2:2181,1.1.1.3:2182,1.1.1.4:2183
> zookeeper.connection.timeout.ms=6000
> }}
> But of-late there has been a unique case that's cropping out in Kafka broker 
> nodes,
> _[2017-12-02 02:47:40,024] ERROR [ReplicaFetcherThread-0-4], Error for 
> partition [__consumer_offsets,15] to broker 
> 4:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server 
> is not the leader for that topic-partition. 
> (kafka.server.ReplicaFetcherThread)_
> The entire server.log is filled with these logs, and its very huge too , 
> please help me in understanding under what circumstances these can occur, and 
> what measures I need to take.. 
> Please help me this is the third time in last three Saturdays i faced the 
> similar issue. 
> Courtesy
> Abhi
> !wq 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6337) Error for partition [__consumer_offsets,15] to broker

2017-12-11 Thread Abhi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285700#comment-16285700
 ] 

Abhi commented on KAFKA-6337:
-

Any pointers 

> Error for partition [__consumer_offsets,15] to broker
> -
>
> Key: KAFKA-6337
> URL: https://issues.apache.org/jira/browse/KAFKA-6337
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
> Environment: Windows running Kafka(0.10.2.0)
> 3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
> running on single windows machine with different disk for logs directory.
>Reporter: Abhi
>Priority: Blocker
>  Labels: windows
>
> Hello *
> I am running Kafka(0.10.2.0) on windows from the past one year ...
> But off late there has been unique Broker issues that I have observed 4-5 
> times in
> last 4 months.
> Kafka setup cofig...
> 3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
> running on single windows machine with different disk for logs directory
> My Kafka has 2 Topics with partition size 50 each , and replication factor of 
> 3.
> My partition logic selection: Each message has a unique ID and logic of 
> selecting partition is ( unique ID % 50), and then calling Kafka producer API 
> to route a specific message to a particular topic partition .
> My Each Broker Properties look like this
> {{broker.id=0
> port:9093
> num.network.threads=3
> num.io.threads=8
> socket.send.buffer.bytes=102400
> socket.receive.buffer.bytes=102400
> socket.request.max.bytes=104857600
> offsets.retention.minutes=360
> advertised.host.name=1.1.1.2
> advertised.port:9093
> ctories under which to store log files
> log.dirs=C:\\kafka_2.10-0.10.2.0-SNAPSHOT\\data\\kafka-logs
> num.partitions=1
> num.recovery.threads.per.data.dir=1
> log.retention.minutes=360
> log.segment.bytes=52428800
> log.retention.check.interval.ms=30
> log.cleaner.enable=true
> log.cleanup.policy=delete
> log.cleaner.min.cleanable.ratio=0.5
> log.cleaner.backoff.ms=15000
> log.segment.delete.delay.ms=6000
> auto.create.topics.enable=false
> zookeeper.connect=1.1.1.2:2181,1.1.1.3:2182,1.1.1.4:2183
> zookeeper.connection.timeout.ms=6000
> }}
> But of-late there has been a unique case that's cropping out in Kafka broker 
> nodes,
> _[2017-12-02 02:47:40,024] ERROR [ReplicaFetcherThread-0-4], Error for 
> partition [__consumer_offsets,15] to broker 
> 4:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server 
> is not the leader for that topic-partition. 
> (kafka.server.ReplicaFetcherThread)_
> The entire server.log is filled with these logs, and its very huge too , 
> please help me in understanding under what circumstances these can occur, and 
> what measures I need to take.. 
> Please help me this is the third time in last three Saturdays i faced the 
> similar issue. 
> Courtesy
> Abhi
> !wq 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-6337) Error for partition [__consumer_offsets,15] to broker

2017-12-11 Thread Abhi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16284675#comment-16284675
 ] 

Abhi edited comment on KAFKA-6337 at 12/11/17 10:01 AM:


Server.log when the error started coming

// Server.log
[2017-12-09 03:10:49,947] ERROR [KafkaApi-1] Error when handling request 
{controller_id=0,controller_epoch=1,partition_states=[{topic=LIVE,partition=31,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[0,4,5]},{topic=LIVE,partition=9,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[6,0,1]},{topic=__consumer_offsets,partition=27,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[0,5,6]},{topic=__consumer_offsets,partition=19,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[6,3,4]},{topic=LIVEOLD,partition=10,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[6,2,3]},{topic=LIVEOLD,partition=32,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=__consumer_offsets,partition=13,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[0,3,4]},{topic=LIVE,partition=17,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=__consumer_offsets,partition=5,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVEOLD,partition=18,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[0,4,5]},{topic=LIVEOLD,partition=45,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVE,partition=3,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=LIVE,partition=30,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[6,3,4]},{topic=LIVEOLD,partition=4,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=__consumer_offsets,partition=48,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=LIVE,partition=44,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=__consumer_offsets,partition=40,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[6,0,1]},{topic=LIVEOLD,partition=31,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=LIVE,partition=16,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVE,partition=38,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[0,5,6]},{topic=__consumer_offsets,partition=34,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=LIVEOLD,partition=17,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[6,3,4]},{topic=LIVEOLD,partition=39,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[0,1,2]},{topic=__consumer_offsets,partition=26,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[6,4,5]},{topic=LIVE,partition=24,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[0,3,4]},{topic=LIVE,partition=2,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=__consumer_offsets,partition=20,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[0,4,5]},{topic=__consumer_offsets,partition=12,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[6,2,3]},{topic=LIVEOLD,partition=3,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVEOLD,partition=25,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[0,5,6]},{topic=LIVE,partition=10,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[0,1,2]},{topic=__consumer_offsets,partition=6,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=LIVEOLD,partition=11,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[0,3,4]},{topic=__consumer_offsets,partition=47,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVEOLD,partition=38,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[6,0,1]},{topic=__consumer_offsets,partition=41,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[0,1,2]},{topic=LIVE,partition=23,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[6,2,3]},{topic=LIVE,partition=45,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=__consumer_offsets,partition=33,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=LIVEOLD,partition=24,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,

[jira] [Updated] (KAFKA-6337) Error for partition [__consumer_offsets,15] to broker

2017-12-11 Thread Abhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhi updated KAFKA-6337:

Priority: Blocker  (was: Major)

> Error for partition [__consumer_offsets,15] to broker
> -
>
> Key: KAFKA-6337
> URL: https://issues.apache.org/jira/browse/KAFKA-6337
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
> Environment: Windows running Kafka(0.10.2.0)
> 3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
> running on single windows machine with different disk for logs directory.
>Reporter: Abhi
>Priority: Blocker
>  Labels: windows
>
> Hello *
> I am running Kafka(0.10.2.0) on windows from the past one year ...
> But off late there has been unique Broker issues that I have observed 4-5 
> times in
> last 4 months.
> Kafka setup cofig...
> 3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
> running on single windows machine with different disk for logs directory
> My Kafka has 2 Topics with partition size 50 each , and replication factor of 
> 3.
> My partition logic selection: Each message has a unique ID and logic of 
> selecting partition is ( unique ID % 50), and then calling Kafka producer API 
> to route a specific message to a particular topic partition .
> My Each Broker Properties look like this
> {{broker.id=0
> port:9093
> num.network.threads=3
> num.io.threads=8
> socket.send.buffer.bytes=102400
> socket.receive.buffer.bytes=102400
> socket.request.max.bytes=104857600
> offsets.retention.minutes=360
> advertised.host.name=1.1.1.2
> advertised.port:9093
> ctories under which to store log files
> log.dirs=C:\\kafka_2.10-0.10.2.0-SNAPSHOT\\data\\kafka-logs
> num.partitions=1
> num.recovery.threads.per.data.dir=1
> log.retention.minutes=360
> log.segment.bytes=52428800
> log.retention.check.interval.ms=30
> log.cleaner.enable=true
> log.cleanup.policy=delete
> log.cleaner.min.cleanable.ratio=0.5
> log.cleaner.backoff.ms=15000
> log.segment.delete.delay.ms=6000
> auto.create.topics.enable=false
> zookeeper.connect=1.1.1.2:2181,1.1.1.3:2182,1.1.1.4:2183
> zookeeper.connection.timeout.ms=6000
> }}
> But of-late there has been a unique case that's cropping out in Kafka broker 
> nodes,
> _[2017-12-02 02:47:40,024] ERROR [ReplicaFetcherThread-0-4], Error for 
> partition [__consumer_offsets,15] to broker 
> 4:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server 
> is not the leader for that topic-partition. 
> (kafka.server.ReplicaFetcherThread)_
> The entire server.log is filled with these logs, and its very huge too , 
> please help me in understanding under what circumstances these can occur, and 
> what measures I need to take.. 
> Please help me this is the third time in last three Saturdays i faced the 
> similar issue. 
> Courtesy
> Abhi
> !wq 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6337) Error for partition [__consumer_offsets,15] to broker

2017-12-09 Thread Abhi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16284675#comment-16284675
 ] 

Abhi commented on KAFKA-6337:
-

{{ 
{controller_id=0,controller_epoch=1,partition_states=[{topic=LIVETOPIC,partition=31,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[0,4,5]},{topic=LIVETOPIC,partition=9,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[6,0,1]},{topic=__consumer_offsets,partition=27,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[0,5,6]},{topic=__consumer_offsets,partition=19,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[6,3,4]},{topic=LIVETOPICOLD,partition=10,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[6,2,3]},{topic=LIVETOPICOLD,partition=32,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=__consumer_offsets,partition=13,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[0,3,4]},{topic=LIVETOPIC,partition=17,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=__consumer_offsets,partition=5,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVETOPICOLD,partition=18,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[0,4,5]},{topic=LIVETOPICOLD,partition=45,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVETOPIC,partition=3,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=LIVETOPIC,partition=30,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[6,3,4]},{topic=LIVETOPICOLD,partition=4,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=__consumer_offsets,partition=48,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=LIVETOPIC,partition=44,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=__consumer_offsets,partition=40,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[6,0,1]},{topic=LIVETOPICOLD,partition=31,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=LIVETOPIC,partition=16,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVETOPIC,partition=38,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[0,5,6]},{topic=__consumer_offsets,partition=34,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=LIVETOPICOLD,partition=17,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[6,3,4]},{topic=LIVETOPICOLD,partition=39,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[0,1,2]},{topic=__consumer_offsets,partition=26,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[6,4,5]},{topic=LIVETOPIC,partition=24,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[0,3,4]},{topic=LIVETOPIC,partition=2,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=__consumer_offsets,partition=20,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[0,4,5]},{topic=__consumer_offsets,partition=12,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[6,2,3]},{topic=LIVETOPICOLD,partition=3,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVETOPICOLD,partition=25,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[0,5,6]},{topic=LIVETOPIC,partition=10,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[0,1,2]},{topic=__consumer_offsets,partition=6,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=LIVETOPICOLD,partition=11,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[0,3,4]},{topic=__consumer_offsets,partition=47,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVETOPICOLD,partition=38,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[6,0,1]},{topic=__consumer_offsets,partition=41,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[0,1,2]},{topic=LIVETOPIC,partition=23,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[6,2,3]},{topic=LIVETOPIC,partition=45,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=__consumer_offsets,partition=33,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=LIVETOPICOLD,partition=24,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[6,4,5]},{topic=LIVETOPI

[jira] [Created] (KAFKA-6337) Error for partition [__consumer_offsets,15] to broker

2017-12-08 Thread Abhi (JIRA)
Abhi created KAFKA-6337:
---

 Summary: Error for partition [__consumer_offsets,15] to broker
 Key: KAFKA-6337
 URL: https://issues.apache.org/jira/browse/KAFKA-6337
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.2.0
 Environment: Windows running Kafka(0.10.2.0)
3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
running on single windows machine with different disk for logs directory.
Reporter: Abhi


Hello *

I am running Kafka(0.10.2.0) on windows from the past one year ...

But off late there has been unique Broker issues that I have observed 4-5 times 
in
last 4 months.

Kafka setup cofig...

3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker nodes 
running on single windows machine with different disk for logs directory

My Kafka has 2 Topics with partition size 50 each , and replication factor of 3.

My partition logic selection: Each message has a unique ID and logic of 
selecting partition is ( unique ID % 50), and then calling Kafka producer API 
to route a specific message to a particular topic partition .

My Each Broker Properties look like this

{{broker.id=0
port:9093
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
offsets.retention.minutes=360
advertised.host.name=1.1.1.2
advertised.port:9093
ctories under which to store log files
log.dirs=C:\\kafka_2.10-0.10.2.0-SNAPSHOT\\data\\kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
log.retention.minutes=360
log.segment.bytes=52428800
log.retention.check.interval.ms=30
log.cleaner.enable=true
log.cleanup.policy=delete
log.cleaner.min.cleanable.ratio=0.5
log.cleaner.backoff.ms=15000
log.segment.delete.delay.ms=6000
auto.create.topics.enable=false
zookeeper.connect=1.1.1.2:2181,1.1.1.3:2182,1.1.1.4:2183
zookeeper.connection.timeout.ms=6000
}}
But of-late there has been a unique case that's cropping out in Kafka broker 
nodes,
_[2017-12-02 02:47:40,024] ERROR [ReplicaFetcherThread-0-4], Error for 
partition [__consumer_offsets,15] to broker 
4:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is 
not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)_

The entire server.log is filled with these logs, and its very huge too , please 
help me in understanding under what circumstances these can occur, and what 
measures I need to take.. 

Please help me this is the third time in last three Saturdays i faced the 
similar issue. 

Courtesy
Abhi
!wq 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)