[jira] [Commented] (KAFKA-7982) ConcurrentModificationException and Continuous warnings "Attempting to send response via channel for which there is no open connection"

2020-01-27 Thread Pradeep Bansal (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024792#comment-17024792
 ] 

Pradeep Bansal commented on KAFKA-7982:
---

an somebody please help with this issue.

> ConcurrentModificationException and Continuous warnings "Attempting to send 
> response via channel for which there is no open connection"
> ---
>
> Key: KAFKA-7982
> URL: https://issues.apache.org/jira/browse/KAFKA-7982
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.1
>Reporter: Abhi
>Priority: Major
>
> Hi,
> I am getting following warnings in server.log continuosly and due to this 
> client consumer is not able to consumer messages.
> [2019-02-20 10:26:30,312] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35248-6259 (kafka.network.Processor)
>  [2019-02-20 10:26:56,760] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35604-6261 (kafka.network.Processor)
> I also noticed that before these warnings started to appear, following 
> concurrent modification exception for the same IP address:
> [2019-02-20 09:01:11,175] INFO Initiating logout for 
> kafka/u-kafkatst-kafkadev-1.sd@unix.com 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)
>  [2019-02-20 09:01:11,176] WARN [SocketServer brokerId=1] Unexpected error 
> from /10.219.25.239; closing connection 
> (org.apache.kafka.common.network.Selector)
>  java.util.ConcurrentModificationException
>  at 
> java.base/java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:970)
>  at java.base/java.util.LinkedList$ListItr.next(LinkedList.java:892)
>  at java.base/javax.security.auth.Subject$SecureSet$1.next(Subject.java:1096)
>  at java.base/javax.security.auth.Subject$ClassSet$1.run(Subject.java:1501)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.base/javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1499)
>  at java.base/javax.security.auth.Subject$ClassSet.(Subject.java:1472)
>  at 
> java.base/javax.security.auth.Subject.getPrivateCredentials(Subject.java:764)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:336)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:328)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.security.jgss/sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:328)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:187)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:439)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:74)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:148)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>  at 
> java.security.sasl/javax.security.sasl.Sasl.createSaslServer(Sasl.java:537)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.lambda$createSaslKerberosServer$12(SaslServerAuthenticator.java:212)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:211)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslServer(SaslServerAuthenticator.java:164)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:450)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:248)
>  at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:132)
>  at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
>  at org.apache.kafka.common.network.Selector.poll(Selector.java:467)
>  at kafka.network.Processor.poll(SocketServer.scala:689)
>  at kafka.network.Processor.run(SocketServer.scala:594)
>  at java.base/java.lang.Thread.run(Thread.java:834)

[jira] [Commented] (KAFKA-9048) Improve scalability in number of partitions in replica fetcher

2019-12-09 Thread Pradeep Bansal (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992214#comment-16992214
 ] 

Pradeep Bansal commented on KAFKA-9048:
---

When is this change planned in Kafka release?

> Improve scalability in number of partitions in replica fetcher
> --
>
> Key: KAFKA-9048
> URL: https://issues.apache.org/jira/browse/KAFKA-9048
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Lucas Bradstreet
>Assignee: Guozhang Wang
>Priority: Major
>
> https://issues.apache.org/jira/browse/KAFKA-9039 
> ([https://github.com/apache/kafka/pull/7443]) improves the performance of the 
> replica fetcher (at both small and large numbers of partitions), but it does 
> not improve its complexity or scalability in the number of partitions.
> I took a profile using async-profiler for the 1000 partition JMH replica 
> fetcher benchmark. The big remaining culprits are:
>  * ~18% looking up logStartOffset
>  * ~45% FetchSessionHandler$Builder.add
>  * ~19% FetchSessionHandler$Builder.build
> *Suggestions*
>  # The logStartOffset is looked up for every partition on each doWork pass. 
> This requires a hashmap lookup even though the logStartOffset changes rarely. 
> If the replica fetcher could be notified of updates to the logStartOffset, 
> then we could reduce the overhead to a function of the number of updates to 
> the logStartOffset instead of O( n ) on each pass.
>  # The use of FetchSessionHandler means that we maintain a partitionStates 
> hashmap in the replica fetcher, and a sessionPartitions hashmap in the 
> FetchSessionHandler. On each incremental fetch session pass, we need to 
> reconcile these two hashmaps to determine which partitions were added/updated 
> and which partitions were removed. This reconciliation process is especially 
> expensive, requiring multiple passes over the fetching partitions, and 
> hashmap remove and puts for most partitions. The replica fetcher could be 
> smarter by maintaining the fetch session *updated* hashmap containing 
> FetchRequest.PartitionData(s) directly, as well as *removed* partitions list 
> so that these do not need to be generated by reconciled on each fetch pass.
>  # maybeTruncate requires an O( n ) pass over the elements in partitionStates 
> even if there are no partitions in truncating state. If we can maintain some 
> additional state about whether truncating partitions exist in 
> partitionStates, or if we could separate these states into a separate data 
> structure, we would not need to iterate across all partitions on every doWork 
> pass. I’ve seen clusters where this work takes about 0.5%-1% of CPU, which is 
> minor but will become more substantial as the number of partitions increases.
> If we can achieve 1 and 2, the complexity will be improved from a function of 
> the number of partitions to the the number of partitions with updated fetch 
> offsets/log start offsets between each fetch. In general, a minority of 
> partitions will have changes in these between fetches, so this should improve 
> the average case complexity greatly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9064) Observing transient issue with kinit command

2019-12-09 Thread Pradeep Bansal (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992191#comment-16992191
 ] 

Pradeep Bansal commented on KAFKA-9064:
---

Can somebody please help with resolving this?

> Observing transient issue with kinit command
> 
>
> Key: KAFKA-9064
> URL: https://issues.apache.org/jira/browse/KAFKA-9064
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
>
> I have specified kinit command to be skinit. While this works fine for most 
> time, sometimes I see below exception where it doesnt respect provided kinit 
> command and use default value. Can this be handled?
>  
> |{{[}}{{2019}}{{-}}{{02}}{{-}}{{19}} 
> {{10}}{{:}}{{20}}{{:}}{{07}}{{,}}{{862}}{{] WARN [Principal}}{{=}}{{null]: 
> Could }}{{not}} {{renew TGT due to problem running shell command: 
> }}{{'/usr/bin/kinit -R'}}{{. Exiting refresh thread. 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)}}
> {{org.apache.kafka.common.utils.Shell$ExitCodeException: kinit: Matching 
> credential }}{{not}} {{found (filename: 
> }}{{/}}{{tmp}}{{/}}{{krb5cc_25012_76850_sshd_w6VpLC8R0Y) }}{{while}} 
> {{renewing credentials}}
>  
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.runCommand(Shell.java:}}{{130}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.run(Shell.java:}}{{76}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell$ShellCommandExecutor.execute(Shell.java:}}{{204}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.execCommand(Shell.java:}}{{268}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.execCommand(Shell.java:}}{{255}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.security.kerberos.KerberosLogin.}}{{lambda}}{{$login$}}{{10}}{{(KerberosLogin.java:}}{{212}}{{)}}
> {{}}{{at 
> java.base}}{{/}}{{java.lang.Thread.run(Thread.java:}}{{834}}{{)}}|
> | |
> | |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-12-04 Thread Pradeep Bansal (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988524#comment-16988524
 ] 

Pradeep Bansal commented on KAFKA-7925:
---

Is there any update. Seems like 2.3.0 and 2.3.1 are already out and they ddon't 
have this fix. Which version this fix will be available?

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.2.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1, Kafka v2.2.0
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod 2048u sock 0,7 0t0 

[jira] [Commented] (KAFKA-7982) ConcurrentModificationException and Continuous warnings "Attempting to send response via channel for which there is no open connection"

2019-12-04 Thread Pradeep Bansal (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988520#comment-16988520
 ] 

Pradeep Bansal commented on KAFKA-7982:
---

This is affecting my setup as well. Is there any update on when this fix will 
be available?

> ConcurrentModificationException and Continuous warnings "Attempting to send 
> response via channel for which there is no open connection"
> ---
>
> Key: KAFKA-7982
> URL: https://issues.apache.org/jira/browse/KAFKA-7982
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.1
>Reporter: Abhi
>Priority: Major
>
> Hi,
> I am getting following warnings in server.log continuosly and due to this 
> client consumer is not able to consumer messages.
> [2019-02-20 10:26:30,312] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35248-6259 (kafka.network.Processor)
>  [2019-02-20 10:26:56,760] WARN Attempting to send response via channel for 
> which there is no open connection, connection id 
> 10.218.27.45:9092-10.219.25.239:35604-6261 (kafka.network.Processor)
> I also noticed that before these warnings started to appear, following 
> concurrent modification exception for the same IP address:
> [2019-02-20 09:01:11,175] INFO Initiating logout for 
> kafka/u-kafkatst-kafkadev-1.sd@unix.com 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)
>  [2019-02-20 09:01:11,176] WARN [SocketServer brokerId=1] Unexpected error 
> from /10.219.25.239; closing connection 
> (org.apache.kafka.common.network.Selector)
>  java.util.ConcurrentModificationException
>  at 
> java.base/java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:970)
>  at java.base/java.util.LinkedList$ListItr.next(LinkedList.java:892)
>  at java.base/javax.security.auth.Subject$SecureSet$1.next(Subject.java:1096)
>  at java.base/javax.security.auth.Subject$ClassSet$1.run(Subject.java:1501)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.base/javax.security.auth.Subject$ClassSet.populateSet(Subject.java:1499)
>  at java.base/javax.security.auth.Subject$ClassSet.(Subject.java:1472)
>  at 
> java.base/javax.security.auth.Subject.getPrivateCredentials(Subject.java:764)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:336)
>  at java.security.jgss/sun.security.jgss.GSSUtil$1.run(GSSUtil.java:328)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at 
> java.security.jgss/sun.security.jgss.GSSUtil.searchSubject(GSSUtil.java:328)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredFromSubject(NativeGSSFactory.java:53)
>  at 
> java.security.jgss/sun.security.jgss.wrapper.NativeGSSFactory.getCredentialElement(NativeGSSFactory.java:116)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.getCredentialElement(GSSManagerImpl.java:187)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.add(GSSCredentialImpl.java:439)
>  at 
> java.security.jgss/sun.security.jgss.GSSCredentialImpl.(GSSCredentialImpl.java:74)
>  at 
> java.security.jgss/sun.security.jgss.GSSManagerImpl.createCredential(GSSManagerImpl.java:148)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.GssKrb5Server.(GssKrb5Server.java:108)
>  at 
> jdk.security.jgss/com.sun.security.sasl.gsskerb.FactoryImpl.createSaslServer(FactoryImpl.java:85)
>  at 
> java.security.sasl/javax.security.sasl.Sasl.createSaslServer(Sasl.java:537)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.lambda$createSaslKerberosServer$12(SaslServerAuthenticator.java:212)
>  at java.base/java.security.AccessController.doPrivileged(Native Method)
>  at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslKerberosServer(SaslServerAuthenticator.java:211)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.createSaslServer(SaslServerAuthenticator.java:164)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.handleKafkaRequest(SaslServerAuthenticator.java:450)
>  at 
> org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:248)
>  at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:132)
>  at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:532)
>  at org.apache.kafka.common.network.Selector.poll(Selector.java:467)
>  at kafka.network.Processor.poll(SocketServer.scala:689)
>  at kafka.network.Processor.run(SocketServer.scala:594)
>  

[jira] [Updated] (KAFKA-9064) Observing transient issue with kinit command

2019-11-26 Thread Pradeep Bansal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-9064:
--
Issue Type: Bug  (was: Improvement)

> Observing transient issue with kinit command
> 
>
> Key: KAFKA-9064
> URL: https://issues.apache.org/jira/browse/KAFKA-9064
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
>
> I have specified kinit command to be skinit. While this works fine for most 
> time, sometimes I see below exception where it doesnt respect provided kinit 
> command and use default value. Can this be handled?
>  
> |{{[}}{{2019}}{{-}}{{02}}{{-}}{{19}} 
> {{10}}{{:}}{{20}}{{:}}{{07}}{{,}}{{862}}{{] WARN [Principal}}{{=}}{{null]: 
> Could }}{{not}} {{renew TGT due to problem running shell command: 
> }}{{'/usr/bin/kinit -R'}}{{. Exiting refresh thread. 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)}}
> {{org.apache.kafka.common.utils.Shell$ExitCodeException: kinit: Matching 
> credential }}{{not}} {{found (filename: 
> }}{{/}}{{tmp}}{{/}}{{krb5cc_25012_76850_sshd_w6VpLC8R0Y) }}{{while}} 
> {{renewing credentials}}
>  
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.runCommand(Shell.java:}}{{130}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.run(Shell.java:}}{{76}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell$ShellCommandExecutor.execute(Shell.java:}}{{204}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.execCommand(Shell.java:}}{{268}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.execCommand(Shell.java:}}{{255}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.security.kerberos.KerberosLogin.}}{{lambda}}{{$login$}}{{10}}{{(KerberosLogin.java:}}{{212}}{{)}}
> {{}}{{at 
> java.base}}{{/}}{{java.lang.Thread.run(Thread.java:}}{{834}}{{)}}|
> | |
> | |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9064) Observing transient issue with kinit command

2019-11-11 Thread Pradeep Bansal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-9064:
--
Summary: Observing transient issue with kinit command  (was: Observe 
transient issue with kinit command)

> Observing transient issue with kinit command
> 
>
> Key: KAFKA-9064
> URL: https://issues.apache.org/jira/browse/KAFKA-9064
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
>
> I have specified kinit command to be skinit. While this works fine for most 
> time, sometimes I see below exception where it doesnt respect provided kinit 
> command and use default value. Can this be handled?
>  
> |{{[}}{{2019}}{{-}}{{02}}{{-}}{{19}} 
> {{10}}{{:}}{{20}}{{:}}{{07}}{{,}}{{862}}{{] WARN [Principal}}{{=}}{{null]: 
> Could }}{{not}} {{renew TGT due to problem running shell command: 
> }}{{'/usr/bin/kinit -R'}}{{. Exiting refresh thread. 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)}}
> {{org.apache.kafka.common.utils.Shell$ExitCodeException: kinit: Matching 
> credential }}{{not}} {{found (filename: 
> }}{{/}}{{tmp}}{{/}}{{krb5cc_25012_76850_sshd_w6VpLC8R0Y) }}{{while}} 
> {{renewing credentials}}
>  
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.runCommand(Shell.java:}}{{130}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.run(Shell.java:}}{{76}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell$ShellCommandExecutor.execute(Shell.java:}}{{204}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.execCommand(Shell.java:}}{{268}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.execCommand(Shell.java:}}{{255}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.security.kerberos.KerberosLogin.}}{{lambda}}{{$login$}}{{10}}{{(KerberosLogin.java:}}{{212}}{{)}}
> {{}}{{at 
> java.base}}{{/}}{{java.lang.Thread.run(Thread.java:}}{{834}}{{)}}|
> | |
> | |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9064) Observe transient issue with kinit command

2019-11-11 Thread Pradeep Bansal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-9064:
--
Summary: Observe transient issue with kinit command  (was: Observe 
transient issue with kinit cimmand)

> Observe transient issue with kinit command
> --
>
> Key: KAFKA-9064
> URL: https://issues.apache.org/jira/browse/KAFKA-9064
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
>
> I have specified kinit command to be skinit. While this works fine for most 
> time, sometimes I see below exception where it doesnt respect provided kinit 
> command and use default value. Can this be handled?
>  
> |{{[}}{{2019}}{{-}}{{02}}{{-}}{{19}} 
> {{10}}{{:}}{{20}}{{:}}{{07}}{{,}}{{862}}{{] WARN [Principal}}{{=}}{{null]: 
> Could }}{{not}} {{renew TGT due to problem running shell command: 
> }}{{'/usr/bin/kinit -R'}}{{. Exiting refresh thread. 
> (org.apache.kafka.common.security.kerberos.KerberosLogin)}}
> {{org.apache.kafka.common.utils.Shell$ExitCodeException: kinit: Matching 
> credential }}{{not}} {{found (filename: 
> }}{{/}}{{tmp}}{{/}}{{krb5cc_25012_76850_sshd_w6VpLC8R0Y) }}{{while}} 
> {{renewing credentials}}
>  
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.runCommand(Shell.java:}}{{130}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.run(Shell.java:}}{{76}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell$ShellCommandExecutor.execute(Shell.java:}}{{204}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.execCommand(Shell.java:}}{{268}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.utils.Shell.execCommand(Shell.java:}}{{255}}{{)}}
> {{}}{{at 
> org.apache.kafka.common.security.kerberos.KerberosLogin.}}{{lambda}}{{$login$}}{{10}}{{(KerberosLogin.java:}}{{212}}{{)}}
> {{}}{{at 
> java.base}}{{/}}{{java.lang.Thread.run(Thread.java:}}{{834}}{{)}}|
> | |
> | |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9063) KAfka producer throughput drops with number of topics even when producer is producing on one topic

2019-10-17 Thread Pradeep Bansal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-9063:
--
Description: 
5 broker cluster

Topic partitions  =  1

Replication factor = 3

Ac mode = all

Send type = Asynchronous

Message size  =  100 bytes

Log compaction  =  Enabled
 # We started throughput test with 1 topic and number of topics present in 
cluster at that time were 1 (excluding already existing topics 1000)
 # We left it to run for about 2 hours (The mean throughput we observed during 
this period was 54k msgs/sec)
 # After this, we started creating 10,000 topics one by one using a script
 # We noted throughput values after creating 100 topics and after 
200,300,400…so on till 10,000 were created
 # After all 10,000 topics were created we left test to run for another 1 hr.

 During the entire duration, we were producing only on a single topic.

 

What could be causing producer throughput drop by just adding number of topics?

!image-2019-10-18-10-22-40-372.png!

  was:
5 broker cluster

Topic partitions  =  1

Replication factor = 3

Ac mode = all

Send type = Asynchronous

Message size  =  100 bytes

Log compaction  =  Enabled
 # We started throughput test with 1 topic and number of topics present in 
cluster at that time were 1 (excluding already existing topics 1000)
 # We left it to run for about 2 hours (The mean throughput we observed during 
this period was 54k msgs/sec)
 # After this, we started creating 10,000 topics one by one using a script
 # We noted throughput values after creating 100 topics and after 
200,300,400…so on till 10,000 were created
 # After all 10,000 topics were created we left test to run for another 1 hr.

 During the entire duration, we were producing only on a single topic.

 

What could be causing performance drop 

!image-2019-10-18-10-22-40-372.png!


> KAfka producer throughput drops with number of topics even when producer is 
> producing on one topic
> --
>
> Key: KAFKA-9063
> URL: https://issues.apache.org/jira/browse/KAFKA-9063
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: image-2019-10-18-10-22-40-372.png
>
>
> 5 broker cluster
> Topic partitions  =  1
> Replication factor = 3
> Ac mode = all
> Send type = Asynchronous
> Message size  =  100 bytes
> Log compaction  =  Enabled
>  # We started throughput test with 1 topic and number of topics present in 
> cluster at that time were 1 (excluding already existing topics 1000)
>  # We left it to run for about 2 hours (The mean throughput we observed 
> during this period was 54k msgs/sec)
>  # After this, we started creating 10,000 topics one by one using a script
>  # We noted throughput values after creating 100 topics and after 
> 200,300,400…so on till 10,000 were created
>  # After all 10,000 topics were created we left test to run for another 1 hr.
>  During the entire duration, we were producing only on a single topic.
>  
> What could be causing producer throughput drop by just adding number of 
> topics?
> !image-2019-10-18-10-22-40-372.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9063) KAfka throughput performance drops with number of topics even when producer is producing on one topic

2019-10-17 Thread Pradeep Bansal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-9063:
--
Summary: KAfka throughput performance drops with number of topics even when 
producer is producing on one topic  (was: KAfka producer performance drops with 
number of topics even when producer is producing on one topic)

> KAfka throughput performance drops with number of topics even when producer 
> is producing on one topic
> -
>
> Key: KAFKA-9063
> URL: https://issues.apache.org/jira/browse/KAFKA-9063
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: image-2019-10-18-10-22-40-372.png
>
>
> 5 broker cluster
> Topic partitions  =  1
> Replication factor = 3
> Ac mode = all
> Send type = Asynchronous
> Message size  =  100 bytes
> Log compaction  =  Enabled
>  # We started throughput test with 1 topic and number of topics present in 
> cluster at that time were 1 (excluding already existing topics 1000)
>  # We left it to run for about 2 hours (The mean throughput we observed 
> during this period was 54k msgs/sec)
>  # After this, we started creating 10,000 topics one by one using a script
>  # We noted throughput values after creating 100 topics and after 
> 200,300,400…so on till 10,000 were created
>  # After all 10,000 topics were created we left test to run for another 1 hr.
>  During the entire duration, we were producing only on a single topic.
>  
> What could be causing performance drop 
> !image-2019-10-18-10-22-40-372.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9063) KAfka producer throughput drops with number of topics even when producer is producing on one topic

2019-10-17 Thread Pradeep Bansal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-9063:
--
Summary: KAfka producer throughput drops with number of topics even when 
producer is producing on one topic  (was: KAfka throughput performance drops 
with number of topics even when producer is producing on one topic)

> KAfka producer throughput drops with number of topics even when producer is 
> producing on one topic
> --
>
> Key: KAFKA-9063
> URL: https://issues.apache.org/jira/browse/KAFKA-9063
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: image-2019-10-18-10-22-40-372.png
>
>
> 5 broker cluster
> Topic partitions  =  1
> Replication factor = 3
> Ac mode = all
> Send type = Asynchronous
> Message size  =  100 bytes
> Log compaction  =  Enabled
>  # We started throughput test with 1 topic and number of topics present in 
> cluster at that time were 1 (excluding already existing topics 1000)
>  # We left it to run for about 2 hours (The mean throughput we observed 
> during this period was 54k msgs/sec)
>  # After this, we started creating 10,000 topics one by one using a script
>  # We noted throughput values after creating 100 topics and after 
> 200,300,400…so on till 10,000 were created
>  # After all 10,000 topics were created we left test to run for another 1 hr.
>  During the entire duration, we were producing only on a single topic.
>  
> What could be causing performance drop 
> !image-2019-10-18-10-22-40-372.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9063) KAfka producer performance drops with number of topics even when producer is producing on one topic

2019-10-17 Thread Pradeep Bansal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-9063:
--
Description: 
5 broker cluster

Topic partitions  =  1

Replication factor = 3

Ac mode = all

Send type = Asynchronous

Message size  =  100 bytes

Log compaction  =  Enabled
 # We started throughput test with 1 topic and number of topics present in 
cluster at that time were 1 (excluding already existing topics 1000)
 # We left it to run for about 2 hours (The mean throughput we observed during 
this period was 54k msgs/sec)
 # After this, we started creating 10,000 topics one by one using a script
 # We noted throughput values after creating 100 topics and after 
200,300,400…so on till 10,000 were created
 # After all 10,000 topics were created we left test to run for another 1 hr.

 During the entire duration, we were producing only on a single topic.

 

What could be causing performance drop 

!image-2019-10-18-10-22-40-372.png!

  was:
5 broker cluster

Topic partitions  =  1

Replication factor = 3

Ac mode = all

Send type = Asynchronous

Message size  =  100 bytes

Log compaction  =  Enabled
 # We started throughput test with 1 topic and number of topics present in 
cluster at that time were 1 (excluding already existing topics 1000)
 # We left it to run for about 2 hours (The mean throughput we observed during 
this period was 54k msgs/sec)
 # After this, we started creating 10,000 topics one by one using a script
 # We noted throughput values after creating 100 topics and after 
200,300,400…so on till 10,000 were created
 # After all 10,000 topics were created we left test to run for another 1 hr.

 During the entire duration, we were producing only on a single topic.

!image-2019-10-18-10-22-40-372.png!


> KAfka producer performance drops with number of topics even when producer is 
> producing on one topic
> ---
>
> Key: KAFKA-9063
> URL: https://issues.apache.org/jira/browse/KAFKA-9063
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: image-2019-10-18-10-22-40-372.png
>
>
> 5 broker cluster
> Topic partitions  =  1
> Replication factor = 3
> Ac mode = all
> Send type = Asynchronous
> Message size  =  100 bytes
> Log compaction  =  Enabled
>  # We started throughput test with 1 topic and number of topics present in 
> cluster at that time were 1 (excluding already existing topics 1000)
>  # We left it to run for about 2 hours (The mean throughput we observed 
> during this period was 54k msgs/sec)
>  # After this, we started creating 10,000 topics one by one using a script
>  # We noted throughput values after creating 100 topics and after 
> 200,300,400…so on till 10,000 were created
>  # After all 10,000 topics were created we left test to run for another 1 hr.
>  During the entire duration, we were producing only on a single topic.
>  
> What could be causing performance drop 
> !image-2019-10-18-10-22-40-372.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9064) Observe transient issue with kinit cimmand

2019-10-17 Thread Pradeep Bansal (Jira)
Pradeep Bansal created KAFKA-9064:
-

 Summary: Observe transient issue with kinit cimmand
 Key: KAFKA-9064
 URL: https://issues.apache.org/jira/browse/KAFKA-9064
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 2.2.1
Reporter: Pradeep Bansal


I have specified kinit command to be skinit. While this works fine for most 
time, sometimes I see below exception where it doesnt respect provided kinit 
command and use default value. Can this be handled?

 
|{{[}}{{2019}}{{-}}{{02}}{{-}}{{19}} 
{{10}}{{:}}{{20}}{{:}}{{07}}{{,}}{{862}}{{] WARN [Principal}}{{=}}{{null]: 
Could }}{{not}} {{renew TGT due to problem running shell command: 
}}{{'/usr/bin/kinit -R'}}{{. Exiting refresh thread. 
(org.apache.kafka.common.security.kerberos.KerberosLogin)}}
{{org.apache.kafka.common.utils.Shell$ExitCodeException: kinit: Matching 
credential }}{{not}} {{found (filename: 
}}{{/}}{{tmp}}{{/}}{{krb5cc_25012_76850_sshd_w6VpLC8R0Y) }}{{while}} {{renewing 
credentials}}
 
{{}}{{at 
org.apache.kafka.common.utils.Shell.runCommand(Shell.java:}}{{130}}{{)}}
{{}}{{at 
org.apache.kafka.common.utils.Shell.run(Shell.java:}}{{76}}{{)}}
{{}}{{at 
org.apache.kafka.common.utils.Shell$ShellCommandExecutor.execute(Shell.java:}}{{204}}{{)}}
{{}}{{at 
org.apache.kafka.common.utils.Shell.execCommand(Shell.java:}}{{268}}{{)}}
{{}}{{at 
org.apache.kafka.common.utils.Shell.execCommand(Shell.java:}}{{255}}{{)}}
{{}}{{at 
org.apache.kafka.common.security.kerberos.KerberosLogin.}}{{lambda}}{{$login$}}{{10}}{{(KerberosLogin.java:}}{{212}}{{)}}
{{}}{{at 
java.base}}{{/}}{{java.lang.Thread.run(Thread.java:}}{{834}}{{)}}|
| |
| |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9063) KAfka producer performance drops with number of topics even when producer is producing on one topic

2019-10-17 Thread Pradeep Bansal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-9063:
--
Description: 
5 broker cluster

Topic partitions  =  1

Replication factor = 3

Ac mode = all

Send type = Asynchronous

Message size  =  100 bytes

Log compaction  =  Enabled
 # We started throughput test with 1 topic and number of topics present in 
cluster at that time were 1 (excluding already existing topics 1000)
 # We left it to run for about 2 hours (The mean throughput we observed during 
this period was 54k msgs/sec)
 # After this, we started creating 10,000 topics one by one using a script
 # We noted throughput values after creating 100 topics and after 
200,300,400…so on till 10,000 were created
 # After all 10,000 topics were created we left test to run for another 1 hr.

 During the entire duration, we were producing only on a single topic.

!image-2019-10-18-10-22-40-372.png!

  was:
5 borker cluster

Topic partitions  =  1

Replication factor = 3

Ac mode = all

Send type = Asynchronous

Message size  =  100 bytes

Log compaction  =  Enabled
 # We started throughput test with 1 topic and number of topics present in 
cluster at that time were 1 (excluding already existing topics 1000)
 # We left it to run for about 2 hours (The mean throughput we observed during 
this period was 54k msgs/sec)
 # After this, we started creating 10,000 topics one by one using a script
 # We noted throughput values after creating 100 topics and after 
200,300,400…so on till 10,000 were created
 # After all 10,000 topics were created we left test to run for another 1 hr.

 During the entire duration, we were producing only on a single topic.

!image-2019-10-18-10-22-40-372.png!


> KAfka producer performance drops with number of topics even when producer is 
> producing on one topic
> ---
>
> Key: KAFKA-9063
> URL: https://issues.apache.org/jira/browse/KAFKA-9063
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: image-2019-10-18-10-22-40-372.png
>
>
> 5 broker cluster
> Topic partitions  =  1
> Replication factor = 3
> Ac mode = all
> Send type = Asynchronous
> Message size  =  100 bytes
> Log compaction  =  Enabled
>  # We started throughput test with 1 topic and number of topics present in 
> cluster at that time were 1 (excluding already existing topics 1000)
>  # We left it to run for about 2 hours (The mean throughput we observed 
> during this period was 54k msgs/sec)
>  # After this, we started creating 10,000 topics one by one using a script
>  # We noted throughput values after creating 100 topics and after 
> 200,300,400…so on till 10,000 were created
>  # After all 10,000 topics were created we left test to run for another 1 hr.
>  During the entire duration, we were producing only on a single topic.
> !image-2019-10-18-10-22-40-372.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9063) KAfka producer performance drops with number of topics even when producer is producing on one topic

2019-10-17 Thread Pradeep Bansal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-9063:
--
Description: 
5 borker cluster

Topic partitions  =  1

Replication factor = 3

Ac mode = all

Send type = Asynchronous

Message size  =  100 bytes

Log compaction  =  Enabled
 # We started throughput test with 1 topic and number of topics present in 
cluster at that time were 1 (excluding already existing topics 1000)
 # We left it to run for about 2 hours (The mean throughput we observed during 
this period was 54k msgs/sec)
 # After this, we started creating 10,000 topics one by one using a script
 # We noted throughput values after creating 100 topics and after 
200,300,400…so on till 10,000 were created
 # After all 10,000 topics were created we left test to run for another 1 hr.

 During the entire duration, we were producing only on a single topic.

!image-2019-10-18-10-22-40-372.png!

  was:
# We started throughput test with 1 topic and number of topics present in 
cluster at that time were 1 (excluding already existing topics 1000)
 # We left it to run for about 2 hours (The mean throughput we observed during 
this period was 54k msgs/sec)
 # After this, we started creating 10,000 topics one by one using a script
 # We noted throughput values after creating 100 topics and after 
200,300,400…so on till 10,000 were created
 # After all 10,000 topics were created we left test to run for another 1 hr.

 During the entire duration, we were producing only on a single topic.

!image-2019-10-18-10-22-40-372.png!


> KAfka producer performance drops with number of topics even when producer is 
> producing on one topic
> ---
>
> Key: KAFKA-9063
> URL: https://issues.apache.org/jira/browse/KAFKA-9063
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: image-2019-10-18-10-22-40-372.png
>
>
> 5 borker cluster
> Topic partitions  =  1
> Replication factor = 3
> Ac mode = all
> Send type = Asynchronous
> Message size  =  100 bytes
> Log compaction  =  Enabled
>  # We started throughput test with 1 topic and number of topics present in 
> cluster at that time were 1 (excluding already existing topics 1000)
>  # We left it to run for about 2 hours (The mean throughput we observed 
> during this period was 54k msgs/sec)
>  # After this, we started creating 10,000 topics one by one using a script
>  # We noted throughput values after creating 100 topics and after 
> 200,300,400…so on till 10,000 were created
>  # After all 10,000 topics were created we left test to run for another 1 hr.
>  During the entire duration, we were producing only on a single topic.
> !image-2019-10-18-10-22-40-372.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9063) KAfka performance drops with number of topics even when producer is producing on one topic

2019-10-17 Thread Pradeep Bansal (Jira)
Pradeep Bansal created KAFKA-9063:
-

 Summary: KAfka performance drops with number of topics even when 
producer is producing on one topic
 Key: KAFKA-9063
 URL: https://issues.apache.org/jira/browse/KAFKA-9063
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 2.2.1
Reporter: Pradeep Bansal
 Attachments: image-2019-10-18-10-22-40-372.png

# We started throughput test with 1 topic and number of topics present in 
cluster at that time were 1 (excluding already existing topics 1000)
 # We left it to run for about 2 hours (The mean throughput we observed during 
this period was 54k msgs/sec)
 # After this, we started creating 10,000 topics one by one using a script
 # We noted throughput values after creating 100 topics and after 
200,300,400…so on till 10,000 were created
 # After all 10,000 topics were created we left test to run for another 1 hr.

 During the entire duration, we were producing only on a single topic.

!image-2019-10-18-10-22-40-372.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9063) KAfka producer performance drops with number of topics even when producer is producing on one topic

2019-10-17 Thread Pradeep Bansal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-9063:
--
Summary: KAfka producer performance drops with number of topics even when 
producer is producing on one topic  (was: KAfka performance drops with number 
of topics even when producer is producing on one topic)

> KAfka producer performance drops with number of topics even when producer is 
> producing on one topic
> ---
>
> Key: KAFKA-9063
> URL: https://issues.apache.org/jira/browse/KAFKA-9063
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.2.1
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: image-2019-10-18-10-22-40-372.png
>
>
> # We started throughput test with 1 topic and number of topics present in 
> cluster at that time were 1 (excluding already existing topics 1000)
>  # We left it to run for about 2 hours (The mean throughput we observed 
> during this period was 54k msgs/sec)
>  # After this, we started creating 10,000 topics one by one using a script
>  # We noted throughput values after creating 100 topics and after 
> 200,300,400…so on till 10,000 were created
>  # After all 10,000 topics were created we left test to run for another 1 hr.
>  During the entire duration, we were producing only on a single topic.
> !image-2019-10-18-10-22-40-372.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9036) Allow kafka auth acls with match pattern

2019-10-14 Thread Pradeep Bansal (Jira)
Pradeep Bansal created KAFKA-9036:
-

 Summary: Allow kafka auth acls with match pattern
 Key: KAFKA-9036
 URL: https://issues.apache.org/jira/browse/KAFKA-9036
 Project: Kafka
  Issue Type: Improvement
  Components: admin, security
Affects Versions: 2.2.1
Reporter: Pradeep Bansal


I am using bin/kafka-acls.sh script to provide consumer/producer permissions on 
a topic. 

 

Currently this allows only literal match or a prefixed match while adding acls. 
It doesn't allow any regex and gives this error message when 
resource-pattern-type specified is match:

A '--resource-pattern-type' value of 'MATCH' is not valid when adding acls.

 

Can this support be added to Kafka alcs?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-7925) Constant 100% cpu usage by all kafka brokers

2019-09-25 Thread Pradeep Bansal (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937461#comment-16937461
 ] 

Pradeep Bansal commented on KAFKA-7925:
---

Do you have any updates on when this fix will be available?
 

> Constant 100% cpu usage by all kafka brokers
> 
>
> Key: KAFKA-7925
> URL: https://issues.apache.org/jira/browse/KAFKA-7925
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.2.0, 2.1.1
> Environment: Java 11, Kafka v2.1.0, Kafka v2.1.1, Kafka v2.2.0
>Reporter: Abhi
>Priority: Critical
> Attachments: jira-server.log-1, jira-server.log-2, jira-server.log-3, 
> jira-server.log-4, jira-server.log-5, jira-server.log-6, 
> jira_prod.producer.log, threadump20190212.txt
>
>
> Hi,
> I am seeing constant 100% cpu usage on all brokers in our kafka cluster even 
> without any clients connected to any broker.
> This is a bug that we have seen multiple times in our kafka setup that is not 
> yet open to clients. It is becoming a blocker for our deployment now.
> I am seeing lot of connections to other brokers in CLOSE_WAIT state (see 
> below). In thread usage, I am seeing these threads 
> 'kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-0,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-1,kafka-network-thread-6-ListenerName(SASL_PLAINTEXT)-SASL_PLAINTEXT-2'
>  taking up more than 90% of the cpu time in a 60s interval.
> I have attached a thread dump of one of the brokers in the cluster.
> *Java version:*
> openjdk 11.0.2 2019-01-15
> OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
> OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)
> *Kafka verison:* v2.1.0
>  
> *connections:*
> java 144319 kafkagod 88u IPv4 3063266 0t0 TCP *:35395 (LISTEN)
> java 144319 kafkagod 89u IPv4 3063267 0t0 TCP *:9144 (LISTEN)
> java 144319 kafkagod 104u IPv4 3064219 0t0 TCP 
> mwkafka-prod-02.tbd:47292->mwkafka-zk-prod-05.tbd:2181 (ESTABLISHED)
> java 144319 kafkagod 2003u IPv4 3055115 0t0 TCP *:9092 (LISTEN)
> java 144319 kafkagod 2013u IPv4 7220110 0t0 TCP 
> mwkafka-prod-02.tbd:60724->mwkafka-zk-prod-04.dr:2181 (ESTABLISHED)
> java 144319 kafkagod 2020u IPv4 30012904 0t0 TCP 
> mwkafka-prod-02.tbd:38988->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2021u IPv4 30012961 0t0 TCP 
> mwkafka-prod-02.tbd:58420->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2027u IPv4 30015723 0t0 TCP 
> mwkafka-prod-02.tbd:58398->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2028u IPv4 30015630 0t0 TCP 
> mwkafka-prod-02.tbd:36248->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2030u IPv4 30015726 0t0 TCP 
> mwkafka-prod-02.tbd:39012->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2031u IPv4 30013619 0t0 TCP 
> mwkafka-prod-02.tbd:38986->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2032u IPv4 30015604 0t0 TCP 
> mwkafka-prod-02.tbd:36246->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2033u IPv4 30012981 0t0 TCP 
> mwkafka-prod-02.tbd:36924->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2034u IPv4 30012967 0t0 TCP 
> mwkafka-prod-02.tbd:39036->mwkafka-prod-02.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2035u IPv4 30012898 0t0 TCP 
> mwkafka-prod-02.tbd:36866->mwkafka-prod-01.dr:9092 (FIN_WAIT2)
> java 144319 kafkagod 2036u IPv4 30004729 0t0 TCP 
> mwkafka-prod-02.tbd:36882->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2037u IPv4 30004914 0t0 TCP 
> mwkafka-prod-02.tbd:58426->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2038u IPv4 30015651 0t0 TCP 
> mwkafka-prod-02.tbd:36884->mwkafka-prod-01.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2039u IPv4 30012966 0t0 TCP 
> mwkafka-prod-02.tbd:58422->mwkafka-prod-01.nyc:9092 (ESTABLISHED)
> java 144319 kafkagod 2040u IPv4 30005643 0t0 TCP 
> mwkafka-prod-02.tbd:36252->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2041u IPv4 30012944 0t0 TCP 
> mwkafka-prod-02.tbd:36286->mwkafka-prod-02.dr:9092 (ESTABLISHED)
> java 144319 kafkagod 2042u IPv4 30012973 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.nyc:51924 (ESTABLISHED)
> java 144319 kafkagod 2043u sock 0,7 0t0 30012463 protocol: TCP
> java 144319 kafkagod 2044u IPv4 30012979 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-01.dr:39994 (ESTABLISHED)
> java 144319 kafkagod 2045u IPv4 30012899 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.nyc:34548 (ESTABLISHED)
> java 144319 kafkagod 2046u sock 0,7 0t0 30003437 protocol: TCP
> java 144319 kafkagod 2047u IPv4 30012980 0t0 TCP 
> mwkafka-prod-02.tbd:9092->mwkafka-prod-02.dr:38120 (ESTABLISHED)
> java 144319 kafkagod 2048u sock 0,7 0t0 30012546 protocol: TCP
> java 144319 kafkagod 2049u IPv4 30005418 0t0 TCP 
> 

[jira] [Created] (KAFKA-7891) Reduce time to start kafka server with clean state

2019-01-31 Thread Pradeep Bansal (JIRA)
Pradeep Bansal created KAFKA-7891:
-

 Summary: Reduce time to start kafka server  with clean state
 Key: KAFKA-7891
 URL: https://issues.apache.org/jira/browse/KAFKA-7891
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 2.1.0
Reporter: Pradeep Bansal


I am using kafka 2.1.0 and has 6 broker cluster. In this I had a scenario where 
a topic (with replication factor 3 and min insync replica as 2) leader went 
down and we lost its data. When starting this broker with fresh state, it took 
aroun 45 minutes to catch up with replicas (data size of 190 G).

 

Is it expected to take such a large period to recover. are there configuration 
using which this can be optimized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7879) Data directory size decreases every few minutes when producer is sending large amount of data

2019-01-29 Thread Pradeep Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754794#comment-16754794
 ] 

Pradeep Bansal commented on KAFKA-7879:
---

I am using df -h to get size of the partition itself. There is no other process 
which is writing (or removing) huge amount of data to observe this pattern.

> Data directory size decreases every few minutes when producer is sending 
> large amount of data
> -
>
> Key: KAFKA-7879
> URL: https://issues.apache.org/jira/browse/KAFKA-7879
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Pradeep Bansal
>Priority: Major
>
> I am running kafka broker with 6 nodes and have set reteion hours to 24 hours 
> and retention bytes to 5 GB.
>  
> I have set retention bytes to 250GB on  topic configuration.
>  
> Now when producing message in async mode with 1000 bytes message with very 
> high frequency. I am seeing that kafka data directory size increases but 
> every 5 minutes it decreases by some percentage (in my observation it 
> increases by 40G and then reduces to 20G, so in every 5 minutes we are seeing 
> increase by 20G instead of 40G).
>  
> Is there any extra configuration we need to set to avoid this data loss or is 
> there some sort of compression that is going on here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7879) Data directory size decreases every few minutes when producer is sending large amount of data

2019-01-28 Thread Pradeep Bansal (JIRA)
Pradeep Bansal created KAFKA-7879:
-

 Summary: Data directory size decreases every few minutes when 
producer is sending large amount of data
 Key: KAFKA-7879
 URL: https://issues.apache.org/jira/browse/KAFKA-7879
 Project: Kafka
  Issue Type: Improvement
Reporter: Pradeep Bansal


I am running kafka broker with 6 nodes and have set reteion hours to 24 hours 
and retention bytes to 5 GB.

 

I have set retention bytes to 250GB on  topic configuration.

 

Now when producing message in async mode with 1000 bytes message with very high 
frequency. I am seeing that kafka data directory size increases but every 5 
minutes it decreases by some percentage (in my observation it increases by 40G 
and then reduces to 20G, so in every 5 minutes we are seeing increase by 20G 
instead of 40G).

 

Is there any extra configuration we need to set to avoid this data loss or is 
there some sort of compression that is going on here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7871) Getting TimeoutException in Kafka Producer when using batch size 0

2019-01-25 Thread Pradeep Bansal (JIRA)
Pradeep Bansal created KAFKA-7871:
-

 Summary: Getting TimeoutException in Kafka Producer when using 
batch size 0
 Key: KAFKA-7871
 URL: https://issues.apache.org/jira/browse/KAFKA-7871
 Project: Kafka
  Issue Type: Improvement
Reporter: Pradeep Bansal


I am getting below exception in Kafka Producer when using batch size as 0.

My message size is 100 bytes. I am sending message continuously on mytopic. 

org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for 
mytopic: 30001 ms has passed since last append



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7546) Java implementation for Authorizer

2018-10-31 Thread Pradeep Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670169#comment-16670169
 ] 

Pradeep Bansal commented on KAFKA-7546:
---

[~omkreddy] this is really helpful. Thanks.

 

One more query here, what ACL's needs to be added here for allowing 
inter-broker communication?

> Java implementation for Authorizer
> --
>
> Key: KAFKA-7546
> URL: https://issues.apache.org/jira/browse/KAFKA-7546
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: AuthorizerImpl.PNG
>
>
> I am using kafka with authentication and authorization. I wanted to plugin my 
> own implementation of Authorizer which doesn't use zookeeper instead has 
> permission mapping in SQL database. Is it possible to write Authorizer code 
> in Java?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7546) Java implementation for Authorizer

2018-10-25 Thread Pradeep Bansal (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16663483#comment-16663483
 ] 

Pradeep Bansal commented on KAFKA-7546:
---

Thanks [~mgharat] for the insights. I have not worked with scala before, 
[Authorizer|https://github.com/apache/kafka/blob/3cdc78e6bb1f83973a14ce1550fe3874f7348b05/core/src/main/scala/kafka/security/auth/Authorizer.scala]
 interface is in scala. Will I be able to write Java class for this interface 
like below?

 

!AuthorizerImpl.PNG!

 

Thanks,

Pradeep

> Java implementation for Authorizer
> --
>
> Key: KAFKA-7546
> URL: https://issues.apache.org/jira/browse/KAFKA-7546
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: AuthorizerImpl.PNG
>
>
> I am using kafka with authentication and authorization. I wanted to plugin my 
> own implementation of Authorizer which doesn't use zookeeper instead has 
> permission mapping in SQL database. Is it possible to write Authorizer code 
> in Java?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7546) Java implementation for Authorizer

2018-10-25 Thread Pradeep Bansal (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Bansal updated KAFKA-7546:
--
Attachment: AuthorizerImpl.PNG

> Java implementation for Authorizer
> --
>
> Key: KAFKA-7546
> URL: https://issues.apache.org/jira/browse/KAFKA-7546
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: AuthorizerImpl.PNG
>
>
> I am using kafka with authentication and authorization. I wanted to plugin my 
> own implementation of Authorizer which doesn't use zookeeper instead has 
> permission mapping in SQL database. Is it possible to write Authorizer code 
> in Java?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7547) Avoid relogin in kafka if connection is already established.

2018-10-25 Thread Pradeep Bansal (JIRA)
Pradeep Bansal created KAFKA-7547:
-

 Summary: Avoid relogin in kafka if connection is already 
established.
 Key: KAFKA-7547
 URL: https://issues.apache.org/jira/browse/KAFKA-7547
 Project: Kafka
  Issue Type: Improvement
  Components: security
Reporter: Pradeep Bansal


I am new to kafka and may be there are ways already there for my requirement. I 
didn't find a way so far and hence I though I will post it here.

Currently, I observed that kafka periodically tries to renew kerberos token 
using kinit -R command. I found that I can set 
sasl.kerberos.min.time.before.relogin and change default from 1 minute to 1 day 
max. But in my case I am not clear on why renew is even required.

 

If it is not really required is there a way to turn it off?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7546) Java implementation for Authorizer

2018-10-25 Thread Pradeep Bansal (JIRA)
Pradeep Bansal created KAFKA-7546:
-

 Summary: Java implementation for Authorizer
 Key: KAFKA-7546
 URL: https://issues.apache.org/jira/browse/KAFKA-7546
 Project: Kafka
  Issue Type: Improvement
  Components: security
Reporter: Pradeep Bansal


I am using kafka with authentication and authorization. I wanted to plugin my 
own implementation of Authorizer which doesn't use zookeeper instead has 
permission mapping in SQL database. Is it possible to write Authorizer code in 
Java?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7545) Use auth_to_local rules from krb5.conf

2018-10-24 Thread Pradeep Bansal (JIRA)
Pradeep Bansal created KAFKA-7545:
-

 Summary: Use auth_to_local rules from krb5.conf
 Key: KAFKA-7545
 URL: https://issues.apache.org/jira/browse/KAFKA-7545
 Project: Kafka
  Issue Type: Improvement
  Components: security
Reporter: Pradeep Bansal


Currently I have to replicate all auth_to_local rules from my krb5.conf and 
pass it to sasl.kerberos.principal.to.local.rules to make them work. This is 
causing maintenance issue.

 

It will be very helpful/useful if kafka can read auth_to_local rules from 
krb5.conf directly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)