[jira] [Updated] (KAFKA-3703) Selector.close() doesn't complete outgoing writes

2016-09-02 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3703:
--
Description: 
Outgoing writes may be discarded when a connection is closed. For instance, 
when running a producer with acks=0, a producer that writes data and closes the 
producer would expect to see all writes to complete if there are no errors. But 
close() simply closes the channel and socket which could result in outgoing 
data being discarded.

This is also an issue in consumers which use commitAsync to commit offsets. 
Closing the consumer may result in commits being discarded because writes have 
not completed before close().

  was:Outgoing writes may be discarded when a connection is closed. For 
instance, when running a producer with acks=0, a producer that writes data and 
closes the producer would expect to see all writes to complete if there are no 
errors. But close() simply closes the channel and socket which could result in 
outgoing data being discarded.


> Selector.close() doesn't complete outgoing writes
> -
>
> Key: KAFKA-3703
> URL: https://issues.apache.org/jira/browse/KAFKA-3703
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>
> Outgoing writes may be discarded when a connection is closed. For instance, 
> when running a producer with acks=0, a producer that writes data and closes 
> the producer would expect to see all writes to complete if there are no 
> errors. But close() simply closes the channel and socket which could result 
> in outgoing data being discarded.
> This is also an issue in consumers which use commitAsync to commit offsets. 
> Closing the consumer may result in commits being discarded because writes 
> have not completed before close().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3703) Selector.close() doesn't complete outgoing writes

2016-09-02 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3703:
--
Affects Version/s: 0.10.0.1
  Component/s: clients
  Summary: Selector.close() doesn't complete outgoing writes  (was: 
PlaintextTransportLayer.close() doesn't complete outgoing writes)

> Selector.close() doesn't complete outgoing writes
> -
>
> Key: KAFKA-3703
> URL: https://issues.apache.org/jira/browse/KAFKA-3703
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>
> Outgoing writes may be discarded when a connection is closed. For instance, 
> when running a producer with acks=0, a producer that writes data and closes 
> the producer would expect to see all writes to complete if there are no 
> errors. But close() simply closes the channel and socket which could result 
> in outgoing data being discarded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4077) Backdate validity of certificates in system tests to cope with clock skew

2016-09-01 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-4077:
--
Status: Patch Available  (was: Open)

> Backdate validity of certificates in system tests to cope with clock skew
> -
>
> Key: KAFKA-4077
> URL: https://issues.apache.org/jira/browse/KAFKA-4077
> Project: Kafka
>  Issue Type: Bug
>  Components: system tests
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Minor
>
> The scenario described by [~ewencp] in 
> https://github.com/apache/kafka/pull/1483 where tests failed with 
> java.security.cert.CertificateNotYetValidException. Certificates are created 
> on the host and copied to VMs and hence should cope with a small amount of 
> clock skew. Set start date to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3218) Kafka-0.9.0.0 does not work as OSGi module

2016-08-23 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3218:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

[~omkreddy] Yes, this has been fixed under KAFKA-3680, marking as resolved.

> Kafka-0.9.0.0 does not work as OSGi module
> --
>
> Key: KAFKA-3218
> URL: https://issues.apache.org/jira/browse/KAFKA-3218
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
> Environment: Apache Felix OSGi container
> jdk_1.8.0_60
>Reporter: Joe O'Connor
>Assignee: Rajini Sivaram
> Attachments: ContextClassLoaderBug.tar.gz
>
>
> KAFKA-2295 changed all Class.forName() calls to use 
> currentThread().getContextClassLoader() instead of the default "classloader 
> that loaded the current class". 
> OSGi loads each module's classes using a separate classloader so this is now 
> broken.
> Steps to reproduce: 
> # install the kafka-clients servicemix OSGi module 0.9.0.0_1
> # attempt to initialize the Kafka producer client from Java code 
> Expected results: 
> - call to "new KafkaProducer()" succeeds
> Actual results: 
> - "new KafkaProducer()" throws ConfigException:
> {quote}Suppressed: java.lang.Exception: Error starting bundle54: 
> Activator start error in bundle com.openet.testcase.ContextClassLoaderBug 
> [54].
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:66)
> ... 12 more
> Caused by: org.osgi.framework.BundleException: Activator start error 
> in bundle com.openet.testcase.ContextClassLoaderBug [54].
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2276)
> at 
> org.apache.felix.framework.Felix.startBundle(Felix.java:2144)
> at 
> org.apache.felix.framework.BundleImpl.start(BundleImpl.java:998)
> at 
> org.apache.karaf.bundle.command.Start.executeOnBundle(Start.java:38)
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:64)
> ... 12 more
> Caused by: java.lang.ExceptionInInitializerError
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
> at com.openet.testcase.Activator.start(Activator.java:16)
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:697)
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2226)
> ... 16 more
> *Caused by: org.apache.kafka.common.config.ConfigException: Invalid 
> value org.apache.kafka.clients.producer.internals.DefaultPartitioner for 
> configuration partitioner.class: Class* 
> *org.apache.kafka.clients.producer.internals.DefaultPartitioner could not be 
> found.*
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:255)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:78)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:94)
> at 
> org.apache.kafka.clients.producer.ProducerConfig.(ProducerConfig.java:206)
> {quote}
> Workaround is to call "currentThread().setContextClassLoader(null)" before 
> initializing the kafka producer.
> Possible fix is to catch ClassNotFoundException at ConfigDef.java:247 and 
> retry the Class.forName() call with the default classloader. However with 
> this fix there is still a problem at AbstractConfig.java:206,  where the 
> newInstance() call succeeds but "instanceof" is false because the classes 
> were loaded by different classloaders.
> Testcase attached, see README.txt for instructions.
> See also SM-2743



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4079) Document quota configuration changes from KIP-55

2016-08-23 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-4079:
-

 Summary: Document quota configuration changes from KIP-55
 Key: KAFKA-4079
 URL: https://issues.apache.org/jira/browse/KAFKA-4079
 Project: Kafka
  Issue Type: Task
  Components: config
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 0.10.1.0


Document the configuration changes introduced for KIP-55 in KAFKA-3492



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4077) Backdate validity of certificates in system tests to cope with clock skew

2016-08-23 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-4077:
-

 Summary: Backdate validity of certificates in system tests to cope 
with clock skew
 Key: KAFKA-4077
 URL: https://issues.apache.org/jira/browse/KAFKA-4077
 Project: Kafka
  Issue Type: Bug
  Components: system tests
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
Priority: Minor


The scenario described by [~ewencp] in 
https://github.com/apache/kafka/pull/1483 where tests failed with 
java.security.cert.CertificateNotYetValidException. Certificates are created on 
the host and copied to VMs and hence should cope with a small amount of clock 
skew. Set start date to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4051) Strange behavior during rebalance when turning the OS clock back

2016-08-22 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-4051:
--
Status: Patch Available  (was: Open)

Agree with [~onurkaraman] that this is an unusual scenario. But the fix is a 
small change that can be seen as one small step towards improving Kafka's 
resilience to wall-clock time changes. The PR avoids the need for a broker 
restart in this case, but perhaps [~ibarra] can provide more context to the 
problem scenario.

> Strange behavior during rebalance when turning the OS clock back
> 
>
> Key: KAFKA-4051
> URL: https://issues.apache.org/jira/browse/KAFKA-4051
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.0
> Environment: OS: Ubuntu 14.04 - 64bits
>Reporter: Gabriel Ibarra
>Assignee: Rajini Sivaram
>
> If a rebalance is performed after turning the OS clock back, then the kafka 
> server enters in a loop and the rebalance cannot be completed until the 
> system returns to the previous date/hour.
> Steps to Reproduce:
> - Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will be owner 
> of all the partitions.
> - Turn the system (OS) clock back. For instance 1 hour.
> - Start a new consumer for TOPIC_NAME  using the same group id, it will force 
> a rebalance.
> After these actions the kafka server logs constantly display the messages 
> below, and after a while both consumers do not receive more packages. This 
> condition lasts at least the time that the clock went back, for this example 
> 1 hour, and finally after this time kafka comes back to work.
> [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 2 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME 
> generation 3 is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP generation 1 
> is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME 
> generation 1 is dead and removed (kafka.coordinator.GroupCoordinator)
> Due to the fact that some systems could have enabled NTP or an administrator 
> option to change the system clock (date/time) it's important to do it safely, 
> currently the only way to do it safely is following the next steps:
> 1-  Tear down the Kafka server.
> 2-  Change the date/time
> 3- Tear up the Kafka server.
> But, this approach can be done only if the change was performed by the 
> administrator, not for NTP. Also in many systems turning down the Kafka 
> server might cause the INFORMATION TO BE LOST.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4019) LogCleaner should grow read/write buffer to max message size for the topic

2016-08-22 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-4019:
--
Status: Patch Available  (was: Open)

> LogCleaner should grow read/write buffer to max message size for the topic
> --
>
> Key: KAFKA-4019
> URL: https://issues.apache.org/jira/browse/KAFKA-4019
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0
>Reporter: Jun Rao
>Assignee: Rajini Sivaram
>
> Currently, the LogCleaner.growBuffers() only grows the buffer up to the 
> default max message size. However, since the max message size can be 
> customized at the topic level, LogCleaner should allow the buffer to grow up 
> to the max message allowed by the topic. Otherwise, the cleaner will get 
> stuck on a large message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4066) NullPointerException in Kafka consumer due to unsafe access to findCoordinatorFuture

2016-08-22 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-4066:
--
Status: Patch Available  (was: Open)

> NullPointerException in Kafka consumer due to unsafe access to 
> findCoordinatorFuture
> 
>
> Key: KAFKA-4066
> URL: https://issues.apache.org/jira/browse/KAFKA-4066
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.1.0
>
>
> {quote}
> java.lang.NullPointerException
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:164)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:193)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:245)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:993)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:959)
>   at kafka.consumer.NewShinyConsumer.receive(BaseConsumer.scala:100)
>   at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:120)
>   at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:75)
>   at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
>   at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4066) NullPointerException in Kafka consumer due to unsafe access to findCoordinatorFuture

2016-08-19 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-4066:
-

 Summary: NullPointerException in Kafka consumer due to unsafe 
access to findCoordinatorFuture
 Key: KAFKA-4066
 URL: https://issues.apache.org/jira/browse/KAFKA-4066
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 0.10.1.0


{quote}
java.lang.NullPointerException
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:164)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:193)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:245)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:993)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:959)
at kafka.consumer.NewShinyConsumer.receive(BaseConsumer.scala:100)
at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:120)
at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:75)
at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4051) Strange behavior during rebalance when turning the OS clock back

2016-08-19 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427840#comment-15427840
 ] 

Rajini Sivaram commented on KAFKA-4051:
---

[~ijuma] [~gwenshap] Thank you both for the feedback. I will try it out 
locally, and run performance tests. I can test on Linux and Mac.

> Strange behavior during rebalance when turning the OS clock back
> 
>
> Key: KAFKA-4051
> URL: https://issues.apache.org/jira/browse/KAFKA-4051
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.0
> Environment: OS: Ubuntu 14.04 - 64bits
>Reporter: Gabriel Ibarra
>Assignee: Rajini Sivaram
>
> If a rebalance is performed after turning the OS clock back, then the kafka 
> server enters in a loop and the rebalance cannot be completed until the 
> system returns to the previous date/hour.
> Steps to Reproduce:
> - Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will be owner 
> of all the partitions.
> - Turn the system (OS) clock back. For instance 1 hour.
> - Start a new consumer for TOPIC_NAME  using the same group id, it will force 
> a rebalance.
> After these actions the kafka server logs constantly display the messages 
> below, and after a while both consumers do not receive more packages. This 
> condition lasts at least the time that the clock went back, for this example 
> 1 hour, and finally after this time kafka comes back to work.
> [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 2 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME 
> generation 3 is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP generation 1 
> is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME 
> generation 1 is dead and removed (kafka.coordinator.GroupCoordinator)
> Due to the fact that some systems could have enabled NTP or an administrator 
> option to change the system clock (date/time) it's important to do it safely, 
> currently the only way to do it safely is following the next steps:
> 1-  Tear down the Kafka server.
> 2-  Change the date/time
> 3- Tear up the Kafka server.
> But, this approach can be done only if the change was performed by the 
> administrator, not for NTP. Also in many systems turning down the Kafka 
> server might cause the INFORMATION TO BE LOST.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4051) Strange behavior during rebalance when turning the OS clock back

2016-08-18 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426290#comment-15426290
 ] 

Rajini Sivaram commented on KAFKA-4051:
---

[~ijuma] As you have pointed out, there are inevitably going to be issues since 
Kafka uses System.currentTimeMillis in so many places. But typically, you would 
expect a single clock change to cause one expiry and thereafter continue to 
work with the changed timer (eg. producer metadata get expires and retry works 
since it is on the updated clock). The issue in this JIRA is that the broker 
doesn't recover until the wall clock time reaches the previously set time. I 
imagine changing the clock back by an hour is an uncommon scenario, but the 
impact is quite big if it does happen. If we are fixing this issue, it will be 
useful to have a system test to check that Kafka continues to function after a 
major clock change.

> Strange behavior during rebalance when turning the OS clock back
> 
>
> Key: KAFKA-4051
> URL: https://issues.apache.org/jira/browse/KAFKA-4051
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.0
> Environment: OS: Ubuntu 14.04 - 64bits
>Reporter: Gabriel Ibarra
>Assignee: Rajini Sivaram
>
> If a rebalance is performed after turning the OS clock back, then the kafka 
> server enters in a loop and the rebalance cannot be completed until the 
> system returns to the previous date/hour.
> Steps to Reproduce:
> - Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will be owner 
> of all the partitions.
> - Turn the system (OS) clock back. For instance 1 hour.
> - Start a new consumer for TOPIC_NAME  using the same group id, it will force 
> a rebalance.
> After these actions the kafka server logs constantly display the messages 
> below, and after a while both consumers do not receive more packages. This 
> condition lasts at least the time that the clock went back, for this example 
> 1 hour, and finally after this time kafka comes back to work.
> [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 2 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME 
> generation 3 is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP generation 1 
> is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME 
> generation 1 is dead and removed (kafka.coordinator.GroupCoordinator)
> Due to the fact that some systems could have enabled NTP or an administrator 
> option to change the system clock (date/time) it's important to do it safely, 
> currently the only way to do it safely is following the next steps:
> 1-  Tear down the Kafka server.
> 2-  Change the date/time
> 3- Tear up the Kafka server.
> But, this approach can be done only if the change was performed by the 
> administrator, not for NTP. Also in many systems turning down the Kafka 
> server might cause the INFORMATION TO BE LOST.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4051) Strange behavior during rebalance when turning the OS clock back

2016-08-18 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426279#comment-15426279
 ] 

Rajini Sivaram commented on KAFKA-4051:
---

I think the issue is around the handling of timer tasks in Kafka. While task 
expiry is set using {{System.currentTimeMillis}} which can move backwards (as 
reported in this JIRA), the internal timers in the TimingWheel in Kafka used to 
handle expiry is a monotonically increasing timer that starts with 
{{System.currentTimeMillis}}. This mismatch causes expiry of tasks until 
{{System.currentTimeMillis}} catches up with the internal timer.

As [~ijuma] has pointed out on the mailing list, Kafka uses 
{{System.currentTimeMillis}} in a lot of places and switching to 
{{System.nanoTime}} everywhere could impact performance. We have a few choices 
on fixing this JIRA (in increasing order of complexity)

# We could switch over to {{System.nanoTime}} for TimerTasks alone to fix the 
issue with delayed tasks reported here
# It may be possible to change the timer implementation to recover better when 
wall clock time moves backwards
# Replace {{System.currentTimeMillis}} with {{System.nanoTime}} in time 
comparisons throughout Kafka code

I am inclined to do 1) and run performance tests, but am interested in what 
others think.


> Strange behavior during rebalance when turning the OS clock back
> 
>
> Key: KAFKA-4051
> URL: https://issues.apache.org/jira/browse/KAFKA-4051
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.0
> Environment: OS: Ubuntu 14.04 - 64bits
>Reporter: Gabriel Ibarra
>Assignee: Rajini Sivaram
>
> If a rebalance is performed after turning the OS clock back, then the kafka 
> server enters in a loop and the rebalance cannot be completed until the 
> system returns to the previous date/hour.
> Steps to Reproduce:
> - Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will be owner 
> of all the partitions.
> - Turn the system (OS) clock back. For instance 1 hour.
> - Start a new consumer for TOPIC_NAME  using the same group id, it will force 
> a rebalance.
> After these actions the kafka server logs constantly display the messages 
> below, and after a while both consumers do not receive more packages. This 
> condition lasts at least the time that the clock went back, for this example 
> 1 hour, and finally after this time kafka comes back to work.
> [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 2 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME 
> generation 3 is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP generation 1 
> is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME 
> generation 1 is dead and removed (kafka.coordinator.GroupCoordinator)
> Due to the fact that some systems could have enabled NTP or an administrator 
> option to change the system clock (date/time) it's important to do it safely, 
> currently the only way to do it safely is following the next steps:
> 1-  Tear down the Kafka server.
> 2-  Change the date/time
> 3- Tear up the Kafka server.
> But, this approach can be done only if the change was performed by the 
> administrator, not for NTP. Also in many systems turning down the Kafka 
> server might cause the INFORMATION TO BE LOST.



--
This message was

[jira] [Assigned] (KAFKA-4051) Strange behavior during rebalance when turning the OS clock back

2016-08-18 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-4051:
-

Assignee: Rajini Sivaram

> Strange behavior during rebalance when turning the OS clock back
> 
>
> Key: KAFKA-4051
> URL: https://issues.apache.org/jira/browse/KAFKA-4051
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.0.0
> Environment: OS: Ubuntu 14.04 - 64bits
>Reporter: Gabriel Ibarra
>Assignee: Rajini Sivaram
>
> If a rebalance is performed after turning the OS clock back, then the kafka 
> server enters in a loop and the rebalance cannot be completed until the 
> system returns to the previous date/hour.
> Steps to Reproduce:
> - Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will be owner 
> of all the partitions.
> - Turn the system (OS) clock back. For instance 1 hour.
> - Start a new consumer for TOPIC_NAME  using the same group id, it will force 
> a rebalance.
> After these actions the kafka server logs constantly display the messages 
> below, and after a while both consumers do not receive more packages. This 
> condition lasts at least the time that the clock went back, for this example 
> 1 hour, and finally after this time kafka comes back to work.
> [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 2 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 3 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME 
> generation 3 is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP generation 1 
> is dead and removed (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 0 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group 
> GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group GROUP_NAME with old generation 1 (kafka.coordinator.GroupCoordinator)
> [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME 
> generation 1 is dead and removed (kafka.coordinator.GroupCoordinator)
> Due to the fact that some systems could have enabled NTP or an administrator 
> option to change the system clock (date/time) it's important to do it safely, 
> currently the only way to do it safely is following the next steps:
> 1-  Tear down the Kafka server.
> 2-  Change the date/time
> 3- Tear up the Kafka server.
> But, this approach can be done only if the change was performed by the 
> administrator, not for NTP. Also in many systems turning down the Kafka 
> server might cause the INFORMATION TO BE LOST.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3492) support quota based on authenticated user name

2016-08-17 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3492:
--
Fix Version/s: 0.10.1.0
   Status: Patch Available  (was: Open)

> support quota based on authenticated user name
> --
>
> Key: KAFKA-3492
> URL: https://issues.apache.org/jira/browse/KAFKA-3492
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Reporter: Jun Rao
>Assignee: Rajini Sivaram
> Fix For: 0.10.1.0
>
>
> Currently, quota is based on the client.id set in the client configuration, 
> which can be changed easily. Ideally, quota should be set on the 
> authenticated user name. We will need to have a KIP proposal/discussion on 
> this first.
> Details are in KIP-55: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-55%3A+Secure+Quotas+for+Authenticated+Users



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-4019) LogCleaner should grow read/write buffer to max message size for the topic

2016-08-17 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-4019:
-

Assignee: Rajini Sivaram

> LogCleaner should grow read/write buffer to max message size for the topic
> --
>
> Key: KAFKA-4019
> URL: https://issues.apache.org/jira/browse/KAFKA-4019
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0
>Reporter: Jun Rao
>Assignee: Rajini Sivaram
>
> Currently, the LogCleaner.growBuffers() only grows the buffer up to the 
> default max message size. However, since the max message size can be 
> customized at the topic level, LogCleaner should allow the buffer to grow up 
> to the max message allowed by the topic. Otherwise, the cleaner will get 
> stuck on a large message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-4054) Quota related metrics and sensors are never deleted

2016-08-17 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-4054.
---
Resolution: Not A Problem

As pointed out by [~ijuma] in KAFKA-3980, metrics and sensors are expired.

> Quota related metrics and sensors are never deleted
> ---
>
> Key: KAFKA-4054
> URL: https://issues.apache.org/jira/browse/KAFKA-4054
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: Rajini Sivaram
>
> Metrics and sensors used for quotas are never deleted. If random client-ids 
> are used by clients, this could result in a lot of unused metrics and sensors 
> in the broker.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3980) JmxReporter uses excessive memory causing OutOfMemoryException

2016-08-17 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424492#comment-15424492
 ] 

Rajini Sivaram commented on KAFKA-3980:
---

[~ijuma] Thank you, I will close KAFKA-4054 since the expiry addresses the 
issue I was worried about. Perhaps this JIRA needs more investigation.

> JmxReporter uses excessive memory causing OutOfMemoryException
> --
>
> Key: KAFKA-3980
> URL: https://issues.apache.org/jira/browse/KAFKA-3980
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Andrew Jorgensen
>
> I have some nodes in a kafka cluster that occasionally will run out of memory 
> whenever I restart the producers. I was able to take a heap dump from both a 
> recently restarted Kafka node which weighed in at about 20 MB and a node that 
> has been running for 2 months is using over 700MB of memory. Looking at the 
> heap dump it looks like the JmxReporter is holding on to metrics and causing 
> them to build up over time. 
> !http://imgur.com/N6Cd0Ku.png!
> !http://imgur.com/kQBqA2j.png!
> The ultimate problem this causes is that there is a chance when I restart the 
> producers it will cause the node to experience an Java heap space exception 
> and OOM. The nodes  then fail to startup correctly and write a -1 as the 
> leader number to the partitions they were responsible for effectively 
> resetting the offset and rendering that partition unavailable. The kafka 
> process then needs to go be restarted in order to re-assign the node to the 
> partition that it owns.
> I have a few questions:
> 1. I am not quite sure why there are so many client id entries in that 
> JmxReporter map.
> 2. Is there a way to have the JmxReporter release metrics after a set amount 
> of time or a way to turn certain high cardinality metrics like these off?
> I can provide any logs or heap dumps if more information is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3980) JmxReporter uses excessive memory causing OutOfMemoryException

2016-08-17 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424402#comment-15424402
 ] 

Rajini Sivaram commented on KAFKA-3980:
---

[~omkreddy] Thanks for the link. I was searching for an open defect, but had 
missed this one.

> JmxReporter uses excessive memory causing OutOfMemoryException
> --
>
> Key: KAFKA-3980
> URL: https://issues.apache.org/jira/browse/KAFKA-3980
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Andrew Jorgensen
>
> I have some nodes in a kafka cluster that occasionally will run out of memory 
> whenever I restart the producers. I was able to take a heap dump from both a 
> recently restarted Kafka node which weighed in at about 20 MB and a node that 
> has been running for 2 months is using over 700MB of memory. Looking at the 
> heap dump it looks like the JmxReporter is holding on to metrics and causing 
> them to build up over time. 
> !http://imgur.com/N6Cd0Ku.png!
> !http://imgur.com/kQBqA2j.png!
> The ultimate problem this causes is that there is a chance when I restart the 
> producers it will cause the node to experience an Java heap space exception 
> and OOM. The nodes  then fail to startup correctly and write a -1 as the 
> leader number to the partitions they were responsible for effectively 
> resetting the offset and rendering that partition unavailable. The kafka 
> process then needs to go be restarted in order to re-assign the node to the 
> partition that it owns.
> I have a few questions:
> 1. I am not quite sure why there are so many client id entries in that 
> JmxReporter map.
> 2. Is there a way to have the JmxReporter release metrics after a set amount 
> of time or a way to turn certain high cardinality metrics like these off?
> I can provide any logs or heap dumps if more information is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4055) Add system tests for secure quotas

2016-08-17 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-4055:
-

 Summary: Add system tests for secure quotas
 Key: KAFKA-4055
 URL: https://issues.apache.org/jira/browse/KAFKA-4055
 Project: Kafka
  Issue Type: Test
  Components: system tests
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 0.10.1.0


Add system tests for quotas for authenticated users and  
(corresponding to KIP-55). Implementation is being done under KAFKA-3492.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4054) Quota related metrics and sensors are never deleted

2016-08-17 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-4054:
-

 Summary: Quota related metrics and sensors are never deleted
 Key: KAFKA-4054
 URL: https://issues.apache.org/jira/browse/KAFKA-4054
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.10.0.1
Reporter: Rajini Sivaram


Metrics and sensors used for quotas are never deleted. If random client-ids are 
used by clients, this could result in a lot of unused metrics and sensors in 
the broker.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3972) kafka java consumer poll returns 0 records after seekToBeginning

2016-07-19 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383832#comment-15383832
 ] 

Rajini Sivaram commented on KAFKA-3972:
---

The code invokes {{seekToBeginning}} of assigned partitions before partitions 
are assigned to the consumer, so there are no partitions on which seek is 
performed. You can provide a {{ConsumerRebalanceListener}} to {{subscribe()}} 
to get notified of partition assignment. Or {{poll}} to wait for partition 
assignment before invoking {{seekToBeginning}}.


> kafka java consumer poll returns 0 records after seekToBeginning
> 
>
> Key: KAFKA-3972
> URL: https://issues.apache.org/jira/browse/KAFKA-3972
> Project: Kafka
>  Issue Type: Task
>  Components: consumer
>Affects Versions: 0.10.0.0
> Environment: docker image elasticsearch:latest, kafka scala version 
> 2.11, kafka version 0.10.0.0
>Reporter: don caldwell
>  Labels: kafka, polling
>
> kafkacat successfully returns rows for the topic, but the following java 
> source reliably fails to produce rows. I have the suspicion that I am missing 
> some simple thing in my setup, but I have been unable to find a way out. I am 
> using the current docker and using docker network commands to connect the 
> processes in my cluster. The properties are:
> bootstrap.servers: kafka01:9092,kafka02:9092,kafka03:9092
> group.id: dhcp1
> topic: dhcp
> enable.auto.commit: false
> auto.commit.interval.ms: 1000
> session.timeout.ms 3
> key.deserializer: org.apache.kafka.common.serialization.StringDeserializer
> value.deserializer: org.apache.kafka.common.serialization.StringDeserializer
> the kafka consumer follows. One thing that I find curious is that, although I 
> seem to successfully make the call to seekToBeginning(), when I print offsets 
> on failure, I get large offsets for all partitions although I had expected 
> them to be 0 or at least some small number.
> Here is the code:
> import org.apache.kafka.clients.consumer.ConsumerConfig;
> import org.apache.kafka.clients.consumer.ConsumerRecord;
> import org.apache.kafka.clients.consumer.ConsumerRecords;
> import org.apache.kafka.clients.consumer.KafkaConsumer;
> import org.apache.kafka.common.errors.TimeoutException;
> import org.apache.kafka.common.protocol.types.SchemaException;
> import org.apache.kafka.common.KafkaException;
> import org.apache.kafka.common.Node;
> import org.apache.kafka.common.PartitionInfo;
> import org.apache.kafka.common.TopicPartition;
> import java.io.FileInputStream;
> import java.io.FileNotFoundException;
> import java.io.IOException;
> import java.lang.Integer;
> import java.lang.System;
> import java.lang.Thread;
> import java.lang.InterruptedException;
> import java.util.Arrays;
> import java.util.ArrayList;
> import java.util.Collections;
> import java.util.List;
> import java.util.Map;
> import java.util.Properties;
> public class KConsumer {
> private Properties prop;
> private String topic;
> private Integer polln;
> private KafkaConsumer consumer;
> private String[] pna = {ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
> ConsumerConfig.GROUP_ID_CONFIG,
> ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,
> ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG,
> ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG,
> ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
> ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG};
> public KConsumer(String pf) throws FileNotFoundException,
> IOException {
> this.setProperties(pf);
> this.newClient();
> }
> public void setProperties(String p) throws FileNotFoundException,
> IOException {
> this.prop = new Properties();
> this.prop.load(new FileInputStream(p));
> this.topic = this.prop.getProperty("topic");
> this.polln = new Integer(this.prop.getProperty("polln"));
> }
> public void setTopic(String t) {
> this.topic = t;
> }
> public String getTopic() {
> return this.topic;
> }
> public void newClient() {
> System.err.println("creating consumer");
> Properties kp = new Properties();
> for(String p : pna) {
> String v = this.prop.getProperty(p);
> if(v != null) {
> kp.put(p, v);
> }
> }
> //this.consumer = new KafkaConsumer<>(this.prop);
> this.consumer = new KafkaConsumer<>(kp);
> //this.consumer.subscribe(Collections.singletonList(this.topic));
> System.err.println("subscribing to " + this.topic);
> this.consumer.subscribe(Arrays.asList(this.topic));
> //this.seekToBeginning();
> }
> public 

[jira] [Updated] (KAFKA-3680) Make Java client classloading more flexible

2016-07-07 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3680:
--
Fix Version/s: 0.10.1.0
   Status: Patch Available  (was: Open)

> Make Java client classloading more flexible
> ---
>
> Key: KAFKA-3680
> URL: https://issues.apache.org/jira/browse/KAFKA-3680
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.10.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.1.0
>
>
> JIRA corresponding to 
> [KIP-60|https://cwiki.apache.org/confluence/display/KAFKA/KIP-60+-+Make+Java+client+classloading+more+flexible]
>  to enable classloading of default classes and custom classes to work in 
> different classloading environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3871) Transient test failure: testSimpleConsumption

2016-06-20 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15339361#comment-15339361
 ] 

Rajini Sivaram commented on KAFKA-3871:
---

[~ijuma] Yes, root cause looks the same. And the stack trace in stdout shows a 
failed auto-create topic that couldn't have come from this test since this one 
failed in its set up before any messages were sent. I had a quick look through 
all the tests in core and couldn't find any that leave producers open (except 
some which may leave producers open when the test fails - but in this case 
there were no other failures).

> Transient test failure: testSimpleConsumption
> -
>
> Key: KAFKA-3871
> URL: https://issues.apache.org/jira/browse/KAFKA-3871
> Project: Kafka
>  Issue Type: Sub-task
>  Components: unit tests
>Affects Versions: 0.10.0.0
>Reporter: Ismael Juma
>  Labels: transient-unit-test-failure
>
> {code}
> kafka.common.TopicExistsException: Topic "topic" already exists.
>   at 
> kafka.admin.AdminUtils$.createOrUpdateTopicPartitionAssignmentPathInZK(AdminUtils.scala:420)
>   at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:404)
>   at kafka.utils.TestUtils$.createTopic(TestUtils.scala:237)
>   at kafka.api.BaseConsumerTest.setUp(BaseConsumerTest.scala:63)
> {code}
> The following was printed to standard out:
> {code}
> [2016-06-16 23:57:44,161] ERROR [KafkaApi-0] Error when handling request 
> {topics=[topic]} (kafka.server.KafkaApis:103)
> kafka.admin.AdminOperationException: replication factor: 1 larger than 
> available brokers: 0
>   at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:117)
>   at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:403)
>   at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$createTopic(KafkaApis.scala:629)
>   at kafka.server.KafkaApis$$anonfun$29.apply(KafkaApis.scala:670)
>   at kafka.server.KafkaApis$$anonfun$29.apply(KafkaApis.scala:666)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>   at 
> scala.collection.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:47)
>   at scala.collection.SetLike$class.map(SetLike.scala:93)
>   at scala.collection.AbstractSet.map(Set.scala:47)
>   at kafka.server.KafkaApis.getTopicMetadata(KafkaApis.scala:666)
>   at 
> kafka.server.KafkaApis.handleTopicMetadataRequest(KafkaApis.scala:727)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:79)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:744)
> {code}
> Build link: https://builds.apache.org/job/kafka-trunk-jdk7/1371/
> [~rsivaram], do you think the root cause is the same as KAFKA-3217?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3799) Turn on endpoint validation in SSL system tests

2016-06-09 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3799:
--
Affects Version/s: 0.10.0.0
   Status: Patch Available  (was: Open)

> Turn on endpoint validation in SSL system tests
> ---
>
> Key: KAFKA-3799
> URL: https://issues.apache.org/jira/browse/KAFKA-3799
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.10.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>
> Endpoint validation is off by default and currently system tests are run 
> without endpoint validation. It will be better to run system tests with 
> endpoint validation turned on. KAFKA-3665 will be enabling validation by 
> default as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3562) Null Pointer Exception Found when delete topic and Using New Producer

2016-06-09 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3562:
--
Status: Patch Available  (was: Open)

> Null Pointer Exception Found when delete topic and Using New Producer
> -
>
> Key: KAFKA-3562
> URL: https://issues.apache.org/jira/browse/KAFKA-3562
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1, 0.9.0.0
>Reporter: Pengwei
>Assignee: Rajini Sivaram
>Priority: Minor
> Fix For: 0.10.0.1
>
>
> Exception in thread “Thread-2” java.lang.NullPointerException
> at 
> org.apache.kafka.clients.producer.internals.DefaultPartitioner.partition(DefaultPartitioner.java:70)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.partition(KafkaProducer.java:687)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:432)
> at 
> com.huawei.kafka.internal.remove.ProducerMsgThread.run(ProducerMsgThread.java:36)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3562) Null Pointer Exception Found when delete topic and Using New Producer

2016-06-07 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-3562:
-

Assignee: Rajini Sivaram

> Null Pointer Exception Found when delete topic and Using New Producer
> -
>
> Key: KAFKA-3562
> URL: https://issues.apache.org/jira/browse/KAFKA-3562
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Pengwei
>Assignee: Rajini Sivaram
>Priority: Minor
> Fix For: 0.10.0.1
>
>
> Exception in thread “Thread-2” java.lang.NullPointerException
> at 
> org.apache.kafka.clients.producer.internals.DefaultPartitioner.partition(DefaultPartitioner.java:70)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.partition(KafkaProducer.java:687)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:432)
> at 
> com.huawei.kafka.internal.remove.ProducerMsgThread.run(ProducerMsgThread.java:36)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3799) Turn on endpoint validation in SSL system tests

2016-06-07 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3799:
-

 Summary: Turn on endpoint validation in SSL system tests
 Key: KAFKA-3799
 URL: https://issues.apache.org/jira/browse/KAFKA-3799
 Project: Kafka
  Issue Type: Test
  Components: system tests
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


Endpoint validation is off by default and currently system tests are run 
without endpoint validation. It will be better to run system tests with 
endpoint validation turned on. KAFKA-3665 will be enabling validation by 
default as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3792) Fix log spam in clients when topic doesn't exist

2016-06-05 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3792:
-

 Summary: Fix log spam in clients when topic doesn't exist
 Key: KAFKA-3792
 URL: https://issues.apache.org/jira/browse/KAFKA-3792
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.10.0.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


Kafka producer logs an error message for every retry of metadata when an app 
attempts to publish to a topic that doesn't exist. Kafka consumer manual assign 
logs error messages forever if topic doesn't exist.

See discussion in the PR for KAFKA-2948 
(https://github.com/apache/kafka/pull/645) for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3751) Add support for SASL mechanism SCRAM-SHA-256

2016-05-24 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3751:
-

 Summary: Add support for SASL mechanism SCRAM-SHA-256 
 Key: KAFKA-3751
 URL: https://issues.apache.org/jira/browse/KAFKA-3751
 Project: Kafka
  Issue Type: Improvement
  Components: security
Affects Versions: 0.10.0.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


Salted Challenge Response Authentication Mechanism (SCRAM) provides secure 
authentication and is increasingly being adopted as an alternative to 
Digest-MD5 which is now obsolete. SCRAM is described in the RFC 
[https://tools.ietf.org/html/rfc5802]. It will be good to add support for 
SCRAM-SHA-256 ([https://tools.ietf.org/html/rfc7677]) as a SASL mechanism for 
Kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3728) inconsistent behavior of Consumer.poll() when assigned vs subscribed

2016-05-19 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291108#comment-15291108
 ] 

Rajini Sivaram commented on KAFKA-3728:
---

EndToEndAuthorizationTest has a misconfigured __consumer_offsets topic, created 
with one replica when the broker is configured with MinInsyncReplicas set to 3. 
The test passes because it performs assign() rather than subscribe() and does 
not commit any offsets (basically doesn't use the offsets topic). Changing 
assign() to subscribe() causes the test to use the offsets topic, resulting in 
the error above.

> inconsistent behavior of Consumer.poll() when assigned vs subscribed
> 
>
> Key: KAFKA-3728
> URL: https://issues.apache.org/jira/browse/KAFKA-3728
> Project: Kafka
>  Issue Type: Bug
>Reporter: Edoardo Comar
>
> A consumer that is manually assigned a topic-partition is able to consume 
> messages that a consumer that subscribes to the topic can not.
> To reproduce : take the test 
> EndToEndAuthorizationTest.testProduceConsume 
> (eg the SaslSslEndToEndAuthorizationTest implementation)
>  
> it passes ( = messages are consumed) 
> if the consumer is assigned the single topic-partition
>   consumers.head.assign(List(tp).asJava)
> but fails 
> if the consumer subscribes to the topic - changing the line to :
>   consumers.head.subscribe(List(topic).asJava)
> The failure when subscribed shows this error about synchronization:
>  org.apache.kafka.common.KafkaException: Unexpected error from SyncGroup: 
> Messages are rejected since there are fewer in-sync replicas than required.
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:455)
> The test passes in both cases (subscribe and assign) with the setting
>   this.serverConfig.setProperty(KafkaConfig.MinInSyncReplicasProp, "1")



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3703) PlaintextTransportLayer.close() doesn't complete outgoing writes

2016-05-11 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3703:
-

 Summary: PlaintextTransportLayer.close() doesn't complete outgoing 
writes
 Key: KAFKA-3703
 URL: https://issues.apache.org/jira/browse/KAFKA-3703
 Project: Kafka
  Issue Type: Bug
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


Outgoing writes may be discarded when a connection is closed. For instance, 
when running a producer with acks=0, a producer that writes data and closes the 
producer would expect to see all writes to complete if there are no errors. But 
close() simply closes the channel and socket which could result in outgoing 
data being discarded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3702) SslTransportLayer.close() does not shutdown gracefully

2016-05-11 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3702:
-

 Summary: SslTransportLayer.close() does not shutdown gracefully
 Key: KAFKA-3702
 URL: https://issues.apache.org/jira/browse/KAFKA-3702
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.0.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


The warning "Failed to send SSL Close message" occurs very frequently when SSL 
connections are closed. Close should write outbound data and shutdown 
gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3689) ERROR Processor got uncaught exception. (kafka.network.Processor)

2016-05-11 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280019#comment-15280019
 ] 

Rajini Sivaram commented on KAFKA-3689:
---

[~ijuma] PR for the {{PlainTransportLayer.close()}} issue is here: 
https://github.com/apache/kafka/pull/1368. Can you take a look? Thank you.

> ERROR Processor got uncaught exception. (kafka.network.Processor)
> -
>
> Key: KAFKA-3689
> URL: https://issues.apache.org/jira/browse/KAFKA-3689
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 0.9.0.1
> Environment: ubuntu 14.04,
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.2)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> 3 broker cluster (all 3 servers identical -  Intel Xeon E5-2670 @2.6GHz, 
> 8cores, 16 threads 64 GB RAM & 1 TB Disk)
> Kafka Cluster is managed by 3 server ZK cluster (these servers are different 
> from Kafka broker servers). All 6 servers are connected via 10G switch. 
> Producers run from external servers.
>Reporter: Buvaneswari Ramanan
>Assignee: Jun Rao
>Priority: Minor
> Fix For: 0.10.1.0, 0.9.0.1, 0.10.0.0, 0.11.0.0, 0.10.0.1, 0.9.0.2
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> As per Ismael Juma's suggestion in email thread to us...@kafka.apache.org 
> with the same subject, I am creating this bug report.
> The following error occurs in one of the brokers in our 3 broker cluster, 
> which serves about 8000 topics. These topics are single partitioned with a 
> replication factor = 3. Each topic gets data at a low rate  – 200 bytes per 
> sec.  Leaders are balanced across the topics.
> Producers run from external servers (4 Ubuntu servers with same config as the 
> brokers), each producing to 2000 topics utilizing kafka-python library.
> This error message occurs repeatedly in one of the servers. Between the hours 
> of 10:30am and 1:30pm on 5/9/16, there were about 10 Million such 
> occurrences. This was right after a cluster restart.
> This is not the first time we got this error in this broker. In those 
> instances, error occurred hours / days after cluster restart.
> =
> [2016-05-09 10:38:43,932] ERROR Processor got uncaught exception. 
> (kafka.network.Processor)
> java.lang.IllegalArgumentException: Attempted to decrease connection count 
> for address with no connections, address: /X.Y.Z.144 (actual network address 
> masked)
> at 
> kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565)
> at 
> kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565)
> at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
> at scala.collection.AbstractMap.getOrElse(Map.scala:59)
> at kafka.network.ConnectionQuotas.dec(SocketServer.scala:564)
> at 
> kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:450)
> at 
> kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:445)
> at scala.collection.Iterator$class.foreach(Iterator.scala:742)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at kafka.network.Processor.run(SocketServer.scala:445)
> at java.lang.Thread.run(Thread.java:745)
> [2016-05-09 10:38:43,932] ERROR Processor got uncaught exception. 
> (kafka.network.Processor)
> java.lang.IllegalArgumentException: Attempted to decrease connection count 
> for address with no connections, address: /X.Y.Z.144
> at 
> kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565)
> at 
> kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565)
> at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
> at scala.collection.AbstractMap.getOrElse(Map.scala:59)
> at kafka.network.ConnectionQuotas.dec(SocketServer.scala:564)
> at 
> kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:450)
> at 
> kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:445)
> at scala.collection.Iterator$class.foreach(Iterator.scala:742)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at kafka.network.Processor.run(SocketServer.scala:445)
> at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3689) ERROR Processor got uncaught exception. (kafka.network.Processor)

2016-05-11 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279945#comment-15279945
 ] 

Rajini Sivaram commented on KAFKA-3689:
---

[~buvana.rama...@nokia.com] Were there any exceptions/errors in the logs before 
the first occurrence of this sequence of errors? Am I right in assuming that 
you are using PLAINTEXT? [~ijuma] Looking at the code, I am not sure if an 
issue with {{SocketServer}} alone would cause what looks a tight loop of 
errors. The errors seem too close to each other to correspond to new 
connections each time. If one connection resulted in a loop of millions of 
errors, I wonder whether {{Selector}} got itself into an inconsistent state.  
Since {{Selector.disconnected}} is cleared for every poll, and addition of 
entries to {{Selector.disconnected}} is always accompanied by a 
{{channel.close}} which cancels the key, I wonder whether there was some close 
exception that resulted in this state. {{PlaintextTransportLayer.close()}} 
doesn't cancel the key if the socket close throws an IOException, which we 
should probably fix anyway. But since an exception would have been logged if 
that was the case, it will be good to know if there were any errors in the logs 
prior to this exception sequence.

> ERROR Processor got uncaught exception. (kafka.network.Processor)
> -
>
> Key: KAFKA-3689
> URL: https://issues.apache.org/jira/browse/KAFKA-3689
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 0.9.0.1
> Environment: ubuntu 14.04,
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.2)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> 3 broker cluster (all 3 servers identical -  Intel Xeon E5-2670 @2.6GHz, 
> 8cores, 16 threads 64 GB RAM & 1 TB Disk)
> Kafka Cluster is managed by 3 server ZK cluster (these servers are different 
> from Kafka broker servers). All 6 servers are connected via 10G switch. 
> Producers run from external servers.
>Reporter: Buvaneswari Ramanan
>Assignee: Jun Rao
>Priority: Minor
> Fix For: 0.10.1.0, 0.9.0.1, 0.10.0.0, 0.11.0.0, 0.10.0.1, 0.9.0.2
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> As per Ismael Juma's suggestion in email thread to us...@kafka.apache.org 
> with the same subject, I am creating this bug report.
> The following error occurs in one of the brokers in our 3 broker cluster, 
> which serves about 8000 topics. These topics are single partitioned with a 
> replication factor = 3. Each topic gets data at a low rate  – 200 bytes per 
> sec.  Leaders are balanced across the topics.
> Producers run from external servers (4 Ubuntu servers with same config as the 
> brokers), each producing to 2000 topics utilizing kafka-python library.
> This error message occurs repeatedly in one of the servers. Between the hours 
> of 10:30am and 1:30pm on 5/9/16, there were about 10 Million such 
> occurrences. This was right after a cluster restart.
> This is not the first time we got this error in this broker. In those 
> instances, error occurred hours / days after cluster restart.
> =
> [2016-05-09 10:38:43,932] ERROR Processor got uncaught exception. 
> (kafka.network.Processor)
> java.lang.IllegalArgumentException: Attempted to decrease connection count 
> for address with no connections, address: /X.Y.Z.144 (actual network address 
> masked)
> at 
> kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565)
> at 
> kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565)
> at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
> at scala.collection.AbstractMap.getOrElse(Map.scala:59)
> at kafka.network.ConnectionQuotas.dec(SocketServer.scala:564)
> at 
> kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:450)
> at 
> kafka.network.Processor$$anonfun$run$13.apply(SocketServer.scala:445)
> at scala.collection.Iterator$class.foreach(Iterator.scala:742)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> at kafka.network.Processor.run(SocketServer.scala:445)
> at java.lang.Thread.run(Thread.java:745)
> [2016-05-09 10:38:43,932] ERROR Processor got uncaught exception. 
> (kafka.network.Processor)
> java.lang.IllegalArgumentException: Attempted to decrease connection count 
> for address with no connections, address: /X.Y.Z.144
> at 
> kafka.network.ConnectionQuotas$$anonfun$9.apply(SocketServer.scala:565)

[jira] [Commented] (KAFKA-3688) Unable to start broker with sasl.mechanism.inter.broker.protocol=PLAIN

2016-05-10 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278613#comment-15278613
 ] 

Rajini Sivaram commented on KAFKA-3688:
---

[~ecomar] As [~ijuma] has said, you need to specify _username_ and _password_ 
for the server to create client connections. For the broker, this needs to be 
in _KafkaServer_. _KafkaClient_ is the context used for clients and 
_KafkaServer_ is the context used for brokers.
 BTW, you don't need to specify _serviceName_ when you are using PLAIN.

> Unable to start broker with sasl.mechanism.inter.broker.protocol=PLAIN
> --
>
> Key: KAFKA-3688
> URL: https://issues.apache.org/jira/browse/KAFKA-3688
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Edoardo Comar
>
> Starting a single broker with the following configuration :
>  
> server.properties:
> listeners=SASL_PLAINTEXT://:9093
> sasl.enabled.mechanisms=PLAIN
> security.inter.broker.protocol=SASL_PLAINTEXT
> sasl.mechanism.inter.broker.protocol=PLAIN
> jaas.conf:
> KafkaServer {
>   org.apache.kafka.common.security.plain.PlainLoginModule required
>   serviceName="kafka"
>   user_edo1="edo1pwd"
>   user_edo2="edo2pwd"
>   user_superkuser="wotever";
> };
> KafkaClient {
>   org.apache.kafka.common.security.plain.PlainLoginModule required
>   serviceName="kafka"
> username="superkuser"
> password="wotever";
> };
> results in a broker startup failure “Failed to create SaslClient with 
> mechanism PLAIN” (see stack trace below).
> Note that this configuration was attempted to try working around the issue
> https://issues.apache.org/jira/browse/KAFKA-3687 
> (unable to use ACLs with security.inter.broker.protocol=PLAIN).
> [2016-05-10 16:54:10,730] INFO Failed to create channel due to  
> (org.apache.kafka.common.network.SaslChannelBuilder)
> org.apache.kafka.common.KafkaException: Failed to configure 
> SaslClientAuthenticator
>   at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:124)
>   at 
> org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:102)
>   at org.apache.kafka.common.network.Selector.connect(Selector.java:177)
>   at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:498)
>   at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:159)
>   at 
> kafka.utils.NetworkClientBlockingOps$.blockingReady$extension(NetworkClientBlockingOps.scala:59)
>   at 
> kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:232)
>   at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:181)
>   at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:180)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> Caused by: org.apache.kafka.common.KafkaException: Failed to create 
> SaslClient with mechanism PLAIN
>   at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.createSaslClient(SaslClientAuthenticator.java:139)
>   at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.configure(SaslClientAuthenticator.java:122)
>   ... 9 more
> Caused by: javax.security.sasl.SaslException: Cannot get userid/password 
> [Caused by javax.security.auth.callback.UnsupportedCallbackException: Could 
> not login: the client is being asked for a password, but the Kafka client 
> code does not currently support obtaining a password from the user.]
>   at 
> com.sun.security.sasl.ClientFactoryImpl.getUserInfo(ClientFactoryImpl.java:157)
>   at 
> com.sun.security.sasl.ClientFactoryImpl.createSaslClient(ClientFactoryImpl.java:94)
>   at javax.security.sasl.Sasl.createSaslClient(Sasl.java:372)
>   at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator$1.run(SaslClientAuthenticator.java:135)
>   at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator$1.run(SaslClientAuthenticator.java:1)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.createSaslClient(SaslClientAuthenticator.java:130)
>   ... 10 more
> Caused by: javax.security.auth.callback.UnsupportedCallbackException: Could 
> not login: the client is being asked for a password, but the Kafka client 
> code does not currently support obtaining a password from the user.
>   at 
> org.apache.kafka.common.security.authenticator.SaslClientCallbackHandler.handle(SaslClientCallbackHandler.java:73)
>   at 
> com.sun.security.sasl.ClientFactoryImpl.getUserInfo(

[jira] [Commented] (KAFKA-3218) Kafka-0.9.0.0 does not work as OSGi module

2016-05-09 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276311#comment-15276311
 ] 

Rajini Sivaram commented on KAFKA-3218:
---

Created 
[KIP-60|https://cwiki.apache.org/confluence/display/KAFKA/KIP-60+-+Make+Java+client+classloading+more+flexible]
 to fix default classloading described in this JIRA as well as loading of 
custom classes.

> Kafka-0.9.0.0 does not work as OSGi module
> --
>
> Key: KAFKA-3218
> URL: https://issues.apache.org/jira/browse/KAFKA-3218
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
> Environment: Apache Felix OSGi container
> jdk_1.8.0_60
>Reporter: Joe O'Connor
>Assignee: Rajini Sivaram
> Attachments: ContextClassLoaderBug.tar.gz
>
>
> KAFKA-2295 changed all Class.forName() calls to use 
> currentThread().getContextClassLoader() instead of the default "classloader 
> that loaded the current class". 
> OSGi loads each module's classes using a separate classloader so this is now 
> broken.
> Steps to reproduce: 
> # install the kafka-clients servicemix OSGi module 0.9.0.0_1
> # attempt to initialize the Kafka producer client from Java code 
> Expected results: 
> - call to "new KafkaProducer()" succeeds
> Actual results: 
> - "new KafkaProducer()" throws ConfigException:
> {quote}Suppressed: java.lang.Exception: Error starting bundle54: 
> Activator start error in bundle com.openet.testcase.ContextClassLoaderBug 
> [54].
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:66)
> ... 12 more
> Caused by: org.osgi.framework.BundleException: Activator start error 
> in bundle com.openet.testcase.ContextClassLoaderBug [54].
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2276)
> at 
> org.apache.felix.framework.Felix.startBundle(Felix.java:2144)
> at 
> org.apache.felix.framework.BundleImpl.start(BundleImpl.java:998)
> at 
> org.apache.karaf.bundle.command.Start.executeOnBundle(Start.java:38)
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:64)
> ... 12 more
> Caused by: java.lang.ExceptionInInitializerError
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
> at com.openet.testcase.Activator.start(Activator.java:16)
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:697)
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2226)
> ... 16 more
> *Caused by: org.apache.kafka.common.config.ConfigException: Invalid 
> value org.apache.kafka.clients.producer.internals.DefaultPartitioner for 
> configuration partitioner.class: Class* 
> *org.apache.kafka.clients.producer.internals.DefaultPartitioner could not be 
> found.*
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:255)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:78)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:94)
> at 
> org.apache.kafka.clients.producer.ProducerConfig.(ProducerConfig.java:206)
> {quote}
> Workaround is to call "currentThread().setContextClassLoader(null)" before 
> initializing the kafka producer.
> Possible fix is to catch ClassNotFoundException at ConfigDef.java:247 and 
> retry the Class.forName() call with the default classloader. However with 
> this fix there is still a problem at AbstractConfig.java:206,  where the 
> newInstance() call succeeds but "instanceof" is false because the classes 
> were loaded by different classloaders.
> Testcase attached, see README.txt for instructions.
> See also SM-2743



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3680) Make Java client classloading more flexible

2016-05-09 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3680:
-

 Summary: Make Java client classloading more flexible
 Key: KAFKA-3680
 URL: https://issues.apache.org/jira/browse/KAFKA-3680
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.10.0.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


JIRA corresponding to 
[KIP-60|https://cwiki.apache.org/confluence/display/KAFKA/KIP-60+-+Make+Java+client+classloading+more+flexible]
 to enable classloading of default classes and custom classes to work in 
different classloading environments.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3662) Failure in kafka.network.SocketServerTest.tooBigRequestIsRejecte

2016-05-09 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3662:
--
Status: Patch Available  (was: Open)

> Failure in kafka.network.SocketServerTest.tooBigRequestIsRejecte
> 
>
> Key: KAFKA-3662
> URL: https://issues.apache.org/jira/browse/KAFKA-3662
> Project: Kafka
>  Issue Type: Sub-task
>  Components: unit tests
>Affects Versions: 0.10.0.0
>Reporter: Guozhang Wang
>Assignee: Rajini Sivaram
>
> Saw this transient failure:
> {code}
> Error Message
> java.net.SocketException: Broken pipe
> Stacktrace
> java.net.SocketException: Broken pipe
>   at java.net.SocketOutputStream.socketWrite0(Native Method)
>   at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
>   at java.net.SocketOutputStream.write(SocketOutputStream.java:138)
>   at java.io.DataOutputStream.writeShort(DataOutputStream.java:168)
>   at kafka.network.SocketServerTest.sendRequest(SocketServerTest.scala:65)
>   at 
> kafka.network.SocketServerTest.tooBigRequestIsRejected(SocketServerTest.scala:153)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:112)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
>   at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:364)
>   at 
> org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:54)
>   at 
> org.gradle.internal.concurrent.StoppableExecutorImpl$1.run(StoppableExecutorImpl.java:40)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPo

[jira] [Assigned] (KAFKA-3662) Failure in kafka.network.SocketServerTest.tooBigRequestIsRejecte

2016-05-09 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-3662:
-

Assignee: Rajini Sivaram

> Failure in kafka.network.SocketServerTest.tooBigRequestIsRejecte
> 
>
> Key: KAFKA-3662
> URL: https://issues.apache.org/jira/browse/KAFKA-3662
> Project: Kafka
>  Issue Type: Sub-task
>  Components: unit tests
>Affects Versions: 0.10.0.0
>Reporter: Guozhang Wang
>Assignee: Rajini Sivaram
>
> Saw this transient failure:
> {code}
> Error Message
> java.net.SocketException: Broken pipe
> Stacktrace
> java.net.SocketException: Broken pipe
>   at java.net.SocketOutputStream.socketWrite0(Native Method)
>   at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
>   at java.net.SocketOutputStream.write(SocketOutputStream.java:138)
>   at java.io.DataOutputStream.writeShort(DataOutputStream.java:168)
>   at kafka.network.SocketServerTest.sendRequest(SocketServerTest.scala:65)
>   at 
> kafka.network.SocketServerTest.tooBigRequestIsRejected(SocketServerTest.scala:153)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:112)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
>   at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:364)
>   at 
> org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:54)
>   at 
> org.gradle.internal.concurrent.StoppableExecutorImpl$1.run(StoppableExecutorImpl.java:40)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExe

[jira] [Commented] (KAFKA-3665) Default ssl.endpoint.identification.algorithm should be https

2016-05-09 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276118#comment-15276118
 ] 

Rajini Sivaram commented on KAFKA-3665:
---

[~ijuma] [~junrao] I dont believe HTTPS validates client hostname/IP address 
and Kafka definitely doesn't. For client requests that go through a VIP for the 
initial request, each broker's certificate would need SubjectAltNames with the 
VIP as well as its advertised host name.

> Default ssl.endpoint.identification.algorithm should be https
> -
>
> Key: KAFKA-3665
> URL: https://issues.apache.org/jira/browse/KAFKA-3665
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.1
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.10.0.0
>
>
> The default `ssl.endpoint.identification.algorithm` is `null` which is not a 
> secure default (man in the middle attacks are possible).
> We should probably use `https` instead. A more conservative alternative would 
> be to update the documentation instead of changing the default.
> A paper on the topic (thanks to Ryan Pridgeon for the reference): 
> http://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3665) Default ssl.endpoint.identification.algorithm should be https

2016-05-06 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273917#comment-15273917
 ] 

Rajini Sivaram commented on KAFKA-3665:
---

[~ijuma] I agree that secure installations of Kafka would always turn on 
hostname verification. But this change could potentially catch people out. 
Perhaps it doesn't matter because this is a major release? The documentation 
should say how to disable it - set ssl.endpoint.identification.algorithm to 
empty string?

> Default ssl.endpoint.identification.algorithm should be https
> -
>
> Key: KAFKA-3665
> URL: https://issues.apache.org/jira/browse/KAFKA-3665
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.1
>Reporter: Ismael Juma
>Assignee: Ismael Juma
> Fix For: 0.10.0.0
>
>
> The default `ssl.endpoint.identification.algorithm` is `null` which is not a 
> secure default (man in the middle attacks are possible).
> We should probably use `https` instead. A more conservative alternative would 
> be to update the documentation instead of changing the default.
> A paper on the topic (thanks to Ryan Pridgeon for the reference): 
> http://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3647) Unable to set a ssl provider

2016-05-04 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270550#comment-15270550
 ] 

Rajini Sivaram commented on KAFKA-3647:
---

{{ssl.provider}} is the name of the provider, not the qualified name of the 
class. Hence you would set {{SunEC}} rather than {{sun.security.ec.SunEC}}. But 
having said that, {{ssl.provider}} is used to create {{SSLContext}} for the 
configured TLS protocol and I dont think {{SunEC}} provides that. The providers 
for key manager and trust manager are not configurable in Kafka at the moment.

> Unable to set a ssl provider
> 
>
> Key: KAFKA-3647
> URL: https://issues.apache.org/jira/browse/KAFKA-3647
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.1
> Environment: Centos, OracleJRE 8, Vagrant
>Reporter: Elvar
>Priority: Minor
>
> When defining a ssl provider Kafka does not start because the provider was 
> not found.
> {code}
> [2016-05-02 13:48:48,252] FATAL [Kafka Server 11], Fatal error during 
> KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
> org.apache.kafka.common.KafkaException: 
> org.apache.kafka.common.KafkaException: 
> java.security.NoSuchProviderException: no such provider: sun.security.ec.SunEC
> at 
> org.apache.kafka.common.network.SslChannelBuilder.configure(SslChannelBuilder.java:44)
> {code}
> To test
> {code}
> /bin/kafka-server-start /etc/kafka/server.properties --override 
> ssl.provider=sun.security.ec.SunEC
> {code}
> This is stopping us from talking to Kafka with SSL from Go programs because 
> no common cipher suites are available.
> Using sslscan this is available from Kafka
> {code}
>  Supported Server Cipher(s):
>Accepted  TLSv1  256 bits  DHE-DSS-AES256-SHA
>Accepted  TLSv1  128 bits  DHE-DSS-AES128-SHA
>Accepted  TLSv1  128 bits  EDH-DSS-DES-CBC3-SHA
>Accepted  TLS11  256 bits  DHE-DSS-AES256-SHA
>Accepted  TLS11  128 bits  DHE-DSS-AES128-SHA
>Accepted  TLS11  128 bits  EDH-DSS-DES-CBC3-SHA
>Accepted  TLS12  256 bits  DHE-DSS-AES256-GCM-SHA384
>Accepted  TLS12  256 bits  DHE-DSS-AES256-SHA256
>Accepted  TLS12  256 bits  DHE-DSS-AES256-SHA
>Accepted  TLS12  128 bits  DHE-DSS-AES128-GCM-SHA256
>Accepted  TLS12  128 bits  DHE-DSS-AES128-SHA256
>Accepted  TLS12  128 bits  DHE-DSS-AES128-SHA
>Accepted  TLS12  128 bits  EDH-DSS-DES-CBC3-SHA
>  Preferred Server Cipher(s):
>SSLv2  0 bits(NONE)
>TLSv1  256 bits  DHE-DSS-AES256-SHA
>TLS11  256 bits  DHE-DSS-AES256-SHA
>TLS12  256 bits  DHE-DSS-AES256-GCM-SHA384
> {code}
> From the Golang documentation these are avilable there
> {code}
> TLS_RSA_WITH_RC4_128_SHAuint16 = 0x0005
> TLS_RSA_WITH_3DES_EDE_CBC_SHA   uint16 = 0x000a
> TLS_RSA_WITH_AES_128_CBC_SHAuint16 = 0x002f
> TLS_RSA_WITH_AES_256_CBC_SHAuint16 = 0x0035
> TLS_RSA_WITH_AES_128_GCM_SHA256 uint16 = 0x009c
> TLS_RSA_WITH_AES_256_GCM_SHA384 uint16 = 0x009d
> TLS_ECDHE_ECDSA_WITH_RC4_128_SHAuint16 = 0xc007
> TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHAuint16 = 0xc009
> TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHAuint16 = 0xc00a
> TLS_ECDHE_RSA_WITH_RC4_128_SHA  uint16 = 0xc011
> TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA uint16 = 0xc012
> TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA  uint16 = 0xc013
> TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA  uint16 = 0xc014
> TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256   uint16 = 0xc02f
> TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 uint16 = 0xc02b
> TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384   uint16 = 0xc030
> TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 uint16 = 0xc02c
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3652) Return error response for unsupported version of ApiVersionsRequest

2016-05-04 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3652:
--
Description: 
Discussion is in the PR https://github.com/apache/kafka/pull/1286. 
Dont fail authentication (for SASL) or break connections (normal operation) 
when an unsupported version of ApiVersionsRequest is received. Instead return 
error response so that client can retry with earlier version of request.

  was:The current implementation allows multiple ApiVersionsRequests prior to 
SaslHandshakeRequest. Restrict to only one. Discussion is in the PR 
https://github.com/apache/kafka/pull/1286. 

Summary: Return error response for unsupported version of 
ApiVersionsRequest  (was: Allow only one ApiVersionsRequest before SASL 
handshake)

> Return error response for unsupported version of ApiVersionsRequest
> ---
>
> Key: KAFKA-3652
> URL: https://issues.apache.org/jira/browse/KAFKA-3652
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> Discussion is in the PR https://github.com/apache/kafka/pull/1286. 
> Dont fail authentication (for SASL) or break connections (normal operation) 
> when an unsupported version of ApiVersionsRequest is received. Instead return 
> error response so that client can retry with earlier version of request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3652) Allow only one ApiVersionsRequest before SASL handshake

2016-05-03 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3652:
-

 Summary: Allow only one ApiVersionsRequest before SASL handshake
 Key: KAFKA-3652
 URL: https://issues.apache.org/jira/browse/KAFKA-3652
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.0.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 0.10.0.0


The current implementation allows multiple ApiVersionsRequests prior to 
SaslHandshakeRequest. Restrict to only one. Discussion is in the PR 
https://github.com/apache/kafka/pull/1286. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3634) Add ducktape tests for upgrade with SASL

2016-04-29 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264450#comment-15264450
 ] 

Rajini Sivaram commented on KAFKA-3634:
---

[~ijuma] Upgrade test from 0.9.0.x was being run only with PLAINTEXT. I have 
added SASL/GSSAPI (don't think this upgrade is being tested elsewhere).

> Add ducktape tests for upgrade with SASL
> 
>
> Key: KAFKA-3634
> URL: https://issues.apache.org/jira/browse/KAFKA-3634
> Project: Kafka
>  Issue Type: Test
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> Add SASL upgrade tests (moved out of KAFKA-2693):
>   - 0.9.0.x to 0.10.0 with GSSAPI as inter-broker SASL mechanism
>   - Rolling upgrade with change in SASL mechanism



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3634) Add ducktape tests for upgrade with SASL

2016-04-29 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3634:
--
Status: Patch Available  (was: Open)

> Add ducktape tests for upgrade with SASL
> 
>
> Key: KAFKA-3634
> URL: https://issues.apache.org/jira/browse/KAFKA-3634
> Project: Kafka
>  Issue Type: Test
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> Add SASL upgrade tests (moved out of KAFKA-2693):
>   - 0.9.0.x to 0.10.0 with GSSAPI as inter-broker SASL mechanism
>   - Rolling upgrade with change in SASL mechanism



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2693) Run relevant ducktape tests with SASL/PLAIN and multiple mechanisms

2016-04-28 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261893#comment-15261893
 ] 

Rajini Sivaram commented on KAFKA-2693:
---

Upgrade tests will be added under KAFKA-3634

> Run relevant ducktape tests with SASL/PLAIN and multiple mechanisms
> ---
>
> Key: KAFKA-2693
> URL: https://issues.apache.org/jira/browse/KAFKA-2693
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> KAFKA-2644 runs sanity test, replication tests and benchmarks with SASL using 
> mechanism GSSAPI. For SASL/PLAIN, run sanity test and replication tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3634) Add ducktape tests for upgrade with SASL

2016-04-28 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3634:
-

 Summary: Add ducktape tests for upgrade with SASL
 Key: KAFKA-3634
 URL: https://issues.apache.org/jira/browse/KAFKA-3634
 Project: Kafka
  Issue Type: Test
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 0.10.0.0


Add SASL upgrade tests (moved out of KAFKA-2693):
  - 0.9.0.x to 0.10.0 with GSSAPI as inter-broker SASL mechanism
  - Rolling upgrade with change in SASL mechanism



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2693) Run relevant ducktape tests with SASL/PLAIN and multiple mechanisms

2016-04-28 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-2693:
--
Status: Patch Available  (was: Open)

> Run relevant ducktape tests with SASL/PLAIN and multiple mechanisms
> ---
>
> Key: KAFKA-2693
> URL: https://issues.apache.org/jira/browse/KAFKA-2693
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> KAFKA-2644 runs sanity test, replication tests and benchmarks with SASL using 
> mechanism GSSAPI. For SASL/PLAIN, run sanity test and replication tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3618) Handle ApiVersionRequest before SASL handshake

2016-04-27 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3618:
--
Status: Patch Available  (was: Open)

> Handle ApiVersionRequest before SASL handshake
> --
>
> Key: KAFKA-3618
> URL: https://issues.apache.org/jira/browse/KAFKA-3618
> Project: Kafka
>  Issue Type: Task
>  Components: security
>Affects Versions: 0.9.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> Handle ApiVersionRequest in SaslServer authenticator before 
> SaslHandshakeRequest to enable clients to obtain handshake request version 
> from the server. This should be implemented after KAFKA-3307 which adds 
> support for version requests after authentication and KAFKA-3149 which adds 
> handshake requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3618) Handle ApiVersionRequest before SASL handshake

2016-04-27 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260826#comment-15260826
 ] 

Rajini Sivaram commented on KAFKA-3618:
---

[~ijuma] Thank you, I have submitted a PR 
(https://github.com/apache/kafka/pull/1276) , not sure why it hasn't been 
linked to this JIRA.

> Handle ApiVersionRequest before SASL handshake
> --
>
> Key: KAFKA-3618
> URL: https://issues.apache.org/jira/browse/KAFKA-3618
> Project: Kafka
>  Issue Type: Task
>  Components: security
>Affects Versions: 0.9.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> Handle ApiVersionRequest in SaslServer authenticator before 
> SaslHandshakeRequest to enable clients to obtain handshake request version 
> from the server. This should be implemented after KAFKA-3307 which adds 
> support for version requests after authentication and KAFKA-3149 which adds 
> handshake requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3618) Handle ApiVersionRequest before SASL handshake

2016-04-27 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260412#comment-15260412
 ] 

Rajini Sivaram commented on KAFKA-3618:
---

Server-side code for handling ApiVersionsRequest before SASL authentication is 
here: https://github.com/rajinisivaram/kafka/tree/KAFKA-3618. This is built on 
top of the PRs in KAFKA-3307 (for ApiVersionsRequest) and KAFKA-3617 (for unit 
tests). Will rebase and submit PR when those two are merged. Please let me know 
if I should submit a PR earlier to make it easier to review.

> Handle ApiVersionRequest before SASL handshake
> --
>
> Key: KAFKA-3618
> URL: https://issues.apache.org/jira/browse/KAFKA-3618
> Project: Kafka
>  Issue Type: Task
>  Components: security
>Affects Versions: 0.9.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> Handle ApiVersionRequest in SaslServer authenticator before 
> SaslHandshakeRequest to enable clients to obtain handshake request version 
> from the server. This should be implemented after KAFKA-3307 which adds 
> support for version requests after authentication and KAFKA-3149 which adds 
> handshake requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2658) Implement SASL/PLAIN

2016-04-27 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-2658.
---
Resolution: Duplicate

This was done under KAFKA-3149.

> Implement SASL/PLAIN
> 
>
> Key: KAFKA-2658
> URL: https://issues.apache.org/jira/browse/KAFKA-2658
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.10.1.0
>
>
> KAFKA-1686 supports SASL/Kerberos using GSSAPI. We should enable more SASL 
> mechanisms. SASL/PLAIN would enable a simpler use of SASL, which along with 
> SSL provides a secure Kafka that uses username/password for client 
> authentication.
> SASL/PLAIN protocol and its uses are described in 
> [https://tools.ietf.org/html/rfc4616]. It is supported in Java.
> This should be implemented after KAFKA-1686. This task should also hopefully 
> enable simpler unit testing of the SASL code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3618) Handle ApiVersionRequest before SASL handshake

2016-04-27 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260201#comment-15260201
 ] 

Rajini Sivaram commented on KAFKA-3618:
---

[~junrao] [~ijuma] I have the server-side code ready (built on top of the PR in 
KAFKA-3307), but wasn't sure if we also want the Java client to request API 
versions in the SASL client authenticator. At the moment, there is no real 
reason to make the client changes since there is only one version of 
SaslHandshakeRequest and hence a version request before the handshake doesn't 
add much value. But if required for 0.10.0, I can work on those changes now (on 
a different PR since the client changes will be built on top of the PR from 
KAFKA-3600). Thanks.

> Handle ApiVersionRequest before SASL handshake
> --
>
> Key: KAFKA-3618
> URL: https://issues.apache.org/jira/browse/KAFKA-3618
> Project: Kafka
>  Issue Type: Task
>  Components: security
>Affects Versions: 0.9.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> Handle ApiVersionRequest in SaslServer authenticator before 
> SaslHandshakeRequest to enable clients to obtain handshake request version 
> from the server. This should be implemented after KAFKA-3307 which adds 
> support for version requests after authentication and KAFKA-3149 which adds 
> handshake requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3617) Add unit tests for SASL authentication

2016-04-27 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3617:
--
Status: Patch Available  (was: Open)

> Add unit tests for SASL authentication
> --
>
> Key: KAFKA-3617
> URL: https://issues.apache.org/jira/browse/KAFKA-3617
> Project: Kafka
>  Issue Type: Test
>  Components: security
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> Add unit tests for SASL in the clients project using SASL/PLAIN 
> implementation. Include tests for authentication failures, handshake request 
> flow etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2693) Run relevant ducktape tests with SASL/PLAIN and multiple mechanisms

2016-04-25 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256679#comment-15256679
 ] 

Rajini Sivaram commented on KAFKA-2693:
---

Include tests for
- SASL/PLAIN
- Multiple mechanisms - different mechanisms for replication and clients
- Upgrade test for GSSAPI (0.9.0.x to 0.10.0) with SASL for replication
- Rolling upgrade with change in SASL mechanism

> Run relevant ducktape tests with SASL/PLAIN and multiple mechanisms
> ---
>
> Key: KAFKA-2693
> URL: https://issues.apache.org/jira/browse/KAFKA-2693
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> KAFKA-2644 runs sanity test, replication tests and benchmarks with SASL using 
> mechanism GSSAPI. For SASL/PLAIN, run sanity test and replication tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3618) Handle ApiVersionRequest before SASL handshake

2016-04-25 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3618:
-

 Summary: Handle ApiVersionRequest before SASL handshake
 Key: KAFKA-3618
 URL: https://issues.apache.org/jira/browse/KAFKA-3618
 Project: Kafka
  Issue Type: Task
  Components: security
Affects Versions: 0.9.0.1
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 0.10.0.0


Handle ApiVersionRequest in SaslServer authenticator before 
SaslHandshakeRequest to enable clients to obtain handshake request version from 
the server. This should be implemented after KAFKA-3307 which adds support for 
version requests after authentication and KAFKA-3149 which adds handshake 
requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3617) Add unit tests for SASL authentication

2016-04-25 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3617:
-

 Summary: Add unit tests for SASL authentication
 Key: KAFKA-3617
 URL: https://issues.apache.org/jira/browse/KAFKA-3617
 Project: Kafka
  Issue Type: Test
  Components: security
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 0.10.0.0


Add unit tests for SASL in the clients project using SASL/PLAIN implementation. 
Include tests for authentication failures, handshake request flow etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2693) Run relevant ducktape tests with SASL/PLAIN and multiple mechanisms

2016-04-25 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-2693:
--
Fix Version/s: 0.10.0.0
  Summary: Run relevant ducktape tests with SASL/PLAIN and multiple 
mechanisms  (was: Run relevant ducktape tests with SASL/PLAIN)

> Run relevant ducktape tests with SASL/PLAIN and multiple mechanisms
> ---
>
> Key: KAFKA-2693
> URL: https://issues.apache.org/jira/browse/KAFKA-2693
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> KAFKA-2644 runs sanity test, replication tests and benchmarks with SASL using 
> mechanism GSSAPI. For SASL/PLAIN, run sanity test and replication tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3586) Quotas do not revert to default values when override config is deleted

2016-04-19 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3586:
--
Status: Patch Available  (was: Open)

> Quotas do not revert to default values when override config is deleted
> --
>
> Key: KAFKA-3586
> URL: https://issues.apache.org/jira/browse/KAFKA-3586
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>
> When quota override configuration for a clientId is updated (eg using 
> kafka-configs.sh), the quotas of the client are dynamically updated by the 
> broker. But when the quota override configuration is deleted, the changes are 
> ignored (until broker restart). It would make sense for the client's quotas 
> to revert to the default quota configuration when the override is removed 
> from the config.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3586) Quotas do not revert to default values when override config is deleted

2016-04-19 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3586:
-

 Summary: Quotas do not revert to default values when override 
config is deleted
 Key: KAFKA-3586
 URL: https://issues.apache.org/jira/browse/KAFKA-3586
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.9.0.1
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


When quota override configuration for a clientId is updated (eg using 
kafka-configs.sh), the quotas of the client are dynamically updated by the 
broker. But when the quota override configuration is deleted, the changes are 
ignored (until broker restart). It would make sense for the client's quotas to 
revert to the default quota configuration when the override is removed from the 
config.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3492) support quota based on authenticated user name

2016-04-18 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245888#comment-15245888
 ] 

Rajini Sivaram commented on KAFKA-3492:
---

[~junrao] [~aauradkar] [~thesquelched] I have created KIP-55 with a proposal. I 
have used rates rather than percentage for client quotas so that the units are 
consistent everywhere. Comments are welcome. Thank you.

> support quota based on authenticated user name
> --
>
> Key: KAFKA-3492
> URL: https://issues.apache.org/jira/browse/KAFKA-3492
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Reporter: Jun Rao
>Assignee: Rajini Sivaram
>
> Currently, quota is based on the client.id set in the client configuration, 
> which can be changed easily. Ideally, quota should be set on the 
> authenticated user name. We will need to have a KIP proposal/discussion on 
> this first.
> Details are in KIP-55: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-55%3A+Secure+Quotas+for+Authenticated+Users



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3492) support quota based on authenticated user name

2016-04-18 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3492:
--
Description: 
Currently, quota is based on the client.id set in the client configuration, 
which can be changed easily. Ideally, quota should be set on the authenticated 
user name. We will need to have a KIP proposal/discussion on this first.

Details are in KIP-55: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-55%3A+Secure+Quotas+for+Authenticated+Users

  was:Currently, quota is based on the client.id set in the client 
configuration, which can be changed easily. Ideally, quota should be set on the 
authenticated user name. We will need to have a KIP proposal/discussion on this 
first.


> support quota based on authenticated user name
> --
>
> Key: KAFKA-3492
> URL: https://issues.apache.org/jira/browse/KAFKA-3492
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Reporter: Jun Rao
>Assignee: Rajini Sivaram
>
> Currently, quota is based on the client.id set in the client configuration, 
> which can be changed easily. Ideally, quota should be set on the 
> authenticated user name. We will need to have a KIP proposal/discussion on 
> this first.
> Details are in KIP-55: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-55%3A+Secure+Quotas+for+Authenticated+Users



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3548) Locale is not handled properly in kafka-consumer

2016-04-14 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3548:
--
Status: Patch Available  (was: Open)

> Locale is not handled properly in kafka-consumer
> 
>
> Key: KAFKA-3548
> URL: https://issues.apache.org/jira/browse/KAFKA-3548
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.1
>Reporter: Tanju Cataltepe
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> If the JVM local language is Turkish, which has different upper case for the 
> lower case letter i, the result is a runtime error caused by 
> org.apache.kafka.clients.consumer.OffsetResetStrategy. More specifically an 
> enum constant *EARLİEST* is generated which does not match *EARLIEST* (note 
> the _dotted capital i_).
> If the locale for the JVM is explicitly set to en_US, the example runs as 
> expected.
> A sample error log is below:
> {noforma}
> [akka://ReactiveKafka/user/$a] Failed to construct kafka consumer
> akka.actor.ActorInitializationException: exception during creation
> at akka.actor.ActorInitializationException$.apply(Actor.scala:172)
> at akka.actor.ActorCell.create(ActorCell.scala:606)
> at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:461)
> at akka.actor.ActorCell.systemInvoke(ActorCell.scala:483)
> at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:282)
> at akka.dispatch.Mailbox.run(Mailbox.scala:223)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> consumer
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:648)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:542)
> at 
> com.softwaremill.react.kafka.ReactiveKafkaConsumer.consumer$lzycompute(ReactiveKafkaConsumer.scala:31)
> at 
> com.softwaremill.react.kafka.ReactiveKafkaConsumer.consumer(ReactiveKafkaConsumer.scala:30)
> at 
> com.softwaremill.react.kafka.KafkaActorPublisher.(KafkaActorPublisher.scala:17)
> at 
> com.softwaremill.react.kafka.ReactiveKafka$$anonfun$consumerActorProps$1.apply(ReactiveKafka.scala:270)
> at 
> com.softwaremill.react.kafka.ReactiveKafka$$anonfun$consumerActorProps$1.apply(ReactiveKafka.scala:270)
> at 
> akka.actor.TypedCreatorFunctionConsumer.produce(IndirectActorProducer.scala:87)
> at akka.actor.Props.newActor(Props.scala:214)
> at akka.actor.ActorCell.newActor(ActorCell.scala:562)
> at akka.actor.ActorCell.create(ActorCell.scala:588)
> ... 7 more
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.kafka.clients.consumer.OffsetResetStrategy.EARLİEST
> at java.lang.Enum.valueOf(Enum.java:238)
> at 
> org.apache.kafka.clients.consumer.OffsetResetStrategy.valueOf(OffsetResetStrategy.java:15)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:588)
> ... 17 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3548) Locale is not handled properly in kafka-consumer

2016-04-13 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239431#comment-15239431
 ] 

Rajini Sivaram commented on KAFKA-3548:
---

[~nehanarkhede] Is anyone working on this issue? Otherwise, I can submit a PR. 
Thanks.

> Locale is not handled properly in kafka-consumer
> 
>
> Key: KAFKA-3548
> URL: https://issues.apache.org/jira/browse/KAFKA-3548
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.1
>Reporter: Tanju Cataltepe
>Assignee: Neha Narkhede
> Fix For: 0.10.0.0
>
>
> If the JVM local language is Turkish, which has different upper case for the 
> lower case letter i, the result is a runtime error caused by 
> org.apache.kafka.clients.consumer.OffsetResetStrategy. More specifically an 
> enum constant *EARLİEST* is generated which does not match *EARLIEST* (note 
> the _dotted capital i_).
> If the locale for the JVM is explicitly set to en_US, the example runs as 
> expected.
> A sample error log is below:
> {noforma}
> [akka://ReactiveKafka/user/$a] Failed to construct kafka consumer
> akka.actor.ActorInitializationException: exception during creation
> at akka.actor.ActorInitializationException$.apply(Actor.scala:172)
> at akka.actor.ActorCell.create(ActorCell.scala:606)
> at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:461)
> at akka.actor.ActorCell.systemInvoke(ActorCell.scala:483)
> at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:282)
> at akka.dispatch.Mailbox.run(Mailbox.scala:223)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kafka.common.KafkaException: Failed to construct kafka 
> consumer
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:648)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:542)
> at 
> com.softwaremill.react.kafka.ReactiveKafkaConsumer.consumer$lzycompute(ReactiveKafkaConsumer.scala:31)
> at 
> com.softwaremill.react.kafka.ReactiveKafkaConsumer.consumer(ReactiveKafkaConsumer.scala:30)
> at 
> com.softwaremill.react.kafka.KafkaActorPublisher.(KafkaActorPublisher.scala:17)
> at 
> com.softwaremill.react.kafka.ReactiveKafka$$anonfun$consumerActorProps$1.apply(ReactiveKafka.scala:270)
> at 
> com.softwaremill.react.kafka.ReactiveKafka$$anonfun$consumerActorProps$1.apply(ReactiveKafka.scala:270)
> at 
> akka.actor.TypedCreatorFunctionConsumer.produce(IndirectActorProducer.scala:87)
> at akka.actor.Props.newActor(Props.scala:214)
> at akka.actor.ActorCell.newActor(ActorCell.scala:562)
> at akka.actor.ActorCell.create(ActorCell.scala:588)
> ... 7 more
> Caused by: java.lang.IllegalArgumentException: No enum constant 
> org.apache.kafka.clients.consumer.OffsetResetStrategy.EARLİEST
> at java.lang.Enum.valueOf(Enum.java:238)
> at 
> org.apache.kafka.clients.consumer.OffsetResetStrategy.valueOf(OffsetResetStrategy.java:15)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:588)
> ... 17 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3149) Extend SASL implementation to support more mechanisms

2016-04-13 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3149:
--
Fix Version/s: 0.10.0.0
   Status: Patch Available  (was: Open)

Updated based on the discussions in the mailing threads around KIP-43 and 
KIP-35.

> Extend SASL implementation to support more mechanisms
> -
>
> Key: KAFKA-3149
> URL: https://issues.apache.org/jira/browse/KAFKA-3149
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> Make SASL implementation more configurable to enable integration with 
> existing authentication servers.
> Details are in KIP-43: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-43%3A+Kafka+SASL+enhancements]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3492) support quota based on authenticated user name

2016-04-08 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-3492:
-

Assignee: Rajini Sivaram

> support quota based on authenticated user name
> --
>
> Key: KAFKA-3492
> URL: https://issues.apache.org/jira/browse/KAFKA-3492
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Reporter: Jun Rao
>Assignee: Rajini Sivaram
>
> Currently, quota is based on the client.id set in the client configuration, 
> which can be changed easily. Ideally, quota should be set on the 
> authenticated user name. We will need to have a KIP proposal/discussion on 
> this first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3492) support quota based on authenticated user name

2016-04-08 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232612#comment-15232612
 ] 

Rajini Sivaram commented on KAFKA-3492:
---

[~aauradkar] Thank you.

> support quota based on authenticated user name
> --
>
> Key: KAFKA-3492
> URL: https://issues.apache.org/jira/browse/KAFKA-3492
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Reporter: Jun Rao
>
> Currently, quota is based on the client.id set in the client configuration, 
> which can be changed easily. Ideally, quota should be set on the 
> authenticated user name. We will need to have a KIP proposal/discussion on 
> this first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3492) support quota based on authenticated user name

2016-04-08 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232324#comment-15232324
 ] 

Rajini Sivaram commented on KAFKA-3492:
---

[~aauradkar] Are you planning to work on this? If not, I will be happy to 
submit a KIP.

> support quota based on authenticated user name
> --
>
> Key: KAFKA-3492
> URL: https://issues.apache.org/jira/browse/KAFKA-3492
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Reporter: Jun Rao
>
> Currently, quota is based on the client.id set in the client configuration, 
> which can be changed easily. Ideally, quota should be set on the 
> authenticated user name. We will need to have a KIP proposal/discussion on 
> this first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3488) commitAsync() fails if metadata update creates new SASL/SSL connection

2016-04-08 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232311#comment-15232311
 ] 

Rajini Sivaram commented on KAFKA-3488:
---

[~guozhang] Since requests from the consumer tend to be small, the impact of 
retrying sends shouldn't be too bad. When sends to a destination are unblocked, 
it would typically be possible to send another batch of buffered requests.

> commitAsync() fails if metadata update creates new SASL/SSL connection
> --
>
> Key: KAFKA-3488
> URL: https://issues.apache.org/jira/browse/KAFKA-3488
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.1.0
>
>
> Sasl/SslConsumerTest.testSimpleConsumption() fails intermittently with a 
> failure in {{commitAsync()}}. The exception stack trace shows:
> {quote}
> kafka.api.SaslPlaintextConsumerTest.testSimpleConsumption FAILED
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> kafka.api.BaseConsumerTest.awaitCommitCallback(BaseConsumerTest.scala:340)
>   at 
> kafka.api.BaseConsumerTest.testSimpleConsumption(BaseConsumerTest.scala:85)
> {quote}
> I have recreated this with some additional trace. The tests run with a very 
> small metadata expiry interval, triggering metadata updates quite often. If a 
> metadata request immediately following a {{commitAsync()}} call creates a new 
> SSL/SASL connection, {{ConsumerNetworkClient.poll}} returns to process the 
> connection handshake packets. Since {{ConsumerNetworkClient.poll}} discards 
> all unsent packets before returning from poll, this can result in the failure 
> of the commit - the callback is invoked with {{SendFailedException}}.
> I understand that {{ConsumerNetworkClient.poll()}} discards unsent packets 
> rather than buffer them to keep the code simple. And perhaps it is ok to fail 
> {{commitAsync}} occasionally since the callback does indicate that the caller 
> should retry. But it feels like an unnecessary limitation that requires error 
> handling in client applications when there are no real failures and makes it 
> much harder to test reliably. As special handling to fix issues like 
> KAFKA-3412, KAFKA-2672 adds more complexity to the code anyway, and because 
> it is much harder to debug failures that affect only SSL/SASL, it may be 
> worth considering improving this behaviour.
> I will see if I can submit a PR for the specific issue I was seeing with the 
> impact of handshakes on {{commitAsync()}}, but I will be interested in views 
> on improving the logic in {{ConsumerNetworkClient}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3517) Document configuration of SASL/PLAIN and multiple mechanisms

2016-04-06 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3517:
-

 Summary: Document configuration of SASL/PLAIN and multiple 
mechanisms
 Key: KAFKA-3517
 URL: https://issues.apache.org/jira/browse/KAFKA-3517
 Project: Kafka
  Issue Type: Task
  Components: security
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 0.10.0.0


Update the security section of the documentation on the support of new 
mechanism, how to specify and plug in PlainLoginModule, and what it takes to 
enable multiple mechanisms on the broker side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3498) Transient failure in kafka.api.SslConsumerTest.testSimpleConsumption

2016-04-04 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223759#comment-15223759
 ] 

Rajini Sivaram commented on KAFKA-3498:
---

This is a duplicate of KAFKA-3488.

> Transient failure in kafka.api.SslConsumerTest.testSimpleConsumption
> 
>
> Key: KAFKA-3498
> URL: https://issues.apache.org/jira/browse/KAFKA-3498
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>  Labels: newbie
>
> {code}
> Stacktrace
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> kafka.api.BaseConsumerTest.awaitCommitCallback(BaseConsumerTest.scala:340)
>   at 
> kafka.api.BaseConsumerTest.testSimpleConsumption(BaseConsumerTest.scala:85)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:105)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:64)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:49)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:106)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:360)
>   at 
> org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:54)
>   at 
> org.gradle.internal.concurrent.StoppableExe

[jira] [Commented] (KAFKA-3488) commitAsync() fails if metadata update creates new SASL/SSL connection

2016-03-31 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220416#comment-15220416
 ] 

Rajini Sivaram commented on KAFKA-3488:
---

[~hachikuji] Thank you for the feedback. I was thinking along the lines of 2) 
since it felt like the simplest change. I will take a look at the commits to 
see if I can reuse the code. I will look into 1) as well. Thanks.

> commitAsync() fails if metadata update creates new SASL/SSL connection
> --
>
> Key: KAFKA-3488
> URL: https://issues.apache.org/jira/browse/KAFKA-3488
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> Sasl/SslConsumerTest.testSimpleConsumption() fails intermittently with a 
> failure in {{commitAsync()}}. The exception stack trace shows:
> {quote}
> kafka.api.SaslPlaintextConsumerTest.testSimpleConsumption FAILED
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:631)
>   at 
> kafka.api.BaseConsumerTest.awaitCommitCallback(BaseConsumerTest.scala:340)
>   at 
> kafka.api.BaseConsumerTest.testSimpleConsumption(BaseConsumerTest.scala:85)
> {quote}
> I have recreated this with some additional trace. The tests run with a very 
> small metadata expiry interval, triggering metadata updates quite often. If a 
> metadata request immediately following a {{commitAsync()}} call creates a new 
> SSL/SASL connection, {{ConsumerNetworkClient.poll}} returns to process the 
> connection handshake packets. Since {{ConsumerNetworkClient.poll}} discards 
> all unsent packets before returning from poll, this can result in the failure 
> of the commit - the callback is invoked with {{SendFailedException}}.
> I understand that {{ConsumerNetworkClient.poll()}} discards unsent packets 
> rather than buffer them to keep the code simple. And perhaps it is ok to fail 
> {{commitAsync}} occasionally since the callback does indicate that the caller 
> should retry. But it feels like an unnecessary limitation that requires error 
> handling in client applications when there are no real failures and makes it 
> much harder to test reliably. As special handling to fix issues like 
> KAFKA-3412, KAFKA-2672 adds more complexity to the code anyway, and because 
> it is much harder to debug failures that affect only SSL/SASL, it may be 
> worth considering improving this behaviour.
> I will see if I can submit a PR for the specific issue I was seeing with the 
> impact of handshakes on {{commitAsync()}}, but I will be interested in views 
> on improving the logic in {{ConsumerNetworkClient}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2910) Failure in kafka.api.SslEndToEndAuthorizationTest.testNoGroupAcl

2016-03-31 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-2910:
--
Fix Version/s: (was: 0.10.1.0)
   0.10.0.0

> Failure in kafka.api.SslEndToEndAuthorizationTest.testNoGroupAcl
> 
>
> Key: KAFKA-2910
> URL: https://issues.apache.org/jira/browse/KAFKA-2910
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> {code}
> java.lang.SecurityException: zkEnableSecureAcls is true, but the verification 
> of the JAAS login file failed.
>   at kafka.server.KafkaServer.initZk(KafkaServer.scala:265)
>   at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
>   at kafka.utils.TestUtils$.createServer(TestUtils.scala:143)
>   at 
> kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:66)
>   at 
> kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:66)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:742)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> kafka.integration.KafkaServerTestHarness$class.setUp(KafkaServerTestHarness.scala:66)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.kafka$api$IntegrationTestHarness$$super$setUp(SslEndToEndAuthorizationTest.scala:24)
>   at 
> kafka.api.IntegrationTestHarness$class.setUp(IntegrationTestHarness.scala:58)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.kafka$api$EndToEndAuthorizationTest$$super$setUp(SslEndToEndAuthorizationTest.scala:24)
>   at 
> kafka.api.EndToEndAuthorizationTest$class.setUp(EndToEndAuthorizationTest.scala:141)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.setUp(SslEndToEndAuthorizationTest.scala:24)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:105)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:64)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:50)
>   at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.g

[jira] [Commented] (KAFKA-2910) Failure in kafka.api.SslEndToEndAuthorizationTest.testNoGroupAcl

2016-03-31 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219967#comment-15219967
 ] 

Rajini Sivaram commented on KAFKA-2910:
---

[~ijuma] Yes, I will try out a fix, do some testing and then submit a PR. Have 
set the fix version to 0.10.0.0. Thanks.

> Failure in kafka.api.SslEndToEndAuthorizationTest.testNoGroupAcl
> 
>
> Key: KAFKA-2910
> URL: https://issues.apache.org/jira/browse/KAFKA-2910
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Rajini Sivaram
> Fix For: 0.10.0.0
>
>
> {code}
> java.lang.SecurityException: zkEnableSecureAcls is true, but the verification 
> of the JAAS login file failed.
>   at kafka.server.KafkaServer.initZk(KafkaServer.scala:265)
>   at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
>   at kafka.utils.TestUtils$.createServer(TestUtils.scala:143)
>   at 
> kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:66)
>   at 
> kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:66)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:742)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> kafka.integration.KafkaServerTestHarness$class.setUp(KafkaServerTestHarness.scala:66)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.kafka$api$IntegrationTestHarness$$super$setUp(SslEndToEndAuthorizationTest.scala:24)
>   at 
> kafka.api.IntegrationTestHarness$class.setUp(IntegrationTestHarness.scala:58)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.kafka$api$EndToEndAuthorizationTest$$super$setUp(SslEndToEndAuthorizationTest.scala:24)
>   at 
> kafka.api.EndToEndAuthorizationTest$class.setUp(EndToEndAuthorizationTest.scala:141)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.setUp(SslEndToEndAuthorizationTest.scala:24)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:105)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:64)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:50)
>   at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>  

[jira] [Commented] (KAFKA-2910) Failure in kafka.api.SslEndToEndAuthorizationTest.testNoGroupAcl

2016-03-31 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219926#comment-15219926
 ] 

Rajini Sivaram commented on KAFKA-2910:
---

Came across this failure while running the tests locally. I think the issue is 
with tests which don't close Zookeeper clients. Unclosed clients continue to 
attempt to reconnect to the Zookeeper server after the server is shutdown, and 
this loads JAAS configuration. Subsequent tests reuse the JAAS configuration, 
resulting in this transient failure.

One possible sequence causing failure:
# TestA creates Zookeeper server in its setUp
# TestA creates ZkUtils in a test
# TestA shuts down Zookeeper server in its tearDown
# TestA calls {{Configuration.setConfiguration(null)}} in its tearDown to reset 
JAAS config
# TestB starts
# TestA's ZK client sender thread which was not closed attempts to reconnect, 
calling {{Configuration.getConfiguration()}}. JAAS config is now reloaded 
because it was reset by TestA's tearDown. At this point the JAAS config loaded 
is typically an empty config.
# TestB creates JAAS config file and sets System property 
{{java.security.auth.login.config}}
# TestB creates Zookeeper and Kafka servers, expecting 
{{Configuration.getConfiguration()}} to load the config based on the currently 
set System property {{java.security.auth.login.config}}. But since the 
configuration was already loaded in  step 6) by TestA before the System 
property was set, JAAS config is not reloaded. TestB setUp fails as a result.

> Failure in kafka.api.SslEndToEndAuthorizationTest.testNoGroupAcl
> 
>
> Key: KAFKA-2910
> URL: https://issues.apache.org/jira/browse/KAFKA-2910
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Rajini Sivaram
> Fix For: 0.10.1.0
>
>
> {code}
> java.lang.SecurityException: zkEnableSecureAcls is true, but the verification 
> of the JAAS login file failed.
>   at kafka.server.KafkaServer.initZk(KafkaServer.scala:265)
>   at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
>   at kafka.utils.TestUtils$.createServer(TestUtils.scala:143)
>   at 
> kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:66)
>   at 
> kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:66)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:742)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> kafka.integration.KafkaServerTestHarness$class.setUp(KafkaServerTestHarness.scala:66)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.kafka$api$IntegrationTestHarness$$super$setUp(SslEndToEndAuthorizationTest.scala:24)
>   at 
> kafka.api.IntegrationTestHarness$class.setUp(IntegrationTestHarness.scala:58)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.kafka$api$EndToEndAuthorizationTest$$super$setUp(SslEndToEndAuthorizationTest.scala:24)
>   at 
> kafka.api.EndToEndAuthorizationTest$class.setUp(EndToEndAuthorizationTest.scala:141)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.setUp(SslEndToEndAuthorizationTest.scala:24)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRun

[jira] [Assigned] (KAFKA-2910) Failure in kafka.api.SslEndToEndAuthorizationTest.testNoGroupAcl

2016-03-31 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-2910:
-

Assignee: Rajini Sivaram

> Failure in kafka.api.SslEndToEndAuthorizationTest.testNoGroupAcl
> 
>
> Key: KAFKA-2910
> URL: https://issues.apache.org/jira/browse/KAFKA-2910
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Rajini Sivaram
> Fix For: 0.10.1.0
>
>
> {code}
> java.lang.SecurityException: zkEnableSecureAcls is true, but the verification 
> of the JAAS login file failed.
>   at kafka.server.KafkaServer.initZk(KafkaServer.scala:265)
>   at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
>   at kafka.utils.TestUtils$.createServer(TestUtils.scala:143)
>   at 
> kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:66)
>   at 
> kafka.integration.KafkaServerTestHarness$$anonfun$setUp$1.apply(KafkaServerTestHarness.scala:66)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:742)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> kafka.integration.KafkaServerTestHarness$class.setUp(KafkaServerTestHarness.scala:66)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.kafka$api$IntegrationTestHarness$$super$setUp(SslEndToEndAuthorizationTest.scala:24)
>   at 
> kafka.api.IntegrationTestHarness$class.setUp(IntegrationTestHarness.scala:58)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.kafka$api$EndToEndAuthorizationTest$$super$setUp(SslEndToEndAuthorizationTest.scala:24)
>   at 
> kafka.api.EndToEndAuthorizationTest$class.setUp(EndToEndAuthorizationTest.scala:141)
>   at 
> kafka.api.SslEndToEndAuthorizationTest.setUp(SslEndToEndAuthorizationTest.scala:24)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:105)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:56)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:64)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:50)
>   at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.messaging.dispatch.Contex

[jira] [Created] (KAFKA-3488) commitAsync() fails if metadata update creates new SASL/SSL connection

2016-03-31 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3488:
-

 Summary: commitAsync() fails if metadata update creates new 
SASL/SSL connection
 Key: KAFKA-3488
 URL: https://issues.apache.org/jira/browse/KAFKA-3488
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 0.9.0.1
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


Sasl/SslConsumerTest.testSimpleConsumption() fails intermittently with a 
failure in {{commitAsync()}}. The exception stack trace shows:

{quote}
kafka.api.SaslPlaintextConsumerTest.testSimpleConsumption FAILED
java.lang.AssertionError: expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
kafka.api.BaseConsumerTest.awaitCommitCallback(BaseConsumerTest.scala:340)
at 
kafka.api.BaseConsumerTest.testSimpleConsumption(BaseConsumerTest.scala:85)
{quote}

I have recreated this with some additional trace. The tests run with a very 
small metadata expiry interval, triggering metadata updates quite often. If a 
metadata request immediately following a {{commitAsync()}} call creates a new 
SSL/SASL connection, {{ConsumerNetworkClient.poll}} returns to process the 
connection handshake packets. Since {{ConsumerNetworkClient.poll}} discards all 
unsent packets before returning from poll, this can result in the failure of 
the commit - the callback is invoked with {{SendFailedException}}.

I understand that {{ConsumerNetworkClient.poll()}} discards unsent packets 
rather than buffer them to keep the code simple. And perhaps it is ok to fail 
{{commitAsync}} occasionally since the callback does indicate that the caller 
should retry. But it feels like an unnecessary limitation that requires error 
handling in client applications when there are no real failures and makes it 
much harder to test reliably. As special handling to fix issues like 
KAFKA-3412, KAFKA-2672 adds more complexity to the code anyway, and because it 
is much harder to debug failures that affect only SSL/SASL, it may be worth 
considering improving this behaviour.

I will see if I can submit a PR for the specific issue I was seeing with the 
impact of handshakes on {{commitAsync()}}, but I will be interested in views on 
improving the logic in {{ConsumerNetworkClient}}.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3476) -Xloggc is not recognised by IBM java

2016-03-29 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215837#comment-15215837
 ] 

Rajini Sivaram commented on KAFKA-3476:
---

You can export new values for GC and performance opts prior to running Kafka 
scripts with IBM Java. For example:
{quote}
export KAFKA_GC_LOG_OPTS="-Xverbosegclog:$TRACE_DIR/server-gc.log 
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps"
export KAFKA_JVM_PERFORMANCE_OPTS="-server -Djava.awt.headless=true"
{quote}


>  -Xloggc is not recognised by IBM java
> --
>
> Key: KAFKA-3476
> URL: https://issues.apache.org/jira/browse/KAFKA-3476
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, tools
>Affects Versions: 0.9.0.0
>Reporter: Khirod Patra
>
> Getting below error on AIX server.
> NOTE : java version is :
> --
> java version "1.8.0"
> Java(TM) SE Runtime Environment (build pap6480-20150129_02)
> IBM J9 VM (build 2.8, JRE 1.8.0 AIX ppc64-64 Compressed References 
> 20150116_231420 (JIT enabled, AOT enabled)
> J9VM - R28_Java8_GA_20150116_2030_B231420
> JIT  - tr.r14.java_20150109_82886.02
> GC   - R28_Java8_GA_20150116_2030_B231420_CMPRSS
> J9CL - 20150116_231420)
> JCL - 20150123_01 based on Oracle jdk8u31-b12
> Error :
> ---
> kafka-run-class.sh -name zookeeper -loggc  
> org.apache.zookeeper.server.quorum.QuorumPeerMain 
> ../config/zookeeper.properties
> 
> http://www.ibm.com/j9/verbosegc"; 
> version="R28_Java8_GA_20150116_2030_B231420_CMPRSS">
> JVMJ9VM007E Command-line option unrecognised: 
> -Xloggc:/home/test_user/containers/kafka_2.11-0.9.0.0/bin/../logs/zookeeper-gc.log
> 
> Error: Could not create the Java Virtual Machine.
> Error: A fatal exception has occurred. Program will exit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3346) Rename "Mode" to "SslMode"

2016-03-07 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184063#comment-15184063
 ] 

Rajini Sivaram commented on KAFKA-3346:
---

Since Mode is used by both SSL and SASL, perhaps SslMode is a bit confusing?

> Rename "Mode" to "SslMode"
> --
>
> Key: KAFKA-3346
> URL: https://issues.apache.org/jira/browse/KAFKA-3346
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>
> In the channel builders, the Mode enum is undocumented, so it is unclear that 
> it is used to signify whether the connection is for SSL client or SSL server.
> I suggest renaming to SslMode (although adding documentation will be ok too, 
> if people object to the rename)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3279) SASL implementation checks for unused System property java.security.auth.login.config

2016-02-24 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3279:
--
Status: Patch Available  (was: Open)

> SASL implementation checks for unused System property 
> java.security.auth.login.config
> -
>
> Key: KAFKA-3279
> URL: https://issues.apache.org/jira/browse/KAFKA-3279
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>
> In many environments (eg. JEE containers), JAAS configuration may be set 
> using methods different from the System property 
> {{java.security.auth.login.config}}. While Kafka obtains JAAS configuration 
> correctly using {{Configuration.getConfiguration()}},  an exception is thrown 
> if the System property {{java.security.auth.login.config}} is not set even 
> when the property is never used. There are also misleading error messages 
> which refer to the value of this property which may or may not be the 
> configuration for which the error is being reported. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3279) SASL implementation checks for unused System property java.security.auth.login.config

2016-02-24 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3279:
-

 Summary: SASL implementation checks for unused System property 
java.security.auth.login.config
 Key: KAFKA-3279
 URL: https://issues.apache.org/jira/browse/KAFKA-3279
 Project: Kafka
  Issue Type: Bug
  Components: security
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


In many environments (eg. JEE containers), JAAS configuration may be set using 
methods different from the System property {{java.security.auth.login.config}}. 
While Kafka obtains JAAS configuration correctly using 
{{Configuration.getConfiguration()}},  an exception is thrown if the System 
property {{java.security.auth.login.config}} is not set even when the property 
is never used. There are also misleading error messages which refer to the 
value of this property which may or may not be the configuration for which the 
error is being reported. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3258) BrokerTopicMetrics of deleted topics are never deleted

2016-02-22 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3258:
--
Status: Patch Available  (was: Open)

> BrokerTopicMetrics of deleted topics are never deleted
> --
>
> Key: KAFKA-3258
> URL: https://issues.apache.org/jira/browse/KAFKA-3258
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>
> Per-topic BrokerTopicMetrics generated by brokers are not deleted even when 
> the topic is deleted. This shows misleading metrics in metrics reporters long 
> after a topic is deleted and is also a resource leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3258) BrokerTopicMetrics of deleted topics are never deleted

2016-02-22 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3258:
-

 Summary: BrokerTopicMetrics of deleted topics are never deleted
 Key: KAFKA-3258
 URL: https://issues.apache.org/jira/browse/KAFKA-3258
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.9.0.1
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


Per-topic BrokerTopicMetrics generated by brokers are not deleted even when the 
topic is deleted. This shows misleading metrics in metrics reporters long after 
a topic is deleted and is also a resource leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3239) Timing issue in controller metrics on topic delete

2016-02-16 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3239:
--
   Resolution: Cannot Reproduce
Fix Version/s: (was: 0.9.1.0)
   Status: Resolved  (was: Patch Available)

The topic corresponding to the exception was stuck in marked-for-delete mode 
for over a month at the time this exception was thrown.  We have manually 
deleted the topic and unfortunately I don't remember what state it was in 
before the delete. Since there have been fixes for topic delete since this 
topic was marked for delete, I am closing this defect for now. It looks like 
the problem may have been with {{replicas}} being empty, which could also throw 
the same exception. Hopefully, with the fixes for topic delete, we shouldn't 
get into the same state again.

> Timing issue in controller metrics on topic delete
> --
>
> Key: KAFKA-3239
> URL: https://issues.apache.org/jira/browse/KAFKA-3239
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>
> Noticed this exception in our logs:
> {quote}
> java.util.NoSuchElementException: key not found: [sometopic,0]
> at scala.collection.MapLike$class.default(MapLike.scala:228)
> at scala.collection.AbstractMap.default(Map.scala:59)
> at scala.collection.mutable.HashMap.apply(HashMap.scala:65)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:209)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:208)
> at 
> scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:114)
> at 
> scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:113)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> scala.collection.TraversableOnce$class.count(TraversableOnce.scala:113)
> at scala.collection.AbstractTraversable.count(Traversable.scala:104)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply$mcI$sp(KafkaController.scala:208)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> at 
> kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:204)
> at 
> kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:202)
> at 
> com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:163)
> at 
> com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:37)
> at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)
> at 
> com.airbnb.metrics.StatsDReporter.sendAMetric(StatsDReporter.java:131)
> at 
> com.airbnb.metrics.StatsDReporter.sendAllKafkaMetrics(StatsDReporter.java:119)
> at com.airbnb.metrics.StatsDReporter.run(StatsDReporter.java:85)
> {quote}
> The exception indicates that the topic was in 
> {{controllerContext.partitionReplicaAssignment}} but not in 
> {{controllerContext.partitionLeadershipInfo}}. This can occur during 
> {{KafkaController.removeTopic()}} since it is not synchronized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3239) Timing issue in controller metrics on topic delete

2016-02-16 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148357#comment-15148357
 ] 

Rajini Sivaram commented on KAFKA-3239:
---

[~ijuma] Yes, you are absolutely right. I will close the PR. I am not sure why 
that exception was thrown though. Will take a look at the calling code.

> Timing issue in controller metrics on topic delete
> --
>
> Key: KAFKA-3239
> URL: https://issues.apache.org/jira/browse/KAFKA-3239
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.9.1.0
>
>
> Noticed this exception in our logs:
> {quote}
> java.util.NoSuchElementException: key not found: [sometopic,0]
> at scala.collection.MapLike$class.default(MapLike.scala:228)
> at scala.collection.AbstractMap.default(Map.scala:59)
> at scala.collection.mutable.HashMap.apply(HashMap.scala:65)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:209)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:208)
> at 
> scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:114)
> at 
> scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:113)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> scala.collection.TraversableOnce$class.count(TraversableOnce.scala:113)
> at scala.collection.AbstractTraversable.count(Traversable.scala:104)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply$mcI$sp(KafkaController.scala:208)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> at 
> kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:204)
> at 
> kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:202)
> at 
> com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:163)
> at 
> com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:37)
> at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)
> at 
> com.airbnb.metrics.StatsDReporter.sendAMetric(StatsDReporter.java:131)
> at 
> com.airbnb.metrics.StatsDReporter.sendAllKafkaMetrics(StatsDReporter.java:119)
> at com.airbnb.metrics.StatsDReporter.run(StatsDReporter.java:85)
> {quote}
> The exception indicates that the topic was in 
> {{controllerContext.partitionReplicaAssignment}} but not in 
> {{controllerContext.partitionLeadershipInfo}}. This can occur during 
> {{KafkaController.removeTopic()}} since it is not synchronized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3239) Timing issue in controller metrics on topic delete

2016-02-16 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3239:
--
Status: Patch Available  (was: Open)

> Timing issue in controller metrics on topic delete
> --
>
> Key: KAFKA-3239
> URL: https://issues.apache.org/jira/browse/KAFKA-3239
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>
> Noticed this exception in our logs:
> {quote}
> java.util.NoSuchElementException: key not found: [sometopic,0]
> at scala.collection.MapLike$class.default(MapLike.scala:228)
> at scala.collection.AbstractMap.default(Map.scala:59)
> at scala.collection.mutable.HashMap.apply(HashMap.scala:65)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:209)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:208)
> at 
> scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:114)
> at 
> scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:113)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
> at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at 
> scala.collection.TraversableOnce$class.count(TraversableOnce.scala:113)
> at scala.collection.AbstractTraversable.count(Traversable.scala:104)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply$mcI$sp(KafkaController.scala:208)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205)
> at 
> kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> at 
> kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:204)
> at 
> kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:202)
> at 
> com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:163)
> at 
> com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:37)
> at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)
> at 
> com.airbnb.metrics.StatsDReporter.sendAMetric(StatsDReporter.java:131)
> at 
> com.airbnb.metrics.StatsDReporter.sendAllKafkaMetrics(StatsDReporter.java:119)
> at com.airbnb.metrics.StatsDReporter.run(StatsDReporter.java:85)
> {quote}
> The exception indicates that the topic was in 
> {{controllerContext.partitionReplicaAssignment}} but not in 
> {{controllerContext.partitionLeadershipInfo}}. This can occur during 
> {{KafkaController.removeTopic()}} since it is not synchronized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3239) Timing issue in controller metrics on topic delete

2016-02-16 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3239:
-

 Summary: Timing issue in controller metrics on topic delete
 Key: KAFKA-3239
 URL: https://issues.apache.org/jira/browse/KAFKA-3239
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 0.9.0.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


Noticed this exception in our logs:
{quote}
java.util.NoSuchElementException: key not found: [sometopic,0]
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:59)
at scala.collection.mutable.HashMap.apply(HashMap.scala:65)
at 
kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:209)
at 
kafka.controller.KafkaController$$anon$3$$anonfun$value$2$$anonfun$apply$mcI$sp$2.apply(KafkaController.scala:208)
at 
scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:114)
at 
scala.collection.TraversableOnce$$anonfun$count$1.apply(TraversableOnce.scala:113)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at 
scala.collection.TraversableOnce$class.count(TraversableOnce.scala:113)
at scala.collection.AbstractTraversable.count(Traversable.scala:104)
at 
kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply$mcI$sp(KafkaController.scala:208)
at 
kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205)
at 
kafka.controller.KafkaController$$anon$3$$anonfun$value$2.apply(KafkaController.scala:205)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
at 
kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:204)
at 
kafka.controller.KafkaController$$anon$3.value(KafkaController.scala:202)
at 
com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:163)
at 
com.airbnb.metrics.StatsDReporter.processGauge(StatsDReporter.java:37)
at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28)
at 
com.airbnb.metrics.StatsDReporter.sendAMetric(StatsDReporter.java:131)
at 
com.airbnb.metrics.StatsDReporter.sendAllKafkaMetrics(StatsDReporter.java:119)
at com.airbnb.metrics.StatsDReporter.run(StatsDReporter.java:85)
{quote}

The exception indicates that the topic was in 
{{controllerContext.partitionReplicaAssignment}} but not in 
{{controllerContext.partitionLeadershipInfo}}. This can occur during 
{{KafkaController.removeTopic()}} since it is not synchronized.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3218) Kafka-0.9.0.0 does not work as OSGi module

2016-02-15 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15147565#comment-15147565
 ] 

Rajini Sivaram commented on KAFKA-3218:
---

[~lowerobert] [~joconnor] Have you tried with the PR in 
https://github.com/apache/kafka/pull/888 which loads default config using 
static classloading to overcome this issue?

> Kafka-0.9.0.0 does not work as OSGi module
> --
>
> Key: KAFKA-3218
> URL: https://issues.apache.org/jira/browse/KAFKA-3218
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
> Environment: Apache Felix OSGi container
> jdk_1.8.0_60
>Reporter: Joe O'Connor
>Assignee: Rajini Sivaram
> Attachments: ContextClassLoaderBug.tar.gz
>
>
> KAFKA-2295 changed all Class.forName() calls to use 
> currentThread().getContextClassLoader() instead of the default "classloader 
> that loaded the current class". 
> OSGi loads each module's classes using a separate classloader so this is now 
> broken.
> Steps to reproduce: 
> # install the kafka-clients servicemix OSGi module 0.9.0.0_1
> # attempt to initialize the Kafka producer client from Java code 
> Expected results: 
> - call to "new KafkaProducer()" succeeds
> Actual results: 
> - "new KafkaProducer()" throws ConfigException:
> {quote}Suppressed: java.lang.Exception: Error starting bundle54: 
> Activator start error in bundle com.openet.testcase.ContextClassLoaderBug 
> [54].
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:66)
> ... 12 more
> Caused by: org.osgi.framework.BundleException: Activator start error 
> in bundle com.openet.testcase.ContextClassLoaderBug [54].
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2276)
> at 
> org.apache.felix.framework.Felix.startBundle(Felix.java:2144)
> at 
> org.apache.felix.framework.BundleImpl.start(BundleImpl.java:998)
> at 
> org.apache.karaf.bundle.command.Start.executeOnBundle(Start.java:38)
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:64)
> ... 12 more
> Caused by: java.lang.ExceptionInInitializerError
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
> at com.openet.testcase.Activator.start(Activator.java:16)
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:697)
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2226)
> ... 16 more
> *Caused by: org.apache.kafka.common.config.ConfigException: Invalid 
> value org.apache.kafka.clients.producer.internals.DefaultPartitioner for 
> configuration partitioner.class: Class* 
> *org.apache.kafka.clients.producer.internals.DefaultPartitioner could not be 
> found.*
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:255)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:78)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:94)
> at 
> org.apache.kafka.clients.producer.ProducerConfig.(ProducerConfig.java:206)
> {quote}
> Workaround is to call "currentThread().setContextClassLoader(null)" before 
> initializing the kafka producer.
> Possible fix is to catch ClassNotFoundException at ConfigDef.java:247 and 
> retry the Class.forName() call with the default classloader. However with 
> this fix there is still a problem at AbstractConfig.java:206,  where the 
> newInstance() call succeeds but "instanceof" is false because the classes 
> were loaded by different classloaders.
> Testcase attached, see README.txt for instructions.
> See also SM-2743



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3129) Console Producer/Consumer Issue

2016-02-15 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15147050#comment-15147050
 ] 

Rajini Sivaram commented on KAFKA-3129:
---

At the moment {{PlaintextTransportLayer.close()}} closes the socket channel and 
invalidates the selection key without blocking for output stream to be shutdown 
gracefully. Is this intentional?
{{close()}} could either be a blocking version that shuts down gracefully or a 
non-blocking version with abrupt termination as it is now. It will be a bigger 
change to implement {{close()}} with timeouts.

> Console Producer/Consumer Issue
> ---
>
> Key: KAFKA-3129
> URL: https://issues.apache.org/jira/browse/KAFKA-3129
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, producer 
>Affects Versions: 0.9.0.0
>Reporter: Vahid Hashemian
>Assignee: Neha Narkhede
> Attachments: kafka-3129.mov, server.log.abnormal.txt, 
> server.log.normal.txt
>
>
> I have been running a simple test case in which I have a text file 
> {{messages.txt}} with 1,000,000 lines (lines contain numbers from 1 to 
> 1,000,000 in ascending order). I run the console consumer like this:
> {{$ bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test}}
> Topic {{test}} is on 1 partition with a replication factor of 1.
> Then I run the console producer like this:
> {{$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test < 
> messages.txt}}
> Then the console starts receiving the messages. And about half the times it 
> goes all the way to 1,000,000. But, in other cases, it stops short, usually 
> at 999,735.
> I tried running another console consumer on another machine and both 
> consumers behave the same way. I can't see anything related to this in the 
> logs.
> I also ran the same experiment with a similar file of 10,000 lines, and am 
> getting a similar behavior. When the consumer does not receive all the 10,000 
> messages it usually stops at 9,864.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3218) Kafka-0.9.0.0 does not work as OSGi module

2016-02-09 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3218:
--
Status: Patch Available  (was: Open)

The PR addresses the loading of default classes for configuration properties by 
switching to static classloading to enable Kafka classes to be loaded in 
different environments like JEE, OSGi etc. which use different classloading 
strategies. Most properties supplied by the application (eg. key.serializer) 
can already be specified as classes to avoid relying on the current 
classloading strategy in Kafka:
{quote}
properties.put("key.serializer", 
org.apache.kafka.common.serialization.StringSerializer.class)
{quote}

Joe, can you check if this solution works for you?

To enable all features of Kafka to work well in OSGi, all uses of dynamic 
classloading need to be fixed. This needs more work and it would be better to 
do this with a KIP.


> Kafka-0.9.0.0 does not work as OSGi module
> --
>
> Key: KAFKA-3218
> URL: https://issues.apache.org/jira/browse/KAFKA-3218
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
> Environment: Apache Felix OSGi container
> jdk_1.8.0_60
>Reporter: Joe O'Connor
>Assignee: Rajini Sivaram
> Attachments: ContextClassLoaderBug.tar.gz
>
>
> KAFKA-2295 changed all Class.forName() calls to use 
> currentThread().getContextClassLoader() instead of the default "classloader 
> that loaded the current class". 
> OSGi loads each module's classes using a separate classloader so this is now 
> broken.
> Steps to reproduce: 
> # install the kafka-clients servicemix OSGi module 0.9.0.0_1
> # attempt to initialize the Kafka producer client from Java code 
> Expected results: 
> - call to "new KafkaProducer()" succeeds
> Actual results: 
> - "new KafkaProducer()" throws ConfigException:
> {quote}Suppressed: java.lang.Exception: Error starting bundle54: 
> Activator start error in bundle com.openet.testcase.ContextClassLoaderBug 
> [54].
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:66)
> ... 12 more
> Caused by: org.osgi.framework.BundleException: Activator start error 
> in bundle com.openet.testcase.ContextClassLoaderBug [54].
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2276)
> at 
> org.apache.felix.framework.Felix.startBundle(Felix.java:2144)
> at 
> org.apache.felix.framework.BundleImpl.start(BundleImpl.java:998)
> at 
> org.apache.karaf.bundle.command.Start.executeOnBundle(Start.java:38)
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:64)
> ... 12 more
> Caused by: java.lang.ExceptionInInitializerError
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
> at com.openet.testcase.Activator.start(Activator.java:16)
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:697)
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2226)
> ... 16 more
> *Caused by: org.apache.kafka.common.config.ConfigException: Invalid 
> value org.apache.kafka.clients.producer.internals.DefaultPartitioner for 
> configuration partitioner.class: Class* 
> *org.apache.kafka.clients.producer.internals.DefaultPartitioner could not be 
> found.*
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:255)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:78)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:94)
> at 
> org.apache.kafka.clients.producer.ProducerConfig.(ProducerConfig.java:206)
> {quote}
> Workaround is to call "currentThread().setContextClassLoader(null)" before 
> initializing the kafka producer.
> Possible fix is to catch ClassNotFoundException at ConfigDef.java:247 and 
> retry the Class.forName() call with the default classloader. However with 
> this fix there is still a problem at AbstractConfig.java:206,  where the 
> newInstance() call succeeds but "instanceof" is false because the classes 
> were loaded by different classloaders.
> Testcase attached, see README.txt for instructions.
> See also SM-2743



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3218) Kafka-0.9.0.0 does not work as OSGi module

2016-02-08 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-3218:
-

Assignee: Rajini Sivaram

> Kafka-0.9.0.0 does not work as OSGi module
> --
>
> Key: KAFKA-3218
> URL: https://issues.apache.org/jira/browse/KAFKA-3218
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
> Environment: Apache Felix OSGi container
> jdk_1.8.0_60
>Reporter: Joe O'Connor
>Assignee: Rajini Sivaram
> Attachments: ContextClassLoaderBug.tar.gz
>
>
> KAFKA-2295 changed all Class.forName() calls to use 
> currentThread().getContextClassLoader() instead of the default "classloader 
> that loaded the current class". 
> OSGi loads each module's classes using a separate classloader so this is now 
> broken.
> Steps to reproduce: 
> # install the kafka-clients servicemix OSGi module 0.9.0.0_1
> # attempt to initialize the Kafka producer client from Java code 
> Expected results: 
> - call to "new KafkaProducer()" succeeds
> Actual results: 
> - "new KafkaProducer()" throws ConfigException:
> {quote}Suppressed: java.lang.Exception: Error starting bundle54: 
> Activator start error in bundle com.openet.testcase.ContextClassLoaderBug 
> [54].
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:66)
> ... 12 more
> Caused by: org.osgi.framework.BundleException: Activator start error 
> in bundle com.openet.testcase.ContextClassLoaderBug [54].
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2276)
> at 
> org.apache.felix.framework.Felix.startBundle(Felix.java:2144)
> at 
> org.apache.felix.framework.BundleImpl.start(BundleImpl.java:998)
> at 
> org.apache.karaf.bundle.command.Start.executeOnBundle(Start.java:38)
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:64)
> ... 12 more
> Caused by: java.lang.ExceptionInInitializerError
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
> at com.openet.testcase.Activator.start(Activator.java:16)
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:697)
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2226)
> ... 16 more
> *Caused by: org.apache.kafka.common.config.ConfigException: Invalid 
> value org.apache.kafka.clients.producer.internals.DefaultPartitioner for 
> configuration partitioner.class: Class* 
> *org.apache.kafka.clients.producer.internals.DefaultPartitioner could not be 
> found.*
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:255)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:78)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:94)
> at 
> org.apache.kafka.clients.producer.ProducerConfig.(ProducerConfig.java:206)
> {quote}
> Workaround is to call "currentThread().setContextClassLoader(null)" before 
> initializing the kafka producer.
> Possible fix is to catch ClassNotFoundException at ConfigDef.java:247 and 
> retry the Class.forName() call with the default classloader. However with 
> this fix there is still a problem at AbstractConfig.java:206,  where the 
> newInstance() call succeeds but "instanceof" is false because the classes 
> were loaded by different classloaders.
> Testcase attached, see README.txt for instructions.
> See also SM-2743



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3218) Kafka-0.9.0.0 does not work as OSGi module

2016-02-08 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15137583#comment-15137583
 ] 

Rajini Sivaram commented on KAFKA-3218:
---

Not sure if a configuration option to choose between current classloader and 
the context classloader is the right approach to support OSGi. While that may 
work in this particular case where the class being loaded is the default 
partitioner which is included in Kafka, in the general case, you might want to 
load your own partitioner class contained in another bundle. Perhaps, it makes 
sense to fix the default case in this JIRA without adding additional config and 
raise a KIP to support OSGi properly?

> Kafka-0.9.0.0 does not work as OSGi module
> --
>
> Key: KAFKA-3218
> URL: https://issues.apache.org/jira/browse/KAFKA-3218
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
> Environment: Apache Felix OSGi container
> jdk_1.8.0_60
>Reporter: Joe O'Connor
> Attachments: ContextClassLoaderBug.tar.gz
>
>
> KAFKA-2295 changed all Class.forName() calls to use 
> currentThread().getContextClassLoader() instead of the default "classloader 
> that loaded the current class". 
> OSGi loads each module's classes using a separate classloader so this is now 
> broken.
> Steps to reproduce: 
> # install the kafka-clients servicemix OSGi module 0.9.0.0_1
> # attempt to initialize the Kafka producer client from Java code 
> Expected results: 
> - call to "new KafkaProducer()" succeeds
> Actual results: 
> - "new KafkaProducer()" throws ConfigException:
> {quote}Suppressed: java.lang.Exception: Error starting bundle54: 
> Activator start error in bundle com.openet.testcase.ContextClassLoaderBug 
> [54].
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:66)
> ... 12 more
> Caused by: org.osgi.framework.BundleException: Activator start error 
> in bundle com.openet.testcase.ContextClassLoaderBug [54].
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2276)
> at 
> org.apache.felix.framework.Felix.startBundle(Felix.java:2144)
> at 
> org.apache.felix.framework.BundleImpl.start(BundleImpl.java:998)
> at 
> org.apache.karaf.bundle.command.Start.executeOnBundle(Start.java:38)
> at 
> org.apache.karaf.bundle.command.BundlesCommand.doExecute(BundlesCommand.java:64)
> ... 12 more
> Caused by: java.lang.ExceptionInInitializerError
> at 
> org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:156)
> at com.openet.testcase.Activator.start(Activator.java:16)
> at 
> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:697)
> at 
> org.apache.felix.framework.Felix.activateBundle(Felix.java:2226)
> ... 16 more
> *Caused by: org.apache.kafka.common.config.ConfigException: Invalid 
> value org.apache.kafka.clients.producer.internals.DefaultPartitioner for 
> configuration partitioner.class: Class* 
> *org.apache.kafka.clients.producer.internals.DefaultPartitioner could not be 
> found.*
> at 
> org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:255)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:78)
> at 
> org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:94)
> at 
> org.apache.kafka.clients.producer.ProducerConfig.(ProducerConfig.java:206)
> {quote}
> Workaround is to call "currentThread().setContextClassLoader(null)" before 
> initializing the kafka producer.
> Possible fix is to catch ClassNotFoundException at ConfigDef.java:247 and 
> retry the Class.forName() call with the default classloader. However with 
> this fix there is still a problem at AbstractConfig.java:206,  where the 
> newInstance() call succeeds but "instanceof" is false because the classes 
> were loaded by different classloaders.
> Testcase attached, see README.txt for instructions.
> See also SM-2743



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2718) Reuse of temporary directories leading to transient unit test failures

2016-02-08 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-2718.
---
Resolution: Fixed

There is another issue with the tests, not with directory reuse, but with 
auto-create of topics with unclosed producers when Kafka server ports are 
reused. KAFKA-3217 has been raised to address this issue. Closing this JIRA 
since no other problems have been seen with directory reuse since this fix.

> Reuse of temporary directories leading to transient unit test failures
> --
>
> Key: KAFKA-2718
> URL: https://issues.apache.org/jira/browse/KAFKA-2718
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 0.9.1.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.9.1.0
>
>
> Stack traces in some of the transient unit test failures indicate that 
> temporary directories used for Zookeeper are being reused.
> {quote}
> kafka.common.TopicExistsException: Topic "topic" already exists.
>   at 
> kafka.admin.AdminUtils$.createOrUpdateTopicPartitionAssignmentPathInZK(AdminUtils.scala:253)
>   at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:237)
>   at kafka.utils.TestUtils$.createTopic(TestUtils.scala:231)
>   at kafka.api.BaseConsumerTest.setUp(BaseConsumerTest.scala:63)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3217) Unit tests which dont close producers auto-create topics in Kafka brokers of other tests when port is reused

2016-02-08 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3217:
-

 Summary: Unit tests which dont close producers auto-create topics 
in Kafka brokers of other tests when port is reused
 Key: KAFKA-3217
 URL: https://issues.apache.org/jira/browse/KAFKA-3217
 Project: Kafka
  Issue Type: Bug
  Components: unit tests
Affects Versions: 0.9.0.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


Consumer tests occasionally fail the exception:

{quote}
kafka.common.TopicExistsException: Topic "topic" already exists.
at 
kafka.admin.AdminUtils$.createOrUpdateTopicPartitionAssignmentPathInZK(AdminUtils.scala:261)
at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:245)
at kafka.utils.TestUtils$.createTopic(TestUtils.scala:237)
at kafka.api.BaseConsumerTest.setUp(BaseConsumerTest.scala:65)
{quote}

Recreated this failure with some additional logging and it turns out that the 
failure is because a few tests which create a topic named "topic" close their 
Kafka server, but not the producer. When the ephemeral port used by the closed 
Kafka server gets reused in another Kafka server in a subsequent test, the 
producer retries of the previous test cause "topic" to be recreated using 
auto-create in the new Kafka server of the subsequent test.  This results in an 
error in the consumer tests occasionally when the topic is auto-created before 
the test attempts to create it.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3170) Default value of fetch_min_bytes in new consumer is 1024 while doc says it is 1

2016-01-29 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3170:
--
Status: Patch Available  (was: Open)

> Default value of fetch_min_bytes in new consumer is 1024 while doc says it is 
> 1
> ---
>
> Key: KAFKA-3170
> URL: https://issues.apache.org/jira/browse/KAFKA-3170
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>
> FETCH_MIN_BYTES_DOC says:
> {quote}
> The minimum amount of data the server should return for a fetch request. If 
> insufficient data is available the request will wait for that much data to 
> accumulate before answering the request. The default setting of 1 byte means 
> that fetch requests are answered as soon as a single byte of data is 
> available or the fetch request times out waiting for data to arrive. Setting 
> this to something greater than 1 will cause the server to wait for larger 
> amounts of data to accumulate which can improve server throughput a bit at 
> the cost of some additional latency.
> {quote}
> But the default value is actually set to 1024. Either the doc or the value 
> needs to be changed. Perhaps 1 is a better default?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3170) Default value of fetch_min_bytes in new consumer is 1024 while doc says it is 1

2016-01-29 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3170:
-

 Summary: Default value of fetch_min_bytes in new consumer is 1024 
while doc says it is 1
 Key: KAFKA-3170
 URL: https://issues.apache.org/jira/browse/KAFKA-3170
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 0.9.0.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


FETCH_MIN_BYTES_DOC says:

{quote}
The minimum amount of data the server should return for a fetch request. If 
insufficient data is available the request will wait for that much data to 
accumulate before answering the request. The default setting of 1 byte means 
that fetch requests are answered as soon as a single byte of data is available 
or the fetch request times out waiting for data to arrive. Setting this to 
something greater than 1 will cause the server to wait for larger amounts of 
data to accumulate which can improve server throughput a bit at the cost of 
some additional latency.
{quote}

But the default value is actually set to 1024. Either the doc or the value 
needs to be changed. Perhaps 1 is a better default?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3169) Kafka broker throws OutOfMemory error with invalid SASL packet

2016-01-29 Thread Rajini Sivaram (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram updated KAFKA-3169:
--
Status: Patch Available  (was: Open)

> Kafka broker throws OutOfMemory error with invalid SASL packet
> --
>
> Key: KAFKA-3169
> URL: https://issues.apache.org/jira/browse/KAFKA-3169
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Critical
> Fix For: 0.9.0.1
>
>
> Receive buffer used in Kafka servers to process SASL packets is unbounded. 
> This can results in brokers crashing with OutOfMemory error when an invalid 
> SASL packet is received. 
> There is a standard SASL property in Java _javax.security.sasl.maxbuffer_ 
> that can be used to specify buffer size. When properties are added to the 
> Sasl implementation in KAFKA-3149, we can use the standard property to limit 
> receive buffer size. 
> But since this is a potential DoS issue, we should set a reasonable limit in 
> 0.9.0.1. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3169) Kafka broker throws OutOfMemory error with invalid SASL packet

2016-01-29 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-3169:
-

 Summary: Kafka broker throws OutOfMemory error with invalid SASL 
packet
 Key: KAFKA-3169
 URL: https://issues.apache.org/jira/browse/KAFKA-3169
 Project: Kafka
  Issue Type: Bug
  Components: security
Affects Versions: 0.9.0.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
Priority: Critical


Receive buffer used in Kafka servers to process SASL packets is unbounded. This 
can results in brokers crashing with OutOfMemory error when an invalid SASL 
packet is received. 

There is a standard SASL property in Java _javax.security.sasl.maxbuffer_ that 
can be used to specify buffer size. When properties are added to the Sasl 
implementation in KAFKA-3149, we can use the standard property to limit receive 
buffer size. 

But since this is a potential DoS issue, we should set a reasonable limit in 
0.9.0.1. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   6   7   8   >