[jira] [Updated] (KAFKA-1940) Initial checkout and build failing
[ https://issues.apache.org/jira/browse/KAFKA-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daneel Yaitskov updated KAFKA-1940: --- Attachment: zinc-upgrade.patch I tried to upgrade zinc library up to 0.3.7 and the issue disappeared. With the patch applied tests on Scala are launched successfully. Though 4 of these tests are appeared broken. I don't know whether it's related to the upgrade because the change Scala tests were not runnable. {noformat} kafka.server.LogRecoveryTest testHWCheckpointWithFailuresMultipleLogSegments FAILED junit.framework.AssertionFailedError: Failed to update high watermark for follower after timeout at junit.framework.Assert.fail(Assert.java:47) at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:619) at kafka.server.LogRecoveryTest.testHWCheckpointWithFailuresMultipleLogSegments(LogRecoveryTest.scala:214) kafka.server.LogRecoveryTest testHWCheckpointNoFailuresMultipleLogSegments FAILED junit.framework.AssertionFailedError: Failed to update high watermark for follower after timeout at junit.framework.Assert.fail(Assert.java:47) at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:619) at kafka.server.LogRecoveryTest.testHWCheckpointNoFailuresMultipleLogSegments(LogRecoveryTest.scala:168) {noformat} {noformat} kafka.producer.AsyncProducerTest testFailedSendRetryLogic FAILED kafka.common.FailedToSendMessageException: Failed to send messages after 3 tries. at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:91) at kafka.producer.AsyncProducerTest.testFailedSendRetryLogic(AsyncProducerTest.scala:415) kafka.producer.AsyncProducerTest testQueueTimeExpired PASSED kafka.producer.AsyncProducerTest testNoBroker FAILED org.scalatest.junit.JUnitTestFailedError: Should fail with FailedToSendMessageException at org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:101) at org.scalatest.junit.JUnit3Suite.newAssertionFailedException(JUnit3Suite.scala:149) at org.scalatest.Assertions$class.fail(Assertions.scala:711) at org.scalatest.junit.JUnit3Suite.fail(JUnit3Suite.scala:149) at kafka.producer.AsyncProducerTest.testNoBroker(AsyncProducerTest.scala:300) {noformat} Initial checkout and build failing -- Key: KAFKA-1940 URL: https://issues.apache.org/jira/browse/KAFKA-1940 Project: Kafka Issue Type: Bug Components: build Affects Versions: 0.8.1.2 Environment: Groovy: 1.8.6 Ant: Apache Ant(TM) version 1.9.2 compiled on July 8 2013 Ivy: 2.2.0 JVM: 1.8.0_25 (Oracle Corporation 25.25-b02) OS: Windows 7 6.1 amd64 Reporter: Martin Lemanski Labels: build Attachments: zinc-upgrade.patch when performing `gradle wrapper` and `gradlew build` as a new developer, I get an exception: {code} C:\development\git\kafkagradlew build --stacktrace ... FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':core:compileScala'. com.typesafe.zinc.Setup.create(Lcom/typesafe/zinc/ScalaLocation;Lcom/typesafe/zinc/SbtJars;Ljava/io/File;)Lcom/typesaf e/zinc/Setup; {code} Details: https://gist.github.com/mlem/ddff83cc8a25b040c157 Current Commit: {code} C:\development\git\kafkagit rev-parse --verify HEAD 71602de0bbf7727f498a812033027f6cbfe34eb8 {code} I am evaluating kafka for my company and wanted to run some tests with it, but couldn't due to this error. I know gradle can be tricky and it's not easy to setup everything correct, but this kind of bugs turns possible commiters/users off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2146) adding partition did not find the correct startIndex
[ https://issues.apache.org/jira/browse/KAFKA-2146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510903#comment-14510903 ] chenshangan commented on KAFKA-2146: I'd like to provide a patch soon. adding partition did not find the correct startIndex - Key: KAFKA-2146 URL: https://issues.apache.org/jira/browse/KAFKA-2146 Project: Kafka Issue Type: Bug Components: admin Affects Versions: 0.8.2.0 Reporter: chenshangan Priority: Minor Fix For: 0.8.3 TopicCommand provide a tool to add partitions for existing topics. It try to find the startIndex from existing partitions. There's a minor flaw in this process, it try to use the first partition fetched from zookeeper as the start partition, and use the first replica id in this partition as the startIndex. One thing, the first partition fetched from zookeeper is not necessary to be the start partition. As partition id begin from zero, we should use partition with id zero as the start partition. The other, broker id does not necessary begin from 0, so the startIndex is not necessary to be the first replica id in the start partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [KIP-DISCUSSION] KIP-22 Expose a Partitioner interface in the new producer
Hi, Here are the questions I think we should consider: 1. Do we need this at all given that we have the partition argument in ProducerRecord which gives full control? I think we do need it because this is a way to plug in a different partitioning strategy at run time and do it in a fairly transparent way. Yes, we need it if we want to support different partitioning strategies inside Kafka rather than requiring the user to code them externally. 3. Do we need to add the value? I suspect people will have uses for computing something off a few fields in the value to choose the partition. This would be useful in cases where the key was being used for log compaction purposes and did not contain the full information for computing the partition. I am not entirely sure about this. I guess that most partitioners should not use it. I think it makes it easier to reason about the system if the partitioner only works on the key. Hoever, if the value (and its serialization) are already available, there is not much harm in passing them along. 4. This interface doesn't include either an init() or close() method. It should implement Closable and Configurable, right? Right now the only application I can think of to have an init() and close() is to read some state information (e.g., load information) that is published on some external distributed storage (e.g., zookeeper) by the brokers. It might be useful also for reconfiguration and state migration. I think it's not a very common use case right now, but if the added complexity is not too much it might be worth to have support for these methods. 5. What happens if the user both sets the partition id in the ProducerRecord and sets a partitioner? Does the partition id just get passed in to the partitioner (as sort of implied in this interface?). This is a bit weird since if you pass in the partition id you kind of expect it to get used, right? Or is it the case that if you specify a partition the partitioner isn't used at all (in which case no point in including partition in the Partitioner api). The user should be able to override the partitioner on a per-record basis by explicitly setting the partition id. I don't think it makes sense for the partitioners to take hints on the partition. I would even go the extra step, and have a default logic that accepts both key and partition id (current interface) and calls partition() only if the partition id is not set. The partition() method does *not* take the partition ID as input (only key-value). Cheers, -- Gianmarco Cheers, -Jay On Thu, Apr 23, 2015 at 6:55 AM, Sriharsha Chintalapani ka...@harsha.io wrote: Hi, Here is the KIP for adding a partitioner interface for producer. https://cwiki.apache.org/confluence/display/KAFKA/KIP-+22+-+Expose+a+Partitioner+interface+in+the+new+producer There is one open question about how interface should look like. Please take a look and let me know if you prefer one way or the other. Thanks, Harsha
Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation
Have there been any discussions around separating out authentication and encryption protocols for Kafka endpoints to enable different combinations? In our deployment environment, we would like to use TLS for encryption, but we don't necessarily want to use certificate-based authentication of clients. With the current design, if we want to use an authentication mechanism like SASL/plain, it looks like we need to define a new security protocol in Kafka which combines SASL/Plain authentication with TLS encryption. In KIP-12, it looks like the protocols defined are PLAINTEXT (no auth, no encryption), KERBEROS (Kerberos auth, no encryption/Kerberos) and SSL(SSL auth/no client auth, SSL encryption). While not all combinations of authentication and encryption protocols are likely to be useful, the ability to combine different mechanisms without modifying Kafka to create combined protocols would make it easier to grow the support for new protocols. I wanted to check if this has already been discussed in the past. Thank you, Rajini On Fri, Apr 24, 2015 at 9:26 AM, Rajini Sivaram rajinisiva...@googlemail.com wrote: Harsha, Thank you for the quick response. (Sorry had missed sending this reply to the dev-list earlier).. 1. I am not sure what the new server-side code is going to look like after refactoring under KAFKA-1928. But I was assuming that there would be only one Channel implementation that would be shared by both clients and server. So the ability to run delegated tasks on a different thread would be useful in any case. Even with the server, I imagine the Processor thread is shared by multiple connections with thread affinity for connections, so it might be better not to run potentially long running delegated tasks on that thread. 2. You may be right that Kafka doesn't need to support renegotiation. The usecase I was thinking of was slightly different from the one you described. Periodic renegotiation is used sometimes to refresh encryption keys especially with ciphers that are weak. Kafka may not have a requirement to support this at the moment. 3. Graceful close needs close handshake messages to be be sent/received to shutdown the SSL engine and this requires managing selection interest based on SSL engine close state. It will be good if the base channel/selector class didn't need to be aware of this. 4. Yes, I agree that the choice is between bringing some selection-related code into the channel or some channel related code into selector. We found the code neater with the former when the three cases above were implemented. But it is possible that you can handle it differently with the latter, so I am happy to wait until your patch is ready. Regards, Rajini On Wed, Apr 22, 2015 at 4:00 PM, Sriharsha Chintalapani ka...@harsha.io wrote: 1. *Support for running potentially long-running delegated tasks outside the network thread*: It is recommended that delegated tasks indicated by a handshake status of NEED_TASK are run on a separate thread since they may block ( http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html). It is easier to encapsulate this in SSLChannel without any changes to common code if selection keys are managed within the Channel. This makes sense I can change code to not do it on the network thread. Right now we are doing the handshake as part of the processor ( it shouldn’t be in acceptor) and we have multiple processors thread. Do we still see this as an issue if it happens on the same thread as processor? . -- Harsha Sent with Airmail On April 22, 2015 at 7:18:17 AM, Sriharsha Chintalapani ( harsh...@fastmail.fm) wrote: Hi Rajini, Thanks for the details. I did go through your code . There was a discussion before about not having selector related code into the channel or extending the selector it self. 1. *Support for running potentially long-running delegated tasks outside the network thread*: It is recommended that delegated tasks indicated by a handshake status of NEED_TASK are run on a separate thread since they may block ( http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html). It is easier to encapsulate this in SSLChannel without any changes to common code if selection keys are managed within the Channel. This makes sense I can change code to not do it on the network thread. 2. *Renegotiation handshake*: During a read operation, handshake status may indicate that renegotiation is required. It will be good to encapsulate this state change (and any knowledge of these SSL-specific state transitions) within SSLChannel. Our experience was that managing keys and state within the SSLChannel rather than in Selector made this code neater. Do we even want to support renegotiation. This is a case where user/client handshakes with server anonymously but later want to change
[jira] [Created] (KAFKA-2146) adding partition did not find the correct startIndex
chenshangan created KAFKA-2146: -- Summary: adding partition did not find the correct startIndex Key: KAFKA-2146 URL: https://issues.apache.org/jira/browse/KAFKA-2146 Project: Kafka Issue Type: Bug Components: admin Affects Versions: 0.8.2.0 Reporter: chenshangan Priority: Minor Fix For: 0.8.3 TopicCommand provide a tool to add partitions for existing topics. It try to find the startIndex from existing partitions. There's a minor flaw in this process, it try to use the first partition fetched from zookeeper as the start partition, and use the first replica id in this partition as the startIndex. One thing, the first partition fetched from zookeeper is not necessary to be the start partition. As partition id begin from zero, we should use partition with id zero as the start partition. The other, broker id does not necessary begin from 0, so the startIndex is not necessary to be the first replica id in the start partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation
Harsha, Thank you for the quick response. (Sorry had missed sending this reply to the dev-list earlier).. 1. I am not sure what the new server-side code is going to look like after refactoring under KAFKA-1928. But I was assuming that there would be only one Channel implementation that would be shared by both clients and server. So the ability to run delegated tasks on a different thread would be useful in any case. Even with the server, I imagine the Processor thread is shared by multiple connections with thread affinity for connections, so it might be better not to run potentially long running delegated tasks on that thread. 2. You may be right that Kafka doesn't need to support renegotiation. The usecase I was thinking of was slightly different from the one you described. Periodic renegotiation is used sometimes to refresh encryption keys especially with ciphers that are weak. Kafka may not have a requirement to support this at the moment. 3. Graceful close needs close handshake messages to be be sent/received to shutdown the SSL engine and this requires managing selection interest based on SSL engine close state. It will be good if the base channel/selector class didn't need to be aware of this. 4. Yes, I agree that the choice is between bringing some selection-related code into the channel or some channel related code into selector. We found the code neater with the former when the three cases above were implemented. But it is possible that you can handle it differently with the latter, so I am happy to wait until your patch is ready. Regards, Rajini On Wed, Apr 22, 2015 at 4:00 PM, Sriharsha Chintalapani ka...@harsha.io wrote: 1. *Support for running potentially long-running delegated tasks outside the network thread*: It is recommended that delegated tasks indicated by a handshake status of NEED_TASK are run on a separate thread since they may block ( http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html). It is easier to encapsulate this in SSLChannel without any changes to common code if selection keys are managed within the Channel. This makes sense I can change code to not do it on the network thread. Right now we are doing the handshake as part of the processor ( it shouldn’t be in acceptor) and we have multiple processors thread. Do we still see this as an issue if it happens on the same thread as processor? . -- Harsha Sent with Airmail On April 22, 2015 at 7:18:17 AM, Sriharsha Chintalapani ( harsh...@fastmail.fm) wrote: Hi Rajini, Thanks for the details. I did go through your code . There was a discussion before about not having selector related code into the channel or extending the selector it self. 1. *Support for running potentially long-running delegated tasks outside the network thread*: It is recommended that delegated tasks indicated by a handshake status of NEED_TASK are run on a separate thread since they may block ( http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html). It is easier to encapsulate this in SSLChannel without any changes to common code if selection keys are managed within the Channel. This makes sense I can change code to not do it on the network thread. 2. *Renegotiation handshake*: During a read operation, handshake status may indicate that renegotiation is required. It will be good to encapsulate this state change (and any knowledge of these SSL-specific state transitions) within SSLChannel. Our experience was that managing keys and state within the SSLChannel rather than in Selector made this code neater. Do we even want to support renegotiation. This is a case where user/client handshakes with server anonymously but later want to change and present their identity and establish a new SSL session. In our producer or consumers either present their identity ( two -way auth) or not. Since these are long running processes I don’t see that there might be a case where they initially establish the session and later present their identity. *Graceful shutdown of the SSL connection*s: Our experience was that we could encapsulate all of the logic for shutting down SSLEngine gracefully within SSLChannel when the selection key and state are owned and managed by SSLChannel. Can’t this be done when channel.close() is called any reason to own the selection key. 4. *And finally a minor point:* We found that by managing selection key and selection interests within SSLChannel, protocol-independent Selector didn't need the concept of handshake at all and all channel state management and handshake related code could be held in protocol-specific classes. This may be worth taking into consideration since it makes it easier for common network layer code to be maintained without any understanding of the details of individual security protocols. The only thing network code(
[jira] [Updated] (KAFKA-2105) NullPointerException in client on MetadataRequest
[ https://issues.apache.org/jira/browse/KAFKA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daneel Yaitskov updated KAFKA-2105: --- Attachment: guard-from-null.patch The patch checks that the argument of partitionsFor is not null. NullPointerException in client on MetadataRequest - Key: KAFKA-2105 URL: https://issues.apache.org/jira/browse/KAFKA-2105 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 0.8.2.1 Reporter: Roger Hoover Priority: Minor Attachments: guard-from-null.patch With the new producer, if you accidentally pass null to KafkaProducer.partitionsFor(null), it will cause the IO thread to throw NPE. Uncaught error in kafka producer I/O thread: java.lang.NullPointerException at org.apache.kafka.common.utils.Utils.utf8Length(Utils.java:174) at org.apache.kafka.common.protocol.types.Type$5.sizeOf(Type.java:176) at org.apache.kafka.common.protocol.types.ArrayOf.sizeOf(ArrayOf.java:55) at org.apache.kafka.common.protocol.types.Schema.sizeOf(Schema.java:81) at org.apache.kafka.common.protocol.types.Struct.sizeOf(Struct.java:218) at org.apache.kafka.common.requests.RequestSend.serialize(RequestSend.java:35) at org.apache.kafka.common.requests.RequestSend.init(RequestSend.java:29) at org.apache.kafka.clients.NetworkClient.metadataRequest(NetworkClient.java:369) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:391) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:188) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Metrics package discussion
Otis, The jira for moving the broker to the new metrics is KAFKA-1930. We didn't try to do the conversion in 0.8.2 because (1) the new metrics are missing reporters for popular systems like Graphite and Ganglia; (2) the histogram support in the new metrics is a bit different and we were not sure if it's good enough for our usage. We will need to have an answer to both before we can migrate to the new metrics. So, the migration may not happen in 0.8.3. One of the reasons that we want to move to the new metrics is that as we are reusing more and more code from the java client, we will be pulling in metrics in the new format. In order to keep the metrics consistent, it's probably better to just bite the bullet and migrate all code hale metrics to the new one. Thanks, Jun On Tue, Apr 21, 2015 at 9:29 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: I'm veery late to this thread. I'm with Gwen about metrics being the public API (but often not treated as such, sadly). I don't know the details of internal issues around how metrics are implemented but, for selfish reasons, would hate to see MBeans change - we spent weeks contributing more than a dozen iterations of patches for changing the old Kafka 0.8.1.x metrics to what they are now in 0.8.2. I wish somebody had mentioned these (known?) issues then - since metrics were so drastically changed then, we could have done it right immediately. Also, when you change MBean names and structure you force everyone to rewrite their MBean parsers (not your problem, but still something to be aware of). If metrics are going to be changing, would it be possible to enumerate the changes somewhere? Finally, I tried finding a JIRA issue for changing metrics, so I can watch it, but couldn't find it here: https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20fixVersion%20%3D%200.8.3%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC Am I looking in the wrong place? Is there an issue for the changes discussed in this thread? Is the decision to do it in 0.8.3 final? Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Mar 31, 2015 at 12:43 PM, Steven Wu stevenz...@gmail.com wrote: My main concern is that we don't do the migration in 0.8.3, we will be left with some metrics in YM format and some others in KM format (as we start sharing client code on the broker). This is probably a worse situation to be in. +1. I am not sure how our servo adaptor will work if there are two formats for metrics? unless there is an easy way to check the format (YM/KM). On Tue, Mar 31, 2015 at 9:40 AM, Jun Rao j...@confluent.io wrote: (2) The metrics are clearly part of the client API and we are not changing that (at least for the new client). Arguably, the metrics are also part of the broker side API. However, since they affect fewer parties (mostly just the Kafka admins), it may be easier to make those changes. My main concern is that we don't do the migration in 0.8.3, we will be left with some metrics in YM format and some others in KM format (as we start sharing client code on the broker). This is probably a worse situation to be in. Thanks, Jun On Tue, Mar 31, 2015 at 9:26 AM, Gwen Shapira gshap...@cloudera.com wrote: (2) I believe we agreed that our metrics are a public API. I believe we also agree we don't break API in minor releases. So, it seems obvious to me that we can't make breaking changes to metrics in minor releases. I'm not convinced we did it in the past is a good reason to do it again. Is there a strong reason to do it in a 0.8.3 time-frame? On Tue, Mar 31, 2015 at 7:59 AM, Jun Rao j...@confluent.io wrote: (2) Not sure why we can't do this in 0.8.3. We changed the metrics names in 0.8.2 already. Given that we need to share code btw the client and the core, and we need to keep the metrics consistent on the broker, it seems that we have no choice but to migrate to KM. If so, it seems that the sooner that we do this, the better. It is important to give people an easy path for migration. However, it may not be easy to keep the mbean names exactly the same. For example, YM has hardcoded attributes (e.g. 1-min-rate, 5-min-rate, 15-min-rate, etc for rates) that are not available in KM. One benefit out of this migration is that one can get the metrics in the client and the broker in the same way. Thanks, Jun On Mon, Mar 30, 2015 at 9:26 PM, Gwen Shapira gshap...@cloudera.com wrote: (1) It will be interesting to see what others use for monitoring integration, to see what is already covered with existing JMX
Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation
Rajini, I am exploring this part right now. To support PLAINTEXT and SSL as protocols and Kerberos auth as authentication on top of plaintext or ssl (if users want to do encryption over an auth mechanism). This is mainly influenced by SASL or GSS-API performance issue when I enable encryption. I’ll update the KIP once I finalize this on my side . Thanks, Harsha On April 24, 2015 at 1:39:14 AM, Rajini Sivaram (rajinisiva...@googlemail.com) wrote: Have there been any discussions around separating out authentication and encryption protocols for Kafka endpoints to enable different combinations? In our deployment environment, we would like to use TLS for encryption, but we don't necessarily want to use certificate-based authentication of clients. With the current design, if we want to use an authentication mechanism like SASL/plain, it looks like we need to define a new security protocol in Kafka which combines SASL/Plain authentication with TLS encryption. In KIP-12, it looks like the protocols defined are PLAINTEXT (no auth, no encryption), KERBEROS (Kerberos auth, no encryption/Kerberos) and SSL(SSL auth/no client auth, SSL encryption). While not all combinations of authentication and encryption protocols are likely to be useful, the ability to combine different mechanisms without modifying Kafka to create combined protocols would make it easier to grow the support for new protocols. I wanted to check if this has already been discussed in the past. Thank you, Rajini On Fri, Apr 24, 2015 at 9:26 AM, Rajini Sivaram rajinisiva...@googlemail.com wrote: Harsha, Thank you for the quick response. (Sorry had missed sending this reply to the dev-list earlier).. 1. I am not sure what the new server-side code is going to look like after refactoring under KAFKA-1928. But I was assuming that there would be only one Channel implementation that would be shared by both clients and server. So the ability to run delegated tasks on a different thread would be useful in any case. Even with the server, I imagine the Processor thread is shared by multiple connections with thread affinity for connections, so it might be better not to run potentially long running delegated tasks on that thread. 2. You may be right that Kafka doesn't need to support renegotiation. The usecase I was thinking of was slightly different from the one you described. Periodic renegotiation is used sometimes to refresh encryption keys especially with ciphers that are weak. Kafka may not have a requirement to support this at the moment. 3. Graceful close needs close handshake messages to be be sent/received to shutdown the SSL engine and this requires managing selection interest based on SSL engine close state. It will be good if the base channel/selector class didn't need to be aware of this. 4. Yes, I agree that the choice is between bringing some selection-related code into the channel or some channel related code into selector. We found the code neater with the former when the three cases above were implemented. But it is possible that you can handle it differently with the latter, so I am happy to wait until your patch is ready. Regards, Rajini On Wed, Apr 22, 2015 at 4:00 PM, Sriharsha Chintalapani ka...@harsha.io wrote: 1. *Support for running potentially long-running delegated tasks outside the network thread*: It is recommended that delegated tasks indicated by a handshake status of NEED_TASK are run on a separate thread since they may block ( http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html). It is easier to encapsulate this in SSLChannel without any changes to common code if selection keys are managed within the Channel. This makes sense I can change code to not do it on the network thread. Right now we are doing the handshake as part of the processor ( it shouldn’t be in acceptor) and we have multiple processors thread. Do we still see this as an issue if it happens on the same thread as processor? . -- Harsha Sent with Airmail On April 22, 2015 at 7:18:17 AM, Sriharsha Chintalapani ( harsh...@fastmail.fm) wrote: Hi Rajini, Thanks for the details. I did go through your code . There was a discussion before about not having selector related code into the channel or extending the selector it self. 1. *Support for running potentially long-running delegated tasks outside the network thread*: It is recommended that delegated tasks indicated by a handshake status of NEED_TASK are run on a separate thread since they may block ( http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html). It is easier to encapsulate this in SSLChannel without any changes to common code if selection keys are managed within the Channel.
Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation
Thank you, Harsha. Yes, that makes sense. Shall take a look when the KIP is finalized. Rajini On Fri, Apr 24, 2015 at 2:34 PM, Sriharsha Chintalapani ka...@harsha.io wrote: Rajini, I am exploring this part right now. To support PLAINTEXT and SSL as protocols and Kerberos auth as authentication on top of plaintext or ssl (if users want to do encryption over an auth mechanism). This is mainly influenced by SASL or GSS-API performance issue when I enable encryption. I’ll update the KIP once I finalize this on my side . Thanks, Harsha On April 24, 2015 at 1:39:14 AM, Rajini Sivaram ( rajinisiva...@googlemail.com) wrote: Have there been any discussions around separating out authentication and encryption protocols for Kafka endpoints to enable different combinations? In our deployment environment, we would like to use TLS for encryption, but we don't necessarily want to use certificate-based authentication of clients. With the current design, if we want to use an authentication mechanism like SASL/plain, it looks like we need to define a new security protocol in Kafka which combines SASL/Plain authentication with TLS encryption. In KIP-12, it looks like the protocols defined are PLAINTEXT (no auth, no encryption), KERBEROS (Kerberos auth, no encryption/Kerberos) and SSL(SSL auth/no client auth, SSL encryption). While not all combinations of authentication and encryption protocols are likely to be useful, the ability to combine different mechanisms without modifying Kafka to create combined protocols would make it easier to grow the support for new protocols. I wanted to check if this has already been discussed in the past. Thank you, Rajini On Fri, Apr 24, 2015 at 9:26 AM, Rajini Sivaram rajinisiva...@googlemail.com wrote: Harsha, Thank you for the quick response. (Sorry had missed sending this reply to the dev-list earlier).. 1. I am not sure what the new server-side code is going to look like after refactoring under KAFKA-1928. But I was assuming that there would be only one Channel implementation that would be shared by both clients and server. So the ability to run delegated tasks on a different thread would be useful in any case. Even with the server, I imagine the Processor thread is shared by multiple connections with thread affinity for connections, so it might be better not to run potentially long running delegated tasks on that thread. 2. You may be right that Kafka doesn't need to support renegotiation. The usecase I was thinking of was slightly different from the one you described. Periodic renegotiation is used sometimes to refresh encryption keys especially with ciphers that are weak. Kafka may not have a requirement to support this at the moment. 3. Graceful close needs close handshake messages to be be sent/received to shutdown the SSL engine and this requires managing selection interest based on SSL engine close state. It will be good if the base channel/selector class didn't need to be aware of this. 4. Yes, I agree that the choice is between bringing some selection-related code into the channel or some channel related code into selector. We found the code neater with the former when the three cases above were implemented. But it is possible that you can handle it differently with the latter, so I am happy to wait until your patch is ready. Regards, Rajini On Wed, Apr 22, 2015 at 4:00 PM, Sriharsha Chintalapani ka...@harsha.io wrote: 1. *Support for running potentially long-running delegated tasks outside the network thread*: It is recommended that delegated tasks indicated by a handshake status of NEED_TASK are run on a separate thread since they may block ( http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html). It is easier to encapsulate this in SSLChannel without any changes to common code if selection keys are managed within the Channel. This makes sense I can change code to not do it on the network thread. Right now we are doing the handshake as part of the processor ( it shouldn’t be in acceptor) and we have multiple processors thread. Do we still see this as an issue if it happens on the same thread as processor? . -- Harsha Sent with Airmail On April 22, 2015 at 7:18:17 AM, Sriharsha Chintalapani ( harsh...@fastmail.fm) wrote: Hi Rajini, Thanks for the details. I did go through your code . There was a discussion before about not having selector related code into the channel or extending the selector it self. 1. *Support for running potentially long-running delegated tasks outside the network thread*: It is recommended that delegated tasks indicated by a handshake status of NEED_TASK are run on a separate thread since they may block ( http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html). It is easier to encapsulate this in
Re: [VOTE] KIP-11- Authorization design for kafka security
I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio n+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authoriza ti on +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI).
Re: [VOTE] KIP-11- Authorization design for kafka security
We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio n+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to
Re: [VOTE] KIP-11- Authorization design for kafka security
Sorry, for the confusion. I'm not sure my last email is clear enough either... Consumers will have a Principal which may belong to a group. But consumer configuration also have a group.id, which controls how partitions are shared between consumers and how offsets are committed. I'm talking about those Groups. On Fri, Apr 24, 2015 at 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote: We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio n+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think
Re: [VOTE] KIP-11- Authorization design for kafka security
Thanks for your comments Gari. My responses are inline. Thanks Parth On 4/24/15, 10:36 AM, Gari Singh gari.r.si...@gmail.com wrote: Sorry - fat fingered send ... Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) I think the user - group mapping can be done at Authorization implementation layer. In any case as you pointed out the session is part of another jira and I think a PR is out https://reviews.apache.org/r/27204/diff/ and we should discuss it on that PR. 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I would then modify the ACL to look like: {version:1, {acls:[ { principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal], principals:[alice,kafka-devs] ... or {version:1, {acls:[ { principals:[KafkaUserPrincipal:alice,KafkaGroupPrincipal:kafka- devs] But in either case this allows for easy identification of the type of principal and makes it easy to plugin multiple kinds of principals The advantage of all of this is that it now provides more flexibility for custom modules for both authentication and authorization moving forward. All the principals that you listed above can be supported with current design. Acls take a KafkaPrincipal as input which is a combination of type and principal name and the authorizer implementations are free to create any extension of this which covers group: groupName, host: HostName, kerberos: kerberosUserName and any other types that may come up. I am not sure how encryption key storage is relavent to the Authorizer so will be great if you can elaborate. 3) Are you sure that you want authorize to take a session object? If we use the model in one above, we could just populate the Subject with a KafkaClientAddressPrincipal and thenhave access to that when evaluated the ACLs. I think it is better to take a session which can just be a wrapper on top of Subject + host for now. This allows for extension which in my opinion is more future requirement proof. 4) What about actually caching authorization decisions? I know ACLs will be cached, but the actual authorize decision can be expensive as well? In default implementation I don’t plan to do this. Easy to add later if we want to but I am not sure why would this ever be expansive when acls are cached and number of acls on a single topic should be very small and iterating over them with simple string comparison should not really be expansive. Thanks Parth On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh gari.r.si...@gmail.com wrote: Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I
[jira] [Commented] (KAFKA-2149) fix default InterBrokerProtocolVersion
[ https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511899#comment-14511899 ] Onur Karaman commented on KAFKA-2149: - Okay that sounds fair. I hadn't seen the upgrade docs you provided before. I had only read KIP-2, which included the line bq. The idea is that in 0.8.3, we will default wire.protocol.version to 0.8.2 Can you update the KIP? fix default InterBrokerProtocolVersion -- Key: KAFKA-2149 URL: https://issues.apache.org/jira/browse/KAFKA-2149 Project: Kafka Issue Type: Bug Reporter: Onur Karaman Assignee: Onur Karaman Attachments: KAFKA-2149.patch Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27204: Patch for KAFKA-1683
On April 24, 2015, 7:07 p.m., Gari Singh wrote: 1) I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) This is also more in line with JAAS as well and would fit better with authentication modules 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). I am not sure how the Subject is valid here. Client holds a its own Subject and server holds its own Subject. Once Sasl auth done you get the client's authorizer ID by calling saslServer.getAuthorizationID() this will give you a String of the clients principal. Why would we associate a Subject than just a prinicipal. - Sriharsha --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27204/#review81522 --- On Oct. 26, 2014, 5:37 a.m., Gwen Shapira wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27204/ --- (Updated Oct. 26, 2014, 5:37 a.m.) Review request for kafka. Bugs: KAFKA-1683 https://issues.apache.org/jira/browse/KAFKA-1683 Repository: kafka Description --- added test for Session Diffs - core/src/main/scala/kafka/network/RequestChannel.scala 4560d8fb7dbfe723085665e6fd611c295e07b69b core/src/main/scala/kafka/network/SocketServer.scala cee76b323e5f3e4c783749ac9e78e1ef02897e3b core/src/test/scala/unit/kafka/network/SocketServerTest.scala 5f4d85254c384dcc27a5a84f0836ea225d3a901a Diff: https://reviews.apache.org/r/27204/diff/ Testing --- Thanks, Gwen Shapira
Re: [KIP-DISCUSSION] KIP-13 Quotas
To turn it off/on we can just add a clear config (quota.enforcement.enabled) or similar. Thanks, Joel On Fri, Apr 24, 2015 at 06:27:22PM -0400, Gari Singh wrote: If we can't disable it, then can we use the tried and true method of using -1 to indicate that no throttling should take place? On Tue, Apr 21, 2015 at 4:38 PM, Joel Koshy jjkosh...@gmail.com wrote: In either approach I'm not sure we considered being able to turn it off completely. IOW, no it is not a plugin if that's what you are asking. We can set very high defaults by default and in the absence of any overrides it would effectively be off. The quota enforcement is actually already part of the metrics package. The new code (that exercises it) will be added to wherever the metrics are being measured. Thanks, Joel On Tue, Apr 21, 2015 at 03:04:07PM -0400, Tong Li wrote: Joel, Nice write up. Couple of questions, not sure if they have been answered. Since we will have a call later today, I would like to ask here as well so that we can talk about if not responded in email discussion. 1. Where the new code will be plugged in, that is, where is the plugin point, KafkaApi? 2. Can this quota control be disabled/enabled without affect anything else? From the design wiki page, it seems to me that each request will at least pay a penalty of checking quota enablement. Thanks. Tong Li OpenStack Kafka Community Development Building 501/B205 liton...@us.ibm.com From: Joel Koshy jjkosh...@gmail.com To: dev@kafka.apache.org Date: 04/21/2015 01:22 PM Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas Given the caveats, it may be worth doing further investigation on the alternate approach which is to use a dedicated DelayQueue for requests that violate quota and compare pros/cons. So the approach is the following: all request handling occurs normally (i.e., unchanged from what we do today). i.e., purgatories will be unchanged. After handling a request and before sending the response, check if the request has violated a quota. If so, then enqueue the response into a DelayQueue. All responses can share the same DelayQueue. Send those responses out after the delay has been met. There are some benefits to doing this: - We will eventually want to quota other requests as well. The above seems to be a clean staged approach that should work uniformly for all requests. i.e., parse request - handle request normally - check quota - hold in delay queue if quota violated - respond . All requests can share the same DelayQueue. (In contrast with the current proposal we could end up with a bunch of purgatories, or a combination of purgatories and delay queues.) - Since this approach does not need any fundamental modifications to the current request handling, it addresses the caveats that Adi noted (which is holding producer requests/fetch requests longer than strictly necessary if quota is violated since the proposal was to not watch on keys in that case). Likewise it addresses the caveat that Guozhang noted (we may return no error if the request is held long enough due to quota violation and satisfy a producer request that may have in fact exceeded the ack timeout) although it is probably reasonable to hide this case from the user. - By avoiding the caveats it also avoids the suggested work-around to the caveats which is effectively to add a min-hold-time to the purgatory. Although this is not a lot of code, I think it adds a quota-driven feature to the purgatory which is already non-trivial and should ideally remain unassociated with quota enforcement. For this to work well we need to be sure that we don't hold a lot of data in the DelayQueue - and therein lies a quirk to this approach. Producer responses (and most other responses) are very small so there is no issue. Fetch responses are fine as well - since we read off a FileMessageSet in response (zero-copy). This will remain true even when we support SSL since encryption occurs at the session layer (not the application layer). Topic metadata response can be a problem though. For this we ideally want to build the topic metadata response only when we are ready to respond. So for metadata-style responses which could contain large response objects we may want to put the quota check and delay queue _before_ handling the request. So the design in this approach would need an amendment: provide a choice of where to put a request in the delay queue: either before handling or after handling (before response). So for: small request, large response: delay queue before handling large request, small response: delay queue after handling, before response small request, small response: either is fine
Re: [KIP-DISCUSSION] KIP-13 Quotas
Hey Jun/Joel, Yeah we will definitely want to quota non-produce/consume requests. Especially offset commit and any other requests the consumer can trigger could easily get invoked in a tight loop by accident. We haven't talked about this a ton, but presumably the mechanism for all these would just be a general requests/sec limit that covers all requests? -Jay On Fri, Apr 24, 2015 at 2:18 PM, Jun Rao j...@confluent.io wrote: Joel, What you suggested makes sense. Not sure if there is a strong need to throttle TMR though since it should be infrequent. Thanks, Jun On Tue, Apr 21, 2015 at 12:21 PM, Joel Koshy jjkosh...@gmail.com wrote: Given the caveats, it may be worth doing further investigation on the alternate approach which is to use a dedicated DelayQueue for requests that violate quota and compare pros/cons. So the approach is the following: all request handling occurs normally (i.e., unchanged from what we do today). i.e., purgatories will be unchanged. After handling a request and before sending the response, check if the request has violated a quota. If so, then enqueue the response into a DelayQueue. All responses can share the same DelayQueue. Send those responses out after the delay has been met. There are some benefits to doing this: - We will eventually want to quota other requests as well. The above seems to be a clean staged approach that should work uniformly for all requests. i.e., parse request - handle request normally - check quota - hold in delay queue if quota violated - respond . All requests can share the same DelayQueue. (In contrast with the current proposal we could end up with a bunch of purgatories, or a combination of purgatories and delay queues.) - Since this approach does not need any fundamental modifications to the current request handling, it addresses the caveats that Adi noted (which is holding producer requests/fetch requests longer than strictly necessary if quota is violated since the proposal was to not watch on keys in that case). Likewise it addresses the caveat that Guozhang noted (we may return no error if the request is held long enough due to quota violation and satisfy a producer request that may have in fact exceeded the ack timeout) although it is probably reasonable to hide this case from the user. - By avoiding the caveats it also avoids the suggested work-around to the caveats which is effectively to add a min-hold-time to the purgatory. Although this is not a lot of code, I think it adds a quota-driven feature to the purgatory which is already non-trivial and should ideally remain unassociated with quota enforcement. For this to work well we need to be sure that we don't hold a lot of data in the DelayQueue - and therein lies a quirk to this approach. Producer responses (and most other responses) are very small so there is no issue. Fetch responses are fine as well - since we read off a FileMessageSet in response (zero-copy). This will remain true even when we support SSL since encryption occurs at the session layer (not the application layer). Topic metadata response can be a problem though. For this we ideally want to build the topic metadata response only when we are ready to respond. So for metadata-style responses which could contain large response objects we may want to put the quota check and delay queue _before_ handling the request. So the design in this approach would need an amendment: provide a choice of where to put a request in the delay queue: either before handling or after handling (before response). So for: small request, large response: delay queue before handling large request, small response: delay queue after handling, before response small request, small response: either is fine large request, large resopnse: we really cannot do anything here but we don't really have this scenario yet So the design would look like this: - parse request - before handling request check if quota violated; if so compute two delay numbers: - before handling delay - before response delay - if before-handling delay 0 insert into before-handling delay queue - handle the request - if before-response delay 0 insert into before-response delay queue - respond Just throwing this out there for discussion. Thanks, Joel On Thu, Apr 16, 2015 at 02:56:23PM -0700, Jun Rao wrote: The quota check for the fetch request is a bit different from the produce request. I assume that for the fetch request, we will first get an estimated fetch response size to do the quota check. There are two things to think about. First, when we actually send the response, we probably don't want to record the metric again since it will double count. Second, the bytes that the fetch response actually sends could be more than the estimate. This means that the
[jira] [Commented] (KAFKA-2149) fix default InterBrokerProtocolVersion
[ https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511937#comment-14511937 ] Gwen Shapira commented on KAFKA-2149: - Oops, the kip actually contained both options and we never removed the one we didn't implement. Sorry about it. You probably noticed that the configuration name is wrong there too. KIP-2 now has the correct version of the upgrade instructions, but I still recommend using the docs for reference. fix default InterBrokerProtocolVersion -- Key: KAFKA-2149 URL: https://issues.apache.org/jira/browse/KAFKA-2149 Project: Kafka Issue Type: Bug Reporter: Onur Karaman Assignee: Onur Karaman Attachments: KAFKA-2149.patch Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2149) fix default InterBrokerProtocolVersion
[ https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Onur Karaman updated KAFKA-2149: Resolution: Not A Problem Status: Resolved (was: Patch Available) fix default InterBrokerProtocolVersion -- Key: KAFKA-2149 URL: https://issues.apache.org/jira/browse/KAFKA-2149 Project: Kafka Issue Type: Bug Reporter: Onur Karaman Assignee: Onur Karaman Attachments: KAFKA-2149.patch Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs
[ https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512134#comment-14512134 ] Onur Karaman commented on KAFKA-1809: - As a follow-up: there was inconsistent documentation between KIP-2 and 0.8.3 documentation. It has already been resolved. Upgrade steps are specified here: http://kafka.apache.org/083/documentation.html#upgrade Refactor brokers to allow listening on multiple ports and IPs -- Key: KAFKA-1809 URL: https://issues.apache.org/jira/browse/KAFKA-1809 Project: Kafka Issue Type: Sub-task Components: security Reporter: Gwen Shapira Assignee: Gwen Shapira Fix For: 0.8.3 Attachments: KAFKA-1809-DOCs.V1.patch, KAFKA-1809.patch, KAFKA-1809.v1.patch, KAFKA-1809.v2.patch, KAFKA-1809.v3.patch, KAFKA-1809_2014-12-25_11:04:17.patch, KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch, KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch, KAFKA-1809_2014-12-30_18:48:46.patch, KAFKA-1809_2015-01-05_14:23:57.patch, KAFKA-1809_2015-01-05_20:25:15.patch, KAFKA-1809_2015-01-05_21:40:14.patch, KAFKA-1809_2015-01-06_11:46:22.patch, KAFKA-1809_2015-01-13_18:16:21.patch, KAFKA-1809_2015-01-28_10:26:22.patch, KAFKA-1809_2015-02-02_11:55:16.patch, KAFKA-1809_2015-02-03_10:45:31.patch, KAFKA-1809_2015-02-03_10:46:42.patch, KAFKA-1809_2015-02-03_10:52:36.patch, KAFKA-1809_2015-03-16_09:02:18.patch, KAFKA-1809_2015-03-16_09:40:49.patch, KAFKA-1809_2015-03-27_15:04:10.patch, KAFKA-1809_2015-04-02_15:12:30.patch, KAFKA-1809_2015-04-02_19:03:58.patch, KAFKA-1809_2015-04-04_22:00:13.patch The goal is to eventually support different security mechanisms on different ports. Currently brokers are defined as host+port pair, and this definition exists throughout the code-base, therefore some refactoring is needed to support multiple ports for a single broker. The detailed design is here: https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [KIP-DISCUSSION] KIP-13 Quotas
If we can't disable it, then can we use the tried and true method of using -1 to indicate that no throttling should take place? On Tue, Apr 21, 2015 at 4:38 PM, Joel Koshy jjkosh...@gmail.com wrote: In either approach I'm not sure we considered being able to turn it off completely. IOW, no it is not a plugin if that's what you are asking. We can set very high defaults by default and in the absence of any overrides it would effectively be off. The quota enforcement is actually already part of the metrics package. The new code (that exercises it) will be added to wherever the metrics are being measured. Thanks, Joel On Tue, Apr 21, 2015 at 03:04:07PM -0400, Tong Li wrote: Joel, Nice write up. Couple of questions, not sure if they have been answered. Since we will have a call later today, I would like to ask here as well so that we can talk about if not responded in email discussion. 1. Where the new code will be plugged in, that is, where is the plugin point, KafkaApi? 2. Can this quota control be disabled/enabled without affect anything else? From the design wiki page, it seems to me that each request will at least pay a penalty of checking quota enablement. Thanks. Tong Li OpenStack Kafka Community Development Building 501/B205 liton...@us.ibm.com From: Joel Koshy jjkosh...@gmail.com To: dev@kafka.apache.org Date: 04/21/2015 01:22 PM Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas Given the caveats, it may be worth doing further investigation on the alternate approach which is to use a dedicated DelayQueue for requests that violate quota and compare pros/cons. So the approach is the following: all request handling occurs normally (i.e., unchanged from what we do today). i.e., purgatories will be unchanged. After handling a request and before sending the response, check if the request has violated a quota. If so, then enqueue the response into a DelayQueue. All responses can share the same DelayQueue. Send those responses out after the delay has been met. There are some benefits to doing this: - We will eventually want to quota other requests as well. The above seems to be a clean staged approach that should work uniformly for all requests. i.e., parse request - handle request normally - check quota - hold in delay queue if quota violated - respond . All requests can share the same DelayQueue. (In contrast with the current proposal we could end up with a bunch of purgatories, or a combination of purgatories and delay queues.) - Since this approach does not need any fundamental modifications to the current request handling, it addresses the caveats that Adi noted (which is holding producer requests/fetch requests longer than strictly necessary if quota is violated since the proposal was to not watch on keys in that case). Likewise it addresses the caveat that Guozhang noted (we may return no error if the request is held long enough due to quota violation and satisfy a producer request that may have in fact exceeded the ack timeout) although it is probably reasonable to hide this case from the user. - By avoiding the caveats it also avoids the suggested work-around to the caveats which is effectively to add a min-hold-time to the purgatory. Although this is not a lot of code, I think it adds a quota-driven feature to the purgatory which is already non-trivial and should ideally remain unassociated with quota enforcement. For this to work well we need to be sure that we don't hold a lot of data in the DelayQueue - and therein lies a quirk to this approach. Producer responses (and most other responses) are very small so there is no issue. Fetch responses are fine as well - since we read off a FileMessageSet in response (zero-copy). This will remain true even when we support SSL since encryption occurs at the session layer (not the application layer). Topic metadata response can be a problem though. For this we ideally want to build the topic metadata response only when we are ready to respond. So for metadata-style responses which could contain large response objects we may want to put the quota check and delay queue _before_ handling the request. So the design in this approach would need an amendment: provide a choice of where to put a request in the delay queue: either before handling or after handling (before response). So for: small request, large response: delay queue before handling large request, small response: delay queue after handling, before response small request, small response: either is fine large request, large resopnse: we really cannot do anything here but we don't really have this scenario yet So the design would look like this: - parse request - before handling request check if quota violated; if so compute two delay numbers: -
[jira] [Commented] (KAFKA-2149) fix default InterBrokerProtocolVersion
[ https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511943#comment-14511943 ] Onur Karaman commented on KAFKA-2149: - Cool thanks! fix default InterBrokerProtocolVersion -- Key: KAFKA-2149 URL: https://issues.apache.org/jira/browse/KAFKA-2149 Project: Kafka Issue Type: Bug Reporter: Onur Karaman Assignee: Onur Karaman Attachments: KAFKA-2149.patch Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2132) Move Log4J appender to clients module
[ https://issues.apache.org/jira/browse/KAFKA-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512040#comment-14512040 ] Ashish K Singh commented on KAFKA-2132: --- Btw, looking more into it, I guess we are combining two issues here. 1. KafkaLog4jAppender is really a producer and do not belong in core. For a user to be able to use KafkaLog4jAppender, he/she will have to pull the entire Kafka core, which is definitely not required. This is what this JIRA is about. 2. Kafka, being a library should not depend on log4j. I think the solution for (1) is to move the KafkaLog4jAppender to clients. For (2), we might have to look into ways to completely get rid of log4j in Kafka core. Move Log4J appender to clients module - Key: KAFKA-2132 URL: https://issues.apache.org/jira/browse/KAFKA-2132 Project: Kafka Issue Type: Improvement Reporter: Gwen Shapira Assignee: Ashish K Singh Log4j appender is just a producer. Since we have a new producer in the clients module, no need to keep Log4J appender in core and force people to package all of Kafka with their apps. Lets move the Log4jAppender to clients module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [KIP-DISCUSSION] KIP-13 Quotas
I think Jay meant a catch-all request/sec limit on all requests per-client. That makes sense. On Fri, Apr 24, 2015 at 11:02:29PM +, Aditya Auradkar wrote: I think Joel's suggestion is quite good. It's still possible to throttle other types of requests using purgatory but we will need a separate purgatory and DelayedOperation variants of different request types or perhaps add a ThrottledOperation type. It also addresses a couple of special case situations wrt delay time and replication timeouts. Jay, if we have a general mechanism of delaying requests then it should be possible to throttle any type of request as long as we have metrics on a per-client basis. For offset commit requests, we would simply need a request rate metric per-client and a good default quota. Thanks, Aditya From: Jay Kreps [jay.kr...@gmail.com] Sent: Friday, April 24, 2015 3:20 PM To: dev@kafka.apache.org Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas Hey Jun/Joel, Yeah we will definitely want to quota non-produce/consume requests. Especially offset commit and any other requests the consumer can trigger could easily get invoked in a tight loop by accident. We haven't talked about this a ton, but presumably the mechanism for all these would just be a general requests/sec limit that covers all requests? -Jay On Fri, Apr 24, 2015 at 2:18 PM, Jun Rao j...@confluent.io wrote: Joel, What you suggested makes sense. Not sure if there is a strong need to throttle TMR though since it should be infrequent. Thanks, Jun On Tue, Apr 21, 2015 at 12:21 PM, Joel Koshy jjkosh...@gmail.com wrote: Given the caveats, it may be worth doing further investigation on the alternate approach which is to use a dedicated DelayQueue for requests that violate quota and compare pros/cons. So the approach is the following: all request handling occurs normally (i.e., unchanged from what we do today). i.e., purgatories will be unchanged. After handling a request and before sending the response, check if the request has violated a quota. If so, then enqueue the response into a DelayQueue. All responses can share the same DelayQueue. Send those responses out after the delay has been met. There are some benefits to doing this: - We will eventually want to quota other requests as well. The above seems to be a clean staged approach that should work uniformly for all requests. i.e., parse request - handle request normally - check quota - hold in delay queue if quota violated - respond . All requests can share the same DelayQueue. (In contrast with the current proposal we could end up with a bunch of purgatories, or a combination of purgatories and delay queues.) - Since this approach does not need any fundamental modifications to the current request handling, it addresses the caveats that Adi noted (which is holding producer requests/fetch requests longer than strictly necessary if quota is violated since the proposal was to not watch on keys in that case). Likewise it addresses the caveat that Guozhang noted (we may return no error if the request is held long enough due to quota violation and satisfy a producer request that may have in fact exceeded the ack timeout) although it is probably reasonable to hide this case from the user. - By avoiding the caveats it also avoids the suggested work-around to the caveats which is effectively to add a min-hold-time to the purgatory. Although this is not a lot of code, I think it adds a quota-driven feature to the purgatory which is already non-trivial and should ideally remain unassociated with quota enforcement. For this to work well we need to be sure that we don't hold a lot of data in the DelayQueue - and therein lies a quirk to this approach. Producer responses (and most other responses) are very small so there is no issue. Fetch responses are fine as well - since we read off a FileMessageSet in response (zero-copy). This will remain true even when we support SSL since encryption occurs at the session layer (not the application layer). Topic metadata response can be a problem though. For this we ideally want to build the topic metadata response only when we are ready to respond. So for metadata-style responses which could contain large response objects we may want to put the quota check and delay queue _before_ handling the request. So the design in this approach would need an amendment: provide a choice of where to put a request in the delay queue: either before handling or after handling (before response). So for: small request, large response: delay queue before handling large request, small response: delay queue after handling, before response small request, small response: either is fine large request,
[jira] [Created] (KAFKA-2147) Unbalanced replication can cause extreme purgatory growth
Evan Huus created KAFKA-2147: Summary: Unbalanced replication can cause extreme purgatory growth Key: KAFKA-2147 URL: https://issues.apache.org/jira/browse/KAFKA-2147 Project: Kafka Issue Type: Bug Components: purgatory, replication Affects Versions: 0.8.2.1 Reporter: Evan Huus Assignee: Joel Koshy Apologies in advance, this is going to be a bit of complex description, mainly because we've seen this issue several different ways and we're still tying them together in terms of root cause and analysis. It is worth noting now that we have all our producers set up to send RequiredAcks==-1, and that this includes all our MirrorMakers. I understand the purgatory is being rewritten (again) for 0.8.3. Hopefully that will incidentally fix this issue, or at least render it moot. h4. Symptoms Fetch request purgatory on a broker or brokers grows rapidly and steadily at a rate of roughly 1-5K requests per second. Heap memory used also grows to keep pace. When 4-5 million requests have accumulated in purgatory, the purgatory is drained, causing a substantial latency spike. The node will tend to drop leadership, replicate, and recover. h5. Case 1 - MirrorMaker We first noticed this case when enabling mirrormaker. We had one primary cluster already, with many producers and consumers. We created a second, identical cluster and enabled replication from the original to the new cluster on some topics using mirrormaker. This caused all six nodes in the new cluster to exhibit the symptom in lockstep - their purgatories would all grow together, and get drained within about 20 seconds of each other. The cluster-wide latency spikes at this time caused several problems for us. Turning MM on and off turned the problem on and off very precisely. When we stopped MM, the purgatories would all drop to normal levels immediately, and would start climbing again when we restarted it. Note that this is the *fetch* purgatories on the brokers that MM was *producing* to, which indicates fairly strongly that this is a replication issue, not a MM issue. This particular cluster and MM setup was abandoned for other reasons before we could make much progress debugging. h5. Case 2 - Broker 6 The second time we saw this issue was on the newest broker (broker 6) in the original cluster. For a long time we were running with five nodes, and eventually added a sixth to handle the increased load. At first, we moved only a handful of higher-volume partitions to this broker. Later, we created a group of new topics (totalling around 100 partitions) for testing purposes that were spread automatically across all six nodes. These topics saw occasional traffic, but were generally unused. At this point broker 6 had leadership for about an equal number of high-volume and unused partitions, about 15-20 of each. Around this time (we don't have detailed enough data to prove real correlation unfortunately), the issue started appearing on this broker as well, but not on any of the other brokers in the cluster. h4. Debugging The first thing we tried was to reduce the `fetch.purgatory.purge.interval.requests` from the default of 1000 to a much lower value of 200. This had no noticeable effect at all. We then enabled debug logging on broker06 and started looking through that. I can attach complete log samples if necessary, but the thing that stood out for us was a substantial number of the following lines: {noformat} [2015-04-23 20:05:15,196] DEBUG [KafkaApi-6] Putting fetch request with correlation id 49939 from client ReplicaFetcherThread-0-6 into purgatory (kafka.server.KafkaApis) {noformat} The volume of these lines seemed to match (approximately) the fetch purgatory growth on that broker. At this point we developed a hypothesis (detailed below) which guided our subsequent debugging tests: - Setting a daemon up to produce regular random data to all of the topics led by kafka06 (specifically the ones which otherwise would receive no data) substantially alleviated the problem. - Doing an additional rebalance of the cluster in order to move a number of other topics with regular data to kafka06 appears to have solved the problem completely. h4. Hypothesis Current versions (0.8.2.1 and earlier) have issues with the replica fetcher not backing off correctly (KAFKA-1461, KAFKA-2082 and others). I believe that in a very specific situation, the replica fetcher thread of one broker can spam another broker with requests that fill up its purgatory and do not get properly flushed. My best guess is that the necessary conditions are: - broker A leads some partitions which receive regular traffic, and some partitions which do not - broker B replicates some of each type of partition from broker A - some producers are producing with RequiredAcks=-1 (wait for all ISR) - broker B
[jira] [Commented] (KAFKA-2148) version 0.8.2 breaks semantic versioning
[ https://issues.apache.org/jira/browse/KAFKA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511277#comment-14511277 ] Reece Markowsky commented on KAFKA-2148: Thanks Jay! version 0.8.2 breaks semantic versioning Key: KAFKA-2148 URL: https://issues.apache.org/jira/browse/KAFKA-2148 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2.0 Reporter: Reece Markowsky Assignee: Jun Rao Labels: api, producer version 0.8.2 of the Producer API drops support for sending a list of KeyedMessage (present in 0.8.1) the call present in Producer version 0.8.1 http://kafka.apache.org/081/api.html public void send(ListKeyedMessageK,V messages); is not present (breaking semantic versioning) in 0.8.2 Producer version 0.8.2 http://kafka.apache.org/082/javadoc/index.html send(ProducerRecordK,V record, Callback callback) or send(ProducerRecordK,V record) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation
Hi Gari, I apologize for not clear in KIP and starting discussion thread earlier. My initial proposal( currently in KIP ) to provide PLAINTEXT, SSL and KERBEROS as individual protocol implementation. As you mentioned at the end “In terms of message level integrity and confidentiality (not to be confused with transport level security like TLS), SASL also provides for this (assuming the mechanism supports it). The SASL library supports this via the props parameter in the createSaslClient/Server methods. So it is easily possible to support Kerberos with integrity (MIC) or confidentiality (encryption) over TCP and without either over TLS. “ My intention was to use sasl to do auth and also provide encryption over plain text channel. But after speaking to many who implemented Sasl this way for HDFS and HBASE , other projects as well their suggestion was not to use context.wrap and context.unwrap which does the encryption for sasl causes performance degradation. Currently I am working on SASL authentication as an option over TCP or TLS. I’ll update the KIP soon once I’ve got interfaces in place. Sorry about the confusion on this as I am testing out multiple options and trying to decide right one. Thanks, Harsha On April 24, 2015 at 8:37:09 AM, Gari Singh (gari.r.si...@gmail.com) wrote: Sorry for jumping in late, but I have been trying to follow this chain as well as the updates to the KIP. I don't mean to seem critical and I may be misunderstanding the proposed implementation, but there seems to be some confusion around terminology (at least from my perspective) and I am not sure I actually understand what is going to be implemented and where the plugin point(s) will be. The KIP does not really mention SASL interfaces in any detail. The way I read the KIP it seems as if if is more about providing a Kerberos mechanism via GSSAPI than it is about providing pluggable SASL support. Perhaps it is the naming convention (GSS is used where I would have though SASL would have been used). Maybe I am missing something? SASL leverages GSSAPI for the Kerberos mechanism, but SASL and the GSSAPI are not the same thing. Also, SSL/TLS is independent of both SASL and GSSAPI although you can use either SASL or GSSAPI over TLS. I would expect something more along the lines of having a SASLChannel and SASL providers (along with pluggable Authentication providers which enumerate which SASL mechanisms they support). I have only ever attempted to really implement SASL support once, but I have played with the SASL APIs and am familiar with how LDAP, SMTP and AMQP use SASL. This is my understanding of how SASL is typically implemented: 1) Client decides whether or not to use TLS or plain TCP (of course this depends on what the server provides). My current understanding is that Kafka will support three types of server sockets: - current socket for backwards compatibility (i.e. no TLS and no SASL) - TLS socket - SASL socket I would also have thought that SASL mechanism would be supported on the TLS socket as well but that does not seem to be the case (or at least it is not clear either way). I know the decision was made to have separate TLS and SASL sockets, but I think that we need to support SASL over TLS as well. You can do this over a single socket if you use the startTLS metaphor. 2) There is typically some type of application protocol specific handshake This is usually used to negotiate whether or not to use SASL and/or negotiate which SASL mechanisms are supported by the server. This is not strictly required, although the SASL spec does mention that the client should be able to get a list of SASL mechanisms supported by the server. For example, SMTP does this with the client sending a EHLO and the server sending an AUTH. Personally I like the AMQP model (which by the way might also help with backwards compatibility using a single socket). For AMQP, the initial frame is basically AMQP.%d0.1.0.0 (AMPQ, TCP, AMQP protocol 1.0.0) AMQP.%d3.1.0.0 (AMQP, SASL) I think you get the idea. So we could do something similar for Kafka KAFKA.[protocol type].[protocol version major].[protocol version minor].[protocol version revision] So for example, we could have protocol types of 0 - open 1- SASL and do this over either a TCP or TLS socket. Of course, if you stick with having a dedicated SASL socket, you could just start out with the option of the client sending something like AUTH as its first message (with the option of appending the initial SASL payload as well) 3) After the protocol handshake, there is an application specific wrapper for carrying SASL frames for the challenges and responses. If the mechanism selected is Kerberos, it is at this point that you that SASL uses the GSSAPI for the exchange (of course wrapped in the app specific SASL frames). If you are using
Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation
Sorry for jumping in late, but I have been trying to follow this chain as well as the updates to the KIP. I don't mean to seem critical and I may be misunderstanding the proposed implementation, but there seems to be some confusion around terminology (at least from my perspective) and I am not sure I actually understand what is going to be implemented and where the plugin point(s) will be. The KIP does not really mention SASL interfaces in any detail. The way I read the KIP it seems as if if is more about providing a Kerberos mechanism via GSSAPI than it is about providing pluggable SASL support. Perhaps it is the naming convention (GSS is used where I would have though SASL would have been used). Maybe I am missing something? SASL leverages GSSAPI for the Kerberos mechanism, but SASL and the GSSAPI are not the same thing. Also, SSL/TLS is independent of both SASL and GSSAPI although you can use either SASL or GSSAPI over TLS. I would expect something more along the lines of having a SASLChannel and SASL providers (along with pluggable Authentication providers which enumerate which SASL mechanisms they support). I have only ever attempted to really implement SASL support once, but I have played with the SASL APIs and am familiar with how LDAP, SMTP and AMQP use SASL. This is my understanding of how SASL is typically implemented: 1) Client decides whether or not to use TLS or plain TCP (of course this depends on what the server provides). My current understanding is that Kafka will support three types of server sockets: - current socket for backwards compatibility (i.e. no TLS and no SASL) - TLS socket - SASL socket I would also have thought that SASL mechanism would be supported on the TLS socket as well but that does not seem to be the case (or at least it is not clear either way). I know the decision was made to have separate TLS and SASL sockets, but I think that we need to support SASL over TLS as well. You can do this over a single socket if you use the startTLS metaphor. 2) There is typically some type of application protocol specific handshake This is usually used to negotiate whether or not to use SASL and/or negotiate which SASL mechanisms are supported by the server. This is not strictly required, although the SASL spec does mention that the client should be able to get a list of SASL mechanisms supported by the server. For example, SMTP does this with the client sending a EHLO and the server sending an AUTH. Personally I like the AMQP model (which by the way might also help with backwards compatibility using a single socket). For AMQP, the initial frame is basically AMQP.%d0.1.0.0 (AMPQ, TCP, AMQP protocol 1.0.0) AMQP.%d3.1.0.0 (AMQP, SASL) I think you get the idea. So we could do something similar for Kafka KAFKA.[protocol type].[protocol version major].[protocol version minor].[protocol version revision] So for example, we could have protocol types of 0 - open 1- SASL and do this over either a TCP or TLS socket. Of course, if you stick with having a dedicated SASL socket, you could just start out with the option of the client sending something like AUTH as its first message (with the option of appending the initial SASL payload as well) 3) After the protocol handshake, there is an application specific wrapper for carrying SASL frames for the challenges and responses. If the mechanism selected is Kerberos, it is at this point that you that SASL uses the GSSAPI for the exchange (of course wrapped in the app specific SASL frames). If you are using PLAIN, there is a defined format to be used (RFC4616). Java of course provides support for various mechanisms in the default SASL client and server mechanisms. For example, the client supports PLAIN, but we would need to implement a PlainSaslServer (which we could also tie to a username/password based authentication provider as well). In terms of message level integrity and confidentiality (not to be confused with transport level security like TLS), SASL also provides for this (assuming the mechanism supports it). The SASL library supports this via the props parameter in the createSaslClient/Server methods. So it is easily possible to support Kerberos with integrity (MIC) or confidentiality (encryption) over TCP and without either over TLS. Hopefully this makes sense and perhaps this is how things are proceeding, but it was not clear to me that this is what is actually being implemented. Sorry for the long note. -- Gari On Fri, Apr 24, 2015 at 9:34 AM, Sriharsha Chintalapani ka...@harsha.io wrote: Rajini, I am exploring this part right now. To support PLAINTEXT and SSL as protocols and Kerberos auth as authentication on top of plaintext or ssl (if users want to do encryption over an auth mechanism). This is mainly influenced by SASL or GSS-API performance issue when I enable encryption. I’ll update the KIP once I finalize this on my side . Thanks, Harsha On April 24, 2015 at
[jira] [Created] (KAFKA-2148) version 0.8.2 breaks semantic versioning
Reece Markowsky created KAFKA-2148: -- Summary: version 0.8.2 breaks semantic versioning Key: KAFKA-2148 URL: https://issues.apache.org/jira/browse/KAFKA-2148 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2.0 Reporter: Reece Markowsky Assignee: Jun Rao version 0.8.2 of the Producer API drops support for sending a list of KeyedMessage (present in 0.8.1) the call present in Producer version 0.8.1 http://kafka.apache.org/081/api.html public void send(ListKeyedMessageK,V messages); is not present (breaking semantic versioning) in 0.8.2 Producer version 0.8.2 http://kafka.apache.org/082/javadoc/index.html send(ProducerRecordK,V record, Callback callback) or send(ProducerRecordK,V record) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KAFKA-2148) version 0.8.2 breaks semantic versioning
[ https://issues.apache.org/jira/browse/KAFKA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jay Kreps resolved KAFKA-2148. -- Resolution: Not A Problem Hey Reece, this is not the same client but rather a new client. It is not intended to be api compatible. The scala client with the api you describe still exists and will continue to exist for some time. This api actually gives all the benefits the scala api had (plus some additional ones, like giving you back the offset and error info even for async writes). version 0.8.2 breaks semantic versioning Key: KAFKA-2148 URL: https://issues.apache.org/jira/browse/KAFKA-2148 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2.0 Reporter: Reece Markowsky Assignee: Jun Rao Labels: api, producer version 0.8.2 of the Producer API drops support for sending a list of KeyedMessage (present in 0.8.1) the call present in Producer version 0.8.1 http://kafka.apache.org/081/api.html public void send(ListKeyedMessageK,V messages); is not present (breaking semantic versioning) in 0.8.2 Producer version 0.8.2 http://kafka.apache.org/082/javadoc/index.html send(ProducerRecordK,V record, Callback callback) or send(ProducerRecordK,V record) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-2148) version 0.8.2 breaks semantic versioning
[ https://issues.apache.org/jira/browse/KAFKA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511277#comment-14511277 ] Reece Markowsky edited comment on KAFKA-2148 at 4/24/15 4:13 PM: - Thanks for the quick reply. Confused, but will see if I can see what you mean. We just upgraded to 0.8.2 and my batching code is broken now. was (Author: reecemarkowsky): Thanks Jay! version 0.8.2 breaks semantic versioning Key: KAFKA-2148 URL: https://issues.apache.org/jira/browse/KAFKA-2148 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2.0 Reporter: Reece Markowsky Assignee: Jun Rao Labels: api, producer version 0.8.2 of the Producer API drops support for sending a list of KeyedMessage (present in 0.8.1) the call present in Producer version 0.8.1 http://kafka.apache.org/081/api.html public void send(ListKeyedMessageK,V messages); is not present (breaking semantic versioning) in 0.8.2 Producer version 0.8.2 http://kafka.apache.org/082/javadoc/index.html send(ProducerRecordK,V record, Callback callback) or send(ProducerRecordK,V record) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1334) Add failure detection capability to the coordinator / consumer
[ https://issues.apache.org/jira/browse/KAFKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Onur Karaman updated KAFKA-1334: Attachment: KAFKA-1334_2015-04-24_22:46:15.patch Add failure detection capability to the coordinator / consumer -- Key: KAFKA-1334 URL: https://issues.apache.org/jira/browse/KAFKA-1334 Project: Kafka Issue Type: Sub-task Components: consumer Affects Versions: 0.9.0 Reporter: Neha Narkhede Assignee: Onur Karaman Attachments: KAFKA-1334.patch, KAFKA-1334_2015-04-11_22:47:27.patch, KAFKA-1334_2015-04-13_11:55:06.patch, KAFKA-1334_2015-04-13_11:58:53.patch, KAFKA-1334_2015-04-18_10:16:23.patch, KAFKA-1334_2015-04-18_12:16:39.patch, KAFKA-1334_2015-04-24_22:46:15.patch 1) Add coordinator discovery and failure detection to the consumer. 2) Add failure detection capability to the coordinator when group management is used. This will not include any rebalancing logic, just the logic to detect consumer failures using session.timeout.ms. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1334) Add failure detection capability to the coordinator / consumer
[ https://issues.apache.org/jira/browse/KAFKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512277#comment-14512277 ] Onur Karaman commented on KAFKA-1334: - Updated reviewboard https://reviews.apache.org/r/33088/diff/ against branch origin/trunk Add failure detection capability to the coordinator / consumer -- Key: KAFKA-1334 URL: https://issues.apache.org/jira/browse/KAFKA-1334 Project: Kafka Issue Type: Sub-task Components: consumer Affects Versions: 0.9.0 Reporter: Neha Narkhede Assignee: Onur Karaman Attachments: KAFKA-1334.patch, KAFKA-1334_2015-04-11_22:47:27.patch, KAFKA-1334_2015-04-13_11:55:06.patch, KAFKA-1334_2015-04-13_11:58:53.patch, KAFKA-1334_2015-04-18_10:16:23.patch, KAFKA-1334_2015-04-18_12:16:39.patch, KAFKA-1334_2015-04-24_22:46:15.patch 1) Add coordinator discovery and failure detection to the consumer. 2) Add failure detection capability to the coordinator when group management is used. This will not include any rebalancing logic, just the logic to detect consumer failures using session.timeout.ms. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 33088: add heartbeat to coordinator
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33088/ --- (Updated April 25, 2015, 5:46 a.m.) Review request for kafka. Bugs: KAFKA-1334 https://issues.apache.org/jira/browse/KAFKA-1334 Repository: kafka Description --- add heartbeat to coordinator todo: - see how it performs under real load - add error code handling on the consumer side - implement the partition assignors Diffs (updated) - clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java e55ab11df4db0b0084f841a74cbcf819caf780d5 clients/src/main/java/org/apache/kafka/common/protocol/Errors.java 36aa412404ff1458c7bef0feecaaa8bc45bed9c7 core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala 456b602245e111880e1b8b361319cabff38ee0e9 core/src/main/scala/kafka/coordinator/ConsumerRegistry.scala 2f5797064d4131ecfc9d2750d9345a9fa3972a9a core/src/main/scala/kafka/coordinator/DelayedHeartbeat.scala 6a6bc7bc4ceb648b67332e789c2c33de88e4cd86 core/src/main/scala/kafka/coordinator/DelayedJoinGroup.scala df60cbc35d09937b4e9c737c67229889c69d8698 core/src/main/scala/kafka/coordinator/DelayedRebalance.scala 8defa2e41c92f1ebe255177679d275c70dae5b3e core/src/main/scala/kafka/coordinator/Group.scala PRE-CREATION core/src/main/scala/kafka/coordinator/GroupRegistry.scala 94ef5829b3a616c90018af1db7627bfe42e259e5 core/src/main/scala/kafka/coordinator/HeartbeatBucket.scala 821e26e97eaa97b5f4520474fff0fedbf406c82a core/src/main/scala/kafka/coordinator/PartitionAssignor.scala PRE-CREATION core/src/main/scala/kafka/server/DelayedOperationKey.scala b673e43b0ba401b2e22f27aef550e3ab0ef4323c core/src/main/scala/kafka/server/KafkaApis.scala b4004aa3a1456d337199aa1245fb0ae61f6add46 core/src/main/scala/kafka/server/KafkaServer.scala c63f4ba9d622817ea8636d4e6135fba917ce085a core/src/main/scala/kafka/server/OffsetManager.scala 18680ce100f10035175cc0263ba7787ab0f6a17a core/src/test/scala/unit/kafka/coordinator/GroupTest.scala PRE-CREATION Diff: https://reviews.apache.org/r/33088/diff/ Testing --- Thanks, Onur Karaman
Re: Review Request 33088: add heartbeat to coordinator
On April 22, 2015, 2:33 a.m., Guozhang Wang wrote: core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala, line 123 https://reviews.apache.org/r/33088/diff/1/?file=923568#file923568line123 How about handleConsumerJoinGroup? I agree with one of Jay's earlier comments from another rb that joinGroup and heartbeat is cleaner. - Onur --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33088/#review79762 --- On April 25, 2015, 5:46 a.m., Onur Karaman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33088/ --- (Updated April 25, 2015, 5:46 a.m.) Review request for kafka. Bugs: KAFKA-1334 https://issues.apache.org/jira/browse/KAFKA-1334 Repository: kafka Description --- add heartbeat to coordinator todo: - see how it performs under real load - add error code handling on the consumer side - implement the partition assignors Diffs - clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java e55ab11df4db0b0084f841a74cbcf819caf780d5 clients/src/main/java/org/apache/kafka/common/protocol/Errors.java 36aa412404ff1458c7bef0feecaaa8bc45bed9c7 core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala 456b602245e111880e1b8b361319cabff38ee0e9 core/src/main/scala/kafka/coordinator/ConsumerRegistry.scala 2f5797064d4131ecfc9d2750d9345a9fa3972a9a core/src/main/scala/kafka/coordinator/DelayedHeartbeat.scala 6a6bc7bc4ceb648b67332e789c2c33de88e4cd86 core/src/main/scala/kafka/coordinator/DelayedJoinGroup.scala df60cbc35d09937b4e9c737c67229889c69d8698 core/src/main/scala/kafka/coordinator/DelayedRebalance.scala 8defa2e41c92f1ebe255177679d275c70dae5b3e core/src/main/scala/kafka/coordinator/Group.scala PRE-CREATION core/src/main/scala/kafka/coordinator/GroupRegistry.scala 94ef5829b3a616c90018af1db7627bfe42e259e5 core/src/main/scala/kafka/coordinator/HeartbeatBucket.scala 821e26e97eaa97b5f4520474fff0fedbf406c82a core/src/main/scala/kafka/coordinator/PartitionAssignor.scala PRE-CREATION core/src/main/scala/kafka/server/DelayedOperationKey.scala b673e43b0ba401b2e22f27aef550e3ab0ef4323c core/src/main/scala/kafka/server/KafkaApis.scala b4004aa3a1456d337199aa1245fb0ae61f6add46 core/src/main/scala/kafka/server/KafkaServer.scala c63f4ba9d622817ea8636d4e6135fba917ce085a core/src/main/scala/kafka/server/OffsetManager.scala 18680ce100f10035175cc0263ba7787ab0f6a17a core/src/test/scala/unit/kafka/coordinator/GroupTest.scala PRE-CREATION Diff: https://reviews.apache.org/r/33088/diff/ Testing --- Thanks, Onur Karaman
Re: Review Request 33125: Add comment to timing fix
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33125/#review81581 --- clients/src/test/java/org/apache/kafka/clients/MetadataTest.java https://reviews.apache.org/r/33125/#comment132016 Do we need this sleep? - Guozhang Wang On April 13, 2015, 7:15 p.m., Rajini Sivaram wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33125/ --- (Updated April 13, 2015, 7:15 p.m.) Review request for kafka. Bugs: KAFKA-2089 https://issues.apache.org/jira/browse/KAFKA-2089 Repository: kafka Description --- Patch for KAFKA-2089: Fix timing issue in MetadataTest Diffs - clients/src/test/java/org/apache/kafka/clients/MetadataTest.java 928087d29deb80655ca83726c1ebc45d76468c1f Diff: https://reviews.apache.org/r/33125/diff/ Testing --- Thanks, Rajini Sivaram
Re: [VOTE] KIP-11- Authorization design for kafka security
+1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too contrived an example? I agree that having this level of granularity gives flexibility but I wonder if people will actually use it and not just * the hosts for a given user and create separate global list as i mentioned above? The only other system I know of that ties users with hosts for access is MySql and I don't love that model. Companies usually standardize on group authorization anyway, are we complicating that issue with the inclusion of hosts attached to users? Additionally I worry about the debt of big JSON configs in the first place, most non-developers find them non-intuitive already, so anything to ease this I think would be beneficial. Thanks Jeff On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry I missed your last questions. I am +0 on adding ―host option for ―list, we could add it for symmetry. Again if this is only a CLI change it can be added later if you mean adding this in authorizer interface then we should make a decision now. Given a choice I would like to actually keep only one option which is resource based get (remove even the get based on principal). I see those (getAcl for principal or host) as special filtering case which can
[jira] [Commented] (KAFKA-2105) NullPointerException in client on MetadataRequest
[ https://issues.apache.org/jira/browse/KAFKA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511317#comment-14511317 ] Gwen Shapira commented on KAFKA-2105: - I'm having trouble applying this patch format. Can you generate one using git diff? Better yet, try our patch review tool: https://cwiki.apache.org/confluence/display/KAFKA/Patch+submission+and+review NullPointerException in client on MetadataRequest - Key: KAFKA-2105 URL: https://issues.apache.org/jira/browse/KAFKA-2105 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 0.8.2.1 Reporter: Roger Hoover Priority: Minor Attachments: guard-from-null.patch With the new producer, if you accidentally pass null to KafkaProducer.partitionsFor(null), it will cause the IO thread to throw NPE. Uncaught error in kafka producer I/O thread: java.lang.NullPointerException at org.apache.kafka.common.utils.Utils.utf8Length(Utils.java:174) at org.apache.kafka.common.protocol.types.Type$5.sizeOf(Type.java:176) at org.apache.kafka.common.protocol.types.ArrayOf.sizeOf(ArrayOf.java:55) at org.apache.kafka.common.protocol.types.Schema.sizeOf(Schema.java:81) at org.apache.kafka.common.protocol.types.Struct.sizeOf(Struct.java:218) at org.apache.kafka.common.requests.RequestSend.serialize(RequestSend.java:35) at org.apache.kafka.common.requests.RequestSend.init(RequestSend.java:29) at org.apache.kafka.clients.NetworkClient.metadataRequest(NetworkClient.java:369) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:391) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:188) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] KIP-11- Authorization design for kafka security
You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too contrived an example? I agree that having this level of granularity gives flexibility but I wonder if people will actually use it and not just * the hosts for a given user and create separate global list as i mentioned above? The only other system I know of that ties users with hosts for access is MySql and I don't love that model. Companies usually standardize on group authorization anyway, are we
[jira] [Updated] (KAFKA-1936) Track offset commit requests separately from produce requests
[ https://issues.apache.org/jira/browse/KAFKA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Auradkar updated KAFKA-1936: --- Assignee: Dong Lin (was: Aditya Auradkar) Track offset commit requests separately from produce requests - Key: KAFKA-1936 URL: https://issues.apache.org/jira/browse/KAFKA-1936 Project: Kafka Issue Type: Sub-task Reporter: Aditya Auradkar Assignee: Dong Lin In ReplicaManager, failed and total produce requests are updated from appendToLocalLog. Since offset commit requests also follow the same path, they are counted along with produce requests. Add a metric and count them separately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2105) NullPointerException in client on MetadataRequest
[ https://issues.apache.org/jira/browse/KAFKA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daneel Yaitskov updated KAFKA-2105: --- Attachment: (was: guard-from-null.patch) NullPointerException in client on MetadataRequest - Key: KAFKA-2105 URL: https://issues.apache.org/jira/browse/KAFKA-2105 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 0.8.2.1 Reporter: Roger Hoover Priority: Minor With the new producer, if you accidentally pass null to KafkaProducer.partitionsFor(null), it will cause the IO thread to throw NPE. Uncaught error in kafka producer I/O thread: java.lang.NullPointerException at org.apache.kafka.common.utils.Utils.utf8Length(Utils.java:174) at org.apache.kafka.common.protocol.types.Type$5.sizeOf(Type.java:176) at org.apache.kafka.common.protocol.types.ArrayOf.sizeOf(ArrayOf.java:55) at org.apache.kafka.common.protocol.types.Schema.sizeOf(Schema.java:81) at org.apache.kafka.common.protocol.types.Struct.sizeOf(Struct.java:218) at org.apache.kafka.common.requests.RequestSend.serialize(RequestSend.java:35) at org.apache.kafka.common.requests.RequestSend.init(RequestSend.java:29) at org.apache.kafka.clients.NetworkClient.metadataRequest(NetworkClient.java:369) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:391) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:188) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1940) Initial checkout and build failing
[ https://issues.apache.org/jira/browse/KAFKA-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daneel Yaitskov updated KAFKA-1940: --- Attachment: (was: zinc-upgrade.patch) Initial checkout and build failing -- Key: KAFKA-1940 URL: https://issues.apache.org/jira/browse/KAFKA-1940 Project: Kafka Issue Type: Bug Components: build Affects Versions: 0.8.1.2 Environment: Groovy: 1.8.6 Ant: Apache Ant(TM) version 1.9.2 compiled on July 8 2013 Ivy: 2.2.0 JVM: 1.8.0_25 (Oracle Corporation 25.25-b02) OS: Windows 7 6.1 amd64 Reporter: Martin Lemanski Labels: build when performing `gradle wrapper` and `gradlew build` as a new developer, I get an exception: {code} C:\development\git\kafkagradlew build --stacktrace ... FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':core:compileScala'. com.typesafe.zinc.Setup.create(Lcom/typesafe/zinc/ScalaLocation;Lcom/typesafe/zinc/SbtJars;Ljava/io/File;)Lcom/typesaf e/zinc/Setup; {code} Details: https://gist.github.com/mlem/ddff83cc8a25b040c157 Current Commit: {code} C:\development\git\kafkagit rev-parse --verify HEAD 71602de0bbf7727f498a812033027f6cbfe34eb8 {code} I am evaluating kafka for my company and wanted to run some tests with it, but couldn't due to this error. I know gradle can be tricky and it's not easy to setup everything correct, but this kind of bugs turns possible commiters/users off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] KIP-11- Authorization design for kafka security
Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I would then modify the ACL to look like: {version:1, {acls:[ { principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal], principals:[alice,kafka-devs 3) The advantage of all of this is that it now provides more flexibility for custom modules for both authentication and authorization moving forward. On Fri, Apr 24, 2015 at 12:37 PM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to
Re: [VOTE] KIP-11- Authorization design for kafka security
Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too contrived an example? I agree that having this level of granularity gives flexibility but I wonder if people will actually use it and not just * the hosts for a given user and create separate global list as i mentioned above? The only other system I know of that ties users with hosts for access is MySql and I don't love that model. Companies usually standardize on group authorization anyway, are we complicating that issue with the inclusion of hosts attached to users? Additionally I worry about the debt of big JSON configs in the first place, most non-developers find them non-intuitive already, so anything to ease this I think would be beneficial. Thanks Jeff On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry I missed your last questions. I am +0 on adding ―host option for ―list, we could add it for symmetry. Again if this is only a CLI change it can be added later if you mean adding this in authorizer interface then we should make a decision now. Given a choice I would like to actually keep only one option which is resource based get (remove even the get based on principal). I see those (getAcl for principal or host) as special filtering case which can easily be achieved by a third party tool by doing list all topics and calling getAcls for each topic and applying filtering logic on that. I really don’t see the need to make those first class citizens of the authorizer interface given these kind of queries will be issued outside of broker JVM so they will not benefit from the caching and because the storage will be indexed on resource both these options even as a first class API will just scan all topic acls and apply filtering logic.
Re: Should 0.8.3 consumers correctly function with 0.8.2 brokers?
Thanks for the responses. Ewen is correct that I am referring to the *new* consumer (org.apache.kafka.clients.consumer.KafkaConsumer). I am extending the consumer to allow my applications more control over committed offsets. I really want to get away from zookeeper (so using the offset storage), and re-balancing is something I haven't really needed to tackle in an automated/seamless way. Either way, I'll hold off going further down this road until there is more interest. @Gwen I set up a single consumer without partition.assignment.strategy or rebalance.callback.class. I was unable to subscribe to just a topic (Unknown api code 11 on broker), but I could subscribe to a topicpartition. This makes sense as I would need to handle re-balance outside the consumer. Things functioned as expected (well I have an additional minor fix to code from KAFKA-2121), and the only exceptions on broker were due to closing consumers (which I have become accustomed to). My tests are specific to my extended version of the consumer, but they basically do a little writing and reading with different serde classes with application controlled commits (similar to onSuccess and onFailure after each record, but with tolerance for out of order acknowledgements). If you are interested, here is the patch of the hack against trunk. On Thu, Apr 23, 2015 at 10:27 PM, Ewen Cheslack-Postava e...@confluent.io wrote: @Neha I think you're mixing up the 0.8.1/0.8.2 updates and the 0.8.2/0.8.3 that's being discussed here? I think the original question was about using the *new* consumer (clients consumer) with 0.8.2. Gwen's right, it will use features not even implemented in the broker in trunk yet, let alone the 0.8.2. I don't think the enable.commit.downgrade type option, or supporting the old protocol with the new consumer at all, makes much sense. You'd end up with some weird hybrid of simple and high-level consumers -- you could use offset storage, but you'd have to manage rebalancing yourself since none of the coordinator support would be there. On Thu, Apr 23, 2015 at 9:22 PM, Neha Narkhede n...@confluent.io wrote: My understanding is that ideally the 0.8.3 consumer should work with an 0.8.2 broker if the offset commit config was set to zookeeper. The only thing that might not work is offset commit to Kafka, which makes sense since the 0.8.2 broker does not support Kafka based offset management. If we broke all kinds of offset commits, then it seems like a regression, no? On Thu, Apr 23, 2015 at 7:26 PM, Gwen Shapira gshap...@cloudera.com wrote: I didn't think 0.8.3 consumer will ever be able to talk to 0.8.2 broker... there are some essential pieces that are missing in 0.8.2 (Coordinator, Heartbeat, etc). Maybe I'm missing something. It will be nice if this will work :) Mind sharing what / how you tested? Were there no errors in broker logs after your fix? On Thu, Apr 23, 2015 at 5:37 PM, Sean Lydon lydon.s...@gmail.com wrote: Currently the clients consumer (trunk) sends offset commit requests of version 2. The 0.8.2 brokers fail to handle this particular request with a: java.lang.AssertionError: assertion failed: Version 2 is invalid for OffsetCommitRequest. Valid versions are 0 or 1. I was able to make this work via a forceful downgrade of this particular request, but I would like some feedback on whether a enable.commit.downgrade configuration would be a tolerable method to allow 0.8.3 consumers to interact with 0.8.2 brokers. I'm also interested in this even being a goal worth pursuing. Thanks, Sean -- Thanks, Neha -- Thanks, Ewen From 31a14a1749cb164bdde0f59951e4d3aae8ce80a1 Mon Sep 17 00:00:00 2001 From: Sean Lydon sly...@apixio.com Date: Fri, 24 Apr 2015 09:29:41 -0700 Subject: [PATCH] Hardcoded changes to downgrade offset_commit to version 1. --- .../java/org/apache/kafka/clients/KafkaClient.java | 10 - .../org/apache/kafka/clients/NetworkClient.java| 12 ++ .../kafka/clients/consumer/KafkaConsumer.java | 2 +- .../clients/consumer/internals/Coordinator.java| 46 -- .../java/org/apache/kafka/clients/MockClient.java | 5 +++ 5 files changed, 69 insertions(+), 6 deletions(-) diff --git a/clients/src/main/java/org/apache/kafka/clients/KafkaClient.java b/clients/src/main/java/org/apache/kafka/clients/KafkaClient.java index 1311f85..e608ca8 100644 --- a/clients/src/main/java/org/apache/kafka/clients/KafkaClient.java +++ b/clients/src/main/java/org/apache/kafka/clients/KafkaClient.java @@ -126,9 +126,17 @@ public interface KafkaClient extends Closeable { */ public RequestHeader nextRequestHeader(ApiKeys key); +/* + * Generate a request header for the next request + * + * @param key The API key of the request + * @param version The API key's version of the request + */ +public RequestHeader nextRequestHeader(ApiKeys key, short
Re: [VOTE] KIP-11- Authorization design for kafka security
Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too contrived an example? I agree that having this level of granularity gives flexibility but I wonder if people will actually use it and not just * the hosts for a given user and create separate global list as i mentioned above? The only other system I know of that ties users with hosts for access is MySql and I don't love that model. Companies usually standardize on group authorization anyway, are we complicating that issue with the inclusion of hosts attached to users? Additionally I worry about the debt of big JSON configs in the first place, most
Re: [VOTE] KIP-11- Authorization design for kafka security
+1 (non-binding) -- Harsha On April 24, 2015 at 9:59:09 AM, Parth Brahmbhatt (pbrahmbh...@hortonworks.com) wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too
[jira] [Updated] (KAFKA-2105) NullPointerException in client on MetadataRequest
[ https://issues.apache.org/jira/browse/KAFKA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daneel Yaitskov updated KAFKA-2105: --- Attachment: guard-from-null.patch Sorry, First time I did patch with --color option. As for me patch files are legacy technique, a github reference for a pull request looks much much better, but I didn't find any ticket resolution in a such way. NullPointerException in client on MetadataRequest - Key: KAFKA-2105 URL: https://issues.apache.org/jira/browse/KAFKA-2105 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 0.8.2.1 Reporter: Roger Hoover Priority: Minor Attachments: guard-from-null.patch With the new producer, if you accidentally pass null to KafkaProducer.partitionsFor(null), it will cause the IO thread to throw NPE. Uncaught error in kafka producer I/O thread: java.lang.NullPointerException at org.apache.kafka.common.utils.Utils.utf8Length(Utils.java:174) at org.apache.kafka.common.protocol.types.Type$5.sizeOf(Type.java:176) at org.apache.kafka.common.protocol.types.ArrayOf.sizeOf(ArrayOf.java:55) at org.apache.kafka.common.protocol.types.Schema.sizeOf(Schema.java:81) at org.apache.kafka.common.protocol.types.Struct.sizeOf(Struct.java:218) at org.apache.kafka.common.requests.RequestSend.serialize(RequestSend.java:35) at org.apache.kafka.common.requests.RequestSend.init(RequestSend.java:29) at org.apache.kafka.clients.NetworkClient.metadataRequest(NetworkClient.java:369) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:391) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:188) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation
I yet to update the KIP with my latest proposal. So give me few days to update it. I am looking at supporting KERBEROS for the first release and going to use JAAS Login Modules to provide authentication. And will we provide a default SASL PLAIN mechanism on the server side Yes . I’ll update the KIP and send out an email for further discussion as it will make it easier. Thanks, Harsha On April 24, 2015 at 9:30:04 AM, Gari Singh (gari.r.si...@gmail.com) wrote: Great. Sounds good. I'll re-read the KIP ASAP. Is their another KIP around authentication providers or is that being tracked here as well. For example, the SASL PLAIN mechanism carries a username and password but currently I don't know where that would be authenticated? I notice that AuthUtils has the ability read a JAAS config, but the KIP only has entries relevant to Kerberos. Is the idea to use JAAS LoginModules to provide pluggable authentication - so we could use some of the JDK provided LoginModules or create our own (e.g. use a local password file, LDAP, etc)? And will we provide a default SASL PLAIN mechanism on the server side or would we implement custom SASL provider modules? Also - I am happy to take a look / test any code as you move along. Also happy to help with SASL providers and/or authentication/login modules Thanks, Gari On Fri, Apr 24, 2015 at 12:05 PM, Sriharsha Chintalapani harsh...@fastmail.fm wrote: Hi Gari, I apologize for not clear in KIP and starting discussion thread earlier. My initial proposal( currently in KIP ) to provide PLAINTEXT, SSL and KERBEROS as individual protocol implementation. As you mentioned at the end “In terms of message level integrity and confidentiality (not to be confused with transport level security like TLS), SASL also provides for this (assuming the mechanism supports it). The SASL library supports this via the props parameter in the createSaslClient/Server methods. So it is easily possible to support Kerberos with integrity (MIC) or confidentiality (encryption) over TCP and without either over TLS. “ My intention was to use sasl to do auth and also provide encryption over plain text channel. But after speaking to many who implemented Sasl this way for HDFS and HBASE , other projects as well their suggestion was not to use context.wrap and context.unwrap which does the encryption for sasl causes performance degradation. Currently I am working on SASL authentication as an option over TCP or TLS. I’ll update the KIP soon once I’ve got interfaces in place. Sorry about the confusion on this as I am testing out multiple options and trying to decide right one. Thanks, Harsha On April 24, 2015 at 8:37:09 AM, Gari Singh (gari.r.si...@gmail.com) wrote: Sorry for jumping in late, but I have been trying to follow this chain as well as the updates to the KIP. I don't mean to seem critical and I may be misunderstanding the proposed implementation, but there seems to be some confusion around terminology (at least from my perspective) and I am not sure I actually understand what is going to be implemented and where the plugin point(s) will be. The KIP does not really mention SASL interfaces in any detail. The way I read the KIP it seems as if if is more about providing a Kerberos mechanism via GSSAPI than it is about providing pluggable SASL support. Perhaps it is the naming convention (GSS is used where I would have though SASL would have been used). Maybe I am missing something? SASL leverages GSSAPI for the Kerberos mechanism, but SASL and the GSSAPI are not the same thing. Also, SSL/TLS is independent of both SASL and GSSAPI although you can use either SASL or GSSAPI over TLS. I would expect something more along the lines of having a SASLChannel and SASL providers (along with pluggable Authentication providers which enumerate which SASL mechanisms they support). I have only ever attempted to really implement SASL support once, but I have played with the SASL APIs and am familiar with how LDAP, SMTP and AMQP use SASL. This is my understanding of how SASL is typically implemented: 1) Client decides whether or not to use TLS or plain TCP (of course this depends on what the server provides). My current understanding is that Kafka will support three types of server sockets: - current socket for backwards compatibility (i.e. no TLS and no SASL) - TLS socket - SASL socket I would also have thought that SASL mechanism would be supported on the TLS socket as well but that does not seem to be the case (or at least it is not clear either way). I know the decision was made to have separate TLS and SASL sockets, but I think that we need to support SASL over TLS as well. You can do this over a single socket if you use the startTLS metaphor. 2) There is typically some type of application protocol specific handshake This is usually used to negotiate whether or not to use SASL and/or negotiate which SASL mechanisms are supported by the server. This is
Re: [VOTE] KIP-11- Authorization design for kafka security
Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the order that the perms would be evaluated if a there was more than one match on a principal ? Is the thought that there wouldn't usually be much overlap on hosts? I guess I can imagine a scenario where I want to offline/online access to a particular hosts or set of hosts and if there was overlap, I'm doing a bunch of alter commands for just a single host. Maybe this is too contrived an example? I agree that having this level of granularity gives flexibility but I wonder if people will actually use it and not just * the hosts for a given user and create separate global list as i mentioned above? The only other system I know of that ties users with hosts for access is MySql and I don't love that model. Companies usually standardize on group authorization anyway, are we complicating that issue with the inclusion of hosts attached to users? Additionally I worry about the debt of big JSON configs in the first place, most non-developers find them non-intuitive already, so anything to ease this I think would be beneficial. Thanks Jeff On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Sorry I missed your last questions. I am +0 on adding ―host option
[jira] [Commented] (KAFKA-2148) version 0.8.2 breaks semantic versioning
[ https://issues.apache.org/jira/browse/KAFKA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511304#comment-14511304 ] Jay Kreps commented on KAFKA-2148: -- Basically kafka.javaapi.producer.Producer still exists and works exactly the same as before. We added a new api, org.apache.kafka.clients.producer.KafkaProducer which is meant to be an eventual replacement and has a lot of advantages. But for the next few releases the old client remains and works exactly as before. version 0.8.2 breaks semantic versioning Key: KAFKA-2148 URL: https://issues.apache.org/jira/browse/KAFKA-2148 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2.0 Reporter: Reece Markowsky Assignee: Jun Rao Labels: api, producer version 0.8.2 of the Producer API drops support for sending a list of KeyedMessage (present in 0.8.1) the call present in Producer version 0.8.1 http://kafka.apache.org/081/api.html public void send(ListKeyedMessageK,V messages); is not present (breaking semantic versioning) in 0.8.2 Producer version 0.8.2 http://kafka.apache.org/082/javadoc/index.html send(ProducerRecordK,V record, Callback callback) or send(ProducerRecordK,V record) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2148) version 0.8.2 breaks semantic versioning
[ https://issues.apache.org/jira/browse/KAFKA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511308#comment-14511308 ] Reece Markowsky commented on KAFKA-2148: now i see it. its kafka/clients/src/main/java/org/apache/kafka/clients/producer/ vs kafka/core/src/main/scala/kafka/producer/ thx! version 0.8.2 breaks semantic versioning Key: KAFKA-2148 URL: https://issues.apache.org/jira/browse/KAFKA-2148 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2.0 Reporter: Reece Markowsky Assignee: Jun Rao Labels: api, producer version 0.8.2 of the Producer API drops support for sending a list of KeyedMessage (present in 0.8.1) the call present in Producer version 0.8.1 http://kafka.apache.org/081/api.html public void send(ListKeyedMessageK,V messages); is not present (breaking semantic versioning) in 0.8.2 Producer version 0.8.2 http://kafka.apache.org/082/javadoc/index.html send(ProducerRecordK,V record, Callback callback) or send(ProducerRecordK,V record) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation
Great. Sounds good. I'll re-read the KIP ASAP. Is their another KIP around authentication providers or is that being tracked here as well. For example, the SASL PLAIN mechanism carries a username and password but currently I don't know where that would be authenticated? I notice that AuthUtils has the ability read a JAAS config, but the KIP only has entries relevant to Kerberos. Is the idea to use JAAS LoginModules to provide pluggable authentication - so we could use some of the JDK provided LoginModules or create our own (e.g. use a local password file, LDAP, etc)? And will we provide a default SASL PLAIN mechanism on the server side or would we implement custom SASL provider modules? Also - I am happy to take a look / test any code as you move along. Also happy to help with SASL providers and/or authentication/login modules Thanks, Gari On Fri, Apr 24, 2015 at 12:05 PM, Sriharsha Chintalapani harsh...@fastmail.fm wrote: Hi Gari, I apologize for not clear in KIP and starting discussion thread earlier. My initial proposal( currently in KIP ) to provide PLAINTEXT, SSL and KERBEROS as individual protocol implementation. As you mentioned at the end “In terms of message level integrity and confidentiality (not to be confused with transport level security like TLS), SASL also provides for this (assuming the mechanism supports it). The SASL library supports this via the props parameter in the createSaslClient/Server methods. So it is easily possible to support Kerberos with integrity (MIC) or confidentiality (encryption) over TCP and without either over TLS. “ My intention was to use sasl to do auth and also provide encryption over plain text channel. But after speaking to many who implemented Sasl this way for HDFS and HBASE , other projects as well their suggestion was not to use context.wrap and context.unwrap which does the encryption for sasl causes performance degradation. Currently I am working on SASL authentication as an option over TCP or TLS. I’ll update the KIP soon once I’ve got interfaces in place. Sorry about the confusion on this as I am testing out multiple options and trying to decide right one. Thanks, Harsha On April 24, 2015 at 8:37:09 AM, Gari Singh (gari.r.si...@gmail.com) wrote: Sorry for jumping in late, but I have been trying to follow this chain as well as the updates to the KIP. I don't mean to seem critical and I may be misunderstanding the proposed implementation, but there seems to be some confusion around terminology (at least from my perspective) and I am not sure I actually understand what is going to be implemented and where the plugin point(s) will be. The KIP does not really mention SASL interfaces in any detail. The way I read the KIP it seems as if if is more about providing a Kerberos mechanism via GSSAPI than it is about providing pluggable SASL support. Perhaps it is the naming convention (GSS is used where I would have though SASL would have been used). Maybe I am missing something? SASL leverages GSSAPI for the Kerberos mechanism, but SASL and the GSSAPI are not the same thing. Also, SSL/TLS is independent of both SASL and GSSAPI although you can use either SASL or GSSAPI over TLS. I would expect something more along the lines of having a SASLChannel and SASL providers (along with pluggable Authentication providers which enumerate which SASL mechanisms they support). I have only ever attempted to really implement SASL support once, but I have played with the SASL APIs and am familiar with how LDAP, SMTP and AMQP use SASL. This is my understanding of how SASL is typically implemented: 1) Client decides whether or not to use TLS or plain TCP (of course this depends on what the server provides). My current understanding is that Kafka will support three types of server sockets: - current socket for backwards compatibility (i.e. no TLS and no SASL) - TLS socket - SASL socket I would also have thought that SASL mechanism would be supported on the TLS socket as well but that does not seem to be the case (or at least it is not clear either way). I know the decision was made to have separate TLS and SASL sockets, but I think that we need to support SASL over TLS as well. You can do this over a single socket if you use the startTLS metaphor. 2) There is typically some type of application protocol specific handshake This is usually used to negotiate whether or not to use SASL and/or negotiate which SASL mechanisms are supported by the server. This is not strictly required, although the SASL spec does mention that the client should be able to get a list of SASL mechanisms supported by the server. For example, SMTP does this with the client sending a EHLO and the server sending an AUTH. Personally I like the AMQP model (which by the way might also help with backwards compatibility using a single socket). For AMQP, the initial frame is basically AMQP.%d0.1.0.0
Re: Should 0.8.3 consumers correctly function with 0.8.2 brokers?
Yes, I was clearly confused :-) On Fri, Apr 24, 2015 at 9:37 AM, Sean Lydon lydon.s...@gmail.com wrote: Thanks for the responses. Ewen is correct that I am referring to the *new* consumer (org.apache.kafka.clients.consumer.KafkaConsumer). I am extending the consumer to allow my applications more control over committed offsets. I really want to get away from zookeeper (so using the offset storage), and re-balancing is something I haven't really needed to tackle in an automated/seamless way. Either way, I'll hold off going further down this road until there is more interest. @Gwen I set up a single consumer without partition.assignment.strategy or rebalance.callback.class. I was unable to subscribe to just a topic (Unknown api code 11 on broker), but I could subscribe to a topicpartition. This makes sense as I would need to handle re-balance outside the consumer. Things functioned as expected (well I have an additional minor fix to code from KAFKA-2121), and the only exceptions on broker were due to closing consumers (which I have become accustomed to). My tests are specific to my extended version of the consumer, but they basically do a little writing and reading with different serde classes with application controlled commits (similar to onSuccess and onFailure after each record, but with tolerance for out of order acknowledgements). If you are interested, here is the patch of the hack against trunk. On Thu, Apr 23, 2015 at 10:27 PM, Ewen Cheslack-Postava e...@confluent.io wrote: @Neha I think you're mixing up the 0.8.1/0.8.2 updates and the 0.8.2/0.8.3 that's being discussed here? I think the original question was about using the *new* consumer (clients consumer) with 0.8.2. Gwen's right, it will use features not even implemented in the broker in trunk yet, let alone the 0.8.2. I don't think the enable.commit.downgrade type option, or supporting the old protocol with the new consumer at all, makes much sense. You'd end up with some weird hybrid of simple and high-level consumers -- you could use offset storage, but you'd have to manage rebalancing yourself since none of the coordinator support would be there. On Thu, Apr 23, 2015 at 9:22 PM, Neha Narkhede n...@confluent.io wrote: My understanding is that ideally the 0.8.3 consumer should work with an 0.8.2 broker if the offset commit config was set to zookeeper. The only thing that might not work is offset commit to Kafka, which makes sense since the 0.8.2 broker does not support Kafka based offset management. If we broke all kinds of offset commits, then it seems like a regression, no? On Thu, Apr 23, 2015 at 7:26 PM, Gwen Shapira gshap...@cloudera.com wrote: I didn't think 0.8.3 consumer will ever be able to talk to 0.8.2 broker... there are some essential pieces that are missing in 0.8.2 (Coordinator, Heartbeat, etc). Maybe I'm missing something. It will be nice if this will work :) Mind sharing what / how you tested? Were there no errors in broker logs after your fix? On Thu, Apr 23, 2015 at 5:37 PM, Sean Lydon lydon.s...@gmail.com wrote: Currently the clients consumer (trunk) sends offset commit requests of version 2. The 0.8.2 brokers fail to handle this particular request with a: java.lang.AssertionError: assertion failed: Version 2 is invalid for OffsetCommitRequest. Valid versions are 0 or 1. I was able to make this work via a forceful downgrade of this particular request, but I would like some feedback on whether a enable.commit.downgrade configuration would be a tolerable method to allow 0.8.3 consumers to interact with 0.8.2 brokers. I'm also interested in this even being a goal worth pursuing. Thanks, Sean -- Thanks, Neha -- Thanks, Ewen -- Thanks, Neha
[jira] [Updated] (KAFKA-1940) Initial checkout and build failing
[ https://issues.apache.org/jira/browse/KAFKA-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daneel Yaitskov updated KAFKA-1940: --- Attachment: zinc-upgrade.patch Reattach the patch file without colored diff. Initial checkout and build failing -- Key: KAFKA-1940 URL: https://issues.apache.org/jira/browse/KAFKA-1940 Project: Kafka Issue Type: Bug Components: build Affects Versions: 0.8.1.2 Environment: Groovy: 1.8.6 Ant: Apache Ant(TM) version 1.9.2 compiled on July 8 2013 Ivy: 2.2.0 JVM: 1.8.0_25 (Oracle Corporation 25.25-b02) OS: Windows 7 6.1 amd64 Reporter: Martin Lemanski Labels: build Attachments: zinc-upgrade.patch when performing `gradle wrapper` and `gradlew build` as a new developer, I get an exception: {code} C:\development\git\kafkagradlew build --stacktrace ... FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':core:compileScala'. com.typesafe.zinc.Setup.create(Lcom/typesafe/zinc/ScalaLocation;Lcom/typesafe/zinc/SbtJars;Ljava/io/File;)Lcom/typesaf e/zinc/Setup; {code} Details: https://gist.github.com/mlem/ddff83cc8a25b040c157 Current Commit: {code} C:\development\git\kafkagit rev-parse --verify HEAD 71602de0bbf7727f498a812033027f6cbfe34eb8 {code} I am evaluating kafka for my company and wanted to run some tests with it, but couldn't due to this error. I know gradle can be tricky and it's not easy to setup everything correct, but this kind of bugs turns possible commiters/users off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1936) Track offset commit requests separately from produce requests
[ https://issues.apache.org/jira/browse/KAFKA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511392#comment-14511392 ] Dong Lin commented on KAFKA-1936: - Thanks! I will work on this ticket. On Fri, Apr 24, 2015 at 10:08 AM, Aditya Auradkar (JIRA) j...@apache.org Track offset commit requests separately from produce requests - Key: KAFKA-1936 URL: https://issues.apache.org/jira/browse/KAFKA-1936 Project: Kafka Issue Type: Sub-task Reporter: Aditya Auradkar Assignee: Dong Lin In ReplicaManager, failed and total produce requests are updated from appendToLocalLog. Since offset commit requests also follow the same path, they are counted along with produce requests. Add a metric and count them separately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] KIP-11- Authorization design for kafka security
Sorry - fat fingered send ... Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I would then modify the ACL to look like: {version:1, {acls:[ { principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal], principals:[alice,kafka-devs] ... or {version:1, {acls:[ { principals:[KafkaUserPrincipal:alice,KafkaGroupPrincipal:kafka- devs] But in either case this allows for easy identification of the type of principal and makes it easy to plugin multiple kinds of principals The advantage of all of this is that it now provides more flexibility for custom modules for both authentication and authorization moving forward. 3) Are you sure that you want authorize to take a session object? If we use the model in one above, we could just populate the Subject with a KafkaClientAddressPrincipal and thenhave access to that when evaluated the ACLs. 4) What about actually caching authorization decisions? I know ACLs will be cached, but the actual authorize decision can be expensive as well? On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh gari.r.si...@gmail.com wrote: Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I would then modify the ACL to look like: {version:1, {acls:[ { principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal], principals:[alice,kafka-devs 3) The advantage of all of this is that it now provides more flexibility for custom modules for both authentication and authorization moving forward. On Fri, Apr 24, 2015 at 12:37 PM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the
Re: [VOTE] KIP-11- Authorization design for kafka security
Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an example suppose we have following acls acl1 - user1 is allowed to READ from all hosts. acl2 - host1 is allowed to READ regardless of who is the user. acl3 - host2 is allowed to READ regardless of who is the user. acl4 - user1 is denied to READ from host1. As stated in the KIP we first evaluate DENY so if user1 tries to access from host1 he will be denied(acl4), even though both user1 and host1 has acl’s for allow with wildcards (acl1, acl2). If user1 tried to READ from host2 , the action will be allowed and it does not matter if we match acl3 or acl1 so I don’t think the evaluation order matters here. “Will people actually use hosts with users?” I really don’t know but given ACl’s are part of our Public APIs I thought it is better to try and cover more use cases. If others think this extra complexity is not worth the value its adding please raise your concerns so we can discuss if it should be removed from the acl structure. Note that even in absence of hosts from ACL users will still be able to whitelist/blacklist host as long as we start supporting principalType = “host”, easy to add and can be an incremental improvement. They will however loose the ability to restrict access to users just from a set of hosts. We agreed to offer a CLI to overcome the JSON acl config https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati on +I n terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like Jsons but that probably has something to do with me being a developer :-). Thanks Parth On 4/22/15, 11:38 AM, Jeff Holoman jholo...@cloudera.com wrote: Parth, This is a long thread, so trying to keep up here, sorry if this has been covered before. First, great job on the KIP proposal and work so far. Are we sure that we want to tie host level access to a given user? My understanding is that the ACL will be (omitting some fields) user_a, host1, host2, host3 user_b, host1, host2, host3 So there would potentially be a lot of redundancy in the configs. Does it make sense to have hosts be at the same level as principal in the hierarchy? This way you could just blanket the allowed / denied hosts and only have to worry about the users. So if you follow this, then we can wildcard the user so we can have a separate list of just host-based access. What's the
Re: [VOTE] KIP-11- Authorization design for kafka security
Sorry Gwen, completely misunderstood the question :-). * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? Yes in current proposal. I did not see an API to create group but if you have a READ permission on a TOPIC and WRITE permission on that Group you are free to join and consume. * Will the CLI tool be used to manage group membership too? Yes and I think that means I need to add ―group. Updating the KIP. Thanks for pointing this out. * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? I have considered any auto delete and auto create as out of scope for the first release. So Right now I was going with preserving the acls. Do you see any issues with this? Auto deleting would mean authorizer will now have to get into implementation details of kafka which I was trying to avoid. Thanks Parth On 4/24/15, 11:33 AM, Gwen Shapira gshap...@cloudera.com wrote: We are not talking about same Groups :) I meant, Groups of consumers (which KIP-11 lists as a separate resource in the Privilege table) On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: I see Groups as something we can add incrementally in the current model. The acls take principalType: name so groups can be represented as group: groupName. We are not managing group memberships anywhere in kafka and I don’t see the need to do so. So for a topic1 using the CLI an admin can add an acl to grant access to group:kafka-test-users. The authorizer implementation can have a plugin to map authenticated user to groups ( This is how hadoop and storm works). The plugin could be mapping user to linux/ldap/active directory groups but that is again upto the implementation. What we are offering is an interface that is extensible so these features can be added incrementally. I can add support for this in the first release but don’t necessarily see why this would be absolute necessity. Thanks Parth On 4/24/15, 11:00 AM, Gwen Shapira gshap...@cloudera.com wrote: Thanks. One more thing I'm missing in the KIP is details on the Group resource (I think we discussed this and it was just not fully updated): * Does everyone have the privilege to create a new Group and use it to consume from Topics he's already privileged on? * Will the CLI tool be used to manage group membership too? * Groups are kind of ephemeral, right? If all consumers in the group disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we treat the new group as completely new resource? Can we create ACLs before the group exists, in anticipation of it getting created? Its all small details, but it will be difficult to implement KIP-11 without knowing the answers :) Gwen On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: You are right, moved it to the default implementation section. Thanks Parth On 4/24/15, 9:52 AM, Gwen Shapira gshap...@cloudera.com wrote: Sample ACL JSON and Zookeeper is in public API, but I thought it is part of DefaultAuthorizer (Since Sentry and Argus won't be using Zookeeper). Am I wrong? Or is it the KIP? On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for clarifying Gwen, KIP updated. I tried to make the distinction by creating a section for all public APIs https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizat io n+ In terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses Let me know if you think there is a better way to reflect this. Thanks Parth On 4/24/15, 9:37 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-binding) Two nitpicks for the wiki: * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty sure new consumers need it to be part of a consumer group. * Can you clearly separate which parts are the API (common to every Authorizer) and which parts are DefaultAuthorizer implementation? It will make reviews and Authorizer implementations a bit easier to know exactly which is which. Gwen On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi, I would like to open KIP-11 for voting. Thanks Parth On 4/22/15, 1:56 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Hi Jeff, Thanks a lot for the review. I think you have a valid point about acls being duplicated and the simplest solution would be to modify acls class so they hold a set of principals instead of single principal. i.e user_a,user_b has READ,WRITE,DESCRIBE Permissions on Topic1 from Host1, Host2, Host3. I think the evaluation order only matters for the permissionType which is Deny acls should be evaluated before allow acls. To give you an
[jira] [Commented] (KAFKA-2132) Move Log4J appender to clients module
[ https://issues.apache.org/jira/browse/KAFKA-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511517#comment-14511517 ] Ashish K Singh commented on KAFKA-2132: --- [~jkreps], [~charmalloc] thanks for the inputs here. We all agree that we should not have log4j as dependency. As [~jkreps] pointed out, moving Log4jAppender to admin tools package will again bring us back to the original problem. I am more inclined towards having it in a separate package. However, I will wait for [~charmalloc] to reply with this thoughts, before submitting a patch. Move Log4J appender to clients module - Key: KAFKA-2132 URL: https://issues.apache.org/jira/browse/KAFKA-2132 Project: Kafka Issue Type: Improvement Reporter: Gwen Shapira Assignee: Ashish K Singh Log4j appender is just a producer. Since we have a new producer in the clients module, no need to keep Log4J appender in core and force people to package all of Kafka with their apps. Lets move the Log4jAppender to clients module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27204: Patch for KAFKA-1683
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27204/#review81522 --- 1) I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) This is also more in line with JAAS as well and would fit better with authentication modules 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). - Gari Singh On Oct. 26, 2014, 5:37 a.m., Gwen Shapira wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27204/ --- (Updated Oct. 26, 2014, 5:37 a.m.) Review request for kafka. Bugs: KAFKA-1683 https://issues.apache.org/jira/browse/KAFKA-1683 Repository: kafka Description --- added test for Session Diffs - core/src/main/scala/kafka/network/RequestChannel.scala 4560d8fb7dbfe723085665e6fd611c295e07b69b core/src/main/scala/kafka/network/SocketServer.scala cee76b323e5f3e4c783749ac9e78e1ef02897e3b core/src/test/scala/unit/kafka/network/SocketServerTest.scala 5f4d85254c384dcc27a5a84f0836ea225d3a901a Diff: https://reviews.apache.org/r/27204/diff/ Testing --- Thanks, Gwen Shapira
Re: [VOTE] KIP-11- Authorization design for kafka security
I will move the comments about subject versus principal wrt session to the PR above. The comments around keys, etc are more appropriate there. If I tie this together with my comments in the thread on SASL / Kerberos, what I am having a hard time figuring out are the pluggable framework for both authentication and authorization versus implementation of specific authentication and authorization providers. As for caching decisions, it just seems silly to authorize on the same operation over and over again (e.g. publishing to the same topic), but perhaps if the ACLs are small enough this will be ok. On Fri, Apr 24, 2015 at 2:18 PM, Parth Brahmbhatt pbrahmbh...@hortonworks.com wrote: Thanks for your comments Gari. My responses are inline. Thanks Parth On 4/24/15, 10:36 AM, Gari Singh gari.r.si...@gmail.com wrote: Sorry - fat fingered send ... Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) I think the user - group mapping can be done at Authorization implementation layer. In any case as you pointed out the session is part of another jira and I think a PR is out https://reviews.apache.org/r/27204/diff/ and we should discuss it on that PR. 2) We would then also have multiple concrete Principals, e.g. KafkaPrincipal KafkaUserPrincipal KafkaGroupPrincipal (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal) etc This is important as eventually (hopefully sooner than later), we will support multiple types of authentication which may each want to populate the Subject with one or more Principals and perhaps even credentials (this could be used in the future to hold encryption keys or perhaps the raw info prior to authentication). So in this way, if we have different authentication modules, we can add different types of Principals by extension This also allows the same subject to have access to some resources based on username and some based on group. Given that with this we would have different types of Principals, I would then modify the ACL to look like: {version:1, {acls:[ { principal_types:[KafkaUserPrincipal,KafkaGroupPrincipal], principals:[alice,kafka-devs] ... or {version:1, {acls:[ { principals:[KafkaUserPrincipal:alice,KafkaGroupPrincipal:kafka- devs] But in either case this allows for easy identification of the type of principal and makes it easy to plugin multiple kinds of principals The advantage of all of this is that it now provides more flexibility for custom modules for both authentication and authorization moving forward. All the principals that you listed above can be supported with current design. Acls take a KafkaPrincipal as input which is a combination of type and principal name and the authorizer implementations are free to create any extension of this which covers group: groupName, host: HostName, kerberos: kerberosUserName and any other types that may come up. I am not sure how encryption key storage is relavent to the Authorizer so will be great if you can elaborate. 3) Are you sure that you want authorize to take a session object? If we use the model in one above, we could just populate the Subject with a KafkaClientAddressPrincipal and thenhave access to that when evaluated the ACLs. I think it is better to take a session which can just be a wrapper on top of Subject + host for now. This allows for extension which in my opinion is more future requirement proof. 4) What about actually caching authorization decisions? I know ACLs will be cached, but the actual authorize decision can be expensive as well? In default implementation I don’t plan to do this. Easy to add later if we want to but I am not sure why would this ever be expansive when acls are cached and number of acls on a single topic should be very small and iterating over them with simple string comparison should not really be expansive. Thanks Parth On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh gari.r.si...@gmail.com wrote: Not sure if my newbie vote will count, but I think you are getting pretty close here. Couple of things: 1) I know the Session object is from a different JIRA, but I think that Session should take a Subject rather than just a single Principal. The reason for this is because a Subject can have multiple Principals (for example both a username and a group or perhaps someone would want to use both the username and the clientIP as Principals) 2) We would then also have multiple
[jira] [Issue Comment Deleted] (KAFKA-2106) Partition balance tool between borkers
[ https://issues.apache.org/jira/browse/KAFKA-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenshangan updated KAFKA-2106: --- Comment: was deleted (was: Topics in the cluster can be divided into two categories: 1. nPartitions nBrokersBeforeExpand 2. nPartitions nBrokersBeforeExpand when adding new brokers into cluster: in case 1, partitions should be reassigned and spread over as many brokers as possible. in case 2, calculate nPartitions in each broker and sort them, partitions should be moved from larger broker to smaller one Finally, nPartitions will almost evenly spread over all brokers. ) Partition balance tool between borkers -- Key: KAFKA-2106 URL: https://issues.apache.org/jira/browse/KAFKA-2106 Project: Kafka Issue Type: New Feature Components: admin Affects Versions: 0.8.3 Reporter: chenshangan The default partition assignment algorithm can work well in a static kafka cluster(number of brokers seldom change). Actually, in production env, number of brokers is always increasing according to the business data. When new brokers added to the cluster, it's better to provide a tool that can help to move existing data to new brokers. Currently, users need to choose topic or partitions manually and use the Reassign Partitions Tool (kafka-reassign-partitions.sh) to achieve the goal. It's a time-consuming task when there's a lot of topics in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2106) Partition balance tool between borkers
[ https://issues.apache.org/jira/browse/KAFKA-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510614#comment-14510614 ] chenshangan commented on KAFKA-2106: When adding new brokers to the cluster, existing replica assignment should be reassigned to get better distribution. The basic though is to reassign the topic in the new cluster while starting at the original broker starting index, the process is like rehash the existing data. Partition balance tool between borkers -- Key: KAFKA-2106 URL: https://issues.apache.org/jira/browse/KAFKA-2106 Project: Kafka Issue Type: New Feature Components: admin Affects Versions: 0.8.3 Reporter: chenshangan The default partition assignment algorithm can work well in a static kafka cluster(number of brokers seldom change). Actually, in production env, number of brokers is always increasing according to the business data. When new brokers added to the cluster, it's better to provide a tool that can help to move existing data to new brokers. Currently, users need to choose topic or partitions manually and use the Reassign Partitions Tool (kafka-reassign-partitions.sh) to achieve the goal. It's a time-consuming task when there's a lot of topics in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2106) Partition balance tool between borkers
[ https://issues.apache.org/jira/browse/KAFKA-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenshangan updated KAFKA-2106: --- Reviewer: Jun Rao Status: Patch Available (was: Open) Partition balance tool between borkers -- Key: KAFKA-2106 URL: https://issues.apache.org/jira/browse/KAFKA-2106 Project: Kafka Issue Type: New Feature Components: admin Affects Versions: 0.8.3 Reporter: chenshangan The default partition assignment algorithm can work well in a static kafka cluster(number of brokers seldom change). Actually, in production env, number of brokers is always increasing according to the business data. When new brokers added to the cluster, it's better to provide a tool that can help to move existing data to new brokers. Currently, users need to choose topic or partitions manually and use the Reassign Partitions Tool (kafka-reassign-partitions.sh) to achieve the goal. It's a time-consuming task when there's a lot of topics in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2106) Partition balance tool between borkers
[ https://issues.apache.org/jira/browse/KAFKA-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenshangan updated KAFKA-2106: --- Attachment: KAFKA-2106.patch Implement basic rehash assignment algorithm Partition balance tool between borkers -- Key: KAFKA-2106 URL: https://issues.apache.org/jira/browse/KAFKA-2106 Project: Kafka Issue Type: New Feature Components: admin Affects Versions: 0.8.3 Reporter: chenshangan Attachments: KAFKA-2106.patch The default partition assignment algorithm can work well in a static kafka cluster(number of brokers seldom change). Actually, in production env, number of brokers is always increasing according to the business data. When new brokers added to the cluster, it's better to provide a tool that can help to move existing data to new brokers. Currently, users need to choose topic or partitions manually and use the Reassign Partitions Tool (kafka-reassign-partitions.sh) to achieve the goal. It's a time-consuming task when there's a lot of topics in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2149) fix default InterBrokerProtocolVersion
Onur Karaman created KAFKA-2149: --- Summary: fix default InterBrokerProtocolVersion Key: KAFKA-2149 URL: https://issues.apache.org/jira/browse/KAFKA-2149 Project: Kafka Issue Type: Bug Reporter: Onur Karaman Assignee: Onur Karaman Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 33532: fix default InterBrokerProtocolVersion
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33532/ --- Review request for kafka. Bugs: KAFKA-2149 https://issues.apache.org/jira/browse/KAFKA-2149 Repository: kafka Description --- Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 Diffs - core/src/main/scala/kafka/server/KafkaConfig.scala cfbbd2be550947dd2b3c8c2cab981fa08fb6d859 core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 2428dbd7197a58cf4cad42ef82b385dab3a2b15e Diff: https://reviews.apache.org/r/33532/diff/ Testing --- Thanks, Onur Karaman
Re: Review Request 33532: fix default InterBrokerProtocolVersion
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33532/ --- (Updated April 24, 2015, 8:23 p.m.) Review request for kafka. Bugs: KAFKA-2149 https://issues.apache.org/jira/browse/KAFKA-2149 Repository: kafka Description --- Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 Diffs - core/src/main/scala/kafka/server/KafkaConfig.scala cfbbd2be550947dd2b3c8c2cab981fa08fb6d859 core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 2428dbd7197a58cf4cad42ef82b385dab3a2b15e Diff: https://reviews.apache.org/r/33532/diff/ Testing (updated) --- Before this patch: brought up a controller with KAFKA-1809 and then an older broker. Broker gets BufferUnderflowException from UpdateMetadataRequest. After this patch: brought up a controller with this patch and then an older broker. Broker no longer gets BufferUnderflowException. Thanks, Onur Karaman
[jira] [Updated] (KAFKA-2138) KafkaProducer does not honor the retry backoff time.
[ https://issues.apache.org/jira/browse/KAFKA-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy updated KAFKA-2138: -- Resolution: Fixed Status: Resolved (was: Patch Available) Discussed offline with [~becket_qin] - KAFKA-2142 has been filed to do further improvements including fixing a pre-existing bug where we may prematurely send (before batch-full or linger time thresholds). KafkaProducer does not honor the retry backoff time. Key: KAFKA-2138 URL: https://issues.apache.org/jira/browse/KAFKA-2138 Project: Kafka Issue Type: Bug Reporter: Jiangjie Qin Assignee: Jiangjie Qin Priority: Critical Attachments: KAFKA-2138.patch, KAFKA-2138_2015-04-22_17:19:33.patch In KafkaProducer, we only check the batch.lastAttemptMs in ready. But we are not checking it in drain() as well. The problem is that if we have two partitions both on the same node, suppose Partition 1 should backoff while partition 2 should not. Currently partition 1's backoff time will be ignored. We should check the lastAttemptMs in drain() as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2149) fix default InterBrokerProtocolVersion
[ https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Onur Karaman updated KAFKA-2149: Status: Patch Available (was: Open) fix default InterBrokerProtocolVersion -- Key: KAFKA-2149 URL: https://issues.apache.org/jira/browse/KAFKA-2149 Project: Kafka Issue Type: Bug Reporter: Onur Karaman Assignee: Onur Karaman Attachments: KAFKA-2149.patch Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2149) fix default InterBrokerProtocolVersion
[ https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511657#comment-14511657 ] Onur Karaman commented on KAFKA-2149: - Created reviewboard https://reviews.apache.org/r/33532/diff/ against branch origin/trunk fix default InterBrokerProtocolVersion -- Key: KAFKA-2149 URL: https://issues.apache.org/jira/browse/KAFKA-2149 Project: Kafka Issue Type: Bug Reporter: Onur Karaman Assignee: Onur Karaman Attachments: KAFKA-2149.patch Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2149) fix default InterBrokerProtocolVersion
[ https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Onur Karaman updated KAFKA-2149: Attachment: KAFKA-2149.patch fix default InterBrokerProtocolVersion -- Key: KAFKA-2149 URL: https://issues.apache.org/jira/browse/KAFKA-2149 Project: Kafka Issue Type: Bug Reporter: Onur Karaman Assignee: Onur Karaman Attachments: KAFKA-2149.patch Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs
[ https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511660#comment-14511660 ] Onur Karaman commented on KAFKA-1809: - I hit problems upgrading to this patch because of the default InterBrokerProtocolVersion. Details and a patch are here: https://issues.apache.org/jira/browse/KAFKA-2149 Refactor brokers to allow listening on multiple ports and IPs -- Key: KAFKA-1809 URL: https://issues.apache.org/jira/browse/KAFKA-1809 Project: Kafka Issue Type: Sub-task Components: security Reporter: Gwen Shapira Assignee: Gwen Shapira Fix For: 0.8.3 Attachments: KAFKA-1809-DOCs.V1.patch, KAFKA-1809.patch, KAFKA-1809.v1.patch, KAFKA-1809.v2.patch, KAFKA-1809.v3.patch, KAFKA-1809_2014-12-25_11:04:17.patch, KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch, KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch, KAFKA-1809_2014-12-30_18:48:46.patch, KAFKA-1809_2015-01-05_14:23:57.patch, KAFKA-1809_2015-01-05_20:25:15.patch, KAFKA-1809_2015-01-05_21:40:14.patch, KAFKA-1809_2015-01-06_11:46:22.patch, KAFKA-1809_2015-01-13_18:16:21.patch, KAFKA-1809_2015-01-28_10:26:22.patch, KAFKA-1809_2015-02-02_11:55:16.patch, KAFKA-1809_2015-02-03_10:45:31.patch, KAFKA-1809_2015-02-03_10:46:42.patch, KAFKA-1809_2015-02-03_10:52:36.patch, KAFKA-1809_2015-03-16_09:02:18.patch, KAFKA-1809_2015-03-16_09:40:49.patch, KAFKA-1809_2015-03-27_15:04:10.patch, KAFKA-1809_2015-04-02_15:12:30.patch, KAFKA-1809_2015-04-02_19:03:58.patch, KAFKA-1809_2015-04-04_22:00:13.patch The goal is to eventually support different security mechanisms on different ports. Currently brokers are defined as host+port pair, and this definition exists throughout the code-base, therefore some refactoring is needed to support multiple ports for a single broker. The detailed design is here: https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Build failed in Jenkins: Kafka-trunk #472
See https://builds.apache.org/job/Kafka-trunk/472/changes Changes: [jjkoshy] KAFKA-2138; Fix producer to honor retry backoff; reviewed by Joel Koshy and Guozhang Wang -- [...truncated 2156 lines...] kafka.log.OffsetIndexTest lookupExtremeCases PASSED kafka.log.OffsetIndexTest appendTooMany PASSED kafka.log.OffsetIndexTest appendOutOfOrder PASSED kafka.log.OffsetIndexTest testReopen PASSED kafka.log.OffsetMapTest testBasicValidation PASSED kafka.log.OffsetMapTest testClear PASSED kafka.log.LogManagerTest testCreateLog PASSED kafka.log.LogManagerTest testGetNonExistentLog PASSED kafka.log.LogManagerTest testCleanupExpiredSegments PASSED kafka.log.LogManagerTest testCleanupSegmentsToMaintainSize PASSED kafka.log.LogManagerTest testTimeBasedFlush PASSED kafka.log.LogManagerTest testLeastLoadedAssignment PASSED kafka.log.LogManagerTest testTwoLogManagersUsingSameDirFails PASSED kafka.log.LogManagerTest testCheckpointRecoveryPoints PASSED kafka.log.LogManagerTest testRecoveryDirectoryMappingWithTrailingSlash PASSED kafka.log.LogManagerTest testRecoveryDirectoryMappingWithRelativeDirectory PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[0] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[1] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[2] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[3] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[4] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[5] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[6] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[7] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[8] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[9] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[10] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[11] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[12] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[13] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[14] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[15] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[16] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[17] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[18] PASSED kafka.log.BrokerCompressionTest testBrokerSideCompression[19] PASSED kafka.log.LogTest testTimeBasedLogRoll PASSED kafka.log.LogTest testTimeBasedLogRollJitter PASSED kafka.log.LogTest testSizeBasedLogRoll PASSED kafka.log.LogTest testLoadEmptyLog PASSED kafka.log.LogTest testAppendAndReadWithSequentialOffsets PASSED kafka.log.LogTest testAppendAndReadWithNonSequentialOffsets PASSED kafka.log.LogTest testReadAtLogGap PASSED kafka.log.LogTest testReadOutOfRange PASSED kafka.log.LogTest testLogRolls PASSED kafka.log.LogTest testCompressedMessages PASSED kafka.log.LogTest testThatGarbageCollectingSegmentsDoesntChangeOffset PASSED kafka.log.LogTest testMessageSetSizeCheck PASSED kafka.log.LogTest testCompactedTopicConstraints PASSED kafka.log.LogTest testMessageSizeCheck PASSED kafka.log.LogTest testLogRecoversToCorrectOffset PASSED kafka.log.LogTest testIndexRebuild PASSED kafka.log.LogTest testTruncateTo PASSED kafka.log.LogTest testIndexResizingAtTruncation PASSED kafka.log.LogTest testBogusIndexSegmentsAreRemoved PASSED kafka.log.LogTest testReopenThenTruncate PASSED kafka.log.LogTest testAsyncDelete PASSED kafka.log.LogTest testOpenDeletesObsoleteFiles PASSED kafka.log.LogTest testAppendMessageWithNullPayload PASSED kafka.log.LogTest testCorruptLog PASSED kafka.log.LogTest testCleanShutdownFile PASSED kafka.log.LogTest testParseTopicPartitionName PASSED kafka.log.LogTest testParseTopicPartitionNameForEmptyName PASSED kafka.log.LogTest testParseTopicPartitionNameForNull PASSED kafka.log.LogTest testParseTopicPartitionNameForMissingSeparator PASSED kafka.log.LogTest testParseTopicPartitionNameForMissingTopic PASSED kafka.log.LogTest testParseTopicPartitionNameForMissingPartition PASSED kafka.log.LogCleanerIntegrationTest cleanerTest PASSED kafka.log.CleanerTest testCleanSegments PASSED kafka.log.CleanerTest testCleaningWithDeletes PASSED kafka.log.CleanerTest testCleaningWithUnkeyedMessages PASSED kafka.log.CleanerTest testCleanSegmentsWithAbort PASSED kafka.log.CleanerTest testSegmentGrouping PASSED kafka.log.CleanerTest testSegmentGroupingWithSparseOffsets PASSED kafka.log.CleanerTest testBuildOffsetMap PASSED kafka.javaapi.consumer.ZookeeperConsumerConnectorTest testBasic PASSED kafka.javaapi.message.ByteBufferMessageSetTest testWrittenEqualsRead PASSED kafka.javaapi.message.ByteBufferMessageSetTest testIteratorIsConsistent PASSED
Re: [KIP-DISCUSSION] KIP-13 Quotas
Joel, What you suggested makes sense. Not sure if there is a strong need to throttle TMR though since it should be infrequent. Thanks, Jun On Tue, Apr 21, 2015 at 12:21 PM, Joel Koshy jjkosh...@gmail.com wrote: Given the caveats, it may be worth doing further investigation on the alternate approach which is to use a dedicated DelayQueue for requests that violate quota and compare pros/cons. So the approach is the following: all request handling occurs normally (i.e., unchanged from what we do today). i.e., purgatories will be unchanged. After handling a request and before sending the response, check if the request has violated a quota. If so, then enqueue the response into a DelayQueue. All responses can share the same DelayQueue. Send those responses out after the delay has been met. There are some benefits to doing this: - We will eventually want to quota other requests as well. The above seems to be a clean staged approach that should work uniformly for all requests. i.e., parse request - handle request normally - check quota - hold in delay queue if quota violated - respond . All requests can share the same DelayQueue. (In contrast with the current proposal we could end up with a bunch of purgatories, or a combination of purgatories and delay queues.) - Since this approach does not need any fundamental modifications to the current request handling, it addresses the caveats that Adi noted (which is holding producer requests/fetch requests longer than strictly necessary if quota is violated since the proposal was to not watch on keys in that case). Likewise it addresses the caveat that Guozhang noted (we may return no error if the request is held long enough due to quota violation and satisfy a producer request that may have in fact exceeded the ack timeout) although it is probably reasonable to hide this case from the user. - By avoiding the caveats it also avoids the suggested work-around to the caveats which is effectively to add a min-hold-time to the purgatory. Although this is not a lot of code, I think it adds a quota-driven feature to the purgatory which is already non-trivial and should ideally remain unassociated with quota enforcement. For this to work well we need to be sure that we don't hold a lot of data in the DelayQueue - and therein lies a quirk to this approach. Producer responses (and most other responses) are very small so there is no issue. Fetch responses are fine as well - since we read off a FileMessageSet in response (zero-copy). This will remain true even when we support SSL since encryption occurs at the session layer (not the application layer). Topic metadata response can be a problem though. For this we ideally want to build the topic metadata response only when we are ready to respond. So for metadata-style responses which could contain large response objects we may want to put the quota check and delay queue _before_ handling the request. So the design in this approach would need an amendment: provide a choice of where to put a request in the delay queue: either before handling or after handling (before response). So for: small request, large response: delay queue before handling large request, small response: delay queue after handling, before response small request, small response: either is fine large request, large resopnse: we really cannot do anything here but we don't really have this scenario yet So the design would look like this: - parse request - before handling request check if quota violated; if so compute two delay numbers: - before handling delay - before response delay - if before-handling delay 0 insert into before-handling delay queue - handle the request - if before-response delay 0 insert into before-response delay queue - respond Just throwing this out there for discussion. Thanks, Joel On Thu, Apr 16, 2015 at 02:56:23PM -0700, Jun Rao wrote: The quota check for the fetch request is a bit different from the produce request. I assume that for the fetch request, we will first get an estimated fetch response size to do the quota check. There are two things to think about. First, when we actually send the response, we probably don't want to record the metric again since it will double count. Second, the bytes that the fetch response actually sends could be more than the estimate. This means that the metric may not be 100% accurate. We may be able to limit the fetch size of each partition to what's in the original estimate. For the produce request, I was thinking that another way to do this is to first figure out the quota_timeout. Then wait in Purgatory for quota_timeout with no key. If the request is not satisfied in quota_timeout and (request_timeout quota_timeout), wait in Purgatory for (request_timeout - quota_timeout) with the original keys. Thanks, Jun On Tue, Apr 14, 2015 at 5:01
[jira] [Commented] (KAFKA-2149) fix default InterBrokerProtocolVersion
[ https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511790#comment-14511790 ] Gwen Shapira commented on KAFKA-2149: - Actually the decision to default to latest version for inter-broker protocol was a design decision, not a bug. The idea is that if the default is 0.8.2, we have the following problems: 1. New installations will not have new features out-of-the-box, they'll need to change configuration. Making life easier for experienced admins upgrading vs new users installing doesn't sound right. 2. We'll need to keep track of the default with every release We do have the upgrade process in the docs: https://kafka.apache.org/083/documentation.html#upgrade fix default InterBrokerProtocolVersion -- Key: KAFKA-2149 URL: https://issues.apache.org/jira/browse/KAFKA-2149 Project: Kafka Issue Type: Bug Reporter: Onur Karaman Assignee: Onur Karaman Attachments: KAFKA-2149.patch Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in KIP-2. We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers to include KAFKA-1809. This specifically happened when an older broker receives a UpdateMetadataRequest from a controller with the patch and the controller didn't explicitly set their inter.broker.protocol.version to 0.8.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)