Re: Review Request 33088: add heartbeat to coordinator

2015-04-24 Thread Onur Karaman


> On April 22, 2015, 2:33 a.m., Guozhang Wang wrote:
> > core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala, line 32
> > 
> >
> > Move kafka imports above scala / external lib imports.

Sorry I made the rb and realized I hadn't moved up 
org.apache.kafka.common.protocol.Errors and 
org.apache.kafka.common.requests.JoinGroupRequest in ConsumerCoordinator.


- Onur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33088/#review79762
---


On April 25, 2015, 5:46 a.m., Onur Karaman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33088/
> ---
> 
> (Updated April 25, 2015, 5:46 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1334
> https://issues.apache.org/jira/browse/KAFKA-1334
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> add heartbeat to coordinator
> 
> todo:
> - see how it performs under real load
> - add error code handling on the consumer side
> - implement the partition assignors
> 
> 
> Diffs
> -
> 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
>  e55ab11df4db0b0084f841a74cbcf819caf780d5 
>   clients/src/main/java/org/apache/kafka/common/protocol/Errors.java 
> 36aa412404ff1458c7bef0feecaaa8bc45bed9c7 
>   core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala 
> 456b602245e111880e1b8b361319cabff38ee0e9 
>   core/src/main/scala/kafka/coordinator/ConsumerRegistry.scala 
> 2f5797064d4131ecfc9d2750d9345a9fa3972a9a 
>   core/src/main/scala/kafka/coordinator/DelayedHeartbeat.scala 
> 6a6bc7bc4ceb648b67332e789c2c33de88e4cd86 
>   core/src/main/scala/kafka/coordinator/DelayedJoinGroup.scala 
> df60cbc35d09937b4e9c737c67229889c69d8698 
>   core/src/main/scala/kafka/coordinator/DelayedRebalance.scala 
> 8defa2e41c92f1ebe255177679d275c70dae5b3e 
>   core/src/main/scala/kafka/coordinator/Group.scala PRE-CREATION 
>   core/src/main/scala/kafka/coordinator/GroupRegistry.scala 
> 94ef5829b3a616c90018af1db7627bfe42e259e5 
>   core/src/main/scala/kafka/coordinator/HeartbeatBucket.scala 
> 821e26e97eaa97b5f4520474fff0fedbf406c82a 
>   core/src/main/scala/kafka/coordinator/PartitionAssignor.scala PRE-CREATION 
>   core/src/main/scala/kafka/server/DelayedOperationKey.scala 
> b673e43b0ba401b2e22f27aef550e3ab0ef4323c 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> b4004aa3a1456d337199aa1245fb0ae61f6add46 
>   core/src/main/scala/kafka/server/KafkaServer.scala 
> c63f4ba9d622817ea8636d4e6135fba917ce085a 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 18680ce100f10035175cc0263ba7787ab0f6a17a 
>   core/src/test/scala/unit/kafka/coordinator/GroupTest.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33088/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Onur Karaman
> 
>



Re: Review Request 33088: add heartbeat to coordinator

2015-04-24 Thread Onur Karaman


> On April 22, 2015, 2:33 a.m., Guozhang Wang wrote:
> > core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala, line 123
> > 
> >
> > How about handleConsumerJoinGroup?

I agree with one of Jay's earlier comments from another rb that joinGroup and 
heartbeat is cleaner.


- Onur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33088/#review79762
---


On April 25, 2015, 5:46 a.m., Onur Karaman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33088/
> ---
> 
> (Updated April 25, 2015, 5:46 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1334
> https://issues.apache.org/jira/browse/KAFKA-1334
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> add heartbeat to coordinator
> 
> todo:
> - see how it performs under real load
> - add error code handling on the consumer side
> - implement the partition assignors
> 
> 
> Diffs
> -
> 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
>  e55ab11df4db0b0084f841a74cbcf819caf780d5 
>   clients/src/main/java/org/apache/kafka/common/protocol/Errors.java 
> 36aa412404ff1458c7bef0feecaaa8bc45bed9c7 
>   core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala 
> 456b602245e111880e1b8b361319cabff38ee0e9 
>   core/src/main/scala/kafka/coordinator/ConsumerRegistry.scala 
> 2f5797064d4131ecfc9d2750d9345a9fa3972a9a 
>   core/src/main/scala/kafka/coordinator/DelayedHeartbeat.scala 
> 6a6bc7bc4ceb648b67332e789c2c33de88e4cd86 
>   core/src/main/scala/kafka/coordinator/DelayedJoinGroup.scala 
> df60cbc35d09937b4e9c737c67229889c69d8698 
>   core/src/main/scala/kafka/coordinator/DelayedRebalance.scala 
> 8defa2e41c92f1ebe255177679d275c70dae5b3e 
>   core/src/main/scala/kafka/coordinator/Group.scala PRE-CREATION 
>   core/src/main/scala/kafka/coordinator/GroupRegistry.scala 
> 94ef5829b3a616c90018af1db7627bfe42e259e5 
>   core/src/main/scala/kafka/coordinator/HeartbeatBucket.scala 
> 821e26e97eaa97b5f4520474fff0fedbf406c82a 
>   core/src/main/scala/kafka/coordinator/PartitionAssignor.scala PRE-CREATION 
>   core/src/main/scala/kafka/server/DelayedOperationKey.scala 
> b673e43b0ba401b2e22f27aef550e3ab0ef4323c 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> b4004aa3a1456d337199aa1245fb0ae61f6add46 
>   core/src/main/scala/kafka/server/KafkaServer.scala 
> c63f4ba9d622817ea8636d4e6135fba917ce085a 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 18680ce100f10035175cc0263ba7787ab0f6a17a 
>   core/src/test/scala/unit/kafka/coordinator/GroupTest.scala PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33088/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Onur Karaman
> 
>



[jira] [Commented] (KAFKA-1334) Add failure detection capability to the coordinator / consumer

2015-04-24 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512277#comment-14512277
 ] 

Onur Karaman commented on KAFKA-1334:
-

Updated reviewboard https://reviews.apache.org/r/33088/diff/
 against branch origin/trunk

> Add failure detection capability to the coordinator / consumer
> --
>
> Key: KAFKA-1334
> URL: https://issues.apache.org/jira/browse/KAFKA-1334
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Onur Karaman
> Attachments: KAFKA-1334.patch, KAFKA-1334_2015-04-11_22:47:27.patch, 
> KAFKA-1334_2015-04-13_11:55:06.patch, KAFKA-1334_2015-04-13_11:58:53.patch, 
> KAFKA-1334_2015-04-18_10:16:23.patch, KAFKA-1334_2015-04-18_12:16:39.patch, 
> KAFKA-1334_2015-04-24_22:46:15.patch
>
>
> 1) Add coordinator discovery and failure detection to the consumer.
> 2) Add failure detection capability to the coordinator when group management 
> is used.
> This will not include any rebalancing logic, just the logic to detect 
> consumer failures using session.timeout.ms. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33088: add heartbeat to coordinator

2015-04-24 Thread Onur Karaman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33088/
---

(Updated April 25, 2015, 5:46 a.m.)


Review request for kafka.


Bugs: KAFKA-1334
https://issues.apache.org/jira/browse/KAFKA-1334


Repository: kafka


Description
---

add heartbeat to coordinator

todo:
- see how it performs under real load
- add error code handling on the consumer side
- implement the partition assignors


Diffs (updated)
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
 e55ab11df4db0b0084f841a74cbcf819caf780d5 
  clients/src/main/java/org/apache/kafka/common/protocol/Errors.java 
36aa412404ff1458c7bef0feecaaa8bc45bed9c7 
  core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala 
456b602245e111880e1b8b361319cabff38ee0e9 
  core/src/main/scala/kafka/coordinator/ConsumerRegistry.scala 
2f5797064d4131ecfc9d2750d9345a9fa3972a9a 
  core/src/main/scala/kafka/coordinator/DelayedHeartbeat.scala 
6a6bc7bc4ceb648b67332e789c2c33de88e4cd86 
  core/src/main/scala/kafka/coordinator/DelayedJoinGroup.scala 
df60cbc35d09937b4e9c737c67229889c69d8698 
  core/src/main/scala/kafka/coordinator/DelayedRebalance.scala 
8defa2e41c92f1ebe255177679d275c70dae5b3e 
  core/src/main/scala/kafka/coordinator/Group.scala PRE-CREATION 
  core/src/main/scala/kafka/coordinator/GroupRegistry.scala 
94ef5829b3a616c90018af1db7627bfe42e259e5 
  core/src/main/scala/kafka/coordinator/HeartbeatBucket.scala 
821e26e97eaa97b5f4520474fff0fedbf406c82a 
  core/src/main/scala/kafka/coordinator/PartitionAssignor.scala PRE-CREATION 
  core/src/main/scala/kafka/server/DelayedOperationKey.scala 
b673e43b0ba401b2e22f27aef550e3ab0ef4323c 
  core/src/main/scala/kafka/server/KafkaApis.scala 
b4004aa3a1456d337199aa1245fb0ae61f6add46 
  core/src/main/scala/kafka/server/KafkaServer.scala 
c63f4ba9d622817ea8636d4e6135fba917ce085a 
  core/src/main/scala/kafka/server/OffsetManager.scala 
18680ce100f10035175cc0263ba7787ab0f6a17a 
  core/src/test/scala/unit/kafka/coordinator/GroupTest.scala PRE-CREATION 

Diff: https://reviews.apache.org/r/33088/diff/


Testing
---


Thanks,

Onur Karaman



[jira] [Updated] (KAFKA-1334) Add failure detection capability to the coordinator / consumer

2015-04-24 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-1334:

Attachment: KAFKA-1334_2015-04-24_22:46:15.patch

> Add failure detection capability to the coordinator / consumer
> --
>
> Key: KAFKA-1334
> URL: https://issues.apache.org/jira/browse/KAFKA-1334
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.9.0
>Reporter: Neha Narkhede
>Assignee: Onur Karaman
> Attachments: KAFKA-1334.patch, KAFKA-1334_2015-04-11_22:47:27.patch, 
> KAFKA-1334_2015-04-13_11:55:06.patch, KAFKA-1334_2015-04-13_11:58:53.patch, 
> KAFKA-1334_2015-04-18_10:16:23.patch, KAFKA-1334_2015-04-18_12:16:39.patch, 
> KAFKA-1334_2015-04-24_22:46:15.patch
>
>
> 1) Add coordinator discovery and failure detection to the consumer.
> 2) Add failure detection capability to the coordinator when group management 
> is used.
> This will not include any rebalancing logic, just the logic to detect 
> consumer failures using session.timeout.ms. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33125: Add comment to timing fix

2015-04-24 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33125/#review81581
---



clients/src/test/java/org/apache/kafka/clients/MetadataTest.java


Do we need this sleep?


- Guozhang Wang


On April 13, 2015, 7:15 p.m., Rajini Sivaram wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33125/
> ---
> 
> (Updated April 13, 2015, 7:15 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2089
> https://issues.apache.org/jira/browse/KAFKA-2089
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Patch for KAFKA-2089: Fix timing issue in MetadataTest
> 
> 
> Diffs
> -
> 
>   clients/src/test/java/org/apache/kafka/clients/MetadataTest.java 
> 928087d29deb80655ca83726c1ebc45d76468c1f 
> 
> Diff: https://reviews.apache.org/r/33125/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Rajini Sivaram
> 
>



[jira] [Commented] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs

2015-04-24 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512134#comment-14512134
 ] 

Onur Karaman commented on KAFKA-1809:
-

As a follow-up: there was inconsistent documentation between KIP-2 and 0.8.3 
documentation. It has already been resolved.

Upgrade steps are specified here:
http://kafka.apache.org/083/documentation.html#upgrade

> Refactor brokers to allow listening on multiple ports and IPs 
> --
>
> Key: KAFKA-1809
> URL: https://issues.apache.org/jira/browse/KAFKA-1809
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Gwen Shapira
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1809-DOCs.V1.patch, KAFKA-1809.patch, 
> KAFKA-1809.v1.patch, KAFKA-1809.v2.patch, KAFKA-1809.v3.patch, 
> KAFKA-1809_2014-12-25_11:04:17.patch, KAFKA-1809_2014-12-27_12:03:35.patch, 
> KAFKA-1809_2014-12-27_12:22:56.patch, KAFKA-1809_2014-12-29_12:11:36.patch, 
> KAFKA-1809_2014-12-30_11:25:29.patch, KAFKA-1809_2014-12-30_18:48:46.patch, 
> KAFKA-1809_2015-01-05_14:23:57.patch, KAFKA-1809_2015-01-05_20:25:15.patch, 
> KAFKA-1809_2015-01-05_21:40:14.patch, KAFKA-1809_2015-01-06_11:46:22.patch, 
> KAFKA-1809_2015-01-13_18:16:21.patch, KAFKA-1809_2015-01-28_10:26:22.patch, 
> KAFKA-1809_2015-02-02_11:55:16.patch, KAFKA-1809_2015-02-03_10:45:31.patch, 
> KAFKA-1809_2015-02-03_10:46:42.patch, KAFKA-1809_2015-02-03_10:52:36.patch, 
> KAFKA-1809_2015-03-16_09:02:18.patch, KAFKA-1809_2015-03-16_09:40:49.patch, 
> KAFKA-1809_2015-03-27_15:04:10.patch, KAFKA-1809_2015-04-02_15:12:30.patch, 
> KAFKA-1809_2015-04-02_19:03:58.patch, KAFKA-1809_2015-04-04_22:00:13.patch
>
>
> The goal is to eventually support different security mechanisms on different 
> ports. 
> Currently brokers are defined as host+port pair, and this definition exists 
> throughout the code-base, therefore some refactoring is needed to support 
> multiple ports for a single broker.
> The detailed design is here: 
> https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [KIP-DISCUSSION] KIP-13 Quotas

2015-04-24 Thread Joel Koshy
To turn it off/on we can just add a clear config
(quota.enforcement.enabled) or similar.

Thanks,

Joel

On Fri, Apr 24, 2015 at 06:27:22PM -0400, Gari Singh wrote:
> If we can't disable it, then can we use the tried and true method of using
> "-1" to indicate that no throttling should take place?
> 
> On Tue, Apr 21, 2015 at 4:38 PM, Joel Koshy  wrote:
> 
> > In either approach I'm not sure we considered being able to turn it
> > off completely. IOW, no it is not a "plugin" if that's what you are
> > asking. We can set very high defaults by default and in the absence of
> > any overrides it would effectively be off. The quota enforcement is
> > actually already part of the metrics package. The new code (that
> > exercises it) will be added to wherever the metrics are being
> > measured.
> >
> > Thanks,
> >
> > Joel
> >
> > On Tue, Apr 21, 2015 at 03:04:07PM -0400, Tong Li wrote:
> > >
> > > Joel,
> > >   Nice write up. Couple of questions, not sure if they have been
> > > answered. Since we will have a call later today, I would like to ask here
> > > as well so that we can talk about if not responded in email discussion.
> > >
> > >   1. Where the new code will be plugged in, that is, where is the
> > > plugin point, KafkaApi?
> > >   2. Can this quota control be disabled/enabled without affect
> > anything
> > > else? From the design wiki page, it seems to me that each request will at
> > > least pay a penalty of checking quota enablement.
> > >
> > > Thanks.
> > >
> > > Tong Li
> > > OpenStack & Kafka Community Development
> > > Building 501/B205
> > > liton...@us.ibm.com
> > >
> > >
> > >
> > > From: Joel Koshy 
> > > To:   dev@kafka.apache.org
> > > Date: 04/21/2015 01:22 PM
> > > Subject:  Re: [KIP-DISCUSSION] KIP-13 Quotas
> > >
> > >
> > >
> > > Given the caveats, it may be worth doing further investigation on the
> > > alternate approach which is to use a dedicated DelayQueue for requests
> > > that violate quota and compare pros/cons.
> > >
> > > So the approach is the following: all request handling occurs normally
> > > (i.e., unchanged from what we do today). i.e., purgatories will be
> > > unchanged.  After handling a request and before sending the response,
> > > check if the request has violated a quota. If so, then enqueue the
> > > response into a DelayQueue. All responses can share the same
> > > DelayQueue. Send those responses out after the delay has been met.
> > >
> > > There are some benefits to doing this:
> > >
> > > - We will eventually want to quota other requests as well. The above
> > >   seems to be a clean staged approach that should work uniformly for
> > >   all requests. i.e., parse request -> handle request normally ->
> > >   check quota -> hold in delay queue if quota violated -> respond .
> > >   All requests can share the same DelayQueue. (In contrast with the
> > >   current proposal we could end up with a bunch of purgatories, or a
> > >   combination of purgatories and delay queues.)
> > > - Since this approach does not need any fundamental modifications to
> > >   the current request handling, it addresses the caveats that Adi
> > >   noted (which is holding producer requests/fetch requests longer than
> > >   strictly necessary if quota is violated since the proposal was to
> > >   not watch on keys in that case). Likewise it addresses the caveat
> > >   that Guozhang noted (we may return no error if the request is held
> > >   long enough due to quota violation and satisfy a producer request
> > >   that may have in fact exceeded the ack timeout) although it is
> > >   probably reasonable to hide this case from the user.
> > > - By avoiding the caveats it also avoids the suggested work-around to
> > >   the caveats which is effectively to add a min-hold-time to the
> > >   purgatory. Although this is not a lot of code, I think it adds a
> > >   quota-driven feature to the purgatory which is already non-trivial
> > >   and should ideally remain unassociated with quota enforcement.
> > >
> > > For this to work well we need to be sure that we don't hold a lot of
> > > data in the DelayQueue - and therein lies a quirk to this approach.
> > > Producer responses (and most other responses) are very small so there
> > > is no issue. Fetch responses are fine as well - since we read off a
> > > FileMessageSet in response (zero-copy). This will remain true even
> > > when we support SSL since encryption occurs at the session layer (not
> > > the application layer).
> > >
> > > Topic metadata response can be a problem though. For this we ideally
> > > want to build the topic metadata response only when we are ready to
> > > respond. So for metadata-style responses which could contain large
> > > response objects we may want to put the quota check and delay queue
> > > _before_ handling the request. So the design in this approach would
> > > need an amendment: provide a choice of where to put a request in the
> > > delay queue: either before handling or afte

Re: [KIP-DISCUSSION] KIP-13 Quotas

2015-04-24 Thread Joel Koshy
I think Jay meant a catch-all request/sec limit on all requests
per-client. That makes sense.

On Fri, Apr 24, 2015 at 11:02:29PM +, Aditya Auradkar wrote:
> I think Joel's suggestion is quite good. It's still possible to throttle 
> other types of requests using purgatory but we will need a separate purgatory 
> and DelayedOperation variants of different request types or perhaps add a 
> ThrottledOperation type. It also addresses a couple of special case 
> situations wrt delay time and replication timeouts. 
> 
> Jay, if we have a general mechanism of delaying requests then it should be 
> possible to throttle any type of request as long as we have metrics on a 
> per-client basis. For offset commit requests, we would simply need a request 
> rate metric per-client and a good default quota.
> 
> Thanks,
> Aditya
> 
> 
> From: Jay Kreps [jay.kr...@gmail.com]
> Sent: Friday, April 24, 2015 3:20 PM
> To: dev@kafka.apache.org
> Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas
> 
> Hey Jun/Joel,
> 
> Yeah we will definitely want to quota non-produce/consume requests.
> Especially offset commit and any other requests the consumer can trigger
> could easily get invoked in a tight loop by accident. We haven't talked
> about this a ton, but presumably the mechanism for all these would just be
> a general requests/sec limit that covers all requests?
> 
> -Jay
> 
> 
> On Fri, Apr 24, 2015 at 2:18 PM, Jun Rao  wrote:
> 
> > Joel,
> >
> > What you suggested makes sense. Not sure if there is a strong need to
> > throttle TMR though since it should be infrequent.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Apr 21, 2015 at 12:21 PM, Joel Koshy  wrote:
> >
> > > Given the caveats, it may be worth doing further investigation on the
> > > alternate approach which is to use a dedicated DelayQueue for requests
> > > that violate quota and compare pros/cons.
> > >
> > > So the approach is the following: all request handling occurs normally
> > > (i.e., unchanged from what we do today). i.e., purgatories will be
> > > unchanged.  After handling a request and before sending the response,
> > > check if the request has violated a quota. If so, then enqueue the
> > > response into a DelayQueue. All responses can share the same
> > > DelayQueue. Send those responses out after the delay has been met.
> > >
> > > There are some benefits to doing this:
> > >
> > > - We will eventually want to quota other requests as well. The above
> > >   seems to be a clean staged approach that should work uniformly for
> > >   all requests. i.e., parse request -> handle request normally ->
> > >   check quota -> hold in delay queue if quota violated -> respond .
> > >   All requests can share the same DelayQueue. (In contrast with the
> > >   current proposal we could end up with a bunch of purgatories, or a
> > >   combination of purgatories and delay queues.)
> > > - Since this approach does not need any fundamental modifications to
> > >   the current request handling, it addresses the caveats that Adi
> > >   noted (which is holding producer requests/fetch requests longer than
> > >   strictly necessary if quota is violated since the proposal was to
> > >   not watch on keys in that case). Likewise it addresses the caveat
> > >   that Guozhang noted (we may return no error if the request is held
> > >   long enough due to quota violation and satisfy a producer request
> > >   that may have in fact exceeded the ack timeout) although it is
> > >   probably reasonable to hide this case from the user.
> > > - By avoiding the caveats it also avoids the suggested work-around to
> > >   the caveats which is effectively to add a min-hold-time to the
> > >   purgatory. Although this is not a lot of code, I think it adds a
> > >   quota-driven feature to the purgatory which is already non-trivial
> > >   and should ideally remain unassociated with quota enforcement.
> > >
> > > For this to work well we need to be sure that we don't hold a lot of
> > > data in the DelayQueue - and therein lies a quirk to this approach.
> > > Producer responses (and most other responses) are very small so there
> > > is no issue. Fetch responses are fine as well - since we read off a
> > > FileMessageSet in response (zero-copy). This will remain true even
> > > when we support SSL since encryption occurs at the session layer (not
> > > the application layer).
> > >
> > > Topic metadata response can be a problem though. For this we ideally
> > > want to build the topic metadata response only when we are ready to
> > > respond. So for metadata-style responses which could contain large
> > > response objects we may want to put the quota check and delay queue
> > > _before_ handling the request. So the design in this approach would
> > > need an amendment: provide a choice of where to put a request in the
> > > delay queue: either before handling or after handling (before
> > > response). So for:
> > >
> > > small request, large response

[jira] [Commented] (KAFKA-2132) Move Log4J appender to clients module

2015-04-24 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512040#comment-14512040
 ] 

Ashish K Singh commented on KAFKA-2132:
---

Btw, looking more into it, I guess we are combining two issues here.

1. KafkaLog4jAppender is really a producer and do not belong in core. For a 
user to be able to use KafkaLog4jAppender, he/she will have to pull the entire 
Kafka core, which is definitely not required. This is what this JIRA is about.
2. Kafka, being a library should not depend on log4j.

I think the solution for (1) is to move the KafkaLog4jAppender to clients. For 
(2), we might have to look into ways to completely get rid of log4j in Kafka 
core.

> Move Log4J appender to clients module
> -
>
> Key: KAFKA-2132
> URL: https://issues.apache.org/jira/browse/KAFKA-2132
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Gwen Shapira
>Assignee: Ashish K Singh
>
> Log4j appender is just a producer.
> Since we have a new producer in the clients module, no need to keep Log4J 
> appender in "core" and force people to package all of Kafka with their apps.
> Lets move the Log4jAppender to clients module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: [KIP-DISCUSSION] KIP-13 Quotas

2015-04-24 Thread Aditya Auradkar
I think Joel's suggestion is quite good. It's still possible to throttle other 
types of requests using purgatory but we will need a separate purgatory and 
DelayedOperation variants of different request types or perhaps add a 
ThrottledOperation type. It also addresses a couple of special case situations 
wrt delay time and replication timeouts. 

Jay, if we have a general mechanism of delaying requests then it should be 
possible to throttle any type of request as long as we have metrics on a 
per-client basis. For offset commit requests, we would simply need a request 
rate metric per-client and a good default quota.

Thanks,
Aditya


From: Jay Kreps [jay.kr...@gmail.com]
Sent: Friday, April 24, 2015 3:20 PM
To: dev@kafka.apache.org
Subject: Re: [KIP-DISCUSSION] KIP-13 Quotas

Hey Jun/Joel,

Yeah we will definitely want to quota non-produce/consume requests.
Especially offset commit and any other requests the consumer can trigger
could easily get invoked in a tight loop by accident. We haven't talked
about this a ton, but presumably the mechanism for all these would just be
a general requests/sec limit that covers all requests?

-Jay


On Fri, Apr 24, 2015 at 2:18 PM, Jun Rao  wrote:

> Joel,
>
> What you suggested makes sense. Not sure if there is a strong need to
> throttle TMR though since it should be infrequent.
>
> Thanks,
>
> Jun
>
> On Tue, Apr 21, 2015 at 12:21 PM, Joel Koshy  wrote:
>
> > Given the caveats, it may be worth doing further investigation on the
> > alternate approach which is to use a dedicated DelayQueue for requests
> > that violate quota and compare pros/cons.
> >
> > So the approach is the following: all request handling occurs normally
> > (i.e., unchanged from what we do today). i.e., purgatories will be
> > unchanged.  After handling a request and before sending the response,
> > check if the request has violated a quota. If so, then enqueue the
> > response into a DelayQueue. All responses can share the same
> > DelayQueue. Send those responses out after the delay has been met.
> >
> > There are some benefits to doing this:
> >
> > - We will eventually want to quota other requests as well. The above
> >   seems to be a clean staged approach that should work uniformly for
> >   all requests. i.e., parse request -> handle request normally ->
> >   check quota -> hold in delay queue if quota violated -> respond .
> >   All requests can share the same DelayQueue. (In contrast with the
> >   current proposal we could end up with a bunch of purgatories, or a
> >   combination of purgatories and delay queues.)
> > - Since this approach does not need any fundamental modifications to
> >   the current request handling, it addresses the caveats that Adi
> >   noted (which is holding producer requests/fetch requests longer than
> >   strictly necessary if quota is violated since the proposal was to
> >   not watch on keys in that case). Likewise it addresses the caveat
> >   that Guozhang noted (we may return no error if the request is held
> >   long enough due to quota violation and satisfy a producer request
> >   that may have in fact exceeded the ack timeout) although it is
> >   probably reasonable to hide this case from the user.
> > - By avoiding the caveats it also avoids the suggested work-around to
> >   the caveats which is effectively to add a min-hold-time to the
> >   purgatory. Although this is not a lot of code, I think it adds a
> >   quota-driven feature to the purgatory which is already non-trivial
> >   and should ideally remain unassociated with quota enforcement.
> >
> > For this to work well we need to be sure that we don't hold a lot of
> > data in the DelayQueue - and therein lies a quirk to this approach.
> > Producer responses (and most other responses) are very small so there
> > is no issue. Fetch responses are fine as well - since we read off a
> > FileMessageSet in response (zero-copy). This will remain true even
> > when we support SSL since encryption occurs at the session layer (not
> > the application layer).
> >
> > Topic metadata response can be a problem though. For this we ideally
> > want to build the topic metadata response only when we are ready to
> > respond. So for metadata-style responses which could contain large
> > response objects we may want to put the quota check and delay queue
> > _before_ handling the request. So the design in this approach would
> > need an amendment: provide a choice of where to put a request in the
> > delay queue: either before handling or after handling (before
> > response). So for:
> >
> > small request, large response: delay queue before handling
> > large request, small response: delay queue after handling, before
> response
> > small request, small response: either is fine
> > large request, large resopnse: we really cannot do anything here but we
> > don't really have this scenario yet
> >
> > So the design would look like this:
> >
> > - parse request
> > - bef

[jira] [Updated] (KAFKA-2149) fix default InterBrokerProtocolVersion

2015-04-24 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-2149:

Resolution: Not A Problem
Status: Resolved  (was: Patch Available)

> fix default InterBrokerProtocolVersion
> --
>
> Key: KAFKA-2149
> URL: https://issues.apache.org/jira/browse/KAFKA-2149
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2149.patch
>
>
> Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
> KIP-2.
> We hit wire-protocol problems (BufferUnderflowException) with upgrading 
> brokers to include KAFKA-1809. This specifically happened when an older 
> broker receives a UpdateMetadataRequest from a controller with the patch and 
> the controller didn't explicitly set their inter.broker.protocol.version to 
> 0.8.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2149) fix default InterBrokerProtocolVersion

2015-04-24 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511943#comment-14511943
 ] 

Onur Karaman commented on KAFKA-2149:
-

Cool thanks!

> fix default InterBrokerProtocolVersion
> --
>
> Key: KAFKA-2149
> URL: https://issues.apache.org/jira/browse/KAFKA-2149
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2149.patch
>
>
> Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
> KIP-2.
> We hit wire-protocol problems (BufferUnderflowException) with upgrading 
> brokers to include KAFKA-1809. This specifically happened when an older 
> broker receives a UpdateMetadataRequest from a controller with the patch and 
> the controller didn't explicitly set their inter.broker.protocol.version to 
> 0.8.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2149) fix default InterBrokerProtocolVersion

2015-04-24 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511937#comment-14511937
 ] 

Gwen Shapira commented on KAFKA-2149:
-

Oops, the kip actually contained both options and we never removed the one we 
didn't implement. Sorry about it. You probably noticed that the configuration 
name is wrong there too.

KIP-2 now has the correct version of the upgrade instructions, but I still 
recommend using the docs for reference. 

> fix default InterBrokerProtocolVersion
> --
>
> Key: KAFKA-2149
> URL: https://issues.apache.org/jira/browse/KAFKA-2149
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2149.patch
>
>
> Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
> KIP-2.
> We hit wire-protocol problems (BufferUnderflowException) with upgrading 
> brokers to include KAFKA-1809. This specifically happened when an older 
> broker receives a UpdateMetadataRequest from a controller with the patch and 
> the controller didn't explicitly set their inter.broker.protocol.version to 
> 0.8.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [KIP-DISCUSSION] KIP-13 Quotas

2015-04-24 Thread Gari Singh
If we can't disable it, then can we use the tried and true method of using
"-1" to indicate that no throttling should take place?

On Tue, Apr 21, 2015 at 4:38 PM, Joel Koshy  wrote:

> In either approach I'm not sure we considered being able to turn it
> off completely. IOW, no it is not a "plugin" if that's what you are
> asking. We can set very high defaults by default and in the absence of
> any overrides it would effectively be off. The quota enforcement is
> actually already part of the metrics package. The new code (that
> exercises it) will be added to wherever the metrics are being
> measured.
>
> Thanks,
>
> Joel
>
> On Tue, Apr 21, 2015 at 03:04:07PM -0400, Tong Li wrote:
> >
> > Joel,
> >   Nice write up. Couple of questions, not sure if they have been
> > answered. Since we will have a call later today, I would like to ask here
> > as well so that we can talk about if not responded in email discussion.
> >
> >   1. Where the new code will be plugged in, that is, where is the
> > plugin point, KafkaApi?
> >   2. Can this quota control be disabled/enabled without affect
> anything
> > else? From the design wiki page, it seems to me that each request will at
> > least pay a penalty of checking quota enablement.
> >
> > Thanks.
> >
> > Tong Li
> > OpenStack & Kafka Community Development
> > Building 501/B205
> > liton...@us.ibm.com
> >
> >
> >
> > From: Joel Koshy 
> > To:   dev@kafka.apache.org
> > Date: 04/21/2015 01:22 PM
> > Subject:  Re: [KIP-DISCUSSION] KIP-13 Quotas
> >
> >
> >
> > Given the caveats, it may be worth doing further investigation on the
> > alternate approach which is to use a dedicated DelayQueue for requests
> > that violate quota and compare pros/cons.
> >
> > So the approach is the following: all request handling occurs normally
> > (i.e., unchanged from what we do today). i.e., purgatories will be
> > unchanged.  After handling a request and before sending the response,
> > check if the request has violated a quota. If so, then enqueue the
> > response into a DelayQueue. All responses can share the same
> > DelayQueue. Send those responses out after the delay has been met.
> >
> > There are some benefits to doing this:
> >
> > - We will eventually want to quota other requests as well. The above
> >   seems to be a clean staged approach that should work uniformly for
> >   all requests. i.e., parse request -> handle request normally ->
> >   check quota -> hold in delay queue if quota violated -> respond .
> >   All requests can share the same DelayQueue. (In contrast with the
> >   current proposal we could end up with a bunch of purgatories, or a
> >   combination of purgatories and delay queues.)
> > - Since this approach does not need any fundamental modifications to
> >   the current request handling, it addresses the caveats that Adi
> >   noted (which is holding producer requests/fetch requests longer than
> >   strictly necessary if quota is violated since the proposal was to
> >   not watch on keys in that case). Likewise it addresses the caveat
> >   that Guozhang noted (we may return no error if the request is held
> >   long enough due to quota violation and satisfy a producer request
> >   that may have in fact exceeded the ack timeout) although it is
> >   probably reasonable to hide this case from the user.
> > - By avoiding the caveats it also avoids the suggested work-around to
> >   the caveats which is effectively to add a min-hold-time to the
> >   purgatory. Although this is not a lot of code, I think it adds a
> >   quota-driven feature to the purgatory which is already non-trivial
> >   and should ideally remain unassociated with quota enforcement.
> >
> > For this to work well we need to be sure that we don't hold a lot of
> > data in the DelayQueue - and therein lies a quirk to this approach.
> > Producer responses (and most other responses) are very small so there
> > is no issue. Fetch responses are fine as well - since we read off a
> > FileMessageSet in response (zero-copy). This will remain true even
> > when we support SSL since encryption occurs at the session layer (not
> > the application layer).
> >
> > Topic metadata response can be a problem though. For this we ideally
> > want to build the topic metadata response only when we are ready to
> > respond. So for metadata-style responses which could contain large
> > response objects we may want to put the quota check and delay queue
> > _before_ handling the request. So the design in this approach would
> > need an amendment: provide a choice of where to put a request in the
> > delay queue: either before handling or after handling (before
> > response). So for:
> >
> > small request, large response: delay queue before handling
> > large request, small response: delay queue after handling, before
> response
> > small request, small response: either is fine
> > large request, large resopnse: we really cannot do anything here but we
> > don't really have this scenario yet
> >

Re: Review Request 27204: Patch for KAFKA-1683

2015-04-24 Thread Sriharsha Chintalapani


> On April 24, 2015, 7:07 p.m., Gari Singh wrote:
> > 1) I think that Session should take a Subject rather than just a single 
> > Principal.  The reason for this is because a Subject can have multiple 
> > Principals (for example both a username and a group or perhaps someone 
> > would want to use both the username and the clientIP as Principals)
> > 
> > This is also more in line with JAAS as well and would fit better with 
> > authentication modules
> > 
> > 2)  We would then also have multiple concrete Principals, e.g.
> > 
> > KafkaPrincipal
> > KafkaUserPrincipal
> > KafkaGroupPrincipal
> > (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
> > etc
> > 
> > This is important as eventually (hopefully sooner than later), we will 
> > support multiple types of authentication which may each want to populate 
> > the Subject with one or more Principals and perhaps even credentials (this 
> > could be used in the future to hold encryption keys or perhaps the raw info 
> > prior to authentication).

I am not sure how the Subject is valid here. Client holds a its own Subject and 
server holds its own Subject. Once Sasl auth done you get the client's 
authorizer ID by calling saslServer.getAuthorizationID() this will give you a 
String of the clients principal. Why would we associate a Subject than just a 
prinicipal.


- Sriharsha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27204/#review81522
---


On Oct. 26, 2014, 5:37 a.m., Gwen Shapira wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27204/
> ---
> 
> (Updated Oct. 26, 2014, 5:37 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1683
> https://issues.apache.org/jira/browse/KAFKA-1683
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> added test for Session
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/network/RequestChannel.scala 
> 4560d8fb7dbfe723085665e6fd611c295e07b69b 
>   core/src/main/scala/kafka/network/SocketServer.scala 
> cee76b323e5f3e4c783749ac9e78e1ef02897e3b 
>   core/src/test/scala/unit/kafka/network/SocketServerTest.scala 
> 5f4d85254c384dcc27a5a84f0836ea225d3a901a 
> 
> Diff: https://reviews.apache.org/r/27204/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Gwen Shapira
> 
>



Re: [KIP-DISCUSSION] KIP-13 Quotas

2015-04-24 Thread Jay Kreps
Hey Jun/Joel,

Yeah we will definitely want to quota non-produce/consume requests.
Especially offset commit and any other requests the consumer can trigger
could easily get invoked in a tight loop by accident. We haven't talked
about this a ton, but presumably the mechanism for all these would just be
a general requests/sec limit that covers all requests?

-Jay


On Fri, Apr 24, 2015 at 2:18 PM, Jun Rao  wrote:

> Joel,
>
> What you suggested makes sense. Not sure if there is a strong need to
> throttle TMR though since it should be infrequent.
>
> Thanks,
>
> Jun
>
> On Tue, Apr 21, 2015 at 12:21 PM, Joel Koshy  wrote:
>
> > Given the caveats, it may be worth doing further investigation on the
> > alternate approach which is to use a dedicated DelayQueue for requests
> > that violate quota and compare pros/cons.
> >
> > So the approach is the following: all request handling occurs normally
> > (i.e., unchanged from what we do today). i.e., purgatories will be
> > unchanged.  After handling a request and before sending the response,
> > check if the request has violated a quota. If so, then enqueue the
> > response into a DelayQueue. All responses can share the same
> > DelayQueue. Send those responses out after the delay has been met.
> >
> > There are some benefits to doing this:
> >
> > - We will eventually want to quota other requests as well. The above
> >   seems to be a clean staged approach that should work uniformly for
> >   all requests. i.e., parse request -> handle request normally ->
> >   check quota -> hold in delay queue if quota violated -> respond .
> >   All requests can share the same DelayQueue. (In contrast with the
> >   current proposal we could end up with a bunch of purgatories, or a
> >   combination of purgatories and delay queues.)
> > - Since this approach does not need any fundamental modifications to
> >   the current request handling, it addresses the caveats that Adi
> >   noted (which is holding producer requests/fetch requests longer than
> >   strictly necessary if quota is violated since the proposal was to
> >   not watch on keys in that case). Likewise it addresses the caveat
> >   that Guozhang noted (we may return no error if the request is held
> >   long enough due to quota violation and satisfy a producer request
> >   that may have in fact exceeded the ack timeout) although it is
> >   probably reasonable to hide this case from the user.
> > - By avoiding the caveats it also avoids the suggested work-around to
> >   the caveats which is effectively to add a min-hold-time to the
> >   purgatory. Although this is not a lot of code, I think it adds a
> >   quota-driven feature to the purgatory which is already non-trivial
> >   and should ideally remain unassociated with quota enforcement.
> >
> > For this to work well we need to be sure that we don't hold a lot of
> > data in the DelayQueue - and therein lies a quirk to this approach.
> > Producer responses (and most other responses) are very small so there
> > is no issue. Fetch responses are fine as well - since we read off a
> > FileMessageSet in response (zero-copy). This will remain true even
> > when we support SSL since encryption occurs at the session layer (not
> > the application layer).
> >
> > Topic metadata response can be a problem though. For this we ideally
> > want to build the topic metadata response only when we are ready to
> > respond. So for metadata-style responses which could contain large
> > response objects we may want to put the quota check and delay queue
> > _before_ handling the request. So the design in this approach would
> > need an amendment: provide a choice of where to put a request in the
> > delay queue: either before handling or after handling (before
> > response). So for:
> >
> > small request, large response: delay queue before handling
> > large request, small response: delay queue after handling, before
> response
> > small request, small response: either is fine
> > large request, large resopnse: we really cannot do anything here but we
> > don't really have this scenario yet
> >
> > So the design would look like this:
> >
> > - parse request
> > - before handling request check if quota violated; if so compute two
> delay
> > numbers:
> >   - before handling delay
> >   - before response delay
> > - if before-handling delay > 0 insert into before-handling delay queue
> > - handle the request
> > - if before-response delay > 0 insert into before-response delay queue
> > - respond
> >
> > Just throwing this out there for discussion.
> >
> > Thanks,
> >
> > Joel
> >
> > On Thu, Apr 16, 2015 at 02:56:23PM -0700, Jun Rao wrote:
> > > The quota check for the fetch request is a bit different from the
> produce
> > > request. I assume that for the fetch request, we will first get an
> > > estimated fetch response size to do the quota check. There are two
> things
> > > to think about. First, when we actually send the response, we probably
> > > don't want to record the metri

[jira] [Commented] (KAFKA-2149) fix default InterBrokerProtocolVersion

2015-04-24 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511899#comment-14511899
 ] 

Onur Karaman commented on KAFKA-2149:
-

Okay that sounds fair. I hadn't seen the upgrade docs you provided before. I 
had only read KIP-2, which included the line

bq. The idea is that in 0.8.3, we will default wire.protocol.version to 0.8.2

Can you update the KIP?

> fix default InterBrokerProtocolVersion
> --
>
> Key: KAFKA-2149
> URL: https://issues.apache.org/jira/browse/KAFKA-2149
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2149.patch
>
>
> Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
> KIP-2.
> We hit wire-protocol problems (BufferUnderflowException) with upgrading 
> brokers to include KAFKA-1809. This specifically happened when an older 
> broker receives a UpdateMetadataRequest from a controller with the patch and 
> the controller didn't explicitly set their inter.broker.protocol.version to 
> 0.8.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2149) fix default InterBrokerProtocolVersion

2015-04-24 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511790#comment-14511790
 ] 

Gwen Shapira commented on KAFKA-2149:
-

Actually the decision to default to latest version for inter-broker protocol 
was a design decision, not a bug.

The idea is that if the default is 0.8.2, we have the following problems:
1. New installations will not have new features out-of-the-box, they'll need to 
change configuration. Making life easier for experienced admins upgrading vs 
new users installing doesn't sound right.
2. We'll need to keep track of the default with every release

We do have the upgrade process in the docs: 
https://kafka.apache.org/083/documentation.html#upgrade



> fix default InterBrokerProtocolVersion
> --
>
> Key: KAFKA-2149
> URL: https://issues.apache.org/jira/browse/KAFKA-2149
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2149.patch
>
>
> Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
> KIP-2.
> We hit wire-protocol problems (BufferUnderflowException) with upgrading 
> brokers to include KAFKA-1809. This specifically happened when an older 
> broker receives a UpdateMetadataRequest from a controller with the patch and 
> the controller didn't explicitly set their inter.broker.protocol.version to 
> 0.8.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [KIP-DISCUSSION] KIP-13 Quotas

2015-04-24 Thread Jun Rao
Joel,

What you suggested makes sense. Not sure if there is a strong need to
throttle TMR though since it should be infrequent.

Thanks,

Jun

On Tue, Apr 21, 2015 at 12:21 PM, Joel Koshy  wrote:

> Given the caveats, it may be worth doing further investigation on the
> alternate approach which is to use a dedicated DelayQueue for requests
> that violate quota and compare pros/cons.
>
> So the approach is the following: all request handling occurs normally
> (i.e., unchanged from what we do today). i.e., purgatories will be
> unchanged.  After handling a request and before sending the response,
> check if the request has violated a quota. If so, then enqueue the
> response into a DelayQueue. All responses can share the same
> DelayQueue. Send those responses out after the delay has been met.
>
> There are some benefits to doing this:
>
> - We will eventually want to quota other requests as well. The above
>   seems to be a clean staged approach that should work uniformly for
>   all requests. i.e., parse request -> handle request normally ->
>   check quota -> hold in delay queue if quota violated -> respond .
>   All requests can share the same DelayQueue. (In contrast with the
>   current proposal we could end up with a bunch of purgatories, or a
>   combination of purgatories and delay queues.)
> - Since this approach does not need any fundamental modifications to
>   the current request handling, it addresses the caveats that Adi
>   noted (which is holding producer requests/fetch requests longer than
>   strictly necessary if quota is violated since the proposal was to
>   not watch on keys in that case). Likewise it addresses the caveat
>   that Guozhang noted (we may return no error if the request is held
>   long enough due to quota violation and satisfy a producer request
>   that may have in fact exceeded the ack timeout) although it is
>   probably reasonable to hide this case from the user.
> - By avoiding the caveats it also avoids the suggested work-around to
>   the caveats which is effectively to add a min-hold-time to the
>   purgatory. Although this is not a lot of code, I think it adds a
>   quota-driven feature to the purgatory which is already non-trivial
>   and should ideally remain unassociated with quota enforcement.
>
> For this to work well we need to be sure that we don't hold a lot of
> data in the DelayQueue - and therein lies a quirk to this approach.
> Producer responses (and most other responses) are very small so there
> is no issue. Fetch responses are fine as well - since we read off a
> FileMessageSet in response (zero-copy). This will remain true even
> when we support SSL since encryption occurs at the session layer (not
> the application layer).
>
> Topic metadata response can be a problem though. For this we ideally
> want to build the topic metadata response only when we are ready to
> respond. So for metadata-style responses which could contain large
> response objects we may want to put the quota check and delay queue
> _before_ handling the request. So the design in this approach would
> need an amendment: provide a choice of where to put a request in the
> delay queue: either before handling or after handling (before
> response). So for:
>
> small request, large response: delay queue before handling
> large request, small response: delay queue after handling, before response
> small request, small response: either is fine
> large request, large resopnse: we really cannot do anything here but we
> don't really have this scenario yet
>
> So the design would look like this:
>
> - parse request
> - before handling request check if quota violated; if so compute two delay
> numbers:
>   - before handling delay
>   - before response delay
> - if before-handling delay > 0 insert into before-handling delay queue
> - handle the request
> - if before-response delay > 0 insert into before-response delay queue
> - respond
>
> Just throwing this out there for discussion.
>
> Thanks,
>
> Joel
>
> On Thu, Apr 16, 2015 at 02:56:23PM -0700, Jun Rao wrote:
> > The quota check for the fetch request is a bit different from the produce
> > request. I assume that for the fetch request, we will first get an
> > estimated fetch response size to do the quota check. There are two things
> > to think about. First, when we actually send the response, we probably
> > don't want to record the metric again since it will double count. Second,
> > the bytes that the fetch response actually sends could be more than the
> > estimate. This means that the metric may not be 100% accurate. We may be
> > able to limit the fetch size of each partition to what's in the original
> > estimate.
> >
> > For the produce request, I was thinking that another way to do this is to
> > first figure out the quota_timeout. Then wait in Purgatory for
> > quota_timeout with no key. If the request is not satisfied in
> quota_timeout
> > and (request_timeout > quota_timeout), wait in Purgatory for
> > (request_timeout 

Build failed in Jenkins: Kafka-trunk #472

2015-04-24 Thread Apache Jenkins Server
See 

Changes:

[jjkoshy] KAFKA-2138; Fix producer to honor retry backoff; reviewed by Joel 
Koshy and Guozhang Wang

--
[...truncated 2156 lines...]

kafka.log.OffsetIndexTest > lookupExtremeCases PASSED

kafka.log.OffsetIndexTest > appendTooMany PASSED

kafka.log.OffsetIndexTest > appendOutOfOrder PASSED

kafka.log.OffsetIndexTest > testReopen PASSED

kafka.log.OffsetMapTest > testBasicValidation PASSED

kafka.log.OffsetMapTest > testClear PASSED

kafka.log.LogManagerTest > testCreateLog PASSED

kafka.log.LogManagerTest > testGetNonExistentLog PASSED

kafka.log.LogManagerTest > testCleanupExpiredSegments PASSED

kafka.log.LogManagerTest > testCleanupSegmentsToMaintainSize PASSED

kafka.log.LogManagerTest > testTimeBasedFlush PASSED

kafka.log.LogManagerTest > testLeastLoadedAssignment PASSED

kafka.log.LogManagerTest > testTwoLogManagersUsingSameDirFails PASSED

kafka.log.LogManagerTest > testCheckpointRecoveryPoints PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithTrailingSlash PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithRelativeDirectory 
PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[15] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[16] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[17] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[18] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[19] PASSED

kafka.log.LogTest > testTimeBasedLogRoll PASSED

kafka.log.LogTest > testTimeBasedLogRollJitter PASSED

kafka.log.LogTest > testSizeBasedLogRoll PASSED

kafka.log.LogTest > testLoadEmptyLog PASSED

kafka.log.LogTest > testAppendAndReadWithSequentialOffsets PASSED

kafka.log.LogTest > testAppendAndReadWithNonSequentialOffsets PASSED

kafka.log.LogTest > testReadAtLogGap PASSED

kafka.log.LogTest > testReadOutOfRange PASSED

kafka.log.LogTest > testLogRolls PASSED

kafka.log.LogTest > testCompressedMessages PASSED

kafka.log.LogTest > testThatGarbageCollectingSegmentsDoesntChangeOffset PASSED

kafka.log.LogTest > testMessageSetSizeCheck PASSED

kafka.log.LogTest > testCompactedTopicConstraints PASSED

kafka.log.LogTest > testMessageSizeCheck PASSED

kafka.log.LogTest > testLogRecoversToCorrectOffset PASSED

kafka.log.LogTest > testIndexRebuild PASSED

kafka.log.LogTest > testTruncateTo PASSED

kafka.log.LogTest > testIndexResizingAtTruncation PASSED

kafka.log.LogTest > testBogusIndexSegmentsAreRemoved PASSED

kafka.log.LogTest > testReopenThenTruncate PASSED

kafka.log.LogTest > testAsyncDelete PASSED

kafka.log.LogTest > testOpenDeletesObsoleteFiles PASSED

kafka.log.LogTest > testAppendMessageWithNullPayload PASSED

kafka.log.LogTest > testCorruptLog PASSED

kafka.log.LogTest > testCleanShutdownFile PASSED

kafka.log.LogTest > testParseTopicPartitionName PASSED

kafka.log.LogTest > testParseTopicPartitionNameForEmptyName PASSED

kafka.log.LogTest > testParseTopicPartitionNameForNull PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingSeparator PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingTopic PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingPartition PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest PASSED

kafka.log.CleanerTest > testCleanSegments PASSED

kafka.log.CleanerTest > testCleaningWithDeletes PASSED

kafka.log.CleanerTest > testCleaningWithUnkeyedMessages PASSED

kafka.log.CleanerTest > testCleanSegmentsWithAbort PASSED

kafka.log.CleanerTest > testSegmentGrouping PASSED

kafka.log.CleanerTest > testSegmentGroupingWithSparseOffsets PASSED

kafka.log.CleanerTest > testBuildOffsetMap PASSED

kafka.javaapi.consumer.ZookeeperConsumerConnectorTest > testBasic PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testWrittenEqualsRead PASSED

kafka.javaapi.me

Re: Review Request 33532: fix default InterBrokerProtocolVersion

2015-04-24 Thread Onur Karaman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33532/
---

(Updated April 24, 2015, 8:23 p.m.)


Review request for kafka.


Bugs: KAFKA-2149
https://issues.apache.org/jira/browse/KAFKA-2149


Repository: kafka


Description
---

Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
KIP-2.

We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers 
to include KAFKA-1809. This specifically happened when an older broker receives 
a UpdateMetadataRequest from a controller with the patch and the controller 
didn't explicitly set their inter.broker.protocol.version to 0.8.2


Diffs
-

  core/src/main/scala/kafka/server/KafkaConfig.scala 
cfbbd2be550947dd2b3c8c2cab981fa08fb6d859 
  core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
2428dbd7197a58cf4cad42ef82b385dab3a2b15e 

Diff: https://reviews.apache.org/r/33532/diff/


Testing (updated)
---

Before this patch: brought up a controller with KAFKA-1809 and then an older 
broker. Broker gets BufferUnderflowException from UpdateMetadataRequest.

After this patch: brought up a controller with this patch and then an older 
broker. Broker no longer gets BufferUnderflowException.


Thanks,

Onur Karaman



[jira] [Commented] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs

2015-04-24 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511660#comment-14511660
 ] 

Onur Karaman commented on KAFKA-1809:
-

I hit problems upgrading to this patch because of the default 
InterBrokerProtocolVersion. Details and a patch are here: 
https://issues.apache.org/jira/browse/KAFKA-2149

> Refactor brokers to allow listening on multiple ports and IPs 
> --
>
> Key: KAFKA-1809
> URL: https://issues.apache.org/jira/browse/KAFKA-1809
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Gwen Shapira
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1809-DOCs.V1.patch, KAFKA-1809.patch, 
> KAFKA-1809.v1.patch, KAFKA-1809.v2.patch, KAFKA-1809.v3.patch, 
> KAFKA-1809_2014-12-25_11:04:17.patch, KAFKA-1809_2014-12-27_12:03:35.patch, 
> KAFKA-1809_2014-12-27_12:22:56.patch, KAFKA-1809_2014-12-29_12:11:36.patch, 
> KAFKA-1809_2014-12-30_11:25:29.patch, KAFKA-1809_2014-12-30_18:48:46.patch, 
> KAFKA-1809_2015-01-05_14:23:57.patch, KAFKA-1809_2015-01-05_20:25:15.patch, 
> KAFKA-1809_2015-01-05_21:40:14.patch, KAFKA-1809_2015-01-06_11:46:22.patch, 
> KAFKA-1809_2015-01-13_18:16:21.patch, KAFKA-1809_2015-01-28_10:26:22.patch, 
> KAFKA-1809_2015-02-02_11:55:16.patch, KAFKA-1809_2015-02-03_10:45:31.patch, 
> KAFKA-1809_2015-02-03_10:46:42.patch, KAFKA-1809_2015-02-03_10:52:36.patch, 
> KAFKA-1809_2015-03-16_09:02:18.patch, KAFKA-1809_2015-03-16_09:40:49.patch, 
> KAFKA-1809_2015-03-27_15:04:10.patch, KAFKA-1809_2015-04-02_15:12:30.patch, 
> KAFKA-1809_2015-04-02_19:03:58.patch, KAFKA-1809_2015-04-04_22:00:13.patch
>
>
> The goal is to eventually support different security mechanisms on different 
> ports. 
> Currently brokers are defined as host+port pair, and this definition exists 
> throughout the code-base, therefore some refactoring is needed to support 
> multiple ports for a single broker.
> The detailed design is here: 
> https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2149) fix default InterBrokerProtocolVersion

2015-04-24 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511657#comment-14511657
 ] 

Onur Karaman commented on KAFKA-2149:
-

Created reviewboard https://reviews.apache.org/r/33532/diff/
 against branch origin/trunk

> fix default InterBrokerProtocolVersion
> --
>
> Key: KAFKA-2149
> URL: https://issues.apache.org/jira/browse/KAFKA-2149
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2149.patch
>
>
> Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
> KIP-2.
> We hit wire-protocol problems (BufferUnderflowException) with upgrading 
> brokers to include KAFKA-1809. This specifically happened when an older 
> broker receives a UpdateMetadataRequest from a controller with the patch and 
> the controller didn't explicitly set their inter.broker.protocol.version to 
> 0.8.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2149) fix default InterBrokerProtocolVersion

2015-04-24 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-2149:

Attachment: KAFKA-2149.patch

> fix default InterBrokerProtocolVersion
> --
>
> Key: KAFKA-2149
> URL: https://issues.apache.org/jira/browse/KAFKA-2149
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2149.patch
>
>
> Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
> KIP-2.
> We hit wire-protocol problems (BufferUnderflowException) with upgrading 
> brokers to include KAFKA-1809. This specifically happened when an older 
> broker receives a UpdateMetadataRequest from a controller with the patch and 
> the controller didn't explicitly set their inter.broker.protocol.version to 
> 0.8.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2149) fix default InterBrokerProtocolVersion

2015-04-24 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-2149:

Status: Patch Available  (was: Open)

> fix default InterBrokerProtocolVersion
> --
>
> Key: KAFKA-2149
> URL: https://issues.apache.org/jira/browse/KAFKA-2149
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2149.patch
>
>
> Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
> KIP-2.
> We hit wire-protocol problems (BufferUnderflowException) with upgrading 
> brokers to include KAFKA-1809. This specifically happened when an older 
> broker receives a UpdateMetadataRequest from a controller with the patch and 
> the controller didn't explicitly set their inter.broker.protocol.version to 
> 0.8.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2149) fix default InterBrokerProtocolVersion

2015-04-24 Thread Onur Karaman (JIRA)
Onur Karaman created KAFKA-2149:
---

 Summary: fix default InterBrokerProtocolVersion
 Key: KAFKA-2149
 URL: https://issues.apache.org/jira/browse/KAFKA-2149
 Project: Kafka
  Issue Type: Bug
Reporter: Onur Karaman
Assignee: Onur Karaman


Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
KIP-2.

We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers 
to include KAFKA-1809. This specifically happened when an older broker receives 
a UpdateMetadataRequest from a controller with the patch and the controller 
didn't explicitly set their inter.broker.protocol.version to 0.8.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 33532: fix default InterBrokerProtocolVersion

2015-04-24 Thread Onur Karaman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33532/
---

Review request for kafka.


Bugs: KAFKA-2149
https://issues.apache.org/jira/browse/KAFKA-2149


Repository: kafka


Description
---

Fix the default InterBrokerProtocolVersion to be KAFKA_082 as specified in 
KIP-2.

We hit wire-protocol problems (BufferUnderflowException) with upgrading brokers 
to include KAFKA-1809. This specifically happened when an older broker receives 
a UpdateMetadataRequest from a controller with the patch and the controller 
didn't explicitly set their inter.broker.protocol.version to 0.8.2


Diffs
-

  core/src/main/scala/kafka/server/KafkaConfig.scala 
cfbbd2be550947dd2b3c8c2cab981fa08fb6d859 
  core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 
2428dbd7197a58cf4cad42ef82b385dab3a2b15e 

Diff: https://reviews.apache.org/r/33532/diff/


Testing
---


Thanks,

Onur Karaman



[jira] [Updated] (KAFKA-2138) KafkaProducer does not honor the retry backoff time.

2015-04-24 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-2138:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Discussed offline with [~becket_qin] - KAFKA-2142 has been filed to do further 
improvements including fixing a pre-existing bug where we may prematurely send 
(before batch-full or linger time thresholds).

> KafkaProducer does not honor the retry backoff time.
> 
>
> Key: KAFKA-2138
> URL: https://issues.apache.org/jira/browse/KAFKA-2138
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>Priority: Critical
> Attachments: KAFKA-2138.patch, KAFKA-2138_2015-04-22_17:19:33.patch
>
>
> In KafkaProducer, we only check the batch.lastAttemptMs in ready. But we are 
> not checking it in drain() as well.
> The problem is that if we have two partitions both on the same node, suppose 
> Partition 1 should backoff while partition 2 should not. Currently partition 
> 1's backoff time will be ignored.
> We should check the lastAttemptMs in drain() as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27204: Patch for KAFKA-1683

2015-04-24 Thread Gari Singh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27204/#review81522
---


1) I think that Session should take a Subject rather than just a single 
Principal.  The reason for this is because a Subject can have multiple 
Principals (for example both a username and a group or perhaps someone would 
want to use both the username and the clientIP as Principals)

This is also more in line with JAAS as well and would fit better with 
authentication modules

2)  We would then also have multiple concrete Principals, e.g.

KafkaPrincipal
KafkaUserPrincipal
KafkaGroupPrincipal
(perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
etc

This is important as eventually (hopefully sooner than later), we will support 
multiple types of authentication which may each want to populate the Subject 
with one or more Principals and perhaps even credentials (this could be used in 
the future to hold encryption keys or perhaps the raw info prior to 
authentication).

- Gari Singh


On Oct. 26, 2014, 5:37 a.m., Gwen Shapira wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27204/
> ---
> 
> (Updated Oct. 26, 2014, 5:37 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1683
> https://issues.apache.org/jira/browse/KAFKA-1683
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> added test for Session
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/network/RequestChannel.scala 
> 4560d8fb7dbfe723085665e6fd611c295e07b69b 
>   core/src/main/scala/kafka/network/SocketServer.scala 
> cee76b323e5f3e4c783749ac9e78e1ef02897e3b 
>   core/src/test/scala/unit/kafka/network/SocketServerTest.scala 
> 5f4d85254c384dcc27a5a84f0836ea225d3a901a 
> 
> Diff: https://reviews.apache.org/r/27204/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Gwen Shapira
> 
>



Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gari Singh
I will move the comments about subject versus principal wrt session to the
PR above.  The comments around keys, etc are more appropriate there.

If I tie this together with my comments in the thread on SASL / Kerberos,
what I am having a hard time figuring out are the pluggable framework for
both authentication and authorization versus implementation of specific
authentication and authorization providers.

As for caching decisions, it just seems silly to authorize on the same
operation over and over again (e.g. publishing to the same topic), but
perhaps if the ACLs are small enough this will be ok.



On Fri, Apr 24, 2015 at 2:18 PM, Parth Brahmbhatt <
pbrahmbh...@hortonworks.com> wrote:

> Thanks for your comments Gari. My responses are inline.
>
> Thanks
> Parth
>
> On 4/24/15, 10:36 AM, "Gari Singh"  wrote:
>
> >Sorry - fat fingered send ...
> >
> >
> >Not sure if my "newbie" vote will count, but I think you are getting
> >pretty
> >close here.
> >
> >Couple of things:
> >
> >1) I know the Session object is from a different JIRA, but I think that
> >Session should take a Subject rather than just a single Principal.  The
> >reason for this is because a Subject can have multiple Principals (for
> >example both a username and a group or perhaps someone would want to use
> >both the username and the clientIP as Principals)
>
> I think the user -> group mapping can be done at Authorization
> implementation layer. In any case as you pointed out the session is part
> of another jira and I think a PR is out
> https://reviews.apache.org/r/27204/diff/ and we should discuss it on that
> PR.
>
> >
> >2)  We would then also have multiple concrete Principals, e.g.
> >
> >KafkaPrincipal
> >KafkaUserPrincipal
> >KafkaGroupPrincipal
> >(perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
> >etc
> >
> >This is important as eventually (hopefully sooner than later), we will
> >support multiple types of authentication which may each want to populate
> >the Subject with one or more Principals and perhaps even credentials (this
> >could be used in the future to hold encryption keys or perhaps the raw
> >info
> >prior to authentication).
> >
> >So in this way, if we have different authentication modules, we can add
> >different types of Principals by extension
> >
> >This also allows the same subject to have access to some resources based
> >on
> >username and some based on group.
> >
> >Given that with this we would have different types of Principals, I would
> >then modify the ACL to look like:
> >
> >{"version":1,
> >  {"acls":[
> >{
> >  "principal_types":["KafkaUserPrincipal","KafkaGroupPrincipal"],
> >  "principals":["alice","kafka-devs"]
> >  ...
> >
> >or
> >
> >{"version":1,
> >  {"acls":[
> >{
> >  "principals":["KafkaUserPrincipal:alice","KafkaGroupPrincipal:kafka-
> >devs"]
> >
> >
> >But in either case this allows for easy identification of the type of
> >principal and makes it easy to plugin multiple kinds of principals
> >
> >The advantage of all of this is that it now provides more flexibility for
> >custom modules for both authentication and authorization moving forward.
>
> All the principals that you listed above can be supported with
> current
> design. Acls take a KafkaPrincipal as input which is a combination of type
> and principal name and the authorizer implementations are free to create
> any extension of this which covers group: groupName, host: HostName,
> kerberos: kerberosUserName and any other types that may come up. I am not
> sure how encryption key storage is relavent to the Authorizer so will be
> great if you can elaborate.
>
> >
> >3) Are you sure that you want "authorize" to take a "session" object?  If
> >we use the model in one above, we could just populate the Subject with a
> >KafkaClientAddressPrincipal and thenhave access to that when evaluated the
> >ACLs.
>
> I think it is better to take a session which can just be a wrapper
> on top
> of Subject + host for now. This allows for extension which in my opinion
> is more "future requirement" proof.
>
> >
> >4) What about actually caching authorization decisions?  I know ACLs will
> >be cached, but the actual authorize decision can be expensive as well?
>
> In default implementation I don’t plan to do this. Easy to add
> later if
> we want to but I am not sure why would this ever be expansive when acls
> are cached and number of acls on a single topic should be very small and
> iterating over them with simple string comparison should not really be
> expansive.
>
> Thanks
> Parth
>
> >
> >On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh 
> >wrote:
> >
> >> Not sure if my "newbie" vote will count, but I think you are getting
> >> pretty close here.
> >>
> >> Couple of things:
> >>
> >> 1) I know the Session object is from a different JIRA, but I think that
> >> Session should take a Subject rather than just a single Principal.  The
> >> reason for this is because a Su

[jira] [Commented] (KAFKA-2132) Move Log4J appender to clients module

2015-04-24 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511517#comment-14511517
 ] 

Ashish K Singh commented on KAFKA-2132:
---

[~jkreps], [~charmalloc] thanks for the inputs here. We all agree that we 
should not have log4j as dependency. As [~jkreps] pointed out, moving 
Log4jAppender to admin tools package will again bring us back to the original 
problem. I am more inclined towards having it in a separate package. However, I 
will wait for [~charmalloc] to reply with this thoughts, before submitting a 
patch.

> Move Log4J appender to clients module
> -
>
> Key: KAFKA-2132
> URL: https://issues.apache.org/jira/browse/KAFKA-2132
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Gwen Shapira
>Assignee: Ashish K Singh
>
> Log4j appender is just a producer.
> Since we have a new producer in the clients module, no need to keep Log4J 
> appender in "core" and force people to package all of Kafka with their apps.
> Lets move the Log4jAppender to clients module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
Sorry Gwen, completely misunderstood the question :-).

* Does everyone have the privilege to create a new Group and use it to
consume from Topics he's already privileged on?
Yes in current proposal. I did not see an API to create group but if you
have a READ permission on a TOPIC and WRITE permission on that Group you
are free to join and consume.
 

* Will the CLI tool be used to manage group membership too?
Yes and I think that means I need to add ―group. Updating the KIP. 
Thanks
for pointing this out.

* Groups are kind of ephemeral, right? If all consumers in the group
disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
treat the new group as completely new resource? Can we create ACLs
before the group exists, in anticipation of it getting created?
I have considered any auto delete and auto create as out of scope for 
the
first release. So Right now I was going with preserving the acls. Do you
see any issues with this? Auto deleting would mean authorizer will now
have to get into implementation details of kafka which I was trying to
avoid.

Thanks
Parth

On 4/24/15, 11:33 AM, "Gwen Shapira"  wrote:

>We are not talking about same Groups :)
>
>I meant, Groups of consumers (which KIP-11 lists as a separate
>resource in the Privilege table)
>
>On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
> wrote:
>> I see Groups as something we can add incrementally in the current model.
>> The acls take principalType: name so groups can be represented as group:
>> groupName. We are not managing group memberships anywhere in kafka and I
>> don’t see the need to do so.
>>
>> So for a topic1 using the CLI an admin can add an acl to grant access to
>> group:kafka-test-users.
>>
>> The authorizer implementation can have a plugin to map authenticated
>>user
>> to groups ( This is how hadoop and storm works). The plugin could be
>> mapping user to linux/ldap/active directory groups but that is again
>>upto
>> the implementation.
>>
>> What we are offering is an interface that is extensible so these
>>features
>> can be added incrementally. I can add support for this in the first
>> release but don’t necessarily see why this would be absolute necessity.
>>
>> Thanks
>> Parth
>>
>> On 4/24/15, 11:00 AM, "Gwen Shapira"  wrote:
>>
>>>Thanks.
>>>
>>>One more thing I'm missing in the KIP is details on the Group resource
>>>(I think we discussed this and it was just not fully updated):
>>>
>>>* Does everyone have the privilege to create a new Group and use it to
>>>consume from Topics he's already privileged on?
>>>* Will the CLI tool be used to manage group membership too?
>>>* Groups are kind of ephemeral, right? If all consumers in the group
>>>disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
>>>treat the new group as completely new resource? Can we create ACLs
>>>before the group exists, in anticipation of it getting created?
>>>
>>>Its all small details, but it will be difficult to implement KIP-11
>>>without knowing the answers :)
>>>
>>>Gwen
>>>
>>>
>>>On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
>>> wrote:
 You are right, moved it to the default implementation section.

 Thanks
 Parth

 On 4/24/15, 9:52 AM, "Gwen Shapira"  wrote:

>Sample ACL JSON and Zookeeper is in public API, but I thought it is
>part of DefaultAuthorizer (Since Sentry and Argus won't be using
>Zookeeper).
>Am I wrong? Or is it the KIP?
>
>On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
> wrote:
>> Thanks for clarifying Gwen, KIP updated.
>>
>> I tried to make the distinction by creating a section for all public
>>APIs
>>
>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizat
>>io
>>n+
>>In
>> terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses
>>
>> Let me know if you think there is a better way to reflect this.
>>
>> Thanks
>> Parth
>>
>> On 4/24/15, 9:37 AM, "Gwen Shapira"  wrote:
>>
>>>+1 (non-binding)
>>>
>>>Two nitpicks for the wiki:
>>>* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
>>>sure new consumers need it to be part of a consumer group.
>>>* Can you clearly separate which parts are the API (common to every
>>>Authorizer) and which parts are DefaultAuthorizer implementation? It
>>>will make reviews and Authorizer implementations a bit easier to
>>>know
>>>exactly which is which.
>>>
>>>Gwen
>>>
>>>On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
>>> wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, "Parth Brahmbhatt"

 wrote:

>Hi Jeff,
>
>Thanks a lot for the review. I think you have a valid point about
>acls
>being duplicated and the simplest solution would be 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gwen Shapira
Sorry, for the confusion. I'm not sure my last email is clear enough either...

Consumers will have a Principal which may belong to a group.
But consumer configuration also have a group.id, which controls how
partitions are shared between consumers and how offsets are committed.
I'm talking about those Groups.


On Fri, Apr 24, 2015 at 11:33 AM, Gwen Shapira  wrote:
> We are not talking about same Groups :)
>
> I meant, Groups of consumers (which KIP-11 lists as a separate
> resource in the Privilege table)
>
> On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
>  wrote:
>> I see Groups as something we can add incrementally in the current model.
>> The acls take principalType: name so groups can be represented as group:
>> groupName. We are not managing group memberships anywhere in kafka and I
>> don’t see the need to do so.
>>
>> So for a topic1 using the CLI an admin can add an acl to grant access to
>> group:kafka-test-users.
>>
>> The authorizer implementation can have a plugin to map authenticated user
>> to groups ( This is how hadoop and storm works). The plugin could be
>> mapping user to linux/ldap/active directory groups but that is again upto
>> the implementation.
>>
>> What we are offering is an interface that is extensible so these features
>> can be added incrementally. I can add support for this in the first
>> release but don’t necessarily see why this would be absolute necessity.
>>
>> Thanks
>> Parth
>>
>> On 4/24/15, 11:00 AM, "Gwen Shapira"  wrote:
>>
>>>Thanks.
>>>
>>>One more thing I'm missing in the KIP is details on the Group resource
>>>(I think we discussed this and it was just not fully updated):
>>>
>>>* Does everyone have the privilege to create a new Group and use it to
>>>consume from Topics he's already privileged on?
>>>* Will the CLI tool be used to manage group membership too?
>>>* Groups are kind of ephemeral, right? If all consumers in the group
>>>disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
>>>treat the new group as completely new resource? Can we create ACLs
>>>before the group exists, in anticipation of it getting created?
>>>
>>>Its all small details, but it will be difficult to implement KIP-11
>>>without knowing the answers :)
>>>
>>>Gwen
>>>
>>>
>>>On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
>>> wrote:
 You are right, moved it to the default implementation section.

 Thanks
 Parth

 On 4/24/15, 9:52 AM, "Gwen Shapira"  wrote:

>Sample ACL JSON and Zookeeper is in public API, but I thought it is
>part of DefaultAuthorizer (Since Sentry and Argus won't be using
>Zookeeper).
>Am I wrong? Or is it the KIP?
>
>On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
> wrote:
>> Thanks for clarifying Gwen, KIP updated.
>>
>> I tried to make the distinction by creating a section for all public
>>APIs
>>
>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio
>>n+
>>In
>> terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses
>>
>> Let me know if you think there is a better way to reflect this.
>>
>> Thanks
>> Parth
>>
>> On 4/24/15, 9:37 AM, "Gwen Shapira"  wrote:
>>
>>>+1 (non-binding)
>>>
>>>Two nitpicks for the wiki:
>>>* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
>>>sure new consumers need it to be part of a consumer group.
>>>* Can you clearly separate which parts are the API (common to every
>>>Authorizer) and which parts are DefaultAuthorizer implementation? It
>>>will make reviews and Authorizer implementations a bit easier to know
>>>exactly which is which.
>>>
>>>Gwen
>>>
>>>On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
>>> wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, "Parth Brahmbhatt"

 wrote:

>Hi Jeff,
>
>Thanks a lot for the review. I think you have a valid point about
>acls
>being duplicated and the simplest solution would be to modify acls
>class
>so they hold a set of principals instead of single principal. i.e
>
> has  Permissions on 
>from
>.
>
>I think the evaluation order only matters for the permissionType
>which
>is
>Deny acls should be evaluated before allow acls. To give you an
>example
>suppose we have following acls
>
>acl1 -> user1 is allowed to READ from all hosts.
>acl2 -> host1 is allowed to READ regardless of who is the user.
>acl3 -> host2 is allowed to READ regardless of who is the user.
>
>acl4 -> user1 is denied to READ from host1.
>
>As stated in the KIP we first evaluate DENY so if user1 tries to
>access
>from host1 he 

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gwen Shapira
We are not talking about same Groups :)

I meant, Groups of consumers (which KIP-11 lists as a separate
resource in the Privilege table)

On Fri, Apr 24, 2015 at 11:31 AM, Parth Brahmbhatt
 wrote:
> I see Groups as something we can add incrementally in the current model.
> The acls take principalType: name so groups can be represented as group:
> groupName. We are not managing group memberships anywhere in kafka and I
> don’t see the need to do so.
>
> So for a topic1 using the CLI an admin can add an acl to grant access to
> group:kafka-test-users.
>
> The authorizer implementation can have a plugin to map authenticated user
> to groups ( This is how hadoop and storm works). The plugin could be
> mapping user to linux/ldap/active directory groups but that is again upto
> the implementation.
>
> What we are offering is an interface that is extensible so these features
> can be added incrementally. I can add support for this in the first
> release but don’t necessarily see why this would be absolute necessity.
>
> Thanks
> Parth
>
> On 4/24/15, 11:00 AM, "Gwen Shapira"  wrote:
>
>>Thanks.
>>
>>One more thing I'm missing in the KIP is details on the Group resource
>>(I think we discussed this and it was just not fully updated):
>>
>>* Does everyone have the privilege to create a new Group and use it to
>>consume from Topics he's already privileged on?
>>* Will the CLI tool be used to manage group membership too?
>>* Groups are kind of ephemeral, right? If all consumers in the group
>>disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
>>treat the new group as completely new resource? Can we create ACLs
>>before the group exists, in anticipation of it getting created?
>>
>>Its all small details, but it will be difficult to implement KIP-11
>>without knowing the answers :)
>>
>>Gwen
>>
>>
>>On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
>> wrote:
>>> You are right, moved it to the default implementation section.
>>>
>>> Thanks
>>> Parth
>>>
>>> On 4/24/15, 9:52 AM, "Gwen Shapira"  wrote:
>>>
Sample ACL JSON and Zookeeper is in public API, but I thought it is
part of DefaultAuthorizer (Since Sentry and Argus won't be using
Zookeeper).
Am I wrong? Or is it the KIP?

On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
 wrote:
> Thanks for clarifying Gwen, KIP updated.
>
> I tried to make the distinction by creating a section for all public
>APIs
>
>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio
>n+
>In
> terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses
>
> Let me know if you think there is a better way to reflect this.
>
> Thanks
> Parth
>
> On 4/24/15, 9:37 AM, "Gwen Shapira"  wrote:
>
>>+1 (non-binding)
>>
>>Two nitpicks for the wiki:
>>* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
>>sure new consumers need it to be part of a consumer group.
>>* Can you clearly separate which parts are the API (common to every
>>Authorizer) and which parts are DefaultAuthorizer implementation? It
>>will make reviews and Authorizer implementations a bit easier to know
>>exactly which is which.
>>
>>Gwen
>>
>>On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
>> wrote:
>>> Hi,
>>>
>>> I would like to open KIP-11 for voting.
>>>
>>> Thanks
>>> Parth
>>>
>>> On 4/22/15, 1:56 PM, "Parth Brahmbhatt"
>>>
>>> wrote:
>>>
Hi Jeff,

Thanks a lot for the review. I think you have a valid point about
acls
being duplicated and the simplest solution would be to modify acls
class
so they hold a set of principals instead of single principal. i.e

 has  Permissions on 
from
.

I think the evaluation order only matters for the permissionType
which
is
Deny acls should be evaluated before allow acls. To give you an
example
suppose we have following acls

acl1 -> user1 is allowed to READ from all hosts.
acl2 -> host1 is allowed to READ regardless of who is the user.
acl3 -> host2 is allowed to READ regardless of who is the user.

acl4 -> user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to
access
from host1 he will be denied(acl4), even though both user1 and host1
has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and
it
does
not matter if we match acl3 or acl1 so I don’t think the evaluation
order
matters here.

“Will people actually use hosts with users?” I really don’t know but
given
ACl’s are part of our Public APIs I thought it is better to try and
>

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
I see Groups as something we can add incrementally in the current model.
The acls take principalType: name so groups can be represented as group:
groupName. We are not managing group memberships anywhere in kafka and I
don’t see the need to do so.

So for a topic1 using the CLI an admin can add an acl to grant access to
group:kafka-test-users.

The authorizer implementation can have a plugin to map authenticated user
to groups ( This is how hadoop and storm works). The plugin could be
mapping user to linux/ldap/active directory groups but that is again upto
the implementation.

What we are offering is an interface that is extensible so these features
can be added incrementally. I can add support for this in the first
release but don’t necessarily see why this would be absolute necessity.

Thanks
Parth

On 4/24/15, 11:00 AM, "Gwen Shapira"  wrote:

>Thanks.
>
>One more thing I'm missing in the KIP is details on the Group resource
>(I think we discussed this and it was just not fully updated):
>
>* Does everyone have the privilege to create a new Group and use it to
>consume from Topics he's already privileged on?
>* Will the CLI tool be used to manage group membership too?
>* Groups are kind of ephemeral, right? If all consumers in the group
>disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
>treat the new group as completely new resource? Can we create ACLs
>before the group exists, in anticipation of it getting created?
>
>Its all small details, but it will be difficult to implement KIP-11
>without knowing the answers :)
>
>Gwen
>
>
>On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
> wrote:
>> You are right, moved it to the default implementation section.
>>
>> Thanks
>> Parth
>>
>> On 4/24/15, 9:52 AM, "Gwen Shapira"  wrote:
>>
>>>Sample ACL JSON and Zookeeper is in public API, but I thought it is
>>>part of DefaultAuthorizer (Since Sentry and Argus won't be using
>>>Zookeeper).
>>>Am I wrong? Or is it the KIP?
>>>
>>>On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
>>> wrote:
 Thanks for clarifying Gwen, KIP updated.

 I tried to make the distinction by creating a section for all public
APIs

https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizatio
n+
In
 terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses

 Let me know if you think there is a better way to reflect this.

 Thanks
 Parth

 On 4/24/15, 9:37 AM, "Gwen Shapira"  wrote:

>+1 (non-binding)
>
>Two nitpicks for the wiki:
>* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
>sure new consumers need it to be part of a consumer group.
>* Can you clearly separate which parts are the API (common to every
>Authorizer) and which parts are DefaultAuthorizer implementation? It
>will make reviews and Authorizer implementations a bit easier to know
>exactly which is which.
>
>Gwen
>
>On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
> wrote:
>> Hi,
>>
>> I would like to open KIP-11 for voting.
>>
>> Thanks
>> Parth
>>
>> On 4/22/15, 1:56 PM, "Parth Brahmbhatt"
>>
>> wrote:
>>
>>>Hi Jeff,
>>>
>>>Thanks a lot for the review. I think you have a valid point about
>>>acls
>>>being duplicated and the simplest solution would be to modify acls
>>>class
>>>so they hold a set of principals instead of single principal. i.e
>>>
>>> has  Permissions on 
>>>from
>>>.
>>>
>>>I think the evaluation order only matters for the permissionType
>>>which
>>>is
>>>Deny acls should be evaluated before allow acls. To give you an
>>>example
>>>suppose we have following acls
>>>
>>>acl1 -> user1 is allowed to READ from all hosts.
>>>acl2 -> host1 is allowed to READ regardless of who is the user.
>>>acl3 -> host2 is allowed to READ regardless of who is the user.
>>>
>>>acl4 -> user1 is denied to READ from host1.
>>>
>>>As stated in the KIP we first evaluate DENY so if user1 tries to
>>>access
>>>from host1 he will be denied(acl4), even though both user1 and host1
>>>has
>>>acl’s for allow with wildcards (acl1, acl2).
>>>If user1 tried to READ from host2 , the action will be allowed and
>>>it
>>>does
>>>not matter if we match acl3 or acl1 so I don’t think the evaluation
>>>order
>>>matters here.
>>>
>>>“Will people actually use hosts with users?” I really don’t know but
>>>given
>>>ACl’s are part of our Public APIs I thought it is better to try and
>>>cover
>>>more use cases. If others think this extra complexity is not worth
>>>the
>>>value its adding please raise your concerns so we can discuss if it
>>>should
>>>be removed from the acl structure. Note that even in absence of
>>>hosts
>>>from
>>>ACL users will still be able to whitelist/blacklist host as lon

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
Thanks for your comments Gari. My responses are inline.

Thanks
Parth

On 4/24/15, 10:36 AM, "Gari Singh"  wrote:

>Sorry - fat fingered send ...
>
>
>Not sure if my "newbie" vote will count, but I think you are getting
>pretty
>close here.
>
>Couple of things:
>
>1) I know the Session object is from a different JIRA, but I think that
>Session should take a Subject rather than just a single Principal.  The
>reason for this is because a Subject can have multiple Principals (for
>example both a username and a group or perhaps someone would want to use
>both the username and the clientIP as Principals)

I think the user -> group mapping can be done at Authorization
implementation layer. In any case as you pointed out the session is part
of another jira and I think a PR is out
https://reviews.apache.org/r/27204/diff/ and we should discuss it on that
PR.

>
>2)  We would then also have multiple concrete Principals, e.g.
>
>KafkaPrincipal
>KafkaUserPrincipal
>KafkaGroupPrincipal
>(perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
>etc
>
>This is important as eventually (hopefully sooner than later), we will
>support multiple types of authentication which may each want to populate
>the Subject with one or more Principals and perhaps even credentials (this
>could be used in the future to hold encryption keys or perhaps the raw
>info
>prior to authentication).
>
>So in this way, if we have different authentication modules, we can add
>different types of Principals by extension
>
>This also allows the same subject to have access to some resources based
>on
>username and some based on group.
>
>Given that with this we would have different types of Principals, I would
>then modify the ACL to look like:
>
>{"version":1,
>  {"acls":[
>{
>  "principal_types":["KafkaUserPrincipal","KafkaGroupPrincipal"],
>  "principals":["alice","kafka-devs"]
>  ...
>
>or
>
>{"version":1,
>  {"acls":[
>{
>  "principals":["KafkaUserPrincipal:alice","KafkaGroupPrincipal:kafka-
>devs"]
>
>
>But in either case this allows for easy identification of the type of
>principal and makes it easy to plugin multiple kinds of principals
>
>The advantage of all of this is that it now provides more flexibility for
>custom modules for both authentication and authorization moving forward.

All the principals that you listed above can be supported with current
design. Acls take a KafkaPrincipal as input which is a combination of type
and principal name and the authorizer implementations are free to create
any extension of this which covers group: groupName, host: HostName,
kerberos: kerberosUserName and any other types that may come up. I am not
sure how encryption key storage is relavent to the Authorizer so will be
great if you can elaborate.

>
>3) Are you sure that you want "authorize" to take a "session" object?  If
>we use the model in one above, we could just populate the Subject with a
>KafkaClientAddressPrincipal and thenhave access to that when evaluated the
>ACLs.

I think it is better to take a session which can just be a wrapper on 
top
of Subject + host for now. This allows for extension which in my opinion
is more "future requirement" proof.

>
>4) What about actually caching authorization decisions?  I know ACLs will
>be cached, but the actual authorize decision can be expensive as well?

In default implementation I don’t plan to do this. Easy to add later if
we want to but I am not sure why would this ever be expansive when acls
are cached and number of acls on a single topic should be very small and
iterating over them with simple string comparison should not really be
expansive.

Thanks
Parth

>
>On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh 
>wrote:
>
>> Not sure if my "newbie" vote will count, but I think you are getting
>> pretty close here.
>>
>> Couple of things:
>>
>> 1) I know the Session object is from a different JIRA, but I think that
>> Session should take a Subject rather than just a single Principal.  The
>> reason for this is because a Subject can have multiple Principals (for
>> example both a username and a group or perhaps someone would want to use
>> both the username and the clientIP as Principals)
>>
>> 2)  We would then also have multiple concrete Principals, e.g.
>>
>> KafkaPrincipal
>> KafkaUserPrincipal
>> KafkaGroupPrincipal
>> (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
>> etc
>>
>> This is important as eventually (hopefully sooner than later), we will
>> support multiple types of authentication which may each want to populate
>> the Subject with one or more Principals and perhaps even credentials
>>(this
>> could be used in the future to hold encryption keys or perhaps the raw
>>info
>> prior to authentication).
>>
>> So in this way, if we have different authentication modules, we can add
>> different types of Principals by extension
>>
>> This also allows the same subject to have access to some

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gwen Shapira
Thanks.

One more thing I'm missing in the KIP is details on the Group resource
(I think we discussed this and it was just not fully updated):

* Does everyone have the privilege to create a new Group and use it to
consume from Topics he's already privileged on?
* Will the CLI tool be used to manage group membership too?
* Groups are kind of ephemeral, right? If all consumers in the group
disconnect the group is gone, AFAIK. Do we preserve the ACLs? Or do we
treat the new group as completely new resource? Can we create ACLs
before the group exists, in anticipation of it getting created?

Its all small details, but it will be difficult to implement KIP-11
without knowing the answers :)

Gwen


On Fri, Apr 24, 2015 at 9:58 AM, Parth Brahmbhatt
 wrote:
> You are right, moved it to the default implementation section.
>
> Thanks
> Parth
>
> On 4/24/15, 9:52 AM, "Gwen Shapira"  wrote:
>
>>Sample ACL JSON and Zookeeper is in public API, but I thought it is
>>part of DefaultAuthorizer (Since Sentry and Argus won't be using
>>Zookeeper).
>>Am I wrong? Or is it the KIP?
>>
>>On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
>> wrote:
>>> Thanks for clarifying Gwen, KIP updated.
>>>
>>> I tried to make the distinction by creating a section for all public
>>>APIs
>>>
>>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
>>>In
>>> terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses
>>>
>>> Let me know if you think there is a better way to reflect this.
>>>
>>> Thanks
>>> Parth
>>>
>>> On 4/24/15, 9:37 AM, "Gwen Shapira"  wrote:
>>>
+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
 wrote:
> Hi,
>
> I would like to open KIP-11 for voting.
>
> Thanks
> Parth
>
> On 4/22/15, 1:56 PM, "Parth Brahmbhatt" 
> wrote:
>
>>Hi Jeff,
>>
>>Thanks a lot for the review. I think you have a valid point about acls
>>being duplicated and the simplest solution would be to modify acls
>>class
>>so they hold a set of principals instead of single principal. i.e
>>
>> has  Permissions on  from
>>.
>>
>>I think the evaluation order only matters for the permissionType which
>>is
>>Deny acls should be evaluated before allow acls. To give you an
>>example
>>suppose we have following acls
>>
>>acl1 -> user1 is allowed to READ from all hosts.
>>acl2 -> host1 is allowed to READ regardless of who is the user.
>>acl3 -> host2 is allowed to READ regardless of who is the user.
>>
>>acl4 -> user1 is denied to READ from host1.
>>
>>As stated in the KIP we first evaluate DENY so if user1 tries to
>>access
>>from host1 he will be denied(acl4), even though both user1 and host1
>>has
>>acl’s for allow with wildcards (acl1, acl2).
>>If user1 tried to READ from host2 , the action will be allowed and it
>>does
>>not matter if we match acl3 or acl1 so I don’t think the evaluation
>>order
>>matters here.
>>
>>“Will people actually use hosts with users?” I really don’t know but
>>given
>>ACl’s are part of our Public APIs I thought it is better to try and
>>cover
>>more use cases. If others think this extra complexity is not worth the
>>value its adding please raise your concerns so we can discuss if it
>>should
>>be removed from the acl structure. Note that even in absence of hosts
>>from
>>ACL users will still be able to whitelist/blacklist host as long as we
>>start supporting principalType = “host”, easy to add and can be an
>>incremental improvement. They will however loose the ability to
>>restrict
>>access to users just from a set of hosts.
>>
>>We agreed to offer a CLI to overcome the JSON acl config
>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati
>>on
>>+I
>>n
>>terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
>>Jsons but that probably has something to do with me being a developer
>>:-).
>>
>>Thanks
>>Parth
>>
>>On 4/22/15, 11:38 AM, "Jeff Holoman"  wrote:
>>
>>>Parth,
>>>
>>>This is a long thread, so trying to keep up here, sorry if this has
>>>been
>>>covered before. First, great job on the KIP proposal and work so far.
>>>
>>>Are we sure that we want to tie host level access to a given user? My
>>>understanding is that the ACL will be (omitting some fields)
>>>
>>>user_a, host1, host2, host3
>

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gari Singh
Sorry - fat fingered send ...


Not sure if my "newbie" vote will count, but I think you are getting pretty
close here.

Couple of things:

1) I know the Session object is from a different JIRA, but I think that
Session should take a Subject rather than just a single Principal.  The
reason for this is because a Subject can have multiple Principals (for
example both a username and a group or perhaps someone would want to use
both the username and the clientIP as Principals)

2)  We would then also have multiple concrete Principals, e.g.

KafkaPrincipal
KafkaUserPrincipal
KafkaGroupPrincipal
(perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
etc

This is important as eventually (hopefully sooner than later), we will
support multiple types of authentication which may each want to populate
the Subject with one or more Principals and perhaps even credentials (this
could be used in the future to hold encryption keys or perhaps the raw info
prior to authentication).

So in this way, if we have different authentication modules, we can add
different types of Principals by extension

This also allows the same subject to have access to some resources based on
username and some based on group.

Given that with this we would have different types of Principals, I would
then modify the ACL to look like:

{"version":1,
  {"acls":[
{
  "principal_types":["KafkaUserPrincipal","KafkaGroupPrincipal"],
  "principals":["alice","kafka-devs"]
  ...

or

{"version":1,
  {"acls":[
{
  "principals":["KafkaUserPrincipal:alice","KafkaGroupPrincipal:kafka-
devs"]


But in either case this allows for easy identification of the type of
principal and makes it easy to plugin multiple kinds of principals

The advantage of all of this is that it now provides more flexibility for
custom modules for both authentication and authorization moving forward.

3) Are you sure that you want "authorize" to take a "session" object?  If
we use the model in one above, we could just populate the Subject with a
KafkaClientAddressPrincipal and thenhave access to that when evaluated the
ACLs.

4) What about actually caching authorization decisions?  I know ACLs will
be cached, but the actual authorize decision can be expensive as well?

On Fri, Apr 24, 2015 at 1:27 PM, Gari Singh  wrote:

> Not sure if my "newbie" vote will count, but I think you are getting
> pretty close here.
>
> Couple of things:
>
> 1) I know the Session object is from a different JIRA, but I think that
> Session should take a Subject rather than just a single Principal.  The
> reason for this is because a Subject can have multiple Principals (for
> example both a username and a group or perhaps someone would want to use
> both the username and the clientIP as Principals)
>
> 2)  We would then also have multiple concrete Principals, e.g.
>
> KafkaPrincipal
> KafkaUserPrincipal
> KafkaGroupPrincipal
> (perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
> etc
>
> This is important as eventually (hopefully sooner than later), we will
> support multiple types of authentication which may each want to populate
> the Subject with one or more Principals and perhaps even credentials (this
> could be used in the future to hold encryption keys or perhaps the raw info
> prior to authentication).
>
> So in this way, if we have different authentication modules, we can add
> different types of Principals by extension
>
> This also allows the same subject to have access to some resources based
> on username and some based on group.
>
> Given that with this we would have different types of Principals, I would
> then modify the ACL to look like:
>
> {"version":1,
>   {"acls":[
> {
>   "principal_types":["KafkaUserPrincipal","KafkaGroupPrincipal"],
>   "principals":["alice","kafka-devs"
>
>
>
>
>
> 3) The advantage of all of this is that it now provides more flexibility
> for custom modules for both authentication and authorization moving forward.
>
>
>
> On Fri, Apr 24, 2015 at 12:37 PM, Gwen Shapira 
> wrote:
>
>> +1 (non-binding)
>>
>> Two nitpicks for the wiki:
>> * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
>> sure new consumers need it to be part of a consumer group.
>> * Can you clearly separate which parts are the API (common to every
>> Authorizer) and which parts are DefaultAuthorizer implementation? It
>> will make reviews and Authorizer implementations a bit easier to know
>> exactly which is which.
>>
>> Gwen
>>
>> On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
>>  wrote:
>> > Hi,
>> >
>> > I would like to open KIP-11 for voting.
>> >
>> > Thanks
>> > Parth
>> >
>> > On 4/22/15, 1:56 PM, "Parth Brahmbhatt" 
>> > wrote:
>> >
>> >>Hi Jeff,
>> >>
>> >>Thanks a lot for the review. I think you have a valid point about acls
>> >>being duplicated and the simplest solution would be to modify acls class
>> >>so they hold a set of principals instead of single principal. i.e
>> >>
>> >> has  Permission

[jira] [Commented] (KAFKA-1936) Track offset commit requests separately from produce requests

2015-04-24 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511392#comment-14511392
 ] 

Dong Lin commented on KAFKA-1936:
-

Thanks! I will work on this ticket.

On Fri, Apr 24, 2015 at 10:08 AM, Aditya Auradkar (JIRA) 



> Track offset commit requests separately from produce requests
> -
>
> Key: KAFKA-1936
> URL: https://issues.apache.org/jira/browse/KAFKA-1936
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Aditya Auradkar
>Assignee: Dong Lin
>
> In ReplicaManager, failed and total produce requests are updated from 
> appendToLocalLog. Since offset commit requests also follow the same path, 
> they are counted along with produce requests. Add a metric and count them 
> separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1940) Initial checkout and build failing

2015-04-24 Thread Daneel Yaitskov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daneel Yaitskov updated KAFKA-1940:
---
Attachment: zinc-upgrade.patch

Reattach the patch file without colored diff.

> Initial checkout and build failing
> --
>
> Key: KAFKA-1940
> URL: https://issues.apache.org/jira/browse/KAFKA-1940
> Project: Kafka
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.8.1.2
> Environment: Groovy:   1.8.6
> Ant:  Apache Ant(TM) version 1.9.2 compiled on July 8 2013
> Ivy:  2.2.0
> JVM:  1.8.0_25 (Oracle Corporation 25.25-b02)
> OS:   Windows 7 6.1 amd64
>Reporter: Martin Lemanski
>  Labels: build
> Attachments: zinc-upgrade.patch
>
>
> when performing `gradle wrapper` and `gradlew build` as a "new" developer, I 
> get an exception: 
> {code}
> C:\development\git\kafka>gradlew build --stacktrace
> <...>
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':core:compileScala'.
> > com.typesafe.zinc.Setup.create(Lcom/typesafe/zinc/ScalaLocation;Lcom/typesafe/zinc/SbtJars;Ljava/io/File;)Lcom/typesaf
> e/zinc/Setup;
> {code}
> Details: https://gist.github.com/mlem/ddff83cc8a25b040c157
> Current Commit:
> {code}
> C:\development\git\kafka>git rev-parse --verify HEAD
> 71602de0bbf7727f498a812033027f6cbfe34eb8
> {code}
> I am evaluating kafka for my company and wanted to run some tests with it, 
> but couldn't due to this error. I know gradle can be tricky and it's not easy 
> to setup everything correct, but this kind of bugs turns possible 
> commiters/users off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1940) Initial checkout and build failing

2015-04-24 Thread Daneel Yaitskov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daneel Yaitskov updated KAFKA-1940:
---
Attachment: (was: zinc-upgrade.patch)

> Initial checkout and build failing
> --
>
> Key: KAFKA-1940
> URL: https://issues.apache.org/jira/browse/KAFKA-1940
> Project: Kafka
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.8.1.2
> Environment: Groovy:   1.8.6
> Ant:  Apache Ant(TM) version 1.9.2 compiled on July 8 2013
> Ivy:  2.2.0
> JVM:  1.8.0_25 (Oracle Corporation 25.25-b02)
> OS:   Windows 7 6.1 amd64
>Reporter: Martin Lemanski
>  Labels: build
>
> when performing `gradle wrapper` and `gradlew build` as a "new" developer, I 
> get an exception: 
> {code}
> C:\development\git\kafka>gradlew build --stacktrace
> <...>
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':core:compileScala'.
> > com.typesafe.zinc.Setup.create(Lcom/typesafe/zinc/ScalaLocation;Lcom/typesafe/zinc/SbtJars;Ljava/io/File;)Lcom/typesaf
> e/zinc/Setup;
> {code}
> Details: https://gist.github.com/mlem/ddff83cc8a25b040c157
> Current Commit:
> {code}
> C:\development\git\kafka>git rev-parse --verify HEAD
> 71602de0bbf7727f498a812033027f6cbfe34eb8
> {code}
> I am evaluating kafka for my company and wanted to run some tests with it, 
> but couldn't due to this error. I know gradle can be tricky and it's not easy 
> to setup everything correct, but this kind of bugs turns possible 
> commiters/users off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gari Singh
Not sure if my "newbie" vote will count, but I think you are getting pretty
close here.

Couple of things:

1) I know the Session object is from a different JIRA, but I think that
Session should take a Subject rather than just a single Principal.  The
reason for this is because a Subject can have multiple Principals (for
example both a username and a group or perhaps someone would want to use
both the username and the clientIP as Principals)

2)  We would then also have multiple concrete Principals, e.g.

KafkaPrincipal
KafkaUserPrincipal
KafkaGroupPrincipal
(perhaps even KafkaKerberosPrincipal and KafkaClientAddressPrincipal)
etc

This is important as eventually (hopefully sooner than later), we will
support multiple types of authentication which may each want to populate
the Subject with one or more Principals and perhaps even credentials (this
could be used in the future to hold encryption keys or perhaps the raw info
prior to authentication).

So in this way, if we have different authentication modules, we can add
different types of Principals by extension

This also allows the same subject to have access to some resources based on
username and some based on group.

Given that with this we would have different types of Principals, I would
then modify the ACL to look like:

{"version":1,
  {"acls":[
{
  "principal_types":["KafkaUserPrincipal","KafkaGroupPrincipal"],
  "principals":["alice","kafka-devs"





3) The advantage of all of this is that it now provides more flexibility
for custom modules for both authentication and authorization moving forward.



On Fri, Apr 24, 2015 at 12:37 PM, Gwen Shapira 
wrote:

> +1 (non-binding)
>
> Two nitpicks for the wiki:
> * Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
> sure new consumers need it to be part of a consumer group.
> * Can you clearly separate which parts are the API (common to every
> Authorizer) and which parts are DefaultAuthorizer implementation? It
> will make reviews and Authorizer implementations a bit easier to know
> exactly which is which.
>
> Gwen
>
> On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
>  wrote:
> > Hi,
> >
> > I would like to open KIP-11 for voting.
> >
> > Thanks
> > Parth
> >
> > On 4/22/15, 1:56 PM, "Parth Brahmbhatt" 
> > wrote:
> >
> >>Hi Jeff,
> >>
> >>Thanks a lot for the review. I think you have a valid point about acls
> >>being duplicated and the simplest solution would be to modify acls class
> >>so they hold a set of principals instead of single principal. i.e
> >>
> >> has  Permissions on  from
> >>.
> >>
> >>I think the evaluation order only matters for the permissionType which is
> >>Deny acls should be evaluated before allow acls. To give you an example
> >>suppose we have following acls
> >>
> >>acl1 -> user1 is allowed to READ from all hosts.
> >>acl2 -> host1 is allowed to READ regardless of who is the user.
> >>acl3 -> host2 is allowed to READ regardless of who is the user.
> >>
> >>acl4 -> user1 is denied to READ from host1.
> >>
> >>As stated in the KIP we first evaluate DENY so if user1 tries to access
> >>from host1 he will be denied(acl4), even though both user1 and host1 has
> >>acl’s for allow with wildcards (acl1, acl2).
> >>If user1 tried to READ from host2 , the action will be allowed and it
> does
> >>not matter if we match acl3 or acl1 so I don’t think the evaluation order
> >>matters here.
> >>
> >>“Will people actually use hosts with users?” I really don’t know but
> given
> >>ACl’s are part of our Public APIs I thought it is better to try and cover
> >>more use cases. If others think this extra complexity is not worth the
> >>value its adding please raise your concerns so we can discuss if it
> should
> >>be removed from the acl structure. Note that even in absence of hosts
> from
> >>ACL users will still be able to whitelist/blacklist host as long as we
> >>start supporting principalType = “host”, easy to add and can be an
> >>incremental improvement. They will however loose the ability to restrict
> >>access to users just from a set of hosts.
> >>
> >>We agreed to offer a CLI to overcome the JSON acl config
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I
> >>n
> >>terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
> >>Jsons but that probably has something to do with me being a developer
> :-).
> >>
> >>Thanks
> >>Parth
> >>
> >>On 4/22/15, 11:38 AM, "Jeff Holoman"  wrote:
> >>
> >>>Parth,
> >>>
> >>>This is a long thread, so trying to keep up here, sorry if this has been
> >>>covered before. First, great job on the KIP proposal and work so far.
> >>>
> >>>Are we sure that we want to tie host level access to a given user? My
> >>>understanding is that the ACL will be (omitting some fields)
> >>>
> >>>user_a, host1, host2, host3
> >>>user_b, host1, host2, host3
> >>>
> >>>So there would potentially be a lot of redundancy in the configs. Does
> it
> >>>make sense to have hosts be at the same level as pr

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Sriharsha Chintalapani
+1 (non-binding)

-- 
Harsha


On April 24, 2015 at 9:59:09 AM, Parth Brahmbhatt (pbrahmbh...@hortonworks.com) 
wrote:

You are right, moved it to the default implementation section.  

Thanks  
Parth  

On 4/24/15, 9:52 AM, "Gwen Shapira"  wrote:  

>Sample ACL JSON and Zookeeper is in public API, but I thought it is  
>part of DefaultAuthorizer (Since Sentry and Argus won't be using  
>Zookeeper).  
>Am I wrong? Or is it the KIP?  
>  
>On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt  
> wrote:  
>> Thanks for clarifying Gwen, KIP updated.  
>>  
>> I tried to make the distinction by creating a section for all public  
>>APIs  
>>  
>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+  
>>In  
>> terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses  
>>  
>> Let me know if you think there is a better way to reflect this.  
>>  
>> Thanks  
>> Parth  
>>  
>> On 4/24/15, 9:37 AM, "Gwen Shapira"  wrote:  
>>  
>>>+1 (non-binding)  
>>>  
>>>Two nitpicks for the wiki:  
>>>* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty  
>>>sure new consumers need it to be part of a consumer group.  
>>>* Can you clearly separate which parts are the API (common to every  
>>>Authorizer) and which parts are DefaultAuthorizer implementation? It  
>>>will make reviews and Authorizer implementations a bit easier to know  
>>>exactly which is which.  
>>>  
>>>Gwen  
>>>  
>>>On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt  
>>> wrote:  
 Hi,  
  
 I would like to open KIP-11 for voting.  
  
 Thanks  
 Parth  
  
 On 4/22/15, 1:56 PM, "Parth Brahmbhatt"   
 wrote:  
  
>Hi Jeff,  
>  
>Thanks a lot for the review. I think you have a valid point about acls  
>being duplicated and the simplest solution would be to modify acls  
>class  
>so they hold a set of principals instead of single principal. i.e  
>  
> has  Permissions on  from  
>.  
>  
>I think the evaluation order only matters for the permissionType which  
>is  
>Deny acls should be evaluated before allow acls. To give you an  
>example  
>suppose we have following acls  
>  
>acl1 -> user1 is allowed to READ from all hosts.  
>acl2 -> host1 is allowed to READ regardless of who is the user.  
>acl3 -> host2 is allowed to READ regardless of who is the user.  
>  
>acl4 -> user1 is denied to READ from host1.  
>  
>As stated in the KIP we first evaluate DENY so if user1 tries to  
>access  
>from host1 he will be denied(acl4), even though both user1 and host1  
>has  
>acl’s for allow with wildcards (acl1, acl2).  
>If user1 tried to READ from host2 , the action will be allowed and it  
>does  
>not matter if we match acl3 or acl1 so I don’t think the evaluation  
>order  
>matters here.  
>  
>“Will people actually use hosts with users?” I really don’t know but  
>given  
>ACl’s are part of our Public APIs I thought it is better to try and  
>cover  
>more use cases. If others think this extra complexity is not worth the  
>value its adding please raise your concerns so we can discuss if it  
>should  
>be removed from the acl structure. Note that even in absence of hosts  
>from  
>ACL users will still be able to whitelist/blacklist host as long as we  
>start supporting principalType = “host”, easy to add and can be an  
>incremental improvement. They will however loose the ability to  
>restrict  
>access to users just from a set of hosts.  
>  
>We agreed to offer a CLI to overcome the JSON acl config  
>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati  
>on  
>+I  
>n  
>terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like  
>Jsons but that probably has something to do with me being a developer  
>:-).  
>  
>Thanks  
>Parth  
>  
>On 4/22/15, 11:38 AM, "Jeff Holoman"  wrote:  
>  
>>Parth,  
>>  
>>This is a long thread, so trying to keep up here, sorry if this has  
>>been  
>>covered before. First, great job on the KIP proposal and work so far.  
>>  
>>Are we sure that we want to tie host level access to a given user? My  
>>understanding is that the ACL will be (omitting some fields)  
>>  
>>user_a, host1, host2, host3  
>>user_b, host1, host2, host3  
>>  
>>So there would potentially be a lot of redundancy in the configs.  
>>Does  
>>it  
>>make sense to have hosts be at the same level as principal in the  
>>hierarchy? This way you could just blanket the allowed / denied hosts  
>>and  
>>only have to worry about the users. So if you follow this, then  
>>  
>>we can wildcard the user so we can have a separate list of just  
>>host-based  
>>access. What's the order that the perms 

[jira] [Updated] (KAFKA-2105) NullPointerException in client on MetadataRequest

2015-04-24 Thread Daneel Yaitskov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daneel Yaitskov updated KAFKA-2105:
---
Attachment: guard-from-null.patch

Sorry, First time I did patch with --color option.
As for me patch files are legacy technique, a github reference for a pull 
request looks much much better, but I didn't find any ticket resolution in a 
such way. 

> NullPointerException in client on MetadataRequest
> -
>
> Key: KAFKA-2105
> URL: https://issues.apache.org/jira/browse/KAFKA-2105
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.1
>Reporter: Roger Hoover
>Priority: Minor
> Attachments: guard-from-null.patch
>
>
> With the new producer, if you accidentally pass null to 
> KafkaProducer.partitionsFor(null), it will cause the IO thread to throw NPE.
> Uncaught error in kafka producer I/O thread: 
> java.lang.NullPointerException
>   at org.apache.kafka.common.utils.Utils.utf8Length(Utils.java:174)
>   at org.apache.kafka.common.protocol.types.Type$5.sizeOf(Type.java:176)
>   at 
> org.apache.kafka.common.protocol.types.ArrayOf.sizeOf(ArrayOf.java:55)
>   at org.apache.kafka.common.protocol.types.Schema.sizeOf(Schema.java:81)
>   at org.apache.kafka.common.protocol.types.Struct.sizeOf(Struct.java:218)
>   at 
> org.apache.kafka.common.requests.RequestSend.serialize(RequestSend.java:35)
>   at 
> org.apache.kafka.common.requests.RequestSend.(RequestSend.java:29)
>   at 
> org.apache.kafka.clients.NetworkClient.metadataRequest(NetworkClient.java:369)
>   at 
> org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:391)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:188)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2105) NullPointerException in client on MetadataRequest

2015-04-24 Thread Daneel Yaitskov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daneel Yaitskov updated KAFKA-2105:
---
Attachment: (was: guard-from-null.patch)

> NullPointerException in client on MetadataRequest
> -
>
> Key: KAFKA-2105
> URL: https://issues.apache.org/jira/browse/KAFKA-2105
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.1
>Reporter: Roger Hoover
>Priority: Minor
>
> With the new producer, if you accidentally pass null to 
> KafkaProducer.partitionsFor(null), it will cause the IO thread to throw NPE.
> Uncaught error in kafka producer I/O thread: 
> java.lang.NullPointerException
>   at org.apache.kafka.common.utils.Utils.utf8Length(Utils.java:174)
>   at org.apache.kafka.common.protocol.types.Type$5.sizeOf(Type.java:176)
>   at 
> org.apache.kafka.common.protocol.types.ArrayOf.sizeOf(ArrayOf.java:55)
>   at org.apache.kafka.common.protocol.types.Schema.sizeOf(Schema.java:81)
>   at org.apache.kafka.common.protocol.types.Struct.sizeOf(Struct.java:218)
>   at 
> org.apache.kafka.common.requests.RequestSend.serialize(RequestSend.java:35)
>   at 
> org.apache.kafka.common.requests.RequestSend.(RequestSend.java:29)
>   at 
> org.apache.kafka.clients.NetworkClient.metadataRequest(NetworkClient.java:369)
>   at 
> org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:391)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:188)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Should 0.8.3 consumers correctly function with 0.8.2 brokers?

2015-04-24 Thread Neha Narkhede
Yes, I was clearly confused :-)

On Fri, Apr 24, 2015 at 9:37 AM, Sean Lydon  wrote:

> Thanks for the responses. Ewen is correct that I am referring to the
> *new* consumer (org.apache.kafka.clients.consumer.KafkaConsumer).
>
> I am extending the consumer to allow my applications more control over
> committed offsets.  I really want to get away from zookeeper (so using
> the offset storage), and re-balancing is something I haven't really
> needed to tackle in an automated/seamless way.  Either way, I'll hold
> off going further down this road until there is more interest.
>
> @Gwen
> I set up a single consumer without partition.assignment.strategy or
> rebalance.callback.class.  I was unable to subscribe to just a topic
> ("Unknown api code 11" on broker), but I could subscribe to a
> topicpartition.  This makes sense as I would need to handle re-balance
> outside the consumer.  Things functioned as expected (well  I have an
> additional minor fix to code from KAFKA-2121), and the only exceptions
> on broker were due to closing consumers (which I have become
> accustomed to).  My tests are specific to my extended version of the
> consumer, but they basically do a little writing and reading with
> different serde classes with application controlled commits (similar
> to onSuccess and onFailure after each record, but with tolerance for
> out of order acknowledgements).
>
> If you are interested, here is the patch of the hack against trunk.
>
> On Thu, Apr 23, 2015 at 10:27 PM, Ewen Cheslack-Postava
>  wrote:
> > @Neha I think you're mixing up the 0.8.1/0.8.2 updates and the
> 0.8.2/0.8.3
> > that's being discussed here?
> >
> > I think the original question was about using the *new* consumer
> ("clients
> > consumer") with 0.8.2. Gwen's right, it will use features not even
> > implemented in the broker in trunk yet, let alone the 0.8.2.
> >
> > I don't think the "enable.commit.downgrade" type option, or supporting
> the
> > old protocol with the new consumer at all, makes much sense. You'd end up
> > with some weird hybrid of simple and high-level consumers -- you could
> use
> > offset storage, but you'd have to manage rebalancing yourself since none
> of
> > the coordinator support would be there.
> >
> >
> > On Thu, Apr 23, 2015 at 9:22 PM, Neha Narkhede 
> wrote:
> >
> >> My understanding is that ideally the 0.8.3 consumer should work with an
> >> 0.8.2 broker if the offset commit config was set to "zookeeper".
> >>
> >> The only thing that might not work is offset commit to Kafka, which
> makes
> >> sense since the 0.8.2 broker does not support Kafka based offset
> >> management.
> >>
> >> If we broke all kinds of offset commits, then it seems like a
> regression,
> >> no?
> >>
> >> On Thu, Apr 23, 2015 at 7:26 PM, Gwen Shapira 
> >> wrote:
> >>
> >> > I didn't think 0.8.3 consumer will ever be able to talk to 0.8.2
> >> > broker... there are some essential pieces that are missing in 0.8.2
> >> > (Coordinator, Heartbeat, etc).
> >> > Maybe I'm missing something. It will be nice if this will work :)
> >> >
> >> > Mind sharing what / how you tested? Were there no errors in broker
> >> > logs after your fix?
> >> >
> >> > On Thu, Apr 23, 2015 at 5:37 PM, Sean Lydon 
> >> wrote:
> >> > > Currently the clients consumer (trunk) sends offset commit requests
> of
> >> > > version 2.  The 0.8.2 brokers fail to handle this particular request
> >> > > with a:
> >> > >
> >> > > java.lang.AssertionError: assertion failed: Version 2 is invalid for
> >> > > OffsetCommitRequest. Valid versions are 0 or 1.
> >> > >
> >> > > I was able to make this work via a forceful downgrade of this
> >> > > particular request, but I would like some feedback on whether a
> >> > > "enable.commit.downgrade" configuration would be a tolerable method
> to
> >> > > allow 0.8.3 consumers to interact with 0.8.2 brokers.  I'm also
> >> > > interested in this even being a goal worth pursuing.
> >> > >
> >> > > Thanks,
> >> > > Sean
> >> >
> >>
> >>
> >>
> >> --
> >> Thanks,
> >> Neha
> >>
> >
> >
> >
> > --
> > Thanks,
> > Ewen
>



-- 
Thanks,
Neha


[jira] [Updated] (KAFKA-1936) Track offset commit requests separately from produce requests

2015-04-24 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar updated KAFKA-1936:
---
Assignee: Dong Lin  (was: Aditya Auradkar)

> Track offset commit requests separately from produce requests
> -
>
> Key: KAFKA-1936
> URL: https://issues.apache.org/jira/browse/KAFKA-1936
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Aditya Auradkar
>Assignee: Dong Lin
>
> In ReplicaManager, failed and total produce requests are updated from 
> appendToLocalLog. Since offset commit requests also follow the same path, 
> they are counted along with produce requests. Add a metric and count them 
> separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
You are right, moved it to the default implementation section.

Thanks
Parth

On 4/24/15, 9:52 AM, "Gwen Shapira"  wrote:

>Sample ACL JSON and Zookeeper is in public API, but I thought it is
>part of DefaultAuthorizer (Since Sentry and Argus won't be using
>Zookeeper).
>Am I wrong? Or is it the KIP?
>
>On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
> wrote:
>> Thanks for clarifying Gwen, KIP updated.
>>
>> I tried to make the distinction by creating a section for all public
>>APIs
>> 
>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
>>In
>> terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses
>>
>> Let me know if you think there is a better way to reflect this.
>>
>> Thanks
>> Parth
>>
>> On 4/24/15, 9:37 AM, "Gwen Shapira"  wrote:
>>
>>>+1 (non-binding)
>>>
>>>Two nitpicks for the wiki:
>>>* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
>>>sure new consumers need it to be part of a consumer group.
>>>* Can you clearly separate which parts are the API (common to every
>>>Authorizer) and which parts are DefaultAuthorizer implementation? It
>>>will make reviews and Authorizer implementations a bit easier to know
>>>exactly which is which.
>>>
>>>Gwen
>>>
>>>On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
>>> wrote:
 Hi,

 I would like to open KIP-11 for voting.

 Thanks
 Parth

 On 4/22/15, 1:56 PM, "Parth Brahmbhatt" 
 wrote:

>Hi Jeff,
>
>Thanks a lot for the review. I think you have a valid point about acls
>being duplicated and the simplest solution would be to modify acls
>class
>so they hold a set of principals instead of single principal. i.e
>
> has  Permissions on  from
>.
>
>I think the evaluation order only matters for the permissionType which
>is
>Deny acls should be evaluated before allow acls. To give you an
>example
>suppose we have following acls
>
>acl1 -> user1 is allowed to READ from all hosts.
>acl2 -> host1 is allowed to READ regardless of who is the user.
>acl3 -> host2 is allowed to READ regardless of who is the user.
>
>acl4 -> user1 is denied to READ from host1.
>
>As stated in the KIP we first evaluate DENY so if user1 tries to
>access
>from host1 he will be denied(acl4), even though both user1 and host1
>has
>acl’s for allow with wildcards (acl1, acl2).
>If user1 tried to READ from host2 , the action will be allowed and it
>does
>not matter if we match acl3 or acl1 so I don’t think the evaluation
>order
>matters here.
>
>“Will people actually use hosts with users?” I really don’t know but
>given
>ACl’s are part of our Public APIs I thought it is better to try and
>cover
>more use cases. If others think this extra complexity is not worth the
>value its adding please raise your concerns so we can discuss if it
>should
>be removed from the acl structure. Note that even in absence of hosts
>from
>ACL users will still be able to whitelist/blacklist host as long as we
>start supporting principalType = “host”, easy to add and can be an
>incremental improvement. They will however loose the ability to
>restrict
>access to users just from a set of hosts.
>
>We agreed to offer a CLI to overcome the JSON acl config
>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorizati
>on
>+I
>n
>terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
>Jsons but that probably has something to do with me being a developer
>:-).
>
>Thanks
>Parth
>
>On 4/22/15, 11:38 AM, "Jeff Holoman"  wrote:
>
>>Parth,
>>
>>This is a long thread, so trying to keep up here, sorry if this has
>>been
>>covered before. First, great job on the KIP proposal and work so far.
>>
>>Are we sure that we want to tie host level access to a given user? My
>>understanding is that the ACL will be (omitting some fields)
>>
>>user_a, host1, host2, host3
>>user_b, host1, host2, host3
>>
>>So there would potentially be a lot of redundancy in the configs.
>>Does
>>it
>>make sense to have hosts be at the same level as principal in the
>>hierarchy? This way you could just blanket the allowed / denied hosts
>>and
>>only have to worry about the users. So if you follow this, then
>>
>>we can wildcard the user so we can have a separate list of just
>>host-based
>>access. What's the order that the perms would be evaluated if a there
>>was
>>more than one match on a principal ?
>>
>>Is the thought that there wouldn't usually be much overlap on hosts?
>>I
>>guess I can imagine a scenario where I want to offline/online access
>>to a
>>particular hosts or set of hosts and if there was overlap, I'm doing
>>a
>>bunch of alter commands for just a single

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gwen Shapira
Sample ACL JSON and Zookeeper is in public API, but I thought it is
part of DefaultAuthorizer (Since Sentry and Argus won't be using
Zookeeper).
Am I wrong? Or is it the KIP?

On Fri, Apr 24, 2015 at 9:49 AM, Parth Brahmbhatt
 wrote:
> Thanks for clarifying Gwen, KIP updated.
>
> I tried to make the distinction by creating a section for all public APIs
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In
> terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses
>
> Let me know if you think there is a better way to reflect this.
>
> Thanks
> Parth
>
> On 4/24/15, 9:37 AM, "Gwen Shapira"  wrote:
>
>>+1 (non-binding)
>>
>>Two nitpicks for the wiki:
>>* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
>>sure new consumers need it to be part of a consumer group.
>>* Can you clearly separate which parts are the API (common to every
>>Authorizer) and which parts are DefaultAuthorizer implementation? It
>>will make reviews and Authorizer implementations a bit easier to know
>>exactly which is which.
>>
>>Gwen
>>
>>On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
>> wrote:
>>> Hi,
>>>
>>> I would like to open KIP-11 for voting.
>>>
>>> Thanks
>>> Parth
>>>
>>> On 4/22/15, 1:56 PM, "Parth Brahmbhatt" 
>>> wrote:
>>>
Hi Jeff,

Thanks a lot for the review. I think you have a valid point about acls
being duplicated and the simplest solution would be to modify acls class
so they hold a set of principals instead of single principal. i.e

 has  Permissions on  from
.

I think the evaluation order only matters for the permissionType which
is
Deny acls should be evaluated before allow acls. To give you an example
suppose we have following acls

acl1 -> user1 is allowed to READ from all hosts.
acl2 -> host1 is allowed to READ regardless of who is the user.
acl3 -> host2 is allowed to READ regardless of who is the user.

acl4 -> user1 is denied to READ from host1.

As stated in the KIP we first evaluate DENY so if user1 tries to access
from host1 he will be denied(acl4), even though both user1 and host1 has
acl’s for allow with wildcards (acl1, acl2).
If user1 tried to READ from host2 , the action will be allowed and it
does
not matter if we match acl3 or acl1 so I don’t think the evaluation
order
matters here.

“Will people actually use hosts with users?” I really don’t know but
given
ACl’s are part of our Public APIs I thought it is better to try and
cover
more use cases. If others think this extra complexity is not worth the
value its adding please raise your concerns so we can discuss if it
should
be removed from the acl structure. Note that even in absence of hosts
from
ACL users will still be able to whitelist/blacklist host as long as we
start supporting principalType = “host”, easy to add and can be an
incremental improvement. They will however loose the ability to restrict
access to users just from a set of hosts.

We agreed to offer a CLI to overcome the JSON acl config
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization
+I
n
terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
Jsons but that probably has something to do with me being a developer
:-).

Thanks
Parth

On 4/22/15, 11:38 AM, "Jeff Holoman"  wrote:

>Parth,
>
>This is a long thread, so trying to keep up here, sorry if this has
>been
>covered before. First, great job on the KIP proposal and work so far.
>
>Are we sure that we want to tie host level access to a given user? My
>understanding is that the ACL will be (omitting some fields)
>
>user_a, host1, host2, host3
>user_b, host1, host2, host3
>
>So there would potentially be a lot of redundancy in the configs. Does
>it
>make sense to have hosts be at the same level as principal in the
>hierarchy? This way you could just blanket the allowed / denied hosts
>and
>only have to worry about the users. So if you follow this, then
>
>we can wildcard the user so we can have a separate list of just
>host-based
>access. What's the order that the perms would be evaluated if a there
>was
>more than one match on a principal ?
>
>Is the thought that there wouldn't usually be much overlap on hosts? I
>guess I can imagine a scenario where I want to offline/online access
>to a
>particular hosts or set of hosts and if there was overlap, I'm doing a
>bunch of alter commands for just a single host. Maybe this is too
>contrived
>an example?
>
>I agree that having this level of granularity gives flexibility but I
>wonder if people will actually use it and not just * the hosts for a
>given
>user and create separate "global" list as i mentioned above?
>
>The only o

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
Thanks for clarifying Gwen, KIP updated.

I tried to make the distinction by creating a section for all public APIs
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+In
terface#KIP-11-AuthorizationInterface-PublicInterfacesandclasses

Let me know if you think there is a better way to reflect this.

Thanks
Parth

On 4/24/15, 9:37 AM, "Gwen Shapira"  wrote:

>+1 (non-binding)
>
>Two nitpicks for the wiki:
>* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
>sure new consumers need it to be part of a consumer group.
>* Can you clearly separate which parts are the API (common to every
>Authorizer) and which parts are DefaultAuthorizer implementation? It
>will make reviews and Authorizer implementations a bit easier to know
>exactly which is which.
>
>Gwen
>
>On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
> wrote:
>> Hi,
>>
>> I would like to open KIP-11 for voting.
>>
>> Thanks
>> Parth
>>
>> On 4/22/15, 1:56 PM, "Parth Brahmbhatt" 
>> wrote:
>>
>>>Hi Jeff,
>>>
>>>Thanks a lot for the review. I think you have a valid point about acls
>>>being duplicated and the simplest solution would be to modify acls class
>>>so they hold a set of principals instead of single principal. i.e
>>>
>>> has  Permissions on  from
>>>.
>>>
>>>I think the evaluation order only matters for the permissionType which
>>>is
>>>Deny acls should be evaluated before allow acls. To give you an example
>>>suppose we have following acls
>>>
>>>acl1 -> user1 is allowed to READ from all hosts.
>>>acl2 -> host1 is allowed to READ regardless of who is the user.
>>>acl3 -> host2 is allowed to READ regardless of who is the user.
>>>
>>>acl4 -> user1 is denied to READ from host1.
>>>
>>>As stated in the KIP we first evaluate DENY so if user1 tries to access
>>>from host1 he will be denied(acl4), even though both user1 and host1 has
>>>acl’s for allow with wildcards (acl1, acl2).
>>>If user1 tried to READ from host2 , the action will be allowed and it
>>>does
>>>not matter if we match acl3 or acl1 so I don’t think the evaluation
>>>order
>>>matters here.
>>>
>>>“Will people actually use hosts with users?” I really don’t know but
>>>given
>>>ACl’s are part of our Public APIs I thought it is better to try and
>>>cover
>>>more use cases. If others think this extra complexity is not worth the
>>>value its adding please raise your concerns so we can discuss if it
>>>should
>>>be removed from the acl structure. Note that even in absence of hosts
>>>from
>>>ACL users will still be able to whitelist/blacklist host as long as we
>>>start supporting principalType = “host”, easy to add and can be an
>>>incremental improvement. They will however loose the ability to restrict
>>>access to users just from a set of hosts.
>>>
>>>We agreed to offer a CLI to overcome the JSON acl config
>>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization
>>>+I
>>>n
>>>terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
>>>Jsons but that probably has something to do with me being a developer
>>>:-).
>>>
>>>Thanks
>>>Parth
>>>
>>>On 4/22/15, 11:38 AM, "Jeff Holoman"  wrote:
>>>
Parth,

This is a long thread, so trying to keep up here, sorry if this has
been
covered before. First, great job on the KIP proposal and work so far.

Are we sure that we want to tie host level access to a given user? My
understanding is that the ACL will be (omitting some fields)

user_a, host1, host2, host3
user_b, host1, host2, host3

So there would potentially be a lot of redundancy in the configs. Does
it
make sense to have hosts be at the same level as principal in the
hierarchy? This way you could just blanket the allowed / denied hosts
and
only have to worry about the users. So if you follow this, then

we can wildcard the user so we can have a separate list of just
host-based
access. What's the order that the perms would be evaluated if a there
was
more than one match on a principal ?

Is the thought that there wouldn't usually be much overlap on hosts? I
guess I can imagine a scenario where I want to offline/online access
to a
particular hosts or set of hosts and if there was overlap, I'm doing a
bunch of alter commands for just a single host. Maybe this is too
contrived
an example?

I agree that having this level of granularity gives flexibility but I
wonder if people will actually use it and not just * the hosts for a
given
user and create separate "global" list as i mentioned above?

The only other system I know of that ties users with hosts for access
is
MySql and I don't love that model. Companies usually standardize on
group
authorization anyway, are we complicating that issue with the inclusion
of
hosts attached to users? Additionally I worry about the debt of big
JSON
configs in the first place, most non-developers find them

[jira] [Commented] (KAFKA-2105) NullPointerException in client on MetadataRequest

2015-04-24 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511317#comment-14511317
 ] 

Gwen Shapira commented on KAFKA-2105:
-

I'm having trouble applying this patch format.

Can you generate one using "git diff"? Better yet, try our patch review tool: 
https://cwiki.apache.org/confluence/display/KAFKA/Patch+submission+and+review

> NullPointerException in client on MetadataRequest
> -
>
> Key: KAFKA-2105
> URL: https://issues.apache.org/jira/browse/KAFKA-2105
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.1
>Reporter: Roger Hoover
>Priority: Minor
> Attachments: guard-from-null.patch
>
>
> With the new producer, if you accidentally pass null to 
> KafkaProducer.partitionsFor(null), it will cause the IO thread to throw NPE.
> Uncaught error in kafka producer I/O thread: 
> java.lang.NullPointerException
>   at org.apache.kafka.common.utils.Utils.utf8Length(Utils.java:174)
>   at org.apache.kafka.common.protocol.types.Type$5.sizeOf(Type.java:176)
>   at 
> org.apache.kafka.common.protocol.types.ArrayOf.sizeOf(ArrayOf.java:55)
>   at org.apache.kafka.common.protocol.types.Schema.sizeOf(Schema.java:81)
>   at org.apache.kafka.common.protocol.types.Struct.sizeOf(Struct.java:218)
>   at 
> org.apache.kafka.common.requests.RequestSend.serialize(RequestSend.java:35)
>   at 
> org.apache.kafka.common.requests.RequestSend.(RequestSend.java:29)
>   at 
> org.apache.kafka.clients.NetworkClient.metadataRequest(NetworkClient.java:369)
>   at 
> org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:391)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:188)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation

2015-04-24 Thread Sriharsha Chintalapani
I yet to update the KIP with my latest proposal. So give me few days to update 
it. 
I am looking at supporting KERBEROS for the first release and going to use JAAS 
Login Modules to provide authentication.
And will we provide a default SASL PLAIN mechanism on the server side 
Yes . I’ll update the KIP and send out an email for further discussion as it 
will make it easier.

Thanks,
Harsha


On April 24, 2015 at 9:30:04 AM, Gari Singh (gari.r.si...@gmail.com) wrote:

Great.  Sounds good.  I'll re-read the KIP ASAP.

Is their another KIP around authentication providers or is that being tracked 
here as well.  For example, the SASL PLAIN mechanism carries a username and 
password but currently I don't know where that would be authenticated?  I 
notice that AuthUtils has the ability read a JAAS config, but the KIP only has 
entries relevant to Kerberos.  Is the idea to use JAAS LoginModules to provide 
pluggable authentication  - so we could use some of the JDK provided 
LoginModules or create our own (e.g. use a local password file, LDAP, etc)?  
And will we provide a default SASL PLAIN mechanism on the server side or would 
we implement custom SASL provider modules?

Also - I am happy to take a look / test any code as you move along.  Also happy 
to help with SASL providers and/or authentication/login modules

Thanks,

Gari

On Fri, Apr 24, 2015 at 12:05 PM, Sriharsha Chintalapani  
wrote:
Hi Gari,
       I apologize for not clear in KIP and starting discussion thread earlier. 
My initial proposal( currently in KIP ) to provide PLAINTEXT, SSL and KERBEROS 
as individual protocol implementation. 
As you mentioned at the end
“In terms of message level integrity and confidentiality (not to be confused 
with transport level security like TLS), SASL also provides for this 
(assuming the mechanism supports it). The SASL library supports this via 
the "props" parameter in the "createSaslClient/Server" methods. So it is 
easily possible to support Kerberos with integrity (MIC) or confidentiality 
(encryption) over TCP and without either over TLS. “

My intention was to use sasl to do auth and also provide encryption over plain 
text channel. But after speaking to many who implemented Sasl this way for HDFS 
and HBASE , other projects as well their suggestion was not to use
context.wrap and context.unwrap which does the encryption for sasl  causes 
performance degradation. 

Currently I am working on SASL authentication as an option over TCP or TLS. 
I’ll update the KIP soon once I’ve got interfaces in place. Sorry about the 
confusion on this as I am testing out multiple options and trying to decide 
right one.

Thanks,
Harsha


On April 24, 2015 at 8:37:09 AM, Gari Singh (gari.r.si...@gmail.com) wrote:

Sorry for jumping in late, but I have been trying to follow this chain as
well as the updates to the KIP. I don't mean to seem critical and I may be
misunderstanding the proposed implementation, but there seems to be some
confusion around terminology (at least from my perspective) and I am not
sure I actually understand what is going to be implemented and where the
plugin point(s) will be.

The KIP does not really mention SASL interfaces in any detail. The way I
read the KIP it seems as if if is more about providing a Kerberos mechanism
via GSSAPI than it is about providing pluggable SASL support. Perhaps it
is the naming convention ("GSS" is used where I would have though SASL
would have been used).

Maybe I am missing something?

SASL leverages GSSAPI for the Kerberos mechanism, but SASL and the GSSAPI
are not the same thing. Also, SSL/TLS is independent of both SASL and
GSSAPI although you can use either SASL or GSSAPI over TLS.

I would expect something more along the lines of having a SASLChannel and
SASL providers (along with pluggable Authentication providers which
enumerate which SASL mechanisms they support).

I have only ever attempted to really implement SASL support once, but I
have played with the SASL APIs and am familiar with how LDAP, SMTP and AMQP
use SASL.

This is my understanding of how SASL is typically implemented:

1) Client decides whether or not to use TLS or plain TCP (of course this
depends on what the server provides).

My current understanding is that Kafka will support three types of server
sockets:

- current socket for backwards compatibility (i.e. no TLS and no SASL)
- TLS socket
- SASL socket

I would also have thought that SASL mechanism would be supported on the TLS
socket as well but that does not seem to be the case (or at least it is not
clear either way). I know the decision was made to have separate TLS and
SASL sockets, but I think that we need to support SASL over TLS as well.
You can do this over a single socket if you use the "startTLS" metaphor.

2) There is typically some type of application protocol specific handshake

This is usually used to negotiate whether or not to use SASL and/or
negotiate which SASL mechanisms are supported by the server. This is not
strictly r

Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Gwen Shapira
+1 (non-binding)

Two nitpicks for the wiki:
* Heartbeat is probably a READ and not CLUSTER operation. I'm pretty
sure new consumers need it to be part of a consumer group.
* Can you clearly separate which parts are the API (common to every
Authorizer) and which parts are DefaultAuthorizer implementation? It
will make reviews and Authorizer implementations a bit easier to know
exactly which is which.

Gwen

On Fri, Apr 24, 2015 at 9:28 AM, Parth Brahmbhatt
 wrote:
> Hi,
>
> I would like to open KIP-11 for voting.
>
> Thanks
> Parth
>
> On 4/22/15, 1:56 PM, "Parth Brahmbhatt" 
> wrote:
>
>>Hi Jeff,
>>
>>Thanks a lot for the review. I think you have a valid point about acls
>>being duplicated and the simplest solution would be to modify acls class
>>so they hold a set of principals instead of single principal. i.e
>>
>> has  Permissions on  from
>>.
>>
>>I think the evaluation order only matters for the permissionType which is
>>Deny acls should be evaluated before allow acls. To give you an example
>>suppose we have following acls
>>
>>acl1 -> user1 is allowed to READ from all hosts.
>>acl2 -> host1 is allowed to READ regardless of who is the user.
>>acl3 -> host2 is allowed to READ regardless of who is the user.
>>
>>acl4 -> user1 is denied to READ from host1.
>>
>>As stated in the KIP we first evaluate DENY so if user1 tries to access
>>from host1 he will be denied(acl4), even though both user1 and host1 has
>>acl’s for allow with wildcards (acl1, acl2).
>>If user1 tried to READ from host2 , the action will be allowed and it does
>>not matter if we match acl3 or acl1 so I don’t think the evaluation order
>>matters here.
>>
>>“Will people actually use hosts with users?” I really don’t know but given
>>ACl’s are part of our Public APIs I thought it is better to try and cover
>>more use cases. If others think this extra complexity is not worth the
>>value its adding please raise your concerns so we can discuss if it should
>>be removed from the acl structure. Note that even in absence of hosts from
>>ACL users will still be able to whitelist/blacklist host as long as we
>>start supporting principalType = “host”, easy to add and can be an
>>incremental improvement. They will however loose the ability to restrict
>>access to users just from a set of hosts.
>>
>>We agreed to offer a CLI to overcome the JSON acl config
>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I
>>n
>>terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
>>Jsons but that probably has something to do with me being a developer :-).
>>
>>Thanks
>>Parth
>>
>>On 4/22/15, 11:38 AM, "Jeff Holoman"  wrote:
>>
>>>Parth,
>>>
>>>This is a long thread, so trying to keep up here, sorry if this has been
>>>covered before. First, great job on the KIP proposal and work so far.
>>>
>>>Are we sure that we want to tie host level access to a given user? My
>>>understanding is that the ACL will be (omitting some fields)
>>>
>>>user_a, host1, host2, host3
>>>user_b, host1, host2, host3
>>>
>>>So there would potentially be a lot of redundancy in the configs. Does it
>>>make sense to have hosts be at the same level as principal in the
>>>hierarchy? This way you could just blanket the allowed / denied hosts and
>>>only have to worry about the users. So if you follow this, then
>>>
>>>we can wildcard the user so we can have a separate list of just
>>>host-based
>>>access. What's the order that the perms would be evaluated if a there was
>>>more than one match on a principal ?
>>>
>>>Is the thought that there wouldn't usually be much overlap on hosts? I
>>>guess I can imagine a scenario where I want to offline/online access to a
>>>particular hosts or set of hosts and if there was overlap, I'm doing a
>>>bunch of alter commands for just a single host. Maybe this is too
>>>contrived
>>>an example?
>>>
>>>I agree that having this level of granularity gives flexibility but I
>>>wonder if people will actually use it and not just * the hosts for a
>>>given
>>>user and create separate "global" list as i mentioned above?
>>>
>>>The only other system I know of that ties users with hosts for access is
>>>MySql and I don't love that model. Companies usually standardize on group
>>>authorization anyway, are we complicating that issue with the inclusion
>>>of
>>>hosts attached to users? Additionally I worry about the debt of big JSON
>>>configs in the first place, most non-developers find them non-intuitive
>>>already, so anything to ease this I think would be beneficial.
>>>
>>>
>>>Thanks
>>>
>>>Jeff
>>>
>>>On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt <
>>>pbrahmbh...@hortonworks.com> wrote:
>>>
 Sorry I missed your last questions. I am +0 on adding ―host option for
 ―list, we could add it for symmetry. Again if this is only a CLI change
it
 can be added later if you mean adding this in authorizer interface then
we
 should make a decision now.

 Given a choice I would like to actually keep only on

Re: Should 0.8.3 consumers correctly function with 0.8.2 brokers?

2015-04-24 Thread Sean Lydon
Thanks for the responses. Ewen is correct that I am referring to the
*new* consumer (org.apache.kafka.clients.consumer.KafkaConsumer).

I am extending the consumer to allow my applications more control over
committed offsets.  I really want to get away from zookeeper (so using
the offset storage), and re-balancing is something I haven't really
needed to tackle in an automated/seamless way.  Either way, I'll hold
off going further down this road until there is more interest.

@Gwen
I set up a single consumer without partition.assignment.strategy or
rebalance.callback.class.  I was unable to subscribe to just a topic
("Unknown api code 11" on broker), but I could subscribe to a
topicpartition.  This makes sense as I would need to handle re-balance
outside the consumer.  Things functioned as expected (well  I have an
additional minor fix to code from KAFKA-2121), and the only exceptions
on broker were due to closing consumers (which I have become
accustomed to).  My tests are specific to my extended version of the
consumer, but they basically do a little writing and reading with
different serde classes with application controlled commits (similar
to onSuccess and onFailure after each record, but with tolerance for
out of order acknowledgements).

If you are interested, here is the patch of the hack against trunk.

On Thu, Apr 23, 2015 at 10:27 PM, Ewen Cheslack-Postava
 wrote:
> @Neha I think you're mixing up the 0.8.1/0.8.2 updates and the 0.8.2/0.8.3
> that's being discussed here?
>
> I think the original question was about using the *new* consumer ("clients
> consumer") with 0.8.2. Gwen's right, it will use features not even
> implemented in the broker in trunk yet, let alone the 0.8.2.
>
> I don't think the "enable.commit.downgrade" type option, or supporting the
> old protocol with the new consumer at all, makes much sense. You'd end up
> with some weird hybrid of simple and high-level consumers -- you could use
> offset storage, but you'd have to manage rebalancing yourself since none of
> the coordinator support would be there.
>
>
> On Thu, Apr 23, 2015 at 9:22 PM, Neha Narkhede  wrote:
>
>> My understanding is that ideally the 0.8.3 consumer should work with an
>> 0.8.2 broker if the offset commit config was set to "zookeeper".
>>
>> The only thing that might not work is offset commit to Kafka, which makes
>> sense since the 0.8.2 broker does not support Kafka based offset
>> management.
>>
>> If we broke all kinds of offset commits, then it seems like a regression,
>> no?
>>
>> On Thu, Apr 23, 2015 at 7:26 PM, Gwen Shapira 
>> wrote:
>>
>> > I didn't think 0.8.3 consumer will ever be able to talk to 0.8.2
>> > broker... there are some essential pieces that are missing in 0.8.2
>> > (Coordinator, Heartbeat, etc).
>> > Maybe I'm missing something. It will be nice if this will work :)
>> >
>> > Mind sharing what / how you tested? Were there no errors in broker
>> > logs after your fix?
>> >
>> > On Thu, Apr 23, 2015 at 5:37 PM, Sean Lydon 
>> wrote:
>> > > Currently the clients consumer (trunk) sends offset commit requests of
>> > > version 2.  The 0.8.2 brokers fail to handle this particular request
>> > > with a:
>> > >
>> > > java.lang.AssertionError: assertion failed: Version 2 is invalid for
>> > > OffsetCommitRequest. Valid versions are 0 or 1.
>> > >
>> > > I was able to make this work via a forceful downgrade of this
>> > > particular request, but I would like some feedback on whether a
>> > > "enable.commit.downgrade" configuration would be a tolerable method to
>> > > allow 0.8.3 consumers to interact with 0.8.2 brokers.  I'm also
>> > > interested in this even being a goal worth pursuing.
>> > >
>> > > Thanks,
>> > > Sean
>> >
>>
>>
>>
>> --
>> Thanks,
>> Neha
>>
>
>
>
> --
> Thanks,
> Ewen
From 31a14a1749cb164bdde0f59951e4d3aae8ce80a1 Mon Sep 17 00:00:00 2001
From: Sean Lydon 
Date: Fri, 24 Apr 2015 09:29:41 -0700
Subject: [PATCH] Hardcoded changes to downgrade offset_commit to version 1.

---
 .../java/org/apache/kafka/clients/KafkaClient.java | 10 -
 .../org/apache/kafka/clients/NetworkClient.java| 12 ++
 .../kafka/clients/consumer/KafkaConsumer.java  |  2 +-
 .../clients/consumer/internals/Coordinator.java| 46 --
 .../java/org/apache/kafka/clients/MockClient.java  |  5 +++
 5 files changed, 69 insertions(+), 6 deletions(-)

diff --git a/clients/src/main/java/org/apache/kafka/clients/KafkaClient.java b/clients/src/main/java/org/apache/kafka/clients/KafkaClient.java
index 1311f85..e608ca8 100644
--- a/clients/src/main/java/org/apache/kafka/clients/KafkaClient.java
+++ b/clients/src/main/java/org/apache/kafka/clients/KafkaClient.java
@@ -126,9 +126,17 @@ public interface KafkaClient extends Closeable {
  */
 public RequestHeader nextRequestHeader(ApiKeys key);
 
+/*
+ * Generate a request header for the next request
+ *
+ * @param key The API key of the request
+ * @param version The API key's version of the request
+ *

Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation

2015-04-24 Thread Gari Singh
Great.  Sounds good.  I'll re-read the KIP ASAP.

Is their another KIP around authentication providers or is that being
tracked here as well.  For example, the SASL PLAIN mechanism carries a
username and password but currently I don't know where that would be
authenticated?  I notice that AuthUtils has the ability read a JAAS config,
but the KIP only has entries relevant to Kerberos.  Is the idea to use JAAS
LoginModules to provide pluggable authentication  - so we could use some of
the JDK provided LoginModules or create our own (e.g. use a local password
file, LDAP, etc)?  And will we provide a default SASL PLAIN mechanism on
the server side or would we implement custom SASL provider modules?

Also - I am happy to take a look / test any code as you move along.  Also
happy to help with SASL providers and/or authentication/login modules

Thanks,

Gari

On Fri, Apr 24, 2015 at 12:05 PM, Sriharsha Chintalapani <
harsh...@fastmail.fm> wrote:

> Hi Gari,
>I apologize for not clear in KIP and starting discussion thread
> earlier.
> My initial proposal( currently in KIP ) to provide PLAINTEXT, SSL and
> KERBEROS as individual protocol implementation.
> As you mentioned at the end
> “In terms of message level integrity and confidentiality (not to be
> confused
> with transport level security like TLS), SASL also provides for this
> (assuming the mechanism supports it). The SASL library supports this via
> the "props" parameter in the "createSaslClient/Server" methods. So it is
> easily possible to support Kerberos with integrity (MIC) or
> confidentiality
> (encryption) over TCP and without either over TLS. “
>
> My intention was to use sasl to do auth and also provide encryption over
> plain text channel. But after speaking to many who implemented Sasl this
> way for HDFS and HBASE , other projects as well their suggestion was not to
> use
> context.wrap and context.unwrap which does the encryption for sasl  causes
> performance degradation.
>
> Currently I am working on SASL authentication as an option over TCP or
> TLS. I’ll update the KIP soon once I’ve got interfaces in place. Sorry
> about the confusion on this as I am testing out multiple options and trying
> to decide right one.
>
> Thanks,
> Harsha
>
>
> On April 24, 2015 at 8:37:09 AM, Gari Singh (gari.r.si...@gmail.com)
> wrote:
>
> Sorry for jumping in late, but I have been trying to follow this chain as
> well as the updates to the KIP. I don't mean to seem critical and I may be
> misunderstanding the proposed implementation, but there seems to be some
> confusion around terminology (at least from my perspective) and I am not
> sure I actually understand what is going to be implemented and where the
> plugin point(s) will be.
>
> The KIP does not really mention SASL interfaces in any detail. The way I
> read the KIP it seems as if if is more about providing a Kerberos
> mechanism
> via GSSAPI than it is about providing pluggable SASL support. Perhaps it
> is the naming convention ("GSS" is used where I would have though SASL
> would have been used).
>
> Maybe I am missing something?
>
> SASL leverages GSSAPI for the Kerberos mechanism, but SASL and the GSSAPI
> are not the same thing. Also, SSL/TLS is independent of both SASL and
> GSSAPI although you can use either SASL or GSSAPI over TLS.
>
> I would expect something more along the lines of having a SASLChannel and
> SASL providers (along with pluggable Authentication providers which
> enumerate which SASL mechanisms they support).
>
> I have only ever attempted to really implement SASL support once, but I
> have played with the SASL APIs and am familiar with how LDAP, SMTP and
> AMQP
> use SASL.
>
> This is my understanding of how SASL is typically implemented:
>
> 1) Client decides whether or not to use TLS or plain TCP (of course this
> depends on what the server provides).
>
> My current understanding is that Kafka will support three types of server
> sockets:
>
> - current socket for backwards compatibility (i.e. no TLS and no SASL)
> - TLS socket
> - SASL socket
>
> I would also have thought that SASL mechanism would be supported on the
> TLS
> socket as well but that does not seem to be the case (or at least it is
> not
> clear either way). I know the decision was made to have separate TLS and
> SASL sockets, but I think that we need to support SASL over TLS as well.
> You can do this over a single socket if you use the "startTLS" metaphor.
>
> 2) There is typically some type of application protocol specific handshake
>
> This is usually used to negotiate whether or not to use SASL and/or
> negotiate which SASL mechanisms are supported by the server. This is not
> strictly required, although the SASL spec does mention that the client
> should be able to get a list of SASL mechanisms supported by the server.
>
> For example, SMTP does this with the client sending a EHLO and the server
> sending an AUTH.
>
> Personally I like the AMQP model (which by the way might also help with
> back

[jira] [Commented] (KAFKA-2148) version 0.8.2 breaks semantic versioning

2015-04-24 Thread Reece Markowsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511308#comment-14511308
 ] 

Reece Markowsky commented on KAFKA-2148:


now i see it.  its 
kafka/clients/src/main/java/org/apache/kafka/clients/producer/   vs 
kafka/core/src/main/scala/kafka/producer/ 

thx!


> version 0.8.2 breaks semantic versioning
> 
>
> Key: KAFKA-2148
> URL: https://issues.apache.org/jira/browse/KAFKA-2148
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Reece Markowsky
>Assignee: Jun Rao
>  Labels: api, producer
>
> version 0.8.2 of the Producer API drops support for sending a list of 
> KeyedMessage (present in 0.8.1)
> the call present in Producer version 0.8.1
> http://kafka.apache.org/081/api.html
>   public void send(List> messages);
> is not present (breaking semantic versioning) in 0.8.2
> Producer version 0.8.2
> http://kafka.apache.org/082/javadoc/index.html
> send(ProducerRecord record, Callback callback) or
> send(ProducerRecord record) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-11- Authorization design for kafka security

2015-04-24 Thread Parth Brahmbhatt
Hi,

I would like to open KIP-11 for voting.

Thanks
Parth

On 4/22/15, 1:56 PM, "Parth Brahmbhatt" 
wrote:

>Hi Jeff,
>
>Thanks a lot for the review. I think you have a valid point about acls
>being duplicated and the simplest solution would be to modify acls class
>so they hold a set of principals instead of single principal. i.e
>
> has  Permissions on  from
>.
>
>I think the evaluation order only matters for the permissionType which is
>Deny acls should be evaluated before allow acls. To give you an example
>suppose we have following acls
>
>acl1 -> user1 is allowed to READ from all hosts.
>acl2 -> host1 is allowed to READ regardless of who is the user.
>acl3 -> host2 is allowed to READ regardless of who is the user.
>
>acl4 -> user1 is denied to READ from host1.
>
>As stated in the KIP we first evaluate DENY so if user1 tries to access
>from host1 he will be denied(acl4), even though both user1 and host1 has
>acl’s for allow with wildcards (acl1, acl2).
>If user1 tried to READ from host2 , the action will be allowed and it does
>not matter if we match acl3 or acl1 so I don’t think the evaluation order
>matters here.
>
>“Will people actually use hosts with users?” I really don’t know but given
>ACl’s are part of our Public APIs I thought it is better to try and cover
>more use cases. If others think this extra complexity is not worth the
>value its adding please raise your concerns so we can discuss if it should
>be removed from the acl structure. Note that even in absence of hosts from
>ACL users will still be able to whitelist/blacklist host as long as we
>start supporting principalType = “host”, easy to add and can be an
>incremental improvement. They will however loose the ability to restrict
>access to users just from a set of hosts.
>
>We agreed to offer a CLI to overcome the JSON acl config
>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+I
>n
>terface#KIP-11-AuthorizationInterface-AclManagement(CLI). I still like
>Jsons but that probably has something to do with me being a developer :-).
>
>Thanks
>Parth
>
>On 4/22/15, 11:38 AM, "Jeff Holoman"  wrote:
>
>>Parth,
>>
>>This is a long thread, so trying to keep up here, sorry if this has been
>>covered before. First, great job on the KIP proposal and work so far.
>>
>>Are we sure that we want to tie host level access to a given user? My
>>understanding is that the ACL will be (omitting some fields)
>>
>>user_a, host1, host2, host3
>>user_b, host1, host2, host3
>>
>>So there would potentially be a lot of redundancy in the configs. Does it
>>make sense to have hosts be at the same level as principal in the
>>hierarchy? This way you could just blanket the allowed / denied hosts and
>>only have to worry about the users. So if you follow this, then
>>
>>we can wildcard the user so we can have a separate list of just
>>host-based
>>access. What's the order that the perms would be evaluated if a there was
>>more than one match on a principal ?
>>
>>Is the thought that there wouldn't usually be much overlap on hosts? I
>>guess I can imagine a scenario where I want to offline/online access to a
>>particular hosts or set of hosts and if there was overlap, I'm doing a
>>bunch of alter commands for just a single host. Maybe this is too
>>contrived
>>an example?
>>
>>I agree that having this level of granularity gives flexibility but I
>>wonder if people will actually use it and not just * the hosts for a
>>given
>>user and create separate "global" list as i mentioned above?
>>
>>The only other system I know of that ties users with hosts for access is
>>MySql and I don't love that model. Companies usually standardize on group
>>authorization anyway, are we complicating that issue with the inclusion
>>of
>>hosts attached to users? Additionally I worry about the debt of big JSON
>>configs in the first place, most non-developers find them non-intuitive
>>already, so anything to ease this I think would be beneficial.
>>
>>
>>Thanks
>>
>>Jeff
>>
>>On Wed, Apr 22, 2015 at 2:22 PM, Parth Brahmbhatt <
>>pbrahmbh...@hortonworks.com> wrote:
>>
>>> Sorry I missed your last questions. I am +0 on adding ―host option for
>>> ―list, we could add it for symmetry. Again if this is only a CLI change
>>>it
>>> can be added later if you mean adding this in authorizer interface then
>>>we
>>> should make a decision now.
>>>
>>> Given a choice I would like to actually keep only one option which is
>>> resource based get (remove even the get based on principal). I see
>>>those
>>> (getAcl for principal or host) as special filtering case which can
>>>easily
>>> be achieved by a third party tool by doing "list all topics" and
>>>calling
>>> getAcls for each topic and applying filtering logic on that.  I really
>>> don’t see the need to make those first class citizens of the authorizer
>>> interface given these kind of queries will be issued outside of broker
>>>JVM
>>> so they will not benefit from the caching and because the storage will
>>>be
>>> indexed on r

[jira] [Commented] (KAFKA-2148) version 0.8.2 breaks semantic versioning

2015-04-24 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511304#comment-14511304
 ] 

Jay Kreps commented on KAFKA-2148:
--

Basically kafka.javaapi.producer.Producer still exists and works exactly the 
same as before. We added a new api, 
org.apache.kafka.clients.producer.KafkaProducer which is meant to be an 
eventual replacement and has a lot of advantages. But for the next few releases 
the old client remains and works exactly as before.

> version 0.8.2 breaks semantic versioning
> 
>
> Key: KAFKA-2148
> URL: https://issues.apache.org/jira/browse/KAFKA-2148
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Reece Markowsky
>Assignee: Jun Rao
>  Labels: api, producer
>
> version 0.8.2 of the Producer API drops support for sending a list of 
> KeyedMessage (present in 0.8.1)
> the call present in Producer version 0.8.1
> http://kafka.apache.org/081/api.html
>   public void send(List> messages);
> is not present (breaking semantic versioning) in 0.8.2
> Producer version 0.8.2
> http://kafka.apache.org/082/javadoc/index.html
> send(ProducerRecord record, Callback callback) or
> send(ProducerRecord record) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-2148) version 0.8.2 breaks semantic versioning

2015-04-24 Thread Reece Markowsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511277#comment-14511277
 ] 

Reece Markowsky edited comment on KAFKA-2148 at 4/24/15 4:13 PM:
-

Thanks for the quick reply.  Confused, but will see if I can see what you mean. 
We just upgraded to 0.8.2 and my batching code is broken now.




was (Author: reecemarkowsky):
Thanks Jay!

> version 0.8.2 breaks semantic versioning
> 
>
> Key: KAFKA-2148
> URL: https://issues.apache.org/jira/browse/KAFKA-2148
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Reece Markowsky
>Assignee: Jun Rao
>  Labels: api, producer
>
> version 0.8.2 of the Producer API drops support for sending a list of 
> KeyedMessage (present in 0.8.1)
> the call present in Producer version 0.8.1
> http://kafka.apache.org/081/api.html
>   public void send(List> messages);
> is not present (breaking semantic versioning) in 0.8.2
> Producer version 0.8.2
> http://kafka.apache.org/082/javadoc/index.html
> send(ProducerRecord record, Callback callback) or
> send(ProducerRecord record) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2148) version 0.8.2 breaks semantic versioning

2015-04-24 Thread Reece Markowsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14511277#comment-14511277
 ] 

Reece Markowsky commented on KAFKA-2148:


Thanks Jay!

> version 0.8.2 breaks semantic versioning
> 
>
> Key: KAFKA-2148
> URL: https://issues.apache.org/jira/browse/KAFKA-2148
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Reece Markowsky
>Assignee: Jun Rao
>  Labels: api, producer
>
> version 0.8.2 of the Producer API drops support for sending a list of 
> KeyedMessage (present in 0.8.1)
> the call present in Producer version 0.8.1
> http://kafka.apache.org/081/api.html
>   public void send(List> messages);
> is not present (breaking semantic versioning) in 0.8.2
> Producer version 0.8.2
> http://kafka.apache.org/082/javadoc/index.html
> send(ProducerRecord record, Callback callback) or
> send(ProducerRecord record) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation

2015-04-24 Thread Sriharsha Chintalapani
Hi Gari,
       I apologize for not clear in KIP and starting discussion thread earlier. 
My initial proposal( currently in KIP ) to provide PLAINTEXT, SSL and KERBEROS 
as individual protocol implementation. 
As you mentioned at the end
“In terms of message level integrity and confidentiality (not to be confused 
with transport level security like TLS), SASL also provides for this 
(assuming the mechanism supports it). The SASL library supports this via 
the "props" parameter in the "createSaslClient/Server" methods. So it is 
easily possible to support Kerberos with integrity (MIC) or confidentiality 
(encryption) over TCP and without either over TLS. “

My intention was to use sasl to do auth and also provide encryption over plain 
text channel. But after speaking to many who implemented Sasl this way for HDFS 
and HBASE , other projects as well their suggestion was not to use
context.wrap and context.unwrap which does the encryption for sasl  causes 
performance degradation. 

Currently I am working on SASL authentication as an option over TCP or TLS. 
I’ll update the KIP soon once I’ve got interfaces in place. Sorry about the 
confusion on this as I am testing out multiple options and trying to decide 
right one.

Thanks,
Harsha


On April 24, 2015 at 8:37:09 AM, Gari Singh (gari.r.si...@gmail.com) wrote:

Sorry for jumping in late, but I have been trying to follow this chain as  
well as the updates to the KIP. I don't mean to seem critical and I may be  
misunderstanding the proposed implementation, but there seems to be some  
confusion around terminology (at least from my perspective) and I am not  
sure I actually understand what is going to be implemented and where the  
plugin point(s) will be.  

The KIP does not really mention SASL interfaces in any detail. The way I  
read the KIP it seems as if if is more about providing a Kerberos mechanism  
via GSSAPI than it is about providing pluggable SASL support. Perhaps it  
is the naming convention ("GSS" is used where I would have though SASL  
would have been used).  

Maybe I am missing something?  

SASL leverages GSSAPI for the Kerberos mechanism, but SASL and the GSSAPI  
are not the same thing. Also, SSL/TLS is independent of both SASL and  
GSSAPI although you can use either SASL or GSSAPI over TLS.  

I would expect something more along the lines of having a SASLChannel and  
SASL providers (along with pluggable Authentication providers which  
enumerate which SASL mechanisms they support).  

I have only ever attempted to really implement SASL support once, but I  
have played with the SASL APIs and am familiar with how LDAP, SMTP and AMQP  
use SASL.  

This is my understanding of how SASL is typically implemented:  

1) Client decides whether or not to use TLS or plain TCP (of course this  
depends on what the server provides).  

My current understanding is that Kafka will support three types of server  
sockets:  

- current socket for backwards compatibility (i.e. no TLS and no SASL)  
- TLS socket  
- SASL socket  

I would also have thought that SASL mechanism would be supported on the TLS  
socket as well but that does not seem to be the case (or at least it is not  
clear either way). I know the decision was made to have separate TLS and  
SASL sockets, but I think that we need to support SASL over TLS as well.  
You can do this over a single socket if you use the "startTLS" metaphor.  

2) There is typically some type of application protocol specific handshake  

This is usually used to negotiate whether or not to use SASL and/or  
negotiate which SASL mechanisms are supported by the server. This is not  
strictly required, although the SASL spec does mention that the client  
should be able to get a list of SASL mechanisms supported by the server.  

For example, SMTP does this with the client sending a EHLO and the server  
sending an AUTH.  

Personally I like the AMQP model (which by the way might also help with  
backwards compatibility using a single socket). For AMQP, the initial  
frame is basically  

AMQP.%d0.1.0.0 (AMPQ, TCP, AMQP protocol 1.0.0)  
AMQP.%d3.1.0.0 (AMQP, SASL)  

I think you get the idea. So we could do something similar for Kafka  

KAFKA.[protocol type].[protocol version major].[protocol version  
minor].[protocol version revision]  

So for example, we could have protocol types of  

0 - open  
1- SASL  

and do this over either a TCP or TLS socket.  

Of course, if you stick with having a dedicated SASL socket, you could just  
start out with the option of the client sending something like "AUTH" as  
its first message (with the option of appending the initial SASL payload as  
well)  

3) After the protocol handshake, there is an application specific wrapper  
for carrying SASL frames for the challenges and responses.  

If the mechanism selected is Kerberos, it is at this point that you that  
SASL uses the GSSAPI for the exchange (of course wrapped in the app  
specific SASL frames). If you are

[jira] [Resolved] (KAFKA-2148) version 0.8.2 breaks semantic versioning

2015-04-24 Thread Jay Kreps (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps resolved KAFKA-2148.
--
Resolution: Not A Problem

Hey Reece, this is not the same client but rather a new client. It is not 
intended to be api compatible. The scala client with the api you describe still 
exists and will continue to exist for some time.

This api actually gives all the benefits the scala api had (plus some 
additional ones, like giving you back the offset and error info even for async 
writes).

> version 0.8.2 breaks semantic versioning
> 
>
> Key: KAFKA-2148
> URL: https://issues.apache.org/jira/browse/KAFKA-2148
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Reece Markowsky
>Assignee: Jun Rao
>  Labels: api, producer
>
> version 0.8.2 of the Producer API drops support for sending a list of 
> KeyedMessage (present in 0.8.1)
> the call present in Producer version 0.8.1
> http://kafka.apache.org/081/api.html
>   public void send(List> messages);
> is not present (breaking semantic versioning) in 0.8.2
> Producer version 0.8.2
> http://kafka.apache.org/082/javadoc/index.html
> send(ProducerRecord record, Callback callback) or
> send(ProducerRecord record) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2148) version 0.8.2 breaks semantic versioning

2015-04-24 Thread Reece Markowsky (JIRA)
Reece Markowsky created KAFKA-2148:
--

 Summary: version 0.8.2 breaks semantic versioning
 Key: KAFKA-2148
 URL: https://issues.apache.org/jira/browse/KAFKA-2148
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8.2.0
Reporter: Reece Markowsky
Assignee: Jun Rao


version 0.8.2 of the Producer API drops support for sending a list of 
KeyedMessage (present in 0.8.1)

the call present in Producer version 0.8.1
http://kafka.apache.org/081/api.html
  public void send(List> messages);

is not present (breaking semantic versioning) in 0.8.2

Producer version 0.8.2
http://kafka.apache.org/082/javadoc/index.html
send(ProducerRecord record, Callback callback) or
send(ProducerRecord record) 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation

2015-04-24 Thread Gari Singh
Sorry for jumping in late, but I have been trying to follow this chain as
well as the updates to the KIP.  I don't mean to seem critical and I may be
misunderstanding the proposed implementation, but there seems to be some
confusion around terminology (at least from my perspective) and I am not
sure I actually understand what is going to be implemented and where the
plugin point(s) will be.

The KIP does not really mention SASL interfaces in any detail.  The way I
read the KIP it seems as if if is more about providing a Kerberos mechanism
via GSSAPI than it is about providing pluggable SASL support.  Perhaps it
is the naming convention ("GSS" is used where I would have though SASL
would have been used).

Maybe I am missing something?

SASL leverages GSSAPI for the Kerberos mechanism, but SASL and the GSSAPI
are not the same thing.  Also, SSL/TLS is independent of both SASL and
GSSAPI although you can use either SASL or GSSAPI over TLS.

I would expect something more along the lines of having a SASLChannel and
SASL providers (along with pluggable Authentication providers which
enumerate which SASL mechanisms they support).

I have only ever attempted to really implement SASL support once, but I
have played with the SASL APIs and am familiar with how LDAP, SMTP and AMQP
use SASL.

This is my understanding of how SASL is typically implemented:

1) Client decides whether or not to use TLS or plain TCP  (of course this
depends on what the server provides).

My current understanding is that Kafka will support three types of server
sockets:

- current socket for backwards compatibility (i.e. no TLS and no SASL)
- TLS socket
- SASL socket

I would also have thought that SASL mechanism would be supported on the TLS
socket as well but that does not seem to be the case (or at least it is not
clear either way).  I know the decision was made to have separate TLS and
SASL sockets, but I think that we need to support SASL over TLS as well.
You can do this over a single socket if you use the "startTLS" metaphor.

2) There is typically some type of application protocol specific handshake

This is usually used to negotiate whether or not to use SASL and/or
negotiate which SASL mechanisms are supported by the server.  This is not
strictly required, although the SASL spec does mention that the client
should be able to get a list of SASL mechanisms supported by the server.

For example, SMTP does this with the client sending a EHLO and the server
sending an AUTH.

Personally I like the AMQP model (which by the way might also help with
backwards compatibility using a single socket).  For AMQP, the initial
frame is basically

AMQP.%d0.1.0.0  (AMPQ, TCP, AMQP protocol 1.0.0)
AMQP.%d3.1.0.0 (AMQP, SASL)

I think you get the idea.  So we could do something similar for Kafka

KAFKA.[protocol type].[protocol version major].[protocol version
minor].[protocol version revision]

So for example, we could have protocol types of

0 - open
1- SASL

and do this over either a TCP or TLS socket.

Of course, if you stick with having a dedicated SASL socket, you could just
start out with the option of the client sending something like "AUTH" as
its first message (with the option of appending the initial SASL payload as
well)

3) After the protocol handshake, there is an application specific wrapper
for carrying SASL frames for the challenges and responses.

If the mechanism selected is Kerberos, it is at this point that you that
SASL uses the GSSAPI for the exchange (of course wrapped in the app
specific SASL frames).  If you are using PLAIN, there is a defined format
to be used (RFC4616).

Java of course provides support for various mechanisms in the default SASL
client and server mechanisms.  For example, the client supports PLAIN, but
we would need to implement a "PlainSaslServer"  (which we could also tie to
a username/password based authentication provider as well).

In terms of message level integrity and confidentiality (not to be confused
with transport level security like TLS), SASL also provides for this
(assuming the mechanism supports it).  The SASL library supports this via
the "props" parameter in the "createSaslClient/Server" methods.  So it is
easily possible to support Kerberos with integrity (MIC) or confidentiality
(encryption) over TCP and without either over TLS.


Hopefully this makes sense and perhaps this is how things are proceeding,
but it was not clear to me that this is what is actually being implemented.

Sorry for the long note.

-- Gari












On Fri, Apr 24, 2015 at 9:34 AM, Sriharsha Chintalapani 
wrote:

> Rajini,
> I am exploring this part right now. To support PLAINTEXT and SSL
> as protocols and Kerberos auth as authentication on top of plaintext or ssl
> (if users want to do encryption over an auth mechanism). This is mainly
> influenced by SASL or GSS-API performance issue when I enable encryption.
> I’ll update the KIP once I finalize this on my side .
> Thanks,
> Harsha
>
>
> On April 24, 201

[jira] [Created] (KAFKA-2147) Unbalanced replication can cause extreme purgatory growth

2015-04-24 Thread Evan Huus (JIRA)
Evan Huus created KAFKA-2147:


 Summary: Unbalanced replication can cause extreme purgatory growth
 Key: KAFKA-2147
 URL: https://issues.apache.org/jira/browse/KAFKA-2147
 Project: Kafka
  Issue Type: Bug
  Components: purgatory, replication
Affects Versions: 0.8.2.1
Reporter: Evan Huus
Assignee: Joel Koshy


Apologies in advance, this is going to be a bit of complex description, mainly 
because we've seen this issue several different ways and we're still tying them 
together in terms of root cause and analysis.

It is worth noting now that we have all our producers set up to send 
RequiredAcks==-1, and that this includes all our MirrorMakers.

I understand the purgatory is being rewritten (again) for 0.8.3. Hopefully that 
will incidentally fix this issue, or at least render it moot.

h4. Symptoms

Fetch request purgatory on a broker or brokers grows rapidly and steadily at a 
rate of roughly 1-5K requests per second. Heap memory used also grows to keep 
pace. When 4-5 million requests have accumulated in purgatory, the purgatory is 
drained, causing a substantial latency spike. The node will tend to drop 
leadership, replicate, and recover.

h5. Case 1 - MirrorMaker

We first noticed this case when enabling mirrormaker. We had one primary 
cluster already, with many producers and consumers. We created a second, 
identical cluster and enabled replication from the original to the new cluster 
on some topics using mirrormaker. This caused all six nodes in the new cluster 
to exhibit the symptom in lockstep - their purgatories would all grow together, 
and get drained within about 20 seconds of each other. The cluster-wide latency 
spikes at this time caused several problems for us.

Turning MM on and off turned the problem on and off very precisely. When we 
stopped MM, the purgatories would all drop to normal levels immediately, and 
would start climbing again when we restarted it.

Note that this is the *fetch* purgatories on the brokers that MM was 
*producing* to, which indicates fairly strongly that this is a replication 
issue, not a MM issue.

This particular cluster and MM setup was abandoned for other reasons before we 
could make much progress debugging.

h5. Case 2 - Broker 6

The second time we saw this issue was on the newest broker (broker 6) in the 
original cluster. For a long time we were running with five nodes, and 
eventually added a sixth to handle the increased load. At first, we moved only 
a handful of higher-volume partitions to this broker. Later, we created a group 
of new topics (totalling around 100 partitions) for testing purposes that were 
spread automatically across all six nodes. These topics saw occasional traffic, 
but were generally unused. At this point broker 6 had leadership for about an 
equal number of high-volume and unused partitions, about 15-20 of each.

Around this time (we don't have detailed enough data to prove real correlation 
unfortunately), the issue started appearing on this broker as well, but not on 
any of the other brokers in the cluster.

h4. Debugging

The first thing we tried was to reduce the 
`fetch.purgatory.purge.interval.requests` from the default of 1000 to a much 
lower value of 200. This had no noticeable effect at all.

We then enabled debug logging on broker06 and started looking through that. I 
can attach complete log samples if necessary, but the thing that stood out for 
us was a substantial number of the following lines:

{noformat}
[2015-04-23 20:05:15,196] DEBUG [KafkaApi-6] Putting fetch request with 
correlation id 49939 from client ReplicaFetcherThread-0-6 into purgatory 
(kafka.server.KafkaApis)
{noformat}

The volume of these lines seemed to match (approximately) the fetch purgatory 
growth on that broker.

At this point we developed a hypothesis (detailed below) which guided our 
subsequent debugging tests:
- Setting a daemon up to produce regular random data to all of the topics led 
by kafka06 (specifically the ones which otherwise would receive no data) 
substantially alleviated the problem.
- Doing an additional rebalance of the cluster in order to move a number of 
other topics with regular data to kafka06 appears to have solved the problem 
completely.

h4. Hypothesis

Current versions (0.8.2.1 and earlier) have issues with the replica fetcher not 
backing off correctly (KAFKA-1461, KAFKA-2082 and others). I believe that in a 
very specific situation, the replica fetcher thread of one broker can spam 
another broker with requests that fill up its purgatory and do not get properly 
flushed. My best guess is that the necessary conditions are:

- broker A leads some partitions which receive regular traffic, and some 
partitions which do not
- broker B replicates some of each type of partition from broker A
- some producers are producing with RequiredAcks=-1 (wait for all ISR)
- broker B hap

Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation

2015-04-24 Thread Rajini Sivaram
Thank you, Harsha. Yes, that makes sense. Shall take a look when the KIP is
finalized.

Rajini

On Fri, Apr 24, 2015 at 2:34 PM, Sriharsha Chintalapani 
wrote:

>  Rajini,
> I am exploring this part right now. To support PLAINTEXT and SSL
> as protocols and Kerberos auth as authentication on top of plaintext or ssl
> (if users want to do encryption over an auth mechanism). This is mainly
> influenced by SASL or GSS-API performance issue when I enable encryption.
> I’ll update the KIP once I finalize this on my side .
>  Thanks,
> Harsha
>
>
> On April 24, 2015 at 1:39:14 AM, Rajini Sivaram (
> rajinisiva...@googlemail.com) wrote:
>
>  Have there been any discussions around separating out authentication and
> encryption protocols for Kafka endpoints to enable different combinations?
> In our deployment environment, we would like to use TLS for encryption,
> but
> we don't necessarily want to use certificate-based authentication of
> clients. With the current design, if we want to use an authentication
> mechanism like SASL/plain, it looks like we need to define a new security
> protocol in Kafka which combines SASL/Plain authentication with TLS
> encryption. In KIP-12, it looks like the protocols defined are PLAINTEXT
> (no auth, no encryption), KERBEROS (Kerberos auth, no encryption/Kerberos)
> and SSL(SSL auth/no client auth, SSL encryption). While not all
> combinations of authentication and encryption protocols are likely to be
> useful, the ability to combine different mechanisms without modifying
> Kafka
> to create combined protocols would make it easier to grow the support for
> new protocols. I wanted to check if this has already been discussed in the
> past.
>
>
>
> Thank you,
>
> Rajini
>
>
>
> On Fri, Apr 24, 2015 at 9:26 AM, Rajini Sivaram <
> rajinisiva...@googlemail.com> wrote:
>
> > Harsha,
> >
> > Thank you for the quick response. (Sorry had missed sending this reply
> to
> > the dev-list earlier)..
> >
> >
> > 1. I am not sure what the new server-side code is going to look like
> > after refactoring under KAFKA-1928. But I was assuming that there would
> be
> > only one Channel implementation that would be shared by both clients and
> > server. So the ability to run delegated tasks on a different thread
> would
> > be useful in any case. Even with the server, I imagine the Processor
> thread
> > is shared by multiple connections with thread affinity for connections,
> so
> > it might be better not to run potentially long running delegated tasks
> on
> > that thread.
> > 2. You may be right that Kafka doesn't need to support renegotiation.
> > The usecase I was thinking of was slightly different from the one you
> > described. Periodic renegotiation is used sometimes to refresh
> encryption
> > keys especially with ciphers that are weak. Kafka may not have a
> > requirement to support this at the moment.
> > 3. Graceful close needs close handshake messages to be be
> > sent/received to shutdown the SSL engine and this requires managing
> > selection interest based on SSL engine close state. It will be good if
> the
> > base channel/selector class didn't need to be aware of this.
> > 4. Yes, I agree that the choice is between bringing some
>  > selection-related code into the channel or some channel related code
> into
> > selector. We found the code neater with the former when the three cases
> > above were implemented. But it is possible that you can handle it
> > differently with the latter, so I am happy to wait until your patch is
> > ready.
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Wed, Apr 22, 2015 at 4:00 PM, Sriharsha Chintalapani 
>
> > wrote:
> >
> >> 1. *Support for running potentially long-running delegated tasks
> >> outside
> >> the network thread*: It is recommended that delegated tasks indicated
> by
> >> a handshake status of NEED_TASK are run on a separate thread since they
> >> may
> >> block (
> >> http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html).
>
> >> It is easier to encapsulate this in SSLChannel without any changes to
> >> common code if selection keys are managed within the Channel.
> >>
> >>
> >> This makes sense I can change code to not do it on the network thread.
> >>
> >> Right now we are doing the handshake as part of the processor ( it
> >> shouldn’t be in acceptor) and we have multiple processors thread. Do we
> >> still see this as an issue if it happens on the same thread as
> processor? .
> >>
> >>
> >>
> >>
> >> --
> >> Harsha
> >> Sent with Airmail
> >>
> >> On April 22, 2015 at 7:18:17 AM, Sriharsha Chintalapani (
> >> harsh...@fastmail.fm) wrote:
> >>
> >> Hi Rajini,
> >> Thanks for the details. I did go through your code . There was a
> >> discussion before about not having selector related code into the
> channel
> >> or extending the selector it self.
> >>
> >> 1. *Support for running potentially long-running delegated tasks
> >> outside
> >> the network thread*: It is recommended that delegated tasks indicated
>

Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation

2015-04-24 Thread Sriharsha Chintalapani
Rajini,
        I am exploring this part right now. To support PLAINTEXT and SSL as 
protocols and Kerberos auth as authentication on top of plaintext or ssl (if 
users want to do encryption over an auth mechanism). This is mainly influenced 
by SASL or GSS-API performance issue when I enable encryption.  I’ll update the 
KIP once I finalize this on my side . 
Thanks,
Harsha


On April 24, 2015 at 1:39:14 AM, Rajini Sivaram (rajinisiva...@googlemail.com) 
wrote:

Have there been any discussions around separating out authentication and  
encryption protocols for Kafka endpoints to enable different combinations?  
In our deployment environment, we would like to use TLS for encryption, but  
we don't necessarily want to use certificate-based authentication of  
clients. With the current design, if we want to use an authentication  
mechanism like SASL/plain, it looks like we need to define a new security  
protocol in Kafka which combines SASL/Plain authentication with TLS  
encryption. In KIP-12, it looks like the protocols defined are PLAINTEXT  
(no auth, no encryption), KERBEROS (Kerberos auth, no encryption/Kerberos)  
and SSL(SSL auth/no client auth, SSL encryption). While not all  
combinations of authentication and encryption protocols are likely to be  
useful, the ability to combine different mechanisms without modifying Kafka  
to create combined protocols would make it easier to grow the support for  
new protocols. I wanted to check if this has already been discussed in the  
past.  



Thank you,  

Rajini  



On Fri, Apr 24, 2015 at 9:26 AM, Rajini Sivaram <  
rajinisiva...@googlemail.com> wrote:  

> Harsha,  
>  
> Thank you for the quick response. (Sorry had missed sending this reply to  
> the dev-list earlier)..  
>  
>  
> 1. I am not sure what the new server-side code is going to look like  
> after refactoring under KAFKA-1928. But I was assuming that there would be  
> only one Channel implementation that would be shared by both clients and  
> server. So the ability to run delegated tasks on a different thread would  
> be useful in any case. Even with the server, I imagine the Processor thread  
> is shared by multiple connections with thread affinity for connections, so  
> it might be better not to run potentially long running delegated tasks on  
> that thread.  
> 2. You may be right that Kafka doesn't need to support renegotiation.  
> The usecase I was thinking of was slightly different from the one you  
> described. Periodic renegotiation is used sometimes to refresh encryption  
> keys especially with ciphers that are weak. Kafka may not have a  
> requirement to support this at the moment.  
> 3. Graceful close needs close handshake messages to be be  
> sent/received to shutdown the SSL engine and this requires managing  
> selection interest based on SSL engine close state. It will be good if the  
> base channel/selector class didn't need to be aware of this.  
> 4. Yes, I agree that the choice is between bringing some  
> selection-related code into the channel or some channel related code into  
> selector. We found the code neater with the former when the three cases  
> above were implemented. But it is possible that you can handle it  
> differently with the latter, so I am happy to wait until your patch is  
> ready.  
>  
> Regards,  
>  
> Rajini  
>  
>  
> On Wed, Apr 22, 2015 at 4:00 PM, Sriharsha Chintalapani   
> wrote:  
>  
>> 1. *Support for running potentially long-running delegated tasks  
>> outside  
>> the network thread*: It is recommended that delegated tasks indicated by  
>> a handshake status of NEED_TASK are run on a separate thread since they  
>> may  
>> block (  
>> http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html).  
>> It is easier to encapsulate this in SSLChannel without any changes to  
>> common code if selection keys are managed within the Channel.  
>>  
>>  
>> This makes sense I can change code to not do it on the network thread.  
>>  
>> Right now we are doing the handshake as part of the processor ( it  
>> shouldn’t be in acceptor) and we have multiple processors thread. Do we  
>> still see this as an issue if it happens on the same thread as processor? .  
>>  
>>  
>>  
>>  
>> --  
>> Harsha  
>> Sent with Airmail  
>>  
>> On April 22, 2015 at 7:18:17 AM, Sriharsha Chintalapani (  
>> harsh...@fastmail.fm) wrote:  
>>  
>> Hi Rajini,  
>> Thanks for the details. I did go through your code . There was a  
>> discussion before about not having selector related code into the channel  
>> or extending the selector it self.  
>>  
>> 1. *Support for running potentially long-running delegated tasks  
>> outside  
>> the network thread*: It is recommended that delegated tasks indicated by  
>> a handshake status of NEED_TASK are run on a separate thread since they  
>> may  
>> block (  
>> http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html).  
>> It is easier to encapsulate this in SSLCh

Re: Metrics package discussion

2015-04-24 Thread Jun Rao
Otis,

The jira for moving the broker to the new metrics is KAFKA-1930.

We didn't try to do the conversion in 0.8.2 because (1) the new metrics are
missing reporters for popular systems like Graphite and Ganglia; (2) the
histogram support in the new metrics is a bit different and we were not
sure if it's good enough for our usage. We will need to have an answer to
both before we can migrate to the new metrics. So, the migration may not
happen in 0.8.3.

One of the reasons that we want to move to the new metrics is that as we
are reusing more and more code from the java client, we will be pulling in
metrics in the new format. In order to keep the metrics consistent, it's
probably better to just bite the bullet and migrate all code hale metrics
to the new one.

Thanks,

Jun

On Tue, Apr 21, 2015 at 9:29 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> I'm veery late to this thread.  I'm with Gwen about metrics being the
> public API (but often not treated as such, sadly).  I don't know the
> details of internal issues around how metrics are implemented but, for
> selfish reasons, would hate to see MBeans change - we spent weeks
> contributing more than a dozen iterations of patches for changing the old
> Kafka 0.8.1.x metrics to what they are now in 0.8.2.  I wish somebody had
> mentioned these (known?) issues then - since metrics were so drastically
> changed then, we could have done it right immediately.  Also, when you
> change MBean names and structure you force everyone to rewrite their MBean
> parsers (not your problem, but still something to be aware of).
>
> If metrics are going to be changing, would it be possible to enumerate the
> changes somewhere?
>
> Finally, I tried finding a JIRA issue for changing metrics, so I can watch
> it, but couldn't find it here:
>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20fixVersion%20%3D%200.8.3%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC
>
> Am I looking in the wrong place?
> Is there an issue for the changes discussed in this thread?
> Is the decision to do it in 0.8.3 final?
>
> Thanks,
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Tue, Mar 31, 2015 at 12:43 PM, Steven Wu  wrote:
>
> > > My main concern is that we don't do the migration in 0.8.3, we will be
> > left
> > with some metrics in YM format and some others in KM format (as we start
> > sharing client code on the broker). This is probably a worse situation to
> > be in.
> >
> > +1. I am not sure how our servo adaptor will work if there are two
> formats
> > for metrics? unless there is an easy way to check the format (YM/KM).
> >
> >
> > On Tue, Mar 31, 2015 at 9:40 AM, Jun Rao  wrote:
> >
> > > (2) The metrics are clearly part of the client API and we are not
> > changing
> > > that (at least for the new client). Arguably, the metrics are also part
> > of
> > > the broker side API. However, since they affect fewer parties (mostly
> > just
> > > the Kafka admins), it may be easier to make those changes.
> > >
> > > My main concern is that we don't do the migration in 0.8.3, we will be
> > left
> > > with some metrics in YM format and some others in KM format (as we
> start
> > > sharing client code on the broker). This is probably a worse situation
> to
> > > be in.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Mar 31, 2015 at 9:26 AM, Gwen Shapira 
> > > wrote:
> > >
> > > > (2) I believe we agreed that our metrics are a public API. I believe
> > > > we also agree we don't break API in minor releases. So, it seems
> > > > obvious to me that we can't make breaking changes to metrics in minor
> > > > releases. I'm not convinced "we did it in the past" is a good reason
> > > > to do it again.
> > > >
> > > > Is there a strong reason to do it in a 0.8.3 time-frame?
> > > >
> > > > On Tue, Mar 31, 2015 at 7:59 AM, Jun Rao  wrote:
> > > > > (2) Not sure why we can't do this in 0.8.3. We changed the metrics
> > > names
> > > > in
> > > > > 0.8.2 already. Given that we need to share code btw the client and
> > the
> > > > > core, and we need to keep the metrics consistent on the broker, it
> > > seems
> > > > > that we have no choice but to migrate to KM. If so, it seems that
> the
> > > > > sooner that we do this, the better. It is important to give people
> an
> > > > easy
> > > > > path for migration. However, it may not be easy to keep the mbean
> > names
> > > > > exactly the same. For example, YM has hardcoded attributes (e.g.
> > > > > 1-min-rate, 5-min-rate, 15-min-rate, etc for rates) that are not
> > > > available
> > > > > in KM.
> > > > >
> > > > > One benefit out of this migration is that one can get the metrics
> in
> > > the
> > > > > client and the broker in the same way.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Mon, Mar 30, 2015 at 9:26 PM, Gwe

[jira] [Commented] (KAFKA-2146) adding partition did not find the correct startIndex

2015-04-24 Thread chenshangan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510903#comment-14510903
 ] 

chenshangan commented on KAFKA-2146:


I'd like to provide a patch soon.

> adding partition did not find the correct startIndex 
> -
>
> Key: KAFKA-2146
> URL: https://issues.apache.org/jira/browse/KAFKA-2146
> Project: Kafka
>  Issue Type: Bug
>  Components: admin
>Affects Versions: 0.8.2.0
>Reporter: chenshangan
>Priority: Minor
> Fix For: 0.8.3
>
>
> TopicCommand provide a tool to add partitions for existing topics. It try to 
> find the startIndex from existing partitions. There's a minor flaw in this 
> process, it try to use the first partition fetched from zookeeper as the 
> start partition, and use the first replica id in this partition as the 
> startIndex.
> One thing, the first partition fetched from zookeeper is not necessary to be 
> the start partition. As partition id begin from zero, we should use partition 
> with id zero as the start partition.
> The other, broker id does not necessary begin from 0, so the startIndex is 
> not necessary to be the first replica id in the start partition. 
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2146) adding partition did not find the correct startIndex

2015-04-24 Thread chenshangan (JIRA)
chenshangan created KAFKA-2146:
--

 Summary: adding partition did not find the correct startIndex 
 Key: KAFKA-2146
 URL: https://issues.apache.org/jira/browse/KAFKA-2146
 Project: Kafka
  Issue Type: Bug
  Components: admin
Affects Versions: 0.8.2.0
Reporter: chenshangan
Priority: Minor
 Fix For: 0.8.3


TopicCommand provide a tool to add partitions for existing topics. It try to 
find the startIndex from existing partitions. There's a minor flaw in this 
process, it try to use the first partition fetched from zookeeper as the start 
partition, and use the first replica id in this partition as the startIndex.

One thing, the first partition fetched from zookeeper is not necessary to be 
the start partition. As partition id begin from zero, we should use partition 
with id zero as the start partition.

The other, broker id does not necessary begin from 0, so the startIndex is not 
necessary to be the first replica id in the start partition. 

  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [KIP-DISCUSSION] KIP-22 Expose a Partitioner interface in the new producer

2015-04-24 Thread Gianmarco De Francisci Morales
Hi,


Here are the questions I think we should consider:
> 1. Do we need this at all given that we have the partition argument in
> ProducerRecord which gives full control? I think we do need it because this
> is a way to plug in a different partitioning strategy at run time and do it
> in a fairly transparent way.
>

Yes, we need it if we want to support different partitioning strategies
inside Kafka rather than requiring the user to code them externally.


> 3. Do we need to add the value? I suspect people will have uses for
> computing something off a few fields in the value to choose the partition.
> This would be useful in cases where the key was being used for log
> compaction purposes and did not contain the full information for computing
> the partition.
>

I am not entirely sure about this. I guess that most partitioners should
not use it.
I think it makes it easier to reason about the system if the partitioner
only works on the key.
Hoever, if the value (and its serialization) are already available, there
is not much harm in passing them along.


> 4. This interface doesn't include either an init() or close() method. It
> should implement Closable and Configurable, right?
>

Right now the only application I can think of to have an init() and close()
is to read some state information (e.g., load information) that is
published on some external distributed storage (e.g., zookeeper) by the
brokers.
It might be useful also for reconfiguration and state migration.

I think it's not a very common use case right now, but if the added
complexity is not too much it might be worth to have support for these
methods.



> 5. What happens if the user both sets the partition id in the
> ProducerRecord and sets a partitioner? Does the partition id just get
> passed in to the partitioner (as sort of implied in this interface?). This
> is a bit weird since if you pass in the partition id you kind of expect it
> to get used, right? Or is it the case that if you specify a partition the
> partitioner isn't used at all (in which case no point in including
> partition in the Partitioner api).
>
>
The user should be able to override the partitioner on a per-record basis
by explicitly setting the partition id.
I don't think it makes sense for the partitioners to take "hints" on the
partition.

I would even go the extra step, and have a default logic that accepts both
key and partition id (current interface) and calls partition() only if the
partition id is not set. The partition() method does *not* take the
partition ID as input (only key-value).


Cheers,
--
Gianmarco



> Cheers,
>
> -Jay
>
> On Thu, Apr 23, 2015 at 6:55 AM, Sriharsha Chintalapani 
> wrote:
>
> > Hi,
> > Here is the KIP for adding a partitioner interface for producer.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-+22+-+Expose+a+Partitioner+interface+in+the+new+producer
> > There is one open question about how interface should look like. Please
> > take a look and let me know if you prefer one way or the other.
> >
> > Thanks,
> > Harsha
> >
> >
>


[jira] [Updated] (KAFKA-2105) NullPointerException in client on MetadataRequest

2015-04-24 Thread Daneel Yaitskov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daneel Yaitskov updated KAFKA-2105:
---
Attachment: guard-from-null.patch

The patch checks that the argument of partitionsFor is not null.

> NullPointerException in client on MetadataRequest
> -
>
> Key: KAFKA-2105
> URL: https://issues.apache.org/jira/browse/KAFKA-2105
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.1
>Reporter: Roger Hoover
>Priority: Minor
> Attachments: guard-from-null.patch
>
>
> With the new producer, if you accidentally pass null to 
> KafkaProducer.partitionsFor(null), it will cause the IO thread to throw NPE.
> Uncaught error in kafka producer I/O thread: 
> java.lang.NullPointerException
>   at org.apache.kafka.common.utils.Utils.utf8Length(Utils.java:174)
>   at org.apache.kafka.common.protocol.types.Type$5.sizeOf(Type.java:176)
>   at 
> org.apache.kafka.common.protocol.types.ArrayOf.sizeOf(ArrayOf.java:55)
>   at org.apache.kafka.common.protocol.types.Schema.sizeOf(Schema.java:81)
>   at org.apache.kafka.common.protocol.types.Struct.sizeOf(Struct.java:218)
>   at 
> org.apache.kafka.common.requests.RequestSend.serialize(RequestSend.java:35)
>   at 
> org.apache.kafka.common.requests.RequestSend.(RequestSend.java:29)
>   at 
> org.apache.kafka.clients.NetworkClient.metadataRequest(NetworkClient.java:369)
>   at 
> org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:391)
>   at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:188)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191)
>   at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation

2015-04-24 Thread Rajini Sivaram
Have there been any discussions around separating out authentication and
encryption protocols for Kafka endpoints to enable different combinations?
In our deployment environment, we would like to use TLS for encryption, but
we don't necessarily want to use certificate-based authentication of
clients. With the current design, if we want to use an authentication
mechanism like SASL/plain, it looks like we need to define a new security
protocol in Kafka which combines SASL/Plain authentication with TLS
encryption. In KIP-12, it looks like the protocols defined are PLAINTEXT
(no auth, no encryption), KERBEROS (Kerberos auth, no encryption/Kerberos)
and SSL(SSL auth/no client auth, SSL encryption). While not all
combinations of authentication and encryption protocols are likely to be
useful, the ability to combine different mechanisms without modifying Kafka
to create combined protocols would make it easier to grow the support for
new protocols. I wanted to check if this has already been discussed in the
past.



Thank you,

Rajini



On Fri, Apr 24, 2015 at 9:26 AM, Rajini Sivaram <
rajinisiva...@googlemail.com> wrote:

>   Harsha,
>
> Thank you for the quick response. (Sorry had missed sending this reply to
> the dev-list earlier)..
>
>
>1. I am not sure what the new server-side code is going to look like
>after refactoring under KAFKA-1928. But I was assuming that there would be
>only one Channel implementation that would be shared by both clients and
>server. So the ability to run delegated tasks on a different thread would
>be useful in any case. Even with the server, I imagine the Processor thread
>is shared by multiple connections with thread affinity for connections, so
>it might be better not to run potentially long running delegated tasks on
>that thread.
>2. You may be right that Kafka doesn't need to support renegotiation.
>The usecase I was thinking of was slightly different from the one you
>described. Periodic renegotiation is used sometimes to refresh encryption
>keys especially with ciphers that are weak. Kafka may not have a
>requirement to support this at the moment.
>3. Graceful close needs close handshake messages to be be
>sent/received to shutdown the SSL engine and this requires managing
>selection interest based on SSL engine close state. It will be good if the
>base channel/selector class didn't need to be aware of this.
>4. Yes, I agree that the choice is between bringing some
>selection-related code into the channel or some channel related code into
>selector. We found the code neater with the former when the three cases
>above were implemented. But it is possible that you can handle it
>differently with the latter, so I am happy to wait until your patch is
>ready.
>
> Regards,
>
> Rajini
>
>
> On Wed, Apr 22, 2015 at 4:00 PM, Sriharsha Chintalapani 
> wrote:
>
>>1. *Support for running potentially long-running delegated tasks
>> outside
>> the network thread*: It is recommended that delegated tasks indicated by
>> a handshake status of NEED_TASK are run on a separate thread since they
>> may
>> block (
>> http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html).
>> It is easier to encapsulate this in SSLChannel without any changes to
>> common code if selection keys are managed within the Channel.
>>
>>
>>  This makes sense I can change code to not do it on the network thread.
>>
>> Right now we are doing the handshake as part of the processor ( it
>> shouldn’t be in acceptor) and we have multiple processors thread. Do we
>> still see this as an issue if it happens on the same thread as processor? .
>>
>>
>>
>>
>>  --
>> Harsha
>> Sent with Airmail
>>
>> On April 22, 2015 at 7:18:17 AM, Sriharsha Chintalapani (
>> harsh...@fastmail.fm) wrote:
>>
>>   Hi Rajini,
>>Thanks for the details. I did go through your code . There was a
>> discussion before about not having selector related code into the channel
>> or extending the selector it self.
>>
>>  1. *Support for running potentially long-running delegated tasks
>> outside
>> the network thread*: It is recommended that delegated tasks indicated by
>> a handshake status of NEED_TASK are run on a separate thread since they
>> may
>> block (
>> http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html).
>> It is easier to encapsulate this in SSLChannel without any changes to
>> common code if selection keys are managed within the Channel.
>>
>>
>>  This makes sense I can change code to not do it on the network thread.
>>
>>
>>  2. *Renegotiation handshake*: During a read operation, handshake status
>> may indicate that renegotiation is required. It will be good to
>> encapsulate
>> this state change (and any knowledge of these SSL-specific state
>> transitions) within SSLChannel. Our experience was that managing keys and
>> state within the SSLChannel rather than in Selector made this code
>> neater.
>>
>> Do 

[jira] [Updated] (KAFKA-1940) Initial checkout and build failing

2015-04-24 Thread Daneel Yaitskov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daneel Yaitskov updated KAFKA-1940:
---
Attachment: zinc-upgrade.patch

I tried to upgrade zinc library up to 0.3.7 and the issue disappeared.
With the patch applied tests on Scala are launched successfully.
Though 4 of these tests are appeared broken. I don't know whether it's related 
to the upgrade because the change Scala tests were  not runnable.

{noformat}
kafka.server.LogRecoveryTest > testHWCheckpointWithFailuresMultipleLogSegments 
FAILED
junit.framework.AssertionFailedError: Failed to update high watermark for 
follower after timeout
at junit.framework.Assert.fail(Assert.java:47)
at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:619)
at 
kafka.server.LogRecoveryTest.testHWCheckpointWithFailuresMultipleLogSegments(LogRecoveryTest.scala:214)

kafka.server.LogRecoveryTest > testHWCheckpointNoFailuresMultipleLogSegments 
FAILED
junit.framework.AssertionFailedError: Failed to update high watermark for 
follower after timeout
at junit.framework.Assert.fail(Assert.java:47)
at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:619)
at 
kafka.server.LogRecoveryTest.testHWCheckpointNoFailuresMultipleLogSegments(LogRecoveryTest.scala:168)
{noformat}

{noformat}
kafka.producer.AsyncProducerTest > testFailedSendRetryLogic FAILED
kafka.common.FailedToSendMessageException: Failed to send messages after 3 
tries.
at 
kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:91)
at 
kafka.producer.AsyncProducerTest.testFailedSendRetryLogic(AsyncProducerTest.scala:415)

kafka.producer.AsyncProducerTest > testQueueTimeExpired PASSED

kafka.producer.AsyncProducerTest > testNoBroker FAILED
org.scalatest.junit.JUnitTestFailedError: Should fail with 
FailedToSendMessageException
at 
org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:101)
at 
org.scalatest.junit.JUnit3Suite.newAssertionFailedException(JUnit3Suite.scala:149)
at org.scalatest.Assertions$class.fail(Assertions.scala:711)
at org.scalatest.junit.JUnit3Suite.fail(JUnit3Suite.scala:149)
at 
kafka.producer.AsyncProducerTest.testNoBroker(AsyncProducerTest.scala:300)
{noformat}

> Initial checkout and build failing
> --
>
> Key: KAFKA-1940
> URL: https://issues.apache.org/jira/browse/KAFKA-1940
> Project: Kafka
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.8.1.2
> Environment: Groovy:   1.8.6
> Ant:  Apache Ant(TM) version 1.9.2 compiled on July 8 2013
> Ivy:  2.2.0
> JVM:  1.8.0_25 (Oracle Corporation 25.25-b02)
> OS:   Windows 7 6.1 amd64
>Reporter: Martin Lemanski
>  Labels: build
> Attachments: zinc-upgrade.patch
>
>
> when performing `gradle wrapper` and `gradlew build` as a "new" developer, I 
> get an exception: 
> {code}
> C:\development\git\kafka>gradlew build --stacktrace
> <...>
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':core:compileScala'.
> > com.typesafe.zinc.Setup.create(Lcom/typesafe/zinc/ScalaLocation;Lcom/typesafe/zinc/SbtJars;Ljava/io/File;)Lcom/typesaf
> e/zinc/Setup;
> {code}
> Details: https://gist.github.com/mlem/ddff83cc8a25b040c157
> Current Commit:
> {code}
> C:\development\git\kafka>git rev-parse --verify HEAD
> 71602de0bbf7727f498a812033027f6cbfe34eb8
> {code}
> I am evaluating kafka for my company and wanted to run some tests with it, 
> but couldn't due to this error. I know gradle can be tricky and it's not easy 
> to setup everything correct, but this kind of bugs turns possible 
> commiters/users off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation

2015-04-24 Thread Rajini Sivaram
  Harsha,

Thank you for the quick response. (Sorry had missed sending this reply to
the dev-list earlier)..


   1. I am not sure what the new server-side code is going to look like
   after refactoring under KAFKA-1928. But I was assuming that there would be
   only one Channel implementation that would be shared by both clients and
   server. So the ability to run delegated tasks on a different thread would
   be useful in any case. Even with the server, I imagine the Processor thread
   is shared by multiple connections with thread affinity for connections, so
   it might be better not to run potentially long running delegated tasks on
   that thread.
   2. You may be right that Kafka doesn't need to support renegotiation.
   The usecase I was thinking of was slightly different from the one you
   described. Periodic renegotiation is used sometimes to refresh encryption
   keys especially with ciphers that are weak. Kafka may not have a
   requirement to support this at the moment.
   3. Graceful close needs close handshake messages to be be sent/received
   to shutdown the SSL engine and this requires managing selection interest
   based on SSL engine close state. It will be good if the base
   channel/selector class didn't need to be aware of this.
   4. Yes, I agree that the choice is between bringing some
   selection-related code into the channel or some channel related code into
   selector. We found the code neater with the former when the three cases
   above were implemented. But it is possible that you can handle it
   differently with the latter, so I am happy to wait until your patch is
   ready.

Regards,

Rajini


On Wed, Apr 22, 2015 at 4:00 PM, Sriharsha Chintalapani 
wrote:

>1. *Support for running potentially long-running delegated tasks
> outside
> the network thread*: It is recommended that delegated tasks indicated by
> a handshake status of NEED_TASK are run on a separate thread since they
> may
> block (
> http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html).
> It is easier to encapsulate this in SSLChannel without any changes to
> common code if selection keys are managed within the Channel.
>
>
>  This makes sense I can change code to not do it on the network thread.
>
> Right now we are doing the handshake as part of the processor ( it
> shouldn’t be in acceptor) and we have multiple processors thread. Do we
> still see this as an issue if it happens on the same thread as processor? .
>
>
>
>
>  --
> Harsha
> Sent with Airmail
>
> On April 22, 2015 at 7:18:17 AM, Sriharsha Chintalapani (
> harsh...@fastmail.fm) wrote:
>
>   Hi Rajini,
>Thanks for the details. I did go through your code . There was a
> discussion before about not having selector related code into the channel
> or extending the selector it self.
>
>  1. *Support for running potentially long-running delegated tasks outside
> the network thread*: It is recommended that delegated tasks indicated by
> a handshake status of NEED_TASK are run on a separate thread since they
> may
> block (
> http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html).
> It is easier to encapsulate this in SSLChannel without any changes to
> common code if selection keys are managed within the Channel.
>
>
>  This makes sense I can change code to not do it on the network thread.
>
>
>  2. *Renegotiation handshake*: During a read operation, handshake status
> may indicate that renegotiation is required. It will be good to
> encapsulate
> this state change (and any knowledge of these SSL-specific state
> transitions) within SSLChannel. Our experience was that managing keys and
> state within the SSLChannel rather than in Selector made this code neater.
>
> Do we even want to support renegotiation. This is a case where user/client
> handshakes with server anonymously
>
> but later want to change and present their identity and establish a new
> SSL session. In our producer or consumers either present their identity (
> two -way auth) or not.  Since these are long running processes I don’t see
> that there might be a case where they initially establish the session and
> later present their identity.
>
>
>  *Graceful shutdown of the SSL connection*s: Our experience was that
> we could encapsulate all of the logic for shutting down SSLEngine
> gracefully within SSLChannel when the selection key and state are owned
> and
> managed by SSLChannel.
>
>
> Can’t this be done when channel.close() is called any reason to own the
> selection key.
>
>  4. *And finally a minor point:* We found that by managing selection key
> and selection interests within SSLChannel, protocol-independent Selector
> didn't need the concept of handshake at all and all channel state
> management and handshake related code could be held in protocol-specific
> classes. This may be worth taking into consideration since it makes it
> easier for common network layer code to be maintained without any
> understanding of the details of in

[jira] [Updated] (KAFKA-2106) Partition balance tool between borkers

2015-04-24 Thread chenshangan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenshangan updated KAFKA-2106:
---
Attachment: KAFKA-2106.patch

Implement basic rehash assignment algorithm

> Partition balance tool between borkers
> --
>
> Key: KAFKA-2106
> URL: https://issues.apache.org/jira/browse/KAFKA-2106
> Project: Kafka
>  Issue Type: New Feature
>  Components: admin
>Affects Versions: 0.8.3
>Reporter: chenshangan
> Attachments: KAFKA-2106.patch
>
>
> The default partition assignment algorithm can work well in a static kafka 
> cluster(number of brokers seldom change). Actually, in production env, number 
> of brokers is always increasing according to the business data. When new 
> brokers added to the cluster, it's better to provide a tool that can help to 
> move existing data to new brokers. Currently, users need to choose topic or 
> partitions manually and use the Reassign Partitions Tool 
> (kafka-reassign-partitions.sh) to achieve the goal. It's a time-consuming 
> task when there's a lot of topics in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2106) Partition balance tool between borkers

2015-04-24 Thread chenshangan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenshangan updated KAFKA-2106:
---
Reviewer: Jun Rao
  Status: Patch Available  (was: Open)

> Partition balance tool between borkers
> --
>
> Key: KAFKA-2106
> URL: https://issues.apache.org/jira/browse/KAFKA-2106
> Project: Kafka
>  Issue Type: New Feature
>  Components: admin
>Affects Versions: 0.8.3
>Reporter: chenshangan
>
> The default partition assignment algorithm can work well in a static kafka 
> cluster(number of brokers seldom change). Actually, in production env, number 
> of brokers is always increasing according to the business data. When new 
> brokers added to the cluster, it's better to provide a tool that can help to 
> move existing data to new brokers. Currently, users need to choose topic or 
> partitions manually and use the Reassign Partitions Tool 
> (kafka-reassign-partitions.sh) to achieve the goal. It's a time-consuming 
> task when there's a lot of topics in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2106) Partition balance tool between borkers

2015-04-24 Thread chenshangan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510614#comment-14510614
 ] 

chenshangan commented on KAFKA-2106:


When adding new brokers to the cluster, existing replica assignment should be 
reassigned to get better distribution.  The basic though is to reassign the 
topic in the new cluster while starting at the original broker starting index, 
the process is like rehash the existing data.

> Partition balance tool between borkers
> --
>
> Key: KAFKA-2106
> URL: https://issues.apache.org/jira/browse/KAFKA-2106
> Project: Kafka
>  Issue Type: New Feature
>  Components: admin
>Affects Versions: 0.8.3
>Reporter: chenshangan
>
> The default partition assignment algorithm can work well in a static kafka 
> cluster(number of brokers seldom change). Actually, in production env, number 
> of brokers is always increasing according to the business data. When new 
> brokers added to the cluster, it's better to provide a tool that can help to 
> move existing data to new brokers. Currently, users need to choose topic or 
> partitions manually and use the Reassign Partitions Tool 
> (kafka-reassign-partitions.sh) to achieve the goal. It's a time-consuming 
> task when there's a lot of topics in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-2106) Partition balance tool between borkers

2015-04-24 Thread chenshangan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenshangan updated KAFKA-2106:
---
Comment: was deleted

(was: Topics in the cluster can be divided into two categories:
1. nPartitions > nBrokersBeforeExpand
2. nPartitions < nBrokersBeforeExpand
when adding new brokers into cluster:
in case 1,  partitions should be reassigned and spread over as many brokers as 
possible.
in case 2,  calculate nPartitions in each broker and sort them, partitions 
should be moved from larger broker to smaller one
Finally, nPartitions will almost evenly spread over all brokers.

)

> Partition balance tool between borkers
> --
>
> Key: KAFKA-2106
> URL: https://issues.apache.org/jira/browse/KAFKA-2106
> Project: Kafka
>  Issue Type: New Feature
>  Components: admin
>Affects Versions: 0.8.3
>Reporter: chenshangan
>
> The default partition assignment algorithm can work well in a static kafka 
> cluster(number of brokers seldom change). Actually, in production env, number 
> of brokers is always increasing according to the business data. When new 
> brokers added to the cluster, it's better to provide a tool that can help to 
> move existing data to new brokers. Currently, users need to choose topic or 
> partitions manually and use the Reassign Partitions Tool 
> (kafka-reassign-partitions.sh) to achieve the goal. It's a time-consuming 
> task when there's a lot of topics in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)