Re: Review Request 33049: Patch for KAFKA-2084

2015-05-15 Thread Dong Lin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33049/#review84025
---



core/src/main/scala/kafka/server/KafkaApis.scala


Do you mean RequestKeys.nameForKey(RequestKeys.FetchKey)?


- Dong Lin


On May 11, 2015, 11:17 p.m., Aditya Auradkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33049/
> ---
> 
> (Updated May 11, 2015, 11:17 p.m.)
> 
> 
> Review request for kafka, Joel Koshy and Jun Rao.
> 
> 
> Bugs: KAFKA-2084
> https://issues.apache.org/jira/browse/KAFKA-2084
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Updated patch for quotas. This patch does the following: 
> 1. Add per-client metrics for both producer and consumers 
> 2. Add configuration for quotas 
> 3. Compute delay times in the metrics package and return the delay times in 
> QuotaViolationException 
> 4. Add a DelayQueue in KafkaApi's that can be used to throttle any type of 
> request. Implemented request throttling for produce and fetch requests. 
> 5. Added unit and integration test cases.
> 6. This doesn't include a system test. There is a separate ticket for that
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/metrics/MetricConfig.java 
> dfa1b0a11042ad9d127226f0e0cec8b1d42b8441 
>   clients/src/main/java/org/apache/kafka/common/metrics/Quota.java 
> d82bb0c055e631425bc1ebbc7d387baac76aeeaa 
>   
> clients/src/main/java/org/apache/kafka/common/metrics/QuotaViolationException.java
>  a451e5385c9eca76b38b425e8ac856b2715fcffe 
>   clients/src/main/java/org/apache/kafka/common/metrics/Sensor.java 
> ca823fd4639523018311b814fde69b6177e73b97 
>   clients/src/test/java/org/apache/kafka/common/utils/MockTime.java  
>   core/src/main/scala/kafka/server/ClientQuotaMetrics.scala PRE-CREATION 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 417960dd1ab407ebebad8fdb0e97415db3e91a2f 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 9efa15ca5567b295ab412ee9eea7c03eb4cdc18b 
>   core/src/main/scala/kafka/server/KafkaServer.scala 
> b7d2a2842e17411a823b93bdedc84657cbd62be1 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 59c9bc3ac3a8afc07a6f8c88c5871304db588d17 
>   core/src/main/scala/kafka/server/ThrottledRequest.scala PRE-CREATION 
>   core/src/main/scala/kafka/utils/ShutdownableThread.scala 
> fc226c863095b7761290292cd8755cd7ad0f155c 
>   core/src/test/scala/integration/kafka/api/QuotasTest.scala PRE-CREATION 
>   core/src/test/scala/unit/kafka/server/ClientQuotaMetricsTest.scala 
> PRE-CREATION 
>   core/src/test/scala/unit/kafka/server/KafkaConfigConfigDefTest.scala 
> 8014a5a6c362785539f24eb03d77278434614fe6 
>   core/src/test/scala/unit/kafka/server/ThrottledRequestExpirationTest.scala 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33049/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Aditya Auradkar
> 
>



[jira] [Updated] (KAFKA-2190) Incorporate close(timeout) to Mirror Maker

2015-05-15 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-2190:

Attachment: KAFKA-2190_2015-05-15_19:50:42.patch

> Incorporate close(timeout) to Mirror Maker
> --
>
> Key: KAFKA-2190
> URL: https://issues.apache.org/jira/browse/KAFKA-2190
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Attachments: KAFKA-2190.patch, KAFKA-2190_2015-05-15_19:50:42.patch
>
>
> Use close(0) when mirror maker exits accidentally to avoid reordering.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2190) Incorporate close(timeout) to Mirror Maker

2015-05-15 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546524#comment-14546524
 ] 

Jiangjie Qin commented on KAFKA-2190:
-

Updated reviewboard https://reviews.apache.org/r/34241/diff/
 against branch origin/trunk

> Incorporate close(timeout) to Mirror Maker
> --
>
> Key: KAFKA-2190
> URL: https://issues.apache.org/jira/browse/KAFKA-2190
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Attachments: KAFKA-2190.patch, KAFKA-2190_2015-05-15_19:50:42.patch
>
>
> Use close(0) when mirror maker exits accidentally to avoid reordering.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34241: Patch for KAFKA-2190

2015-05-15 Thread Jiangjie Qin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34241/
---

(Updated May 16, 2015, 2:50 a.m.)


Review request for kafka.


Bugs: KAFKA-2190
https://issues.apache.org/jira/browse/KAFKA-2190


Repository: kafka


Description (updated)
---

Should flush before commit offsets when mirror maker thread exits on exception.


Diffs (updated)
-

  core/src/main/scala/kafka/tools/MirrorMaker.scala 
954852170d67cbb8ff4f113301816d2a2daf5e91 

Diff: https://reviews.apache.org/r/34241/diff/


Testing
---


Thanks,

Jiangjie Qin



Re: [DISCUSS] Add missing API to old high level consumer

2015-05-15 Thread Jiangjie Qin
Sounds good. Thanks, Jun.

Jiangjie (Becket) Qin

On 5/15/15, 5:21 PM, "Jun Rao"  wrote:

>Hi, Jiangjie,
>
>It seems that api is already in
>kafka.javaapi.consumer.ZookeeperConsumerConnector. We just need to add it
>to kafka.javaapi.consumer.ConsumerConnector. This is fine and I don't
>think
>we need a KIP. Since this is only added in trunk, it will be part of the
>0.8.3 release. I agree with Joe that since we are developing the new java
>consumer, it would be good not to make changes to the old scala consumer.
>
>As for the the 0.8.3 release, we agreed that it will include at least the
>new java consumer. If other features (e.g., security, admin, etc) can be
>done at or before that, they will be part of the 0.8.3 release too.
>Updated
>the release plan wiki.
>
>Thanks,
>
>Jun
>
>
>On Wed, May 13, 2015 at 3:01 PM, Jiangjie Qin 
>wrote:
>
>> Add the DISCUSS prefix to the email title : )
>>
>> From: Jiangjie Qin mailto:j...@linkedin.com>>
>> Date: Tuesday, May 12, 2015 at 4:51 PM
>> To: "dev@kafka.apache.org" <
>> dev@kafka.apache.org>
>> Subject: Add missing API to old high level consumer
>>
>> Hi,
>>
>> I just noticed that in KAFKA-1650 (which is before we use KIP) we added
>>an
>> offset commit method in high level consumer that commits offsets using a
>> user provided offset map.
>>
>> public void commitOffsets(Map
>> offsetsToCommit, boolean retryOnFailure);
>>
>> This method was added to all the Scala classes but I forgot to add it to
>> Java API of ConsumerConnector. (Already regretting now. . .)
>> This method is very useful in several cases and has been asked for from
>> time to time. For example, people have several threads consuming
>>messages
>> and processing them. Without this method, one thread will unexpectedly
>> commit offsets for another thread, thus might lose some messages if
>> something goes wrong.
>>
>> I created KAFKA-2186 and hope we can add this missing method into the
>>Java
>> API of old high level consumer (literarily one line change).
>> Although this method should have been there since KAFKA-1650,  adding
>>this
>> method to Java API now is a public API change, just want to see if
>>people
>> think we need a KIP for this.
>>
>> Thanks.
>>
>> Jiangjie (Becket) Qin
>>



Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Joel Koshy


> On May 16, 2015, 12:07 a.m., Joel Koshy wrote:
> > clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java,
> >  line 35
> > 
> >
> > Can we just use a fixed thread pool?
> > 
> > A more general question/concern though:
> > 
> > If there are multiple inbound connections all of which end up needing 
> > delegated tasks, wouldn't they block? i.e., since the pool size is one?
> > 
> > From my reading of the JSSE ref guide this would typically be when you 
> > have a custom trust or key manager. Are there other use-cases?

Capturing offline discussion: I'm actually less concerned about this on the 
client side than the server side. On the client side the net effect of blocking 
here is that the producer/consumer would be to not produce/consume during the 
handshake. On the current server-side current patch (KAFKA-1684) it is done in 
the acceptor threads. So on a server bounce several clients could initiate 
handshakes on new partition leaders. If there are several delegated tasks then 
all processors could eventually block.
Not sure if this is something we need to worry about. I wonder why SSLEngine 
does not expose a callback-based model for executing delegated tasks.


- Joel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review83998
---


On May 15, 2015, 2:18 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated May 15, 2015, 2:18 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> Diffs
> -
> 
>   build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
>   checkstyle/checkstyle.xml a215ff36e9252879f1e0be5a86fef9a875bb8f38 
>   checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 8e336a3aa96c73f52beaeb56b931baf4b026cf21 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 187d0004c8c46b6664ddaffecc6166d4b47351e5 
>   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
> c4fa058692f50abb4f47bd344119d805c60123f5 
>   clients/src/main/java/org/apache/kafka/common/config/SecurityConfigs.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Channel.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
> b5f8d83e89f9026dc0853e5f92c00b2d7f043e22 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> 57de0585e5e9a53eb9dcd99cac1ab3eb2086a302 
>   clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/protocol/SecurityProt

Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Joel Koshy


> On May 15, 2015, 10:54 p.m., Joel Koshy wrote:
> > clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java,
> >  line 153
> > 
> >
> > I think Michael meant the following which I think is valid right?
> > Line 146: handshakeWrap completes and the status changes to NEED_UNWRAP 
> > (since SSLEngine has finished wrapping), but the netBuffer has not yet been 
> > flushed. So on line 153 we would fall through to the NEED_UNWRAP case 
> > without doing the flush.

Discussed offline. I misread the code. This logic is fine. Michael, let us know 
if you still think there is an issue.


- Joel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review83993
---


On May 15, 2015, 2:18 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated May 15, 2015, 2:18 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> Diffs
> -
> 
>   build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
>   checkstyle/checkstyle.xml a215ff36e9252879f1e0be5a86fef9a875bb8f38 
>   checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 8e336a3aa96c73f52beaeb56b931baf4b026cf21 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 187d0004c8c46b6664ddaffecc6166d4b47351e5 
>   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
> c4fa058692f50abb4f47bd344119d805c60123f5 
>   clients/src/main/java/org/apache/kafka/common/config/SecurityConfigs.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Channel.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
> b5f8d83e89f9026dc0853e5f92c00b2d7f043e22 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> 57de0585e5e9a53eb9dcd99cac1ab3eb2086a302 
>   clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java 
> dab1a94dd29563688b6ecf4eeb0e180b06049d3f 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/DefaultPrincipalBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/KafkaPrincipal.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/PrincipalBuilder.java
>  PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/utils/Utils.java 
> f73eedb030987f018d8446bb1dcd98d19fa97331 
>   clients/src/test/java/org/apache/kafka/common/network/EchoServer.java 
> PRE-CREATION 
>   clients/src/test/java/org/apache/kaf

Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review84012
---



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


Sriharsha, this is the line from the ref guide under blocking tasks I was 
referring to:
"The engine will block future wrap/unwrap calls until all of the 
outstanding tasks are completed"


- Joel Koshy


On May 15, 2015, 2:18 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated May 15, 2015, 2:18 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> Diffs
> -
> 
>   build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
>   checkstyle/checkstyle.xml a215ff36e9252879f1e0be5a86fef9a875bb8f38 
>   checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 8e336a3aa96c73f52beaeb56b931baf4b026cf21 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 187d0004c8c46b6664ddaffecc6166d4b47351e5 
>   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
> c4fa058692f50abb4f47bd344119d805c60123f5 
>   clients/src/main/java/org/apache/kafka/common/config/SecurityConfigs.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Channel.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
> b5f8d83e89f9026dc0853e5f92c00b2d7f043e22 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> 57de0585e5e9a53eb9dcd99cac1ab3eb2086a302 
>   clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java 
> dab1a94dd29563688b6ecf4eeb0e180b06049d3f 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/DefaultPrincipalBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/KafkaPrincipal.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/PrincipalBuilder.java
>  PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/utils/Utils.java 
> f73eedb030987f018d8446bb1dcd98d19fa97331 
>   clients/src/test/java/org/apache/kafka/common/network/EchoServer.java 
> PRE-CREATION 
>   clients/src/test/java/org/apache/kafka/common/network/SSLFactoryTest.java 
> PRE-CREATION 
>   clients/src/test/java/org/apache/kafka/common/network/SSLSelectorTest.java 
> PRE-CREATION 
>   clients/src/test/java/org/apache/kafka/common/network/SelectorTest.java 
> d5b306b026e788b4e5479f3419805aa49ae889f3 
>   clients/src/test/java/org/apache/kafka/common/utils/UtilsTest.java 
> 2ebe3c21f6

Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review84006
---



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


From the ref guide:
"In addition to an application explicitly closing the SSLEngine, the 
SSLEngine might be closed by the peer (via receipt of a close message while it 
is processing handshake data), or by the SSLEngine encountering an error while 
processing application or handshake data, indicated by throwing an 
SSLException. In such cases, the application should invoke SSLEngine.wrap() to 
get the close message and send it to the peer until SSLEngine.isOutboundDone() 
returns true, as shown in the previous example, or the 
SSLEngineResult.getStatus() returns CLOSED."

Do we need to do this? i.e., it appears we are not handling this during the 
handshake.


- Joel Koshy


On May 15, 2015, 2:18 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated May 15, 2015, 2:18 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> Diffs
> -
> 
>   build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
>   checkstyle/checkstyle.xml a215ff36e9252879f1e0be5a86fef9a875bb8f38 
>   checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 8e336a3aa96c73f52beaeb56b931baf4b026cf21 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 187d0004c8c46b6664ddaffecc6166d4b47351e5 
>   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
> c4fa058692f50abb4f47bd344119d805c60123f5 
>   clients/src/main/java/org/apache/kafka/common/config/SecurityConfigs.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Channel.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
> b5f8d83e89f9026dc0853e5f92c00b2d7f043e22 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> 57de0585e5e9a53eb9dcd99cac1ab3eb2086a302 
>   clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java 
> dab1a94dd29563688b6ecf4eeb0e180b06049d3f 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/DefaultPrincipalBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/KafkaPrincipal.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/PrincipalBuilder.java
>  PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/utils/Utils.java 
> f73eedb030987f018d8446bb1dcd98d19fa97331 

Re: [DISCUSS] Add missing API to old high level consumer

2015-05-15 Thread Jun Rao
Hi, Jiangjie,

It seems that api is already in
kafka.javaapi.consumer.ZookeeperConsumerConnector. We just need to add it
to kafka.javaapi.consumer.ConsumerConnector. This is fine and I don't think
we need a KIP. Since this is only added in trunk, it will be part of the
0.8.3 release. I agree with Joe that since we are developing the new java
consumer, it would be good not to make changes to the old scala consumer.

As for the the 0.8.3 release, we agreed that it will include at least the
new java consumer. If other features (e.g., security, admin, etc) can be
done at or before that, they will be part of the 0.8.3 release too. Updated
the release plan wiki.

Thanks,

Jun


On Wed, May 13, 2015 at 3:01 PM, Jiangjie Qin 
wrote:

> Add the DISCUSS prefix to the email title : )
>
> From: Jiangjie Qin mailto:j...@linkedin.com>>
> Date: Tuesday, May 12, 2015 at 4:51 PM
> To: "dev@kafka.apache.org" <
> dev@kafka.apache.org>
> Subject: Add missing API to old high level consumer
>
> Hi,
>
> I just noticed that in KAFKA-1650 (which is before we use KIP) we added an
> offset commit method in high level consumer that commits offsets using a
> user provided offset map.
>
> public void commitOffsets(Map
> offsetsToCommit, boolean retryOnFailure);
>
> This method was added to all the Scala classes but I forgot to add it to
> Java API of ConsumerConnector. (Already regretting now. . .)
> This method is very useful in several cases and has been asked for from
> time to time. For example, people have several threads consuming messages
> and processing them. Without this method, one thread will unexpectedly
> commit offsets for another thread, thus might lose some messages if
> something goes wrong.
>
> I created KAFKA-2186 and hope we can add this missing method into the Java
> API of old high level consumer (literarily one line change).
> Although this method should have been there since KAFKA-1650,  adding this
> method to Java API now is a public API change, just want to see if people
> think we need a KIP for this.
>
> Thanks.
>
> Jiangjie (Becket) Qin
>


Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review83998
---


Few more minor comments.

General comment on javadocs: although this can wait. The current patch has some 
but hopefully the final patch will have more uniformly thorough javadocs. 
Similar comment on logging. E.g., should we add trace messages in various 
stages of the handshake?

BTW thanks a lot for the patch - this is a lot of work!


clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java


May be able to remove values() if we avoid passing in the full config as 
mentioned in Selector



clients/src/main/java/org/apache/kafka/common/network/Authenticator.java


I'm also unclear on this API. Will discuss offline with you.



clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java


These can be private.



clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java


May want to just have a final member and return that.



clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java


Can we just use a fixed thread pool?

A more general question/concern though:

If there are multiple inbound connections all of which end up needing 
delegated tasks, wouldn't they block? i.e., since the pool size is one?

From my reading of the JSSE ref guide this would typically be when you have 
a custom trust or key manager. Are there other use-cases?



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


`if (handshakeResult.getStatus() == Status.OK && handshakeStatus == 
HandshakeStatus.NEED_TASK)`



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


Should this be unwrapped into the appReadBuffer?



clients/src/main/java/org/apache/kafka/common/network/Selector.java


Rather than leak in the entire config here, should we just initialize 
SSLFactory outside and pass that in instead?

We can add constructors for different authentication mechanisms as needed.



clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java


typo



clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java


1, "SSL"



clients/src/main/java/org/apache/kafka/common/security/auth/KafkaPrincipal.java


@override annotations on this and couple other methods.


- Joel Koshy


On May 15, 2015, 2:18 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated May 15, 2015, 2:18 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> Diffs
> -
> 
>   build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
>   checkstyle/checkstyle.xml a215ff36e9252879f1e0be5a86fef9a875bb8f38 
>   checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 8e336a3aa96c73f52beaeb56b931baf4b026cf21 
>   clients/src/main/java/org/apache/kafka/clients/producer/Producer

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-15 Thread Jun Rao
Aditya,

In the following, we should encode the config properties as a json map to
be consistent with topic config.

"Internally, the znodes are comma-separated key-value pairs where key
represents the configuration property to change.
{"version": x, "config" : {X1=Y1, X2=Y2..}}"

Thanks,

Jun

On Fri, May 15, 2015 at 1:09 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Yes we did. I just overlooked that line.. cleaning it up now.
>
> Aditya
>
> 
> From: Gwen Shapira [gshap...@cloudera.com]
> Sent: Friday, May 15, 2015 12:55 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-21 Configuration Management
>
> The wiki says:
> "There will be 3 paths within config
> /config/clients/
> /config/topics/
> /config/brokers/
> Didn't we decide that brokers will not be configured dynamically, rather we
> will keep the config in the file?
>
> On Fri, May 15, 2015 at 10:46 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Updated the wiki to capture our recent discussions. Please read.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
> >
> > Thanks,
> > Aditya
> >
> > 
> > From: Joel Koshy [jjkosh...@gmail.com]
> > Sent: Tuesday, May 12, 2015 1:09 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-21 Configuration Management
> >
> > The lack of audit could be addressed to some degree by an internal
> > __config_changes topic which can have very long retention. Also, per
> > the hangout summary that Gwen sent out it appears that we decided
> > against supporting SIGHUP/dynamic configs for the broker.
> >
> > Thanks,
> >
> > Joel
> >
> > On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote:
> > > Thanks for chiming in, Todd!
> > >
> > > Agree that the lack of audit and rollback is a major downside of moving
> > all
> > > configs to ZooKeeper. Being able to configure dynamically created
> > entities
> > > in Kafka is required though. So I think what Todd suggested is a good
> > > solution to managing all configs - catching SIGHUP for broker configs
> and
> > > storing dynamic configs in ZK like we do today.
> > >
> > > On Tue, May 12, 2015 at 10:30 AM, Jay Kreps 
> wrote:
> > >
> > > > Hmm, here is how I think we can change the split brain proposal to
> > make it
> > > > a bit better:
> > > > 1. Get rid of broker overrides, this is just done in the config file.
> > This
> > > > makes the precedence chain a lot clearer (e.g. zk always overrides
> > file on
> > > > a per-entity basis).
> > > > 2. Get rid of the notion of dynamic configs in ConfigDef and in the
> > broker.
> > > > All overrides are dynamic and all server configs are static.
> > > > 3. Create an equivalent of LogConfig for ClientConfig and any future
> > config
> > > > type we make.
> > > > 4. Generalize the TopicConfigManager to handle multiple types of
> > overrides.
> > > >
> > > > What we haven't done is try to think through how the pure zk approach
> > would
> > > > work.
> > > >
> > > > -Jay
> > > >
> > > >
> > > >
> > > > On Mon, May 11, 2015 at 10:53 PM, Ashish Singh 
> > > > wrote:
> > > >
> > > > > I agree with the Joel's suggestion on keeping broker's configs in
> > > > > config file and clients/topics config in ZK. Few other projects,
> > Apache
> > > > > Solr for one, also does something similar for its configurations.
> > > > >
> > > > > On Monday, May 11, 2015, Gwen Shapira 
> wrote:
> > > > >
> > > > > > I like this approach (obviously).
> > > > > > I am also OK with supporting broker re-read of config file based
> > on ZK
> > > > > > watch instead of SIGHUP, if we see this as more consistent with
> the
> > > > rest
> > > > > of
> > > > > > our code base.
> > > > > >
> > > > > > Either is fine by me as long as brokers keep the file and just do
> > > > refresh
> > > > > > :)
> > > > > >
> > > > > > On Tue, May 12, 2015 at 2:54 AM, Joel Koshy  > > > > > > wrote:
> > > > > >
> > > > > > > So the general concern here is the dichotomy of configs (which
> we
> > > > > > > already have - i.e., in the form of broker config files vs
> topic
> > > > > > > configs in zookeeper). We (at LinkedIn) had some discussions on
> > this
> > > > > > > last week and had this very question for the operations team
> > whose
> > > > > > > opinion is I think to a large degree a touchstone for this
> > decision:
> > > > > > > "Has the operations team at LinkedIn experienced any pain so
> far
> > with
> > > > > > > managing topic configs in ZooKeeper (while broker configs are
> > > > > > > file-based)?" It turns out that ops overwhelmingly favors the
> > current
> > > > > > > approach. i.e., service configs as file-based configs and
> > > > client/topic
> > > > > > > configs in ZooKeeper is intuitive and works great. This may be
> > > > > > > somewhat counter-intuitive to devs, but this is one of those
> > > > decisions
> > > > > > > for which ops input is very critical - because for all
>

Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Jun Rao
+1

Thanks,

Jun

On Fri, May 15, 2015 at 9:18 AM, Parth Brahmbhatt <
pbrahmbh...@hortonworks.com> wrote:

> Hi,
>
> Opening the voting thread for KIP-11.
>
> Link to the KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
> Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
>
> Thanks
> Parth
>


[jira] [Commented] (KAFKA-2147) Unbalanced replication can cause extreme purgatory growth

2015-05-15 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546356#comment-14546356
 ] 

Jun Rao commented on KAFKA-2147:


Interesting, looking at the code again, it seems that the cleaning logic may 
not be triggered on time if the requests in the delay queue keep getting 
satisfied before they are expired. I am attaching a new patch. Could you give 
it try?

> Unbalanced replication can cause extreme purgatory growth
> -
>
> Key: KAFKA-2147
> URL: https://issues.apache.org/jira/browse/KAFKA-2147
> Project: Kafka
>  Issue Type: Bug
>  Components: purgatory, replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jun Rao
> Attachments: KAFKA-2147.patch, KAFKA-2147_2015-05-14_09:41:56.patch, 
> KAFKA-2147_2015-05-15_16:14:44.patch, purgatory.log, watch-lists.log
>
>
> Apologies in advance, this is going to be a bit of complex description, 
> mainly because we've seen this issue several different ways and we're still 
> tying them together in terms of root cause and analysis.
> It is worth noting now that we have all our producers set up to send 
> RequiredAcks==-1, and that this includes all our MirrorMakers.
> I understand the purgatory is being rewritten (again) for 0.8.3. Hopefully 
> that will incidentally fix this issue, or at least render it moot.
> h4. Symptoms
> Fetch request purgatory on a broker or brokers grows rapidly and steadily at 
> a rate of roughly 1-5K requests per second. Heap memory used also grows to 
> keep pace. When 4-5 million requests have accumulated in purgatory, the 
> purgatory is drained, causing a substantial latency spike. The node will tend 
> to drop leadership, replicate, and recover.
> h5. Case 1 - MirrorMaker
> We first noticed this case when enabling mirrormaker. We had one primary 
> cluster already, with many producers and consumers. We created a second, 
> identical cluster and enabled replication from the original to the new 
> cluster on some topics using mirrormaker. This caused all six nodes in the 
> new cluster to exhibit the symptom in lockstep - their purgatories would all 
> grow together, and get drained within about 20 seconds of each other. The 
> cluster-wide latency spikes at this time caused several problems for us.
> Turning MM on and off turned the problem on and off very precisely. When we 
> stopped MM, the purgatories would all drop to normal levels immediately, and 
> would start climbing again when we restarted it.
> Note that this is the *fetch* purgatories on the brokers that MM was 
> *producing* to, which indicates fairly strongly that this is a replication 
> issue, not a MM issue.
> This particular cluster and MM setup was abandoned for other reasons before 
> we could make much progress debugging.
> h5. Case 2 - Broker 6
> The second time we saw this issue was on the newest broker (broker 6) in the 
> original cluster. For a long time we were running with five nodes, and 
> eventually added a sixth to handle the increased load. At first, we moved 
> only a handful of higher-volume partitions to this broker. Later, we created 
> a group of new topics (totalling around 100 partitions) for testing purposes 
> that were spread automatically across all six nodes. These topics saw 
> occasional traffic, but were generally unused. At this point broker 6 had 
> leadership for about an equal number of high-volume and unused partitions, 
> about 15-20 of each.
> Around this time (we don't have detailed enough data to prove real 
> correlation unfortunately), the issue started appearing on this broker as 
> well, but not on any of the other brokers in the cluster.
> h4. Debugging
> The first thing we tried was to reduce the 
> `fetch.purgatory.purge.interval.requests` from the default of 1000 to a much 
> lower value of 200. This had no noticeable effect at all.
> We then enabled debug logging on broker06 and started looking through that. I 
> can attach complete log samples if necessary, but the thing that stood out 
> for us was a substantial number of the following lines:
> {noformat}
> [2015-04-23 20:05:15,196] DEBUG [KafkaApi-6] Putting fetch request with 
> correlation id 49939 from client ReplicaFetcherThread-0-6 into purgatory 
> (kafka.server.KafkaApis)
> {noformat}
> The volume of these lines seemed to match (approximately) the fetch purgatory 
> growth on that broker.
> At this point we developed a hypothesis (detailed below) which guided our 
> subsequent debugging tests:
> - Setting a daemon up to produce regular random data to all of the topics led 
> by kafka06 (specifically the ones which otherwise would receive no data) 
> substantially alleviated the problem.
> - Doing an additional rebalance of the cluster in order to move a number of 
> other topi

[jira] [Updated] (KAFKA-2147) Unbalanced replication can cause extreme purgatory growth

2015-05-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-2147:
---
Attachment: KAFKA-2147_2015-05-15_16:14:44.patch

> Unbalanced replication can cause extreme purgatory growth
> -
>
> Key: KAFKA-2147
> URL: https://issues.apache.org/jira/browse/KAFKA-2147
> Project: Kafka
>  Issue Type: Bug
>  Components: purgatory, replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jun Rao
> Attachments: KAFKA-2147.patch, KAFKA-2147_2015-05-14_09:41:56.patch, 
> KAFKA-2147_2015-05-15_16:14:44.patch, purgatory.log, watch-lists.log
>
>
> Apologies in advance, this is going to be a bit of complex description, 
> mainly because we've seen this issue several different ways and we're still 
> tying them together in terms of root cause and analysis.
> It is worth noting now that we have all our producers set up to send 
> RequiredAcks==-1, and that this includes all our MirrorMakers.
> I understand the purgatory is being rewritten (again) for 0.8.3. Hopefully 
> that will incidentally fix this issue, or at least render it moot.
> h4. Symptoms
> Fetch request purgatory on a broker or brokers grows rapidly and steadily at 
> a rate of roughly 1-5K requests per second. Heap memory used also grows to 
> keep pace. When 4-5 million requests have accumulated in purgatory, the 
> purgatory is drained, causing a substantial latency spike. The node will tend 
> to drop leadership, replicate, and recover.
> h5. Case 1 - MirrorMaker
> We first noticed this case when enabling mirrormaker. We had one primary 
> cluster already, with many producers and consumers. We created a second, 
> identical cluster and enabled replication from the original to the new 
> cluster on some topics using mirrormaker. This caused all six nodes in the 
> new cluster to exhibit the symptom in lockstep - their purgatories would all 
> grow together, and get drained within about 20 seconds of each other. The 
> cluster-wide latency spikes at this time caused several problems for us.
> Turning MM on and off turned the problem on and off very precisely. When we 
> stopped MM, the purgatories would all drop to normal levels immediately, and 
> would start climbing again when we restarted it.
> Note that this is the *fetch* purgatories on the brokers that MM was 
> *producing* to, which indicates fairly strongly that this is a replication 
> issue, not a MM issue.
> This particular cluster and MM setup was abandoned for other reasons before 
> we could make much progress debugging.
> h5. Case 2 - Broker 6
> The second time we saw this issue was on the newest broker (broker 6) in the 
> original cluster. For a long time we were running with five nodes, and 
> eventually added a sixth to handle the increased load. At first, we moved 
> only a handful of higher-volume partitions to this broker. Later, we created 
> a group of new topics (totalling around 100 partitions) for testing purposes 
> that were spread automatically across all six nodes. These topics saw 
> occasional traffic, but were generally unused. At this point broker 6 had 
> leadership for about an equal number of high-volume and unused partitions, 
> about 15-20 of each.
> Around this time (we don't have detailed enough data to prove real 
> correlation unfortunately), the issue started appearing on this broker as 
> well, but not on any of the other brokers in the cluster.
> h4. Debugging
> The first thing we tried was to reduce the 
> `fetch.purgatory.purge.interval.requests` from the default of 1000 to a much 
> lower value of 200. This had no noticeable effect at all.
> We then enabled debug logging on broker06 and started looking through that. I 
> can attach complete log samples if necessary, but the thing that stood out 
> for us was a substantial number of the following lines:
> {noformat}
> [2015-04-23 20:05:15,196] DEBUG [KafkaApi-6] Putting fetch request with 
> correlation id 49939 from client ReplicaFetcherThread-0-6 into purgatory 
> (kafka.server.KafkaApis)
> {noformat}
> The volume of these lines seemed to match (approximately) the fetch purgatory 
> growth on that broker.
> At this point we developed a hypothesis (detailed below) which guided our 
> subsequent debugging tests:
> - Setting a daemon up to produce regular random data to all of the topics led 
> by kafka06 (specifically the ones which otherwise would receive no data) 
> substantially alleviated the problem.
> - Doing an additional rebalance of the cluster in order to move a number of 
> other topics with regular data to kafka06 appears to have solved the problem 
> completely.
> h4. Hypothesis
> Current versions (0.8.2.1 and earlier) have issues with the replica fetcher 
> not backing off correctly (KAFKA-1461, KAFKA-2082 and others). I

[jira] [Commented] (KAFKA-2147) Unbalanced replication can cause extreme purgatory growth

2015-05-15 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546355#comment-14546355
 ] 

Jun Rao commented on KAFKA-2147:


Updated reviewboard https://reviews.apache.org/r/34125/diff/
 against branch origin/0.8.2

> Unbalanced replication can cause extreme purgatory growth
> -
>
> Key: KAFKA-2147
> URL: https://issues.apache.org/jira/browse/KAFKA-2147
> Project: Kafka
>  Issue Type: Bug
>  Components: purgatory, replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jun Rao
> Attachments: KAFKA-2147.patch, KAFKA-2147_2015-05-14_09:41:56.patch, 
> KAFKA-2147_2015-05-15_16:14:44.patch, purgatory.log, watch-lists.log
>
>
> Apologies in advance, this is going to be a bit of complex description, 
> mainly because we've seen this issue several different ways and we're still 
> tying them together in terms of root cause and analysis.
> It is worth noting now that we have all our producers set up to send 
> RequiredAcks==-1, and that this includes all our MirrorMakers.
> I understand the purgatory is being rewritten (again) for 0.8.3. Hopefully 
> that will incidentally fix this issue, or at least render it moot.
> h4. Symptoms
> Fetch request purgatory on a broker or brokers grows rapidly and steadily at 
> a rate of roughly 1-5K requests per second. Heap memory used also grows to 
> keep pace. When 4-5 million requests have accumulated in purgatory, the 
> purgatory is drained, causing a substantial latency spike. The node will tend 
> to drop leadership, replicate, and recover.
> h5. Case 1 - MirrorMaker
> We first noticed this case when enabling mirrormaker. We had one primary 
> cluster already, with many producers and consumers. We created a second, 
> identical cluster and enabled replication from the original to the new 
> cluster on some topics using mirrormaker. This caused all six nodes in the 
> new cluster to exhibit the symptom in lockstep - their purgatories would all 
> grow together, and get drained within about 20 seconds of each other. The 
> cluster-wide latency spikes at this time caused several problems for us.
> Turning MM on and off turned the problem on and off very precisely. When we 
> stopped MM, the purgatories would all drop to normal levels immediately, and 
> would start climbing again when we restarted it.
> Note that this is the *fetch* purgatories on the brokers that MM was 
> *producing* to, which indicates fairly strongly that this is a replication 
> issue, not a MM issue.
> This particular cluster and MM setup was abandoned for other reasons before 
> we could make much progress debugging.
> h5. Case 2 - Broker 6
> The second time we saw this issue was on the newest broker (broker 6) in the 
> original cluster. For a long time we were running with five nodes, and 
> eventually added a sixth to handle the increased load. At first, we moved 
> only a handful of higher-volume partitions to this broker. Later, we created 
> a group of new topics (totalling around 100 partitions) for testing purposes 
> that were spread automatically across all six nodes. These topics saw 
> occasional traffic, but were generally unused. At this point broker 6 had 
> leadership for about an equal number of high-volume and unused partitions, 
> about 15-20 of each.
> Around this time (we don't have detailed enough data to prove real 
> correlation unfortunately), the issue started appearing on this broker as 
> well, but not on any of the other brokers in the cluster.
> h4. Debugging
> The first thing we tried was to reduce the 
> `fetch.purgatory.purge.interval.requests` from the default of 1000 to a much 
> lower value of 200. This had no noticeable effect at all.
> We then enabled debug logging on broker06 and started looking through that. I 
> can attach complete log samples if necessary, but the thing that stood out 
> for us was a substantial number of the following lines:
> {noformat}
> [2015-04-23 20:05:15,196] DEBUG [KafkaApi-6] Putting fetch request with 
> correlation id 49939 from client ReplicaFetcherThread-0-6 into purgatory 
> (kafka.server.KafkaApis)
> {noformat}
> The volume of these lines seemed to match (approximately) the fetch purgatory 
> growth on that broker.
> At this point we developed a hypothesis (detailed below) which guided our 
> subsequent debugging tests:
> - Setting a daemon up to produce regular random data to all of the topics led 
> by kafka06 (specifically the ones which otherwise would receive no data) 
> substantially alleviated the problem.
> - Doing an additional rebalance of the cluster in order to move a number of 
> other topics with regular data to kafka06 appears to have solved the problem 
> completely.
> h4. Hypothesis
> Current versions (0.8.2.1 and earlier) have issu

Re: Review Request 34125: Patch for KAFKA-2147

2015-05-15 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34125/
---

(Updated May 15, 2015, 11:14 p.m.)


Review request for kafka.


Bugs: KAFKA-2147
https://issues.apache.org/jira/browse/KAFKA-2147


Repository: kafka


Description (updated)
---

add instrumentation


return from pollExpired() on expired items


Diffs (updated)
-

  core/src/main/scala/kafka/server/FetchRequestPurgatory.scala 
ed1318891253556cdf4d908033b704495acd5724 
  core/src/main/scala/kafka/server/ProducerRequestPurgatory.scala 
e7ff411b6dd98732cb2b33f6d8b65d96bbac688e 
  core/src/main/scala/kafka/server/RequestPurgatory.scala 
87ee3bec9c57d7322f1b5d8caa74b1c8415bcf49 

Diff: https://reviews.apache.org/r/34125/diff/


Testing
---


Thanks,

Jun Rao



Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review83993
---



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


I think Michael meant the following which I think is valid right?
Line 146: handshakeWrap completes and the status changes to NEED_UNWRAP 
(since SSLEngine has finished wrapping), but the netBuffer has not yet been 
flushed. So on line 153 we would fall through to the NEED_UNWRAP case without 
doing the flush.



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


Actually, can you describe how this would be done (say, for dealing with 
revoked certificates after an client authenticates)? Per your jira comment we 
can use an authorizer to block the client in this case, but if you have a 
proposal on handling periodic renegotiation it would be useful to discuss that. 
I agree we don't need to implement it now.


- Joel Koshy


On May 15, 2015, 2:18 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated May 15, 2015, 2:18 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> Diffs
> -
> 
>   build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
>   checkstyle/checkstyle.xml a215ff36e9252879f1e0be5a86fef9a875bb8f38 
>   checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 8e336a3aa96c73f52beaeb56b931baf4b026cf21 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 187d0004c8c46b6664ddaffecc6166d4b47351e5 
>   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
> c4fa058692f50abb4f47bd344119d805c60123f5 
>   clients/src/main/java/org/apache/kafka/common/config/SecurityConfigs.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Channel.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
> b5f8d83e89f9026dc0853e5f92c00b2d7f043e22 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> 57de0585e5e9a53eb9dcd99cac1ab3eb2086a302 
>   clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java 
> dab1a94dd29563688b6ecf4eeb0e180b06049d3f 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/DefaultPrincipalBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/KafkaPrincipal.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/PrincipalBuilder.java
>  

Re: [DISCUSS] KIP-19 Add a request timeout to NetworkClient

2015-05-15 Thread Mayuresh Gharat
+1 on creating a new connection.
So from what I understand the request timeout should be greater than
the replication timeout in any case.

If the broker is slow or not responding and the request times out we will
treat it as we treat disconnections and update metadata try sending it to
new leader or the same broker on a new connection.

Thanks,

Mayuresh

On Fri, May 15, 2015 at 2:14 PM, Jiangjie Qin 
wrote:

> I modified the WIKI page to incorporate the feedbacks from mailing list
> and KIP hangout.
>
> - Added the deprecation plan for TIMEOUT_CONFIG
> - Added the actions to take after request timeout
>
> I finally chose to create a new connection if requests timeout. The reason
> is:
> 1. In most cases, if a broker is just slow, as long as we set request
> timeout to be a reasonable value, we should not see many new connections
> get created.
> 2. If a broker is down, hopefully metadata refresh will find the new
> broker and we will not try to reconnect to the broker anymore.
>
> Comments are welcome!
>
> Thanks.
>
> Jiangjie (Becket) Qin
>
> On 5/12/15, 2:59 PM, "Mayuresh Gharat"  wrote:
>
> >+1 Becket. That would give enough time for clients to move. We should make
> >this change very clear.
> >
> >Thanks,
> >
> >Mayuresh
> >
> >On Tue, May 12, 2015 at 1:45 PM, Jiangjie Qin 
> >wrote:
> >
> >> Hey Ewen,
> >>
> >> Very good summary about the compatibility. What you proposed makes
> >>sense.
> >> So basically we can do the following:
> >>
> >> In next release, i.e. 0.8.3:
> >> 1. Add REPLICATION_TIMEOUT_CONFIG (“replication.timeout.ms”)
> >> 2. Mark TIMEOUT_CONFIG as deprecated
> >> 3. Override REPLICATION_TIMEOUT_CONFIG with TIMEOUT_CONFIG if it is
> >> defined and give a warning about deprecation.
> >> In the release after 0.8.3, we remove TIMEOUT_CONFIG.
> >>
> >> This should give enough buffer for this change.
> >>
> >> Request timeout is a complete new thing we add to fix a bug, I’m with
> >>you
> >> it does not make sense to have it maintain the old buggy behavior. So we
> >> can set it to a reasonable value instead of infinite.
> >>
> >> Jiangjie (Becket) Qin
> >>
> >> On 5/12/15, 12:03 PM, "Ewen Cheslack-Postava" 
> wrote:
> >>
> >> >I think my confusion is coming from this:
> >> >
> >> >> So in this KIP, we only address (3). The only public interface change
> >> >>is a
> >> >> new configuration of request timeout (and maybe change the
> >>configuration
> >> >> name of TIMEOUT_CONFIG to REPLICATION_TIMEOUT_CONFIG).
> >> >
> >> >There are 3 possible compatibility issues I see here:
> >> >
> >> >* I assumed this meant the constants also change, so "timeout.ms"
> >>becomes
> >> >"
> >> >replication.timeout.ms". This breaks config files that worked on the
> >> >previous version and the only warning would be in release notes. We do
> >> >warn
> >> >about unused configs so they might notice the problem.
> >> >
> >> >* Binary and source compatibility if someone configures their client in
> >> >code and uses the TIMEOUT_CONFIG variable. Renaming it will cause
> >>existing
> >> >jars to break if you try to run against an updated client (which seems
> >>not
> >> >very significant since I doubt people upgrade these without recompiling
> >> >but
> >> >maybe I'm wrong about that). And it breaks builds without have
> >>deprecated
> >> >that field first, which again, is probably not the biggest issue but is
> >> >annoying for users and when we accidentally changed the API we
> >>received a
> >> >complaint about breaking builds.
> >> >
> >> >* Behavior compatibility as Jay mentioned on the call -- setting the
> >> >config
> >> >(even if the name changed) doesn't have the same effect it used to.
> >> >
> >> >One solution, which admittedly is more painful to implement and
> >>maintain,
> >> >would be to maintain the timeout.ms config, have it override the
> others
> >> if
> >> >it is specified (including an infinite request timeout I guess?), and
> >>if
> >> >it
> >> >isn't specified, we can just use the new config variables. Given a real
> >> >deprecation schedule, users would have better warning of changes and a
> >> >window to make the changes.
> >> >
> >> >I actually think it might not be necessary to maintain the old behavior
> >> >precisely, although maybe for some code it is an issue if they start
> >> >seeing
> >> >timeout exceptions that they wouldn't have seen before?
> >> >
> >> >-Ewen
> >> >
> >> >On Wed, May 6, 2015 at 6:06 PM, Jun Rao  wrote:
> >> >
> >> >> Jiangjie,
> >> >>
> >> >> Yes, I think using metadata timeout to expire batches in the record
> >> >> accumulator makes sense.
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Jun
> >> >>
> >> >> On Mon, May 4, 2015 at 10:32 AM, Jiangjie Qin
> >> >>
> >> >> wrote:
> >> >>
> >> >> > I incorporated Ewen and Guozhang’s comments in the KIP page. Want
> >>to
> >> >> speed
> >> >> > up on this KIP because currently we experience mirror-maker hung
> >>very
> >> >> > likely when a broker is down.
> >> >> >
> >> >> > I also took a shot to solve KAFKA-1788 in KAFKA

[jira] [Commented] (KAFKA-2160) DelayedOperationPurgatory should remove the pair in watchersForKey with empty watchers list

2015-05-15 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546320#comment-14546320
 ] 

Joel Koshy commented on KAFKA-2160:
---

I would be surprised if the global RW lock is a problem since it is all in 
memory and purged only at an interval. This works well for the coordinator use 
case where you have empty watchers created slowly. As we discussed offline 
earlier in the week, I suggested this for simplicity. If in future there are 
other delayed operation purgatories in which these get created very quickly 
then this approach may not work as well. However, I doubt we will have such 
use-cases.

> DelayedOperationPurgatory should remove the pair in watchersForKey with empty 
> watchers list
> ---
>
> Key: KAFKA-2160
> URL: https://issues.apache.org/jira/browse/KAFKA-2160
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Attachments: KAFKA-2160.patch, KAFKA-2160.patch, 
> KAFKA-2160_2015-04-30_15:20:14.patch, KAFKA-2160_2015-05-06_16:31:48.patch
>
>
> With purgatory usage in consumer coordinator, it will be common that watcher 
> lists are very short and live only for a short time. So we'd better clean 
> them from the watchersForKey Pool once the list become empty in 
> checkAndComplete() calls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2160) DelayedOperationPurgatory should remove the pair in watchersForKey with empty watchers list

2015-05-15 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546290#comment-14546290
 ] 

Guozhang Wang commented on KAFKA-2160:
--

[~junrao][~jjkoshy] Could you take a look at the most recent approach? If it 
works promising to you I will pursue this direction and make some more 
optimizations on the coordiantor's usage of the purgatories.

> DelayedOperationPurgatory should remove the pair in watchersForKey with empty 
> watchers list
> ---
>
> Key: KAFKA-2160
> URL: https://issues.apache.org/jira/browse/KAFKA-2160
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Attachments: KAFKA-2160.patch, KAFKA-2160.patch, 
> KAFKA-2160_2015-04-30_15:20:14.patch, KAFKA-2160_2015-05-06_16:31:48.patch
>
>
> With purgatory usage in consumer coordinator, it will be common that watcher 
> lists are very short and live only for a short time. So we'd better clean 
> them from the watchersForKey Pool once the list become empty in 
> checkAndComplete() calls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-2160) DelayedOperationPurgatory should remove the pair in watchersForKey with empty watchers list

2015-05-15 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546287#comment-14546287
 ] 

Guozhang Wang edited comment on KAFKA-2160 at 5/15/15 10:27 PM:


Ran kafka.TestPurgatoryPerformance to compare the performance of purgatory 
against trunk.

Setup: request rate = 1000 / sec, latency pct50 = 50ms, pct75 = 75ms, each 
request has 3 keys. Test with varying total key space size (with more keys it 
is more likely that some key's watch list gets empty and hence can be removed). 

The following table shows elapsed time (ms):

||#.keys||100||1000||1||1||10||
|trunk (without global lock)|100339|100616|100018|100280|100324|
|K2160 (with global lock)|100157|100270|100330|99934|99867|


was (Author: guozhang):
Ran kafka.TestPurgatoryPerformance to compare the performance of purgatory 
against trunk.

Setup: request rate = 1000 / sec, latency pct50 = 50ms, pct75 = 75ms, each 
request has 3 keys. Test with varying total key space size (with more keys it 
is more likely that some key's watch list gets empty and hence can be removed). 

The following table shows elapsed time (ms):

||#.keys||100||1000||1||1||10||
|trunk (without global lock)|100339|100616|100018|100280|100324|
|K2160 (with global lock)|100157|100270||100330|99934|99867|

> DelayedOperationPurgatory should remove the pair in watchersForKey with empty 
> watchers list
> ---
>
> Key: KAFKA-2160
> URL: https://issues.apache.org/jira/browse/KAFKA-2160
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Attachments: KAFKA-2160.patch, KAFKA-2160.patch, 
> KAFKA-2160_2015-04-30_15:20:14.patch, KAFKA-2160_2015-05-06_16:31:48.patch
>
>
> With purgatory usage in consumer coordinator, it will be common that watcher 
> lists are very short and live only for a short time. So we'd better clean 
> them from the watchersForKey Pool once the list become empty in 
> checkAndComplete() calls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2160) DelayedOperationPurgatory should remove the pair in watchersForKey with empty watchers list

2015-05-15 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546287#comment-14546287
 ] 

Guozhang Wang commented on KAFKA-2160:
--

Ran kafka.TestPurgatoryPerformance to compare the performance of purgatory 
against trunk.

Setup: request rate = 1000 / sec, latency pct50 = 50ms, pct75 = 75ms, each 
request has 3 keys. Test with varying total key space size (with more keys it 
is more likely that some key's watch list gets empty and hence can be removed). 

The following table shows elapsed time (ms):

||#.keys||100||1000||1||1||10||
|trunk (without global lock)|100339|100616|100018|100280|100324|
|K2160 (with global lock)|100157|100270||100330|99934|99867|

> DelayedOperationPurgatory should remove the pair in watchersForKey with empty 
> watchers list
> ---
>
> Key: KAFKA-2160
> URL: https://issues.apache.org/jira/browse/KAFKA-2160
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Attachments: KAFKA-2160.patch, KAFKA-2160.patch, 
> KAFKA-2160_2015-04-30_15:20:14.patch, KAFKA-2160_2015-05-06_16:31:48.patch
>
>
> With purgatory usage in consumer coordinator, it will be common that watcher 
> lists are very short and live only for a short time. So we'd better clean 
> them from the watchersForKey Pool once the list become empty in 
> checkAndComplete() calls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33065: Patch for KAFKA-1928

2015-05-15 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33065/#review83959
---


Thanks for the patch. There seem to be some compilation errors. Perhaps need to 
rebase.

:core:compileScala/Users/junrao/intellij/kafka/core/src/main/scala/kafka/api/ConsumerMetadataRequest.scala:68:
 type mismatch;
 found   : kafka.network.RequestOrResponseSend
 required: kafka.network.Send
requestChannel.sendResponse(new Response(request, new 
RequestOrResponseSend(request.connectionId, errorResponse)))
  ^
/Users/junrao/intellij/kafka/core/src/main/scala/kafka/api/ControlledShutdownRequest.scala:65:
 type mismatch;
 found   : kafka.network.RequestOrResponseSend
 required: kafka.network.Send
requestChannel.sendResponse(new Response(request, new 
RequestOrResponseSend(request.connectionId, errorResponse)))
  ^
/Users/junrao/intellij/kafka/core/src/main/scala/kafka/api/FetchRequest.scala:152:
 type mismatch;
 found   : kafka.api.FetchResponseSend
 required: kafka.network.Send
requestChannel.sendResponse(new RequestChannel.Response(request, new 
FetchResponseSend(request.connectionId, errorResponse)))
 ^
/Users/junrao/intellij/kafka/core/src/main/scala/kafka/api/LeaderAndIsrRequest.scala:189:
 type mismatch;
 found   : kafka.network.RequestOrResponseSend
 required: kafka.network.Send
requestChannel.sendResponse(new Response(request, new 
RequestOrResponseSend(request.connectionId, errorResponse)))
  ^

/Users/junrao/intellij/kafka/core/src/main/scala/kafka/api/OffsetCommitRequest.scala:78:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
/Users/junrao/intellij/kafka/core/src/main/scala/kafka/api/OffsetCommitRequest.scala:167:
 type mismatch;
 found   : kafka.network.RequestOrResponseSend
 required: kafka.network.Send
requestChannel.sendResponse(new Response(request, new 
RequestOrResponseSend(request.connectionId, commitResponse)))
  ^
/Users/junrao/intellij/kafka/core/src/main/scala/kafka/api/OffsetFetchRequest.scala:99:
 type mismatch;
 found   : kafka.network.RequestOrResponseSend
 required: kafka.network.Send
requestChannel.sendResponse(new Response(request, new 
RequestOrResponseSend(request.connectionId, errorResponse)))
  ^
/Users/junrao/intellij/kafka/core/src/main/scala/kafka/api/OffsetRequest.scala:121:
 type mismatch;
 found   : kafka.network.RequestOrResponseSend
 required: kafka.network.Send
requestChannel.sendResponse(new Response(request, new 
RequestOrResponseSend(request.connectionId, errorResponse)))
  ^
/Users/junrao/intellij/kafka/core/src/main/scala/kafka/network/RequestChannel.scala:29:
 imported `Send' is permanently hidden by definition of trait Send in package 
network
import org.apache.kafka.common.network.Send


clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java


We don't need this since the idle conenctions only need to be closed on the 
server side.



clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java


We don't need this since the idle conenctions only need to be closed on the 
server side.



clients/src/main/java/org/apache/kafka/common/network/MultiSend.java


Actually, with MultiSend, we will be sending a 4-byte size plus the 
payload, the sum of which could be a bit larger than max_int. So, I think we 
need to make writeTo and size return long instead in Send. Sorry for the 
incorrect suggestion earlier.



clients/src/main/java/org/apache/kafka/common/network/MultiSend.java


To be consistent, no need to wrap single line statement in {}.



clients/src/main/java/org/apache/kafka/common/network/Selector.java


It seems that we don't really need IdenityHashMap to optimize performance. 
In the following link, HashMap gives comparable performance as IdenityHashMap 
on String keys since String caches the hashcode.
http://java-performance.info/java-util-identityhashmap/



clients/src/main/java/org/apache/kafka/common/network/Selector.java


We only need 

Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Sriharsha Chintalapani


> On May 15, 2015, 8:26 p.m., Michael Herstine wrote:
> > clients/src/main/java/org/apache/kafka/common/security/auth/PrincipalBuilder.java,
> >  line 29
> > 
> >
> > Will it be "pluggable"; i.e. can individual sites provide their own 
> > implementations?

This is made it into plugabble class. Please check the revision 6 of the patch.


- Sriharsha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review83807
---


On May 15, 2015, 2:18 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated May 15, 2015, 2:18 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> Diffs
> -
> 
>   build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
>   checkstyle/checkstyle.xml a215ff36e9252879f1e0be5a86fef9a875bb8f38 
>   checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 8e336a3aa96c73f52beaeb56b931baf4b026cf21 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 187d0004c8c46b6664ddaffecc6166d4b47351e5 
>   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
> c4fa058692f50abb4f47bd344119d805c60123f5 
>   clients/src/main/java/org/apache/kafka/common/config/SecurityConfigs.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Channel.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
> b5f8d83e89f9026dc0853e5f92c00b2d7f043e22 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> 57de0585e5e9a53eb9dcd99cac1ab3eb2086a302 
>   clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java 
> dab1a94dd29563688b6ecf4eeb0e180b06049d3f 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/DefaultPrincipalBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/KafkaPrincipal.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/PrincipalBuilder.java
>  PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/utils/Utils.java 
> f73eedb030987f018d8446bb1dcd98d19fa97331 
>   clients/src/test/java/org/apache/kafka/common/network/EchoServer.java 
> PRE-CREATION 
>   clients/src/test/java/org/apache/kafka/common/network/SSLFactoryTest.java 
> PRE-CREATION 
>   clients/src/test/java/org/apache/kafka/common/network/SSLSelectorTest.java 
> PRE-CREATION 
>   clients/src/test/java/org/apache/kafka/common/network/SelectorTest.java 
> d5b306b026e788b4e5479f3419805aa49ae889

Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Sriharsha Chintalapani


> On May 15, 2015, 8:26 p.m., Michael Herstine wrote:
> > clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java, 
> > line 34
> > 
> >
> > This is probably just my ignorance of the Kafka codebase, but what does 
> > this interface represent? It looks a lot like a socket to me...

It does represet a socket but instead of extending its acting as delegation 
layer to the socket and allowing me to read/write methods without extending 
socketchannel . Do you see this as a issue?


> On May 15, 2015, 8:26 p.m., Michael Herstine wrote:
> > clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java,
> >  lines 404-410
> > 
> >
> > This is likely to be unhelpful in many cases; according to the javadoc 
> > for class X500Principal, the name will be derived from the Subject Name in 
> > the peer certificate. Typically, the Subject Name is the hostname of the 
> > cert to permit peer validation when tools like curl or a browser are used 
> > to hit the endpoint (for debugging purposes, say).
> > 
> > At LinkedIn, we use the Subject Alternative Names field to embed 
> > service identity.

In the latest version there is a PrincipalBuilder class which is pluggable 
which takes in both authenticator and transportLayer. Its up to user which one 
they want to use.


> On May 15, 2015, 8:26 p.m., Michael Herstine wrote:
> > clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java,
> >  line 251
> > 
> >
> > I'm not familiar with Kafka's threading model, but I know throughput is 
> > very important to you. Do you want to be blocking execution on this thread 
> > while you carry out these tasks?

tasks() method runs the delegatedTasks using ExecutorService so it won't run on 
selector thread. But Rajini brought up an issue about muting the selectionKey I 
am working on it.


> On May 15, 2015, 8:26 p.m., Michael Herstine wrote:
> > clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java,
> >  line 215
> > 
> >
> > Did you remember to flip `appWriteBuffer` before calling `wrap`?

not necessary here.


> On May 15, 2015, 8:26 p.m., Michael Herstine wrote:
> > clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java,
> >  line 153
> > 
> >
> > I don't know if this is right. Suppose we return from `handshakeWrap` 
> > with data still to be written from `netOutBuffer` to the socket. The status 
> > will still be NEEDS_UNWRAP, because as far as SSLEngine is concerned, it's 
> > wrapped everything & ready for input. In this case, you'll fall through 
> > without flushing everything from `netOutBuffer`.

Not sure if I understood the comment. "Suppose we return from handshakeWrap 
with data still to be written from netOutBuffer to the socket." 
If we still have data in netOutBuffer we will return from handshake process 
with "WRITE" selectionOp.Interest to register for the next iteration. we don't 
fall through without flushing everyting from netOutBuffer.


> On May 15, 2015, 8:26 p.m., Michael Herstine wrote:
> > clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java,
> >  lines 112-121
> > 
> >
> > Sorry, but I'm a little confused here. Why would the caller be passing 
> > `read` & `write`? I mean, you own the underlying socket, so you know that 
> > state it's in. And you also have the handshake status, so you know what 
> > you're expecting, right?
> > 
> > Update: I've stepped through your logic, and it seems to me that you're 
> > figuring out at the end of each pass whether you want to read or write, 
> > returning that information to the caller, and asking the caller to pass it 
> > back in the next invocation of `handshake`. Why? Why not just record your 
> > state in a member variable?

I am depending on the SelectionKey interest Ops. Socketchannel 
http://docs.oracle.com/javase/7/docs/api/java/nio/channels/SocketChannel.html 
doesn't tell me if its ready for write or read. We depend on the Selector's 
selectionKey ops to figure out if the socketChannel is ready for write or read 
ops.


> On May 15, 2015, 8:26 p.m., Michael Herstine wrote:
> > clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java,
> >  line 76
> > 
> >
> > I'm confused on the difference between `startHandshake` & `handshake` 
> > (inherited from interface `TransportLayer`).

startHandshake is more of a prepare method for handshake process and its only 
SSLTransport

Re: [DISCUSS] KIP-19 Add a request timeout to NetworkClient

2015-05-15 Thread Jiangjie Qin
I modified the WIKI page to incorporate the feedbacks from mailing list
and KIP hangout.

- Added the deprecation plan for TIMEOUT_CONFIG
- Added the actions to take after request timeout

I finally chose to create a new connection if requests timeout. The reason
is:
1. In most cases, if a broker is just slow, as long as we set request
timeout to be a reasonable value, we should not see many new connections
get created. 
2. If a broker is down, hopefully metadata refresh will find the new
broker and we will not try to reconnect to the broker anymore.

Comments are welcome!

Thanks.

Jiangjie (Becket) Qin

On 5/12/15, 2:59 PM, "Mayuresh Gharat"  wrote:

>+1 Becket. That would give enough time for clients to move. We should make
>this change very clear.
>
>Thanks,
>
>Mayuresh
>
>On Tue, May 12, 2015 at 1:45 PM, Jiangjie Qin 
>wrote:
>
>> Hey Ewen,
>>
>> Very good summary about the compatibility. What you proposed makes
>>sense.
>> So basically we can do the following:
>>
>> In next release, i.e. 0.8.3:
>> 1. Add REPLICATION_TIMEOUT_CONFIG (“replication.timeout.ms”)
>> 2. Mark TIMEOUT_CONFIG as deprecated
>> 3. Override REPLICATION_TIMEOUT_CONFIG with TIMEOUT_CONFIG if it is
>> defined and give a warning about deprecation.
>> In the release after 0.8.3, we remove TIMEOUT_CONFIG.
>>
>> This should give enough buffer for this change.
>>
>> Request timeout is a complete new thing we add to fix a bug, I’m with
>>you
>> it does not make sense to have it maintain the old buggy behavior. So we
>> can set it to a reasonable value instead of infinite.
>>
>> Jiangjie (Becket) Qin
>>
>> On 5/12/15, 12:03 PM, "Ewen Cheslack-Postava"  wrote:
>>
>> >I think my confusion is coming from this:
>> >
>> >> So in this KIP, we only address (3). The only public interface change
>> >>is a
>> >> new configuration of request timeout (and maybe change the
>>configuration
>> >> name of TIMEOUT_CONFIG to REPLICATION_TIMEOUT_CONFIG).
>> >
>> >There are 3 possible compatibility issues I see here:
>> >
>> >* I assumed this meant the constants also change, so "timeout.ms"
>>becomes
>> >"
>> >replication.timeout.ms". This breaks config files that worked on the
>> >previous version and the only warning would be in release notes. We do
>> >warn
>> >about unused configs so they might notice the problem.
>> >
>> >* Binary and source compatibility if someone configures their client in
>> >code and uses the TIMEOUT_CONFIG variable. Renaming it will cause
>>existing
>> >jars to break if you try to run against an updated client (which seems
>>not
>> >very significant since I doubt people upgrade these without recompiling
>> >but
>> >maybe I'm wrong about that). And it breaks builds without have
>>deprecated
>> >that field first, which again, is probably not the biggest issue but is
>> >annoying for users and when we accidentally changed the API we
>>received a
>> >complaint about breaking builds.
>> >
>> >* Behavior compatibility as Jay mentioned on the call -- setting the
>> >config
>> >(even if the name changed) doesn't have the same effect it used to.
>> >
>> >One solution, which admittedly is more painful to implement and
>>maintain,
>> >would be to maintain the timeout.ms config, have it override the others
>> if
>> >it is specified (including an infinite request timeout I guess?), and
>>if
>> >it
>> >isn't specified, we can just use the new config variables. Given a real
>> >deprecation schedule, users would have better warning of changes and a
>> >window to make the changes.
>> >
>> >I actually think it might not be necessary to maintain the old behavior
>> >precisely, although maybe for some code it is an issue if they start
>> >seeing
>> >timeout exceptions that they wouldn't have seen before?
>> >
>> >-Ewen
>> >
>> >On Wed, May 6, 2015 at 6:06 PM, Jun Rao  wrote:
>> >
>> >> Jiangjie,
>> >>
>> >> Yes, I think using metadata timeout to expire batches in the record
>> >> accumulator makes sense.
>> >>
>> >> Thanks,
>> >>
>> >> Jun
>> >>
>> >> On Mon, May 4, 2015 at 10:32 AM, Jiangjie Qin
>> >>
>> >> wrote:
>> >>
>> >> > I incorporated Ewen and Guozhang’s comments in the KIP page. Want
>>to
>> >> speed
>> >> > up on this KIP because currently we experience mirror-maker hung
>>very
>> >> > likely when a broker is down.
>> >> >
>> >> > I also took a shot to solve KAFKA-1788 in KAFKA-2142. I used
>>metadata
>> >> > timeout to expire the batches which are sitting in accumulator
>>without
>> >> > leader info. I did that because the situation there is essentially
>> >> missing
>> >> > metadata.
>> >> >
>> >> > As a summary of what I am thinking about the timeout in new
>>Producer:
>> >> >
>> >> > 1. Metadata timeout:
>> >> >   - used in send(), blocking
>> >> >   - used in accumulator to expire batches with timeout exception.
>> >> > 2. Linger.ms
>> >> >   - Used in accumulator to ready the batch for drain
>> >> > 3. Request timeout
>> >> >   - Used in NetworkClient to expire a batch and retry if no
>>response
>> >>is
>> >> > received for a request 

Re: Review Request 34170: Patch for KAFKA-2191

2015-05-15 Thread Jay Kreps


> On May 13, 2015, 11:50 p.m., Aditya Auradkar wrote:
> > clients/src/main/java/org/apache/kafka/common/metrics/stats/SampledStat.java,
> >  line 45
> > 
> >
> > I think this is a good catch.
> > 
> > Just so I understand, this is adding new samples for blank intervals of 
> > time when record was not called. As an optimization, I think you should 
> > only add "MetricConfig.samples" number of samples. In the rare case, there 
> > is no activity for 10 mins (say), this will add 10sec*6*10 = 600 samples 
> > which will be purged immediately on the next record call.
> 
> Jay Kreps wrote:
> This is a really good point. It is totally possible for a metric to track 
> activity on a topic that has no writes for a month, the first write would 
> then cause you to cycle through a month of samples. The logic around 
> correctly skipping these windows and calculating the correct window boundry 
> needs to be carefully worked out.
> 
> Jay Kreps wrote:
> Actually maybe you can elaborate on why this is needed at all? In the 
> current code if the current sample is complete we add a new sample. Your code 
> adds lots of samples. But why do we need that? Isn't purging obsolete samples 
> handled on measurement already? I think that is more elegant. You must see 
> some issue there, maybe you can explain?
> 
> Dong Lin wrote:
> Adi: Good catch! I will update the patch to fix the problem.
> 
> Jay: Sure. Here is the problem I find with the current code.
> 
> - The current code implements SampledStat.advance in a way that is 
> probably different from users' expectation: typically when rate is sampled, 
> time will be divided into consecutive slots of equal length and samples are 
> expired in unit of these time slots, which assures the user the rate is 
> measured on samples collected in the past expireAge period. The current 
> implementation doesn't provide such an easy-to-understand interpretations, 
> since the effective sampled period can be anywhere between 0 and expireAge.
> 
> - As one extreme example, say we call rate.measure after expireAge (i.e. 
> config.samples() * config.timeWindowMs()) has passed since last sample. Then 
> purgeObsoleteSamples will reset all samples and elapsed will be 0 as a 
> result. This is problematic: rate should instead keep the information that we 
> haven’t observed any sample during the past expireAge.
> 
> Does it make sense?
> 
> Jay Kreps wrote:
> Hey Dong, I think what you are saying is that the last sample is partial, 
> that is it doesn't cover a full window yet. This is true and is by design. 
> That is the whole reason there are multiple samples--to stabalize the partial 
> estimate. The ideal window would be a backwards looking sliding window of N 
> ms from the time of measurement but that would be computationally unfeasible.
> 
> I don't understand your example. If no events have occurred in 
> config.samples() * config.timeWindowMs() then the observed rate is 0, right? 
> I think you are saying that the time estimate should be the full window, but 
> that is fixed by changing the other formula as I suggested.
> 
> Dong Lin wrote:
> Hey Jay, thanks for reply. The problem I wanted to explain with this 
> example is that, if no events have occurred in config.samples() * 
> config.timeWindowMs(), oldest == now and the measured rate will be infinite 
> (which should be 0 instead).
> 
> The example explained above is only relevant to the current code. As you 
> said, using the other formula you suggested will fix the problem and avoid 
> the need to make this change here.
> 
> Just a couple of questions towards an acceptable patch. How about we use 
> the other formula you suggested in Rate.measure()? When the user calls 
> rate.measure() with number of samples <= 1 (i.e. ellapsed <= timeWindowMs), 
> should we throw exception or return n/timeWindowMs? And should we keep the 
> existing SampledStat.oldest()?
> 
> Thank much,

Ah gotcha, yeah i think it is easiest to fix in measure() rather than in 
advance.

You definitely don't want to throw an exception in measure because it is 
totally correct to have any number of samples when measure is called. I think 
the proposed change was require the configured num samples (i.e. the max 
samples) to be at least 2 and then always use the formula we discussed in 
measure.

But heres the thing I think this logic is simply very precise and has to be 
carefully thought out with each change. So don't listen to me too much, work it 
through and see if you agree that that makes sense and if you don't let's talk 
about alternatives.


- Jay


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34170/#review83687
---


On M

Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Sriharsha Chintalapani


> On May 15, 2015, 8:26 p.m., Michael Herstine wrote:
> > clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java,
> >  line 295
> > 
> >
> > Pursuant to my last comment-- what exactly does this config item do? In 
> > SSL, the server always presents a certificate; it is the _client_ cert 
> > which is optional. So I don't understand what a config item "client require 
> > cert" means.

It meant for client side cert required i.e client needs to provide 
certification.


On May 15, 2015, 8:26 p.m., Sriharsha Chintalapani wrote:
> > Thanks Srisharsha-- this is a big job, and non-blocking SSL is quite 
> > complex.

Thanks Michael for the review. It looks like you commented on the revision 5 
where the latest one is revision 6. I'll any way go through comments and 
address those.


- Sriharsha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review83807
---


On May 15, 2015, 2:18 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated May 15, 2015, 2:18 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> Diffs
> -
> 
>   build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
>   checkstyle/checkstyle.xml a215ff36e9252879f1e0be5a86fef9a875bb8f38 
>   checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 8e336a3aa96c73f52beaeb56b931baf4b026cf21 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 187d0004c8c46b6664ddaffecc6166d4b47351e5 
>   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
> c4fa058692f50abb4f47bd344119d805c60123f5 
>   clients/src/main/java/org/apache/kafka/common/config/SecurityConfigs.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Channel.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
> b5f8d83e89f9026dc0853e5f92c00b2d7f043e22 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> 57de0585e5e9a53eb9dcd99cac1ab3eb2086a302 
>   clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java 
> dab1a94dd29563688b6ecf4eeb0e180b06049d3f 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/DefaultPrincipalBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/KafkaPrincipal.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/security/auth/PrincipalBuilder.java
>  PRE-CREATION 
>   clients/src/main/java/org/apache/kaf

Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Jay Kreps
+1

-Jay

On Fri, May 15, 2015 at 9:18 AM, Parth Brahmbhatt <
pbrahmbh...@hortonworks.com> wrote:

> Hi,
>
> Opening the voting thread for KIP-11.
>
> Link to the KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
> Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
>
> Thanks
> Parth
>


Re: Review Request 34170: Patch for KAFKA-2191

2015-05-15 Thread Dong Lin


> On May 13, 2015, 11:50 p.m., Aditya Auradkar wrote:
> > clients/src/main/java/org/apache/kafka/common/metrics/stats/SampledStat.java,
> >  line 45
> > 
> >
> > I think this is a good catch.
> > 
> > Just so I understand, this is adding new samples for blank intervals of 
> > time when record was not called. As an optimization, I think you should 
> > only add "MetricConfig.samples" number of samples. In the rare case, there 
> > is no activity for 10 mins (say), this will add 10sec*6*10 = 600 samples 
> > which will be purged immediately on the next record call.
> 
> Jay Kreps wrote:
> This is a really good point. It is totally possible for a metric to track 
> activity on a topic that has no writes for a month, the first write would 
> then cause you to cycle through a month of samples. The logic around 
> correctly skipping these windows and calculating the correct window boundry 
> needs to be carefully worked out.
> 
> Jay Kreps wrote:
> Actually maybe you can elaborate on why this is needed at all? In the 
> current code if the current sample is complete we add a new sample. Your code 
> adds lots of samples. But why do we need that? Isn't purging obsolete samples 
> handled on measurement already? I think that is more elegant. You must see 
> some issue there, maybe you can explain?
> 
> Dong Lin wrote:
> Adi: Good catch! I will update the patch to fix the problem.
> 
> Jay: Sure. Here is the problem I find with the current code.
> 
> - The current code implements SampledStat.advance in a way that is 
> probably different from users' expectation: typically when rate is sampled, 
> time will be divided into consecutive slots of equal length and samples are 
> expired in unit of these time slots, which assures the user the rate is 
> measured on samples collected in the past expireAge period. The current 
> implementation doesn't provide such an easy-to-understand interpretations, 
> since the effective sampled period can be anywhere between 0 and expireAge.
> 
> - As one extreme example, say we call rate.measure after expireAge (i.e. 
> config.samples() * config.timeWindowMs()) has passed since last sample. Then 
> purgeObsoleteSamples will reset all samples and elapsed will be 0 as a 
> result. This is problematic: rate should instead keep the information that we 
> haven’t observed any sample during the past expireAge.
> 
> Does it make sense?
> 
> Jay Kreps wrote:
> Hey Dong, I think what you are saying is that the last sample is partial, 
> that is it doesn't cover a full window yet. This is true and is by design. 
> That is the whole reason there are multiple samples--to stabalize the partial 
> estimate. The ideal window would be a backwards looking sliding window of N 
> ms from the time of measurement but that would be computationally unfeasible.
> 
> I don't understand your example. If no events have occurred in 
> config.samples() * config.timeWindowMs() then the observed rate is 0, right? 
> I think you are saying that the time estimate should be the full window, but 
> that is fixed by changing the other formula as I suggested.

Hey Jay, thanks for reply. The problem I wanted to explain with this example is 
that, if no events have occurred in config.samples() * config.timeWindowMs(), 
oldest == now and the measured rate will be infinite (which should be 0 
instead).

The example explained above is only relevant to the current code. As you said, 
using the other formula you suggested will fix the problem and avoid the need 
to make this change here.

Just a couple of questions towards an acceptable patch. How about we use the 
other formula you suggested in Rate.measure()? When the user calls 
rate.measure() with number of samples <= 1 (i.e. ellapsed <= timeWindowMs), 
should we throw exception or return n/timeWindowMs? And should we keep the 
existing SampledStat.oldest()?

Thank much,


- Dong


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34170/#review83687
---


On May 14, 2015, 7:34 a.m., Dong Lin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34170/
> ---
> 
> (Updated May 14, 2015, 7:34 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2191
> https://issues.apache.org/jira/browse/KAFKA-2191
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-2191; Measured rate should not be infinite
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/metrics/stats/Rate.java 
> 98429da34418f7f1deba1b5e44e2e6025212edb3 
>   
> clients/src/main/java/org/apache/kafka/common/metrics

Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Michael Herstine

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review83807
---



clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java


"Need client auth" is generally a _server_-side setting. Should this appear 
in a file named CommonClientConfigs.java?

If it should, are you not using "want client auth" as well as "require 
client auth"?



clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java


Pursuant to my last comment-- what exactly does this config item do? In 
SSL, the server always presents a certificate; it is the _client_ cert which is 
optional. So I don't understand what a config item "client require cert" means.



clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java


Just as a general aside: when you can't implement a method in a concrete 
subclass, and are reduced to throwing an exception to so indicate, you're 
violating the Liskov Substitution Principle-- the idea that a subclass can be 
used anywhere a superclass is.

To put this another way, this is likely to surprise client code written in 
terms of interface `TransportLayer`.



clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java


Thanks for making these configurable; I think a lot of sites will have 
strong opinions on what protocol versions & cipher suites they'll want to 
enable.



clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java


This is only applicable on the server-side.



clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java


This is interesting; I don't see a corresponding `createSSLSocketFactory`-- 
did I miss it? Besides, I thought Kafka used NIO (in which case you wouldn't be 
using `SSLServerSocketFactory`, which IIRC is blocking. Who calls this?



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


Do you want to make your nomenclature more uniform? The network buffers are 
referred to in terms of "in" & "out" while the application buffers are referred 
to as "read" & "write".

On a related note, do you need read & write for each? At any given point in 
time, you're doing one or the other, so why not just have a "network" buffer & 
an "application" buffer?



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


I'm confused on the difference between `startHandshake` & `handshake` 
(inherited from interface `TransportLayer`).



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


Do you want to set your buffers' limits to 0, here? Is not

netOutBuffer.clear()
netInBuffer.clear()

what you really want (i.e. current position at 0, limit at capacity, 
indicating that the buffers are available for write up to their capacities)



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


Sorry, but I'm a little confused here. Why would the caller be passing 
`read` & `write`? I mean, you own the underlying socket, so you know that state 
it's in. And you also have the handshake status, so you know what you're 
expecting, right?

Update: I've stepped through your logic, and it seems to me that you're 
figuring out at the end of each pass whether you want to read or write, 
returning that information to the caller, and asking the caller to pass it back 
in the next invocation of `handshake`. Why? Why not just record your state in a 
member variable?



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


I don't know if this is right. Suppose we return from `handshakeWrap` with 
data still to be written from `netOutBuffer` to the socket. The status will 
still be NEEDS_UNWRAP, because as far as SSLEngine is concerned, it's wrapped 
everything & ready for input. In this case, you'll fall through without 
flushing everything from `netOutBuffer`.



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


Did you remember to flip `appWriteBuffer` before calling `wrap`?



clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java


RE: [DISCUSS] KIP-21 Configuration Management

2015-05-15 Thread Aditya Auradkar
Yes we did. I just overlooked that line.. cleaning it up now.

Aditya


From: Gwen Shapira [gshap...@cloudera.com]
Sent: Friday, May 15, 2015 12:55 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

The wiki says:
"There will be 3 paths within config
/config/clients/
/config/topics/
/config/brokers/
Didn't we decide that brokers will not be configured dynamically, rather we
will keep the config in the file?

On Fri, May 15, 2015 at 10:46 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Updated the wiki to capture our recent discussions. Please read.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
>
> Thanks,
> Aditya
>
> 
> From: Joel Koshy [jjkosh...@gmail.com]
> Sent: Tuesday, May 12, 2015 1:09 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-21 Configuration Management
>
> The lack of audit could be addressed to some degree by an internal
> __config_changes topic which can have very long retention. Also, per
> the hangout summary that Gwen sent out it appears that we decided
> against supporting SIGHUP/dynamic configs for the broker.
>
> Thanks,
>
> Joel
>
> On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote:
> > Thanks for chiming in, Todd!
> >
> > Agree that the lack of audit and rollback is a major downside of moving
> all
> > configs to ZooKeeper. Being able to configure dynamically created
> entities
> > in Kafka is required though. So I think what Todd suggested is a good
> > solution to managing all configs - catching SIGHUP for broker configs and
> > storing dynamic configs in ZK like we do today.
> >
> > On Tue, May 12, 2015 at 10:30 AM, Jay Kreps  wrote:
> >
> > > Hmm, here is how I think we can change the split brain proposal to
> make it
> > > a bit better:
> > > 1. Get rid of broker overrides, this is just done in the config file.
> This
> > > makes the precedence chain a lot clearer (e.g. zk always overrides
> file on
> > > a per-entity basis).
> > > 2. Get rid of the notion of dynamic configs in ConfigDef and in the
> broker.
> > > All overrides are dynamic and all server configs are static.
> > > 3. Create an equivalent of LogConfig for ClientConfig and any future
> config
> > > type we make.
> > > 4. Generalize the TopicConfigManager to handle multiple types of
> overrides.
> > >
> > > What we haven't done is try to think through how the pure zk approach
> would
> > > work.
> > >
> > > -Jay
> > >
> > >
> > >
> > > On Mon, May 11, 2015 at 10:53 PM, Ashish Singh 
> > > wrote:
> > >
> > > > I agree with the Joel's suggestion on keeping broker's configs in
> > > > config file and clients/topics config in ZK. Few other projects,
> Apache
> > > > Solr for one, also does something similar for its configurations.
> > > >
> > > > On Monday, May 11, 2015, Gwen Shapira  wrote:
> > > >
> > > > > I like this approach (obviously).
> > > > > I am also OK with supporting broker re-read of config file based
> on ZK
> > > > > watch instead of SIGHUP, if we see this as more consistent with the
> > > rest
> > > > of
> > > > > our code base.
> > > > >
> > > > > Either is fine by me as long as brokers keep the file and just do
> > > refresh
> > > > > :)
> > > > >
> > > > > On Tue, May 12, 2015 at 2:54 AM, Joel Koshy  > > > > > wrote:
> > > > >
> > > > > > So the general concern here is the dichotomy of configs (which we
> > > > > > already have - i.e., in the form of broker config files vs topic
> > > > > > configs in zookeeper). We (at LinkedIn) had some discussions on
> this
> > > > > > last week and had this very question for the operations team
> whose
> > > > > > opinion is I think to a large degree a touchstone for this
> decision:
> > > > > > "Has the operations team at LinkedIn experienced any pain so far
> with
> > > > > > managing topic configs in ZooKeeper (while broker configs are
> > > > > > file-based)?" It turns out that ops overwhelmingly favors the
> current
> > > > > > approach. i.e., service configs as file-based configs and
> > > client/topic
> > > > > > configs in ZooKeeper is intuitive and works great. This may be
> > > > > > somewhat counter-intuitive to devs, but this is one of those
> > > decisions
> > > > > > for which ops input is very critical - because for all practical
> > > > > > purposes, they are the users in this discussion.
> > > > > >
> > > > > > If we continue with this dichotomy and need to support dynamic
> config
> > > > > > for client/topic configs as well as select service configs then
> there
> > > > > > will need to be dichotomy in the config change mechanism as well.
> > > > > > i.e., client/topic configs will change via (say) a ZooKeeper
> watch
> > > and
> > > > > > the service config will change via a config file re-read (on
> SIGHUP)
> > > > > > after config changes have been pushed out to local files. Is
> this a
> > > > > > bad thing? Personally, I don't think it is - i.e. I'

Re: [DISCUSS] KIP-21 Configuration Management

2015-05-15 Thread Gwen Shapira
The wiki says:
"There will be 3 paths within config
/config/clients/
/config/topics/
/config/brokers/
Didn't we decide that brokers will not be configured dynamically, rather we
will keep the config in the file?

On Fri, May 15, 2015 at 10:46 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Updated the wiki to capture our recent discussions. Please read.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
>
> Thanks,
> Aditya
>
> 
> From: Joel Koshy [jjkosh...@gmail.com]
> Sent: Tuesday, May 12, 2015 1:09 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-21 Configuration Management
>
> The lack of audit could be addressed to some degree by an internal
> __config_changes topic which can have very long retention. Also, per
> the hangout summary that Gwen sent out it appears that we decided
> against supporting SIGHUP/dynamic configs for the broker.
>
> Thanks,
>
> Joel
>
> On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote:
> > Thanks for chiming in, Todd!
> >
> > Agree that the lack of audit and rollback is a major downside of moving
> all
> > configs to ZooKeeper. Being able to configure dynamically created
> entities
> > in Kafka is required though. So I think what Todd suggested is a good
> > solution to managing all configs - catching SIGHUP for broker configs and
> > storing dynamic configs in ZK like we do today.
> >
> > On Tue, May 12, 2015 at 10:30 AM, Jay Kreps  wrote:
> >
> > > Hmm, here is how I think we can change the split brain proposal to
> make it
> > > a bit better:
> > > 1. Get rid of broker overrides, this is just done in the config file.
> This
> > > makes the precedence chain a lot clearer (e.g. zk always overrides
> file on
> > > a per-entity basis).
> > > 2. Get rid of the notion of dynamic configs in ConfigDef and in the
> broker.
> > > All overrides are dynamic and all server configs are static.
> > > 3. Create an equivalent of LogConfig for ClientConfig and any future
> config
> > > type we make.
> > > 4. Generalize the TopicConfigManager to handle multiple types of
> overrides.
> > >
> > > What we haven't done is try to think through how the pure zk approach
> would
> > > work.
> > >
> > > -Jay
> > >
> > >
> > >
> > > On Mon, May 11, 2015 at 10:53 PM, Ashish Singh 
> > > wrote:
> > >
> > > > I agree with the Joel's suggestion on keeping broker's configs in
> > > > config file and clients/topics config in ZK. Few other projects,
> Apache
> > > > Solr for one, also does something similar for its configurations.
> > > >
> > > > On Monday, May 11, 2015, Gwen Shapira  wrote:
> > > >
> > > > > I like this approach (obviously).
> > > > > I am also OK with supporting broker re-read of config file based
> on ZK
> > > > > watch instead of SIGHUP, if we see this as more consistent with the
> > > rest
> > > > of
> > > > > our code base.
> > > > >
> > > > > Either is fine by me as long as brokers keep the file and just do
> > > refresh
> > > > > :)
> > > > >
> > > > > On Tue, May 12, 2015 at 2:54 AM, Joel Koshy  > > > > > wrote:
> > > > >
> > > > > > So the general concern here is the dichotomy of configs (which we
> > > > > > already have - i.e., in the form of broker config files vs topic
> > > > > > configs in zookeeper). We (at LinkedIn) had some discussions on
> this
> > > > > > last week and had this very question for the operations team
> whose
> > > > > > opinion is I think to a large degree a touchstone for this
> decision:
> > > > > > "Has the operations team at LinkedIn experienced any pain so far
> with
> > > > > > managing topic configs in ZooKeeper (while broker configs are
> > > > > > file-based)?" It turns out that ops overwhelmingly favors the
> current
> > > > > > approach. i.e., service configs as file-based configs and
> > > client/topic
> > > > > > configs in ZooKeeper is intuitive and works great. This may be
> > > > > > somewhat counter-intuitive to devs, but this is one of those
> > > decisions
> > > > > > for which ops input is very critical - because for all practical
> > > > > > purposes, they are the users in this discussion.
> > > > > >
> > > > > > If we continue with this dichotomy and need to support dynamic
> config
> > > > > > for client/topic configs as well as select service configs then
> there
> > > > > > will need to be dichotomy in the config change mechanism as well.
> > > > > > i.e., client/topic configs will change via (say) a ZooKeeper
> watch
> > > and
> > > > > > the service config will change via a config file re-read (on
> SIGHUP)
> > > > > > after config changes have been pushed out to local files. Is
> this a
> > > > > > bad thing? Personally, I don't think it is - i.e. I'm in favor of
> > > this
> > > > > > approach. What do others think?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Joel
> > > > > >
> > > > > > On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote:
> > > > > > > What Todd said :)
> > > > > > >
> > > > > 

RE: [DISCUSS] KIP-21 Configuration Management

2015-05-15 Thread Aditya Auradkar
Updated the wiki to capture our recent discussions. Please read.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration

Thanks,
Aditya


From: Joel Koshy [jjkosh...@gmail.com]
Sent: Tuesday, May 12, 2015 1:09 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

The lack of audit could be addressed to some degree by an internal
__config_changes topic which can have very long retention. Also, per
the hangout summary that Gwen sent out it appears that we decided
against supporting SIGHUP/dynamic configs for the broker.

Thanks,

Joel

On Tue, May 12, 2015 at 11:05:06AM -0700, Neha Narkhede wrote:
> Thanks for chiming in, Todd!
>
> Agree that the lack of audit and rollback is a major downside of moving all
> configs to ZooKeeper. Being able to configure dynamically created entities
> in Kafka is required though. So I think what Todd suggested is a good
> solution to managing all configs - catching SIGHUP for broker configs and
> storing dynamic configs in ZK like we do today.
>
> On Tue, May 12, 2015 at 10:30 AM, Jay Kreps  wrote:
>
> > Hmm, here is how I think we can change the split brain proposal to make it
> > a bit better:
> > 1. Get rid of broker overrides, this is just done in the config file. This
> > makes the precedence chain a lot clearer (e.g. zk always overrides file on
> > a per-entity basis).
> > 2. Get rid of the notion of dynamic configs in ConfigDef and in the broker.
> > All overrides are dynamic and all server configs are static.
> > 3. Create an equivalent of LogConfig for ClientConfig and any future config
> > type we make.
> > 4. Generalize the TopicConfigManager to handle multiple types of overrides.
> >
> > What we haven't done is try to think through how the pure zk approach would
> > work.
> >
> > -Jay
> >
> >
> >
> > On Mon, May 11, 2015 at 10:53 PM, Ashish Singh 
> > wrote:
> >
> > > I agree with the Joel's suggestion on keeping broker's configs in
> > > config file and clients/topics config in ZK. Few other projects, Apache
> > > Solr for one, also does something similar for its configurations.
> > >
> > > On Monday, May 11, 2015, Gwen Shapira  wrote:
> > >
> > > > I like this approach (obviously).
> > > > I am also OK with supporting broker re-read of config file based on ZK
> > > > watch instead of SIGHUP, if we see this as more consistent with the
> > rest
> > > of
> > > > our code base.
> > > >
> > > > Either is fine by me as long as brokers keep the file and just do
> > refresh
> > > > :)
> > > >
> > > > On Tue, May 12, 2015 at 2:54 AM, Joel Koshy  > > > > wrote:
> > > >
> > > > > So the general concern here is the dichotomy of configs (which we
> > > > > already have - i.e., in the form of broker config files vs topic
> > > > > configs in zookeeper). We (at LinkedIn) had some discussions on this
> > > > > last week and had this very question for the operations team whose
> > > > > opinion is I think to a large degree a touchstone for this decision:
> > > > > "Has the operations team at LinkedIn experienced any pain so far with
> > > > > managing topic configs in ZooKeeper (while broker configs are
> > > > > file-based)?" It turns out that ops overwhelmingly favors the current
> > > > > approach. i.e., service configs as file-based configs and
> > client/topic
> > > > > configs in ZooKeeper is intuitive and works great. This may be
> > > > > somewhat counter-intuitive to devs, but this is one of those
> > decisions
> > > > > for which ops input is very critical - because for all practical
> > > > > purposes, they are the users in this discussion.
> > > > >
> > > > > If we continue with this dichotomy and need to support dynamic config
> > > > > for client/topic configs as well as select service configs then there
> > > > > will need to be dichotomy in the config change mechanism as well.
> > > > > i.e., client/topic configs will change via (say) a ZooKeeper watch
> > and
> > > > > the service config will change via a config file re-read (on SIGHUP)
> > > > > after config changes have been pushed out to local files. Is this a
> > > > > bad thing? Personally, I don't think it is - i.e. I'm in favor of
> > this
> > > > > approach. What do others think?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Joel
> > > > >
> > > > > On Mon, May 11, 2015 at 11:08:44PM +0300, Gwen Shapira wrote:
> > > > > > What Todd said :)
> > > > > >
> > > > > > (I think my ops background is showing...)
> > > > > >
> > > > > > On Mon, May 11, 2015 at 10:17 PM, Todd Palino  > > > > wrote:
> > > > > >
> > > > > > > I understand your point here, Jay, but I disagree that we can't
> > > have
> > > > > two
> > > > > > > configuration systems. We have two different types of
> > configuration
> > > > > > > information. We have configuration that relates to the service
> > > itself
> > > > > (the
> > > > > > > Kafka broker), and we have configuration that relates to the
> > > content
> > > > > within
> > > > > > > t

Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Tom Graves
+1 non-binding.
Tom Graves 


 On Friday, May 15, 2015 2:00 PM, Don Bosco Durai  wrote:
   

 +1 non-binding


On 5/15/15, 11:43 AM, "Gwen Shapira"  wrote:

>+1 non-binding
>
>On Fri, May 15, 2015 at 9:12 PM, Harsha  wrote:
>
>> +1 non-binding
>>
>>
>>
>>
>>
>>
>> On Fri, May 15, 2015 at 9:18 AM -0700, "Parth Brahmbhatt" <
>> pbrahmbh...@hortonworks.com> wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Hi,
>>
>> Opening the voting thread for KIP-11.
>>
>> Link to the KIP:
>> 
>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
>>Interface
>> Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
>>
>> Thanks
>> Parth
>>




   

[jira] [Updated] (KAFKA-2147) Unbalanced replication can cause extreme purgatory growth

2015-05-15 Thread Evan Huus (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evan Huus updated KAFKA-2147:
-
Attachment: purgatory.log

Purgatory log from the attached patch.

Based on my understanding, the check is working (it is seeing the right number 
of elements), it just isn't waking up enough.

> Unbalanced replication can cause extreme purgatory growth
> -
>
> Key: KAFKA-2147
> URL: https://issues.apache.org/jira/browse/KAFKA-2147
> Project: Kafka
>  Issue Type: Bug
>  Components: purgatory, replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Jun Rao
> Attachments: KAFKA-2147.patch, KAFKA-2147_2015-05-14_09:41:56.patch, 
> purgatory.log, watch-lists.log
>
>
> Apologies in advance, this is going to be a bit of complex description, 
> mainly because we've seen this issue several different ways and we're still 
> tying them together in terms of root cause and analysis.
> It is worth noting now that we have all our producers set up to send 
> RequiredAcks==-1, and that this includes all our MirrorMakers.
> I understand the purgatory is being rewritten (again) for 0.8.3. Hopefully 
> that will incidentally fix this issue, or at least render it moot.
> h4. Symptoms
> Fetch request purgatory on a broker or brokers grows rapidly and steadily at 
> a rate of roughly 1-5K requests per second. Heap memory used also grows to 
> keep pace. When 4-5 million requests have accumulated in purgatory, the 
> purgatory is drained, causing a substantial latency spike. The node will tend 
> to drop leadership, replicate, and recover.
> h5. Case 1 - MirrorMaker
> We first noticed this case when enabling mirrormaker. We had one primary 
> cluster already, with many producers and consumers. We created a second, 
> identical cluster and enabled replication from the original to the new 
> cluster on some topics using mirrormaker. This caused all six nodes in the 
> new cluster to exhibit the symptom in lockstep - their purgatories would all 
> grow together, and get drained within about 20 seconds of each other. The 
> cluster-wide latency spikes at this time caused several problems for us.
> Turning MM on and off turned the problem on and off very precisely. When we 
> stopped MM, the purgatories would all drop to normal levels immediately, and 
> would start climbing again when we restarted it.
> Note that this is the *fetch* purgatories on the brokers that MM was 
> *producing* to, which indicates fairly strongly that this is a replication 
> issue, not a MM issue.
> This particular cluster and MM setup was abandoned for other reasons before 
> we could make much progress debugging.
> h5. Case 2 - Broker 6
> The second time we saw this issue was on the newest broker (broker 6) in the 
> original cluster. For a long time we were running with five nodes, and 
> eventually added a sixth to handle the increased load. At first, we moved 
> only a handful of higher-volume partitions to this broker. Later, we created 
> a group of new topics (totalling around 100 partitions) for testing purposes 
> that were spread automatically across all six nodes. These topics saw 
> occasional traffic, but were generally unused. At this point broker 6 had 
> leadership for about an equal number of high-volume and unused partitions, 
> about 15-20 of each.
> Around this time (we don't have detailed enough data to prove real 
> correlation unfortunately), the issue started appearing on this broker as 
> well, but not on any of the other brokers in the cluster.
> h4. Debugging
> The first thing we tried was to reduce the 
> `fetch.purgatory.purge.interval.requests` from the default of 1000 to a much 
> lower value of 200. This had no noticeable effect at all.
> We then enabled debug logging on broker06 and started looking through that. I 
> can attach complete log samples if necessary, but the thing that stood out 
> for us was a substantial number of the following lines:
> {noformat}
> [2015-04-23 20:05:15,196] DEBUG [KafkaApi-6] Putting fetch request with 
> correlation id 49939 from client ReplicaFetcherThread-0-6 into purgatory 
> (kafka.server.KafkaApis)
> {noformat}
> The volume of these lines seemed to match (approximately) the fetch purgatory 
> growth on that broker.
> At this point we developed a hypothesis (detailed below) which guided our 
> subsequent debugging tests:
> - Setting a daemon up to produce regular random data to all of the topics led 
> by kafka06 (specifically the ones which otherwise would receive no data) 
> substantially alleviated the problem.
> - Doing an additional rebalance of the cluster in order to move a number of 
> other topics with regular data to kafka06 appears to have solved the problem 
> completely.
> h4. Hypothesis
> Current versions (0.8.2.1 and ear

[jira] [Assigned] (KAFKA-2073) Replace TopicMetadata request/response with o.a.k.requests.metadata

2015-05-15 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi reassigned KAFKA-2073:
---

Assignee: Andrii Biletskyi  (was: Sriharsha Chintalapani)

> Replace TopicMetadata request/response with o.a.k.requests.metadata
> ---
>
> Key: KAFKA-2073
> URL: https://issues.apache.org/jira/browse/KAFKA-2073
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Gwen Shapira
>Assignee: Andrii Biletskyi
> Fix For: 0.8.3
>
>
> Replace TopicMetadata request/response with o.a.k.requests.metadata.
> Note, this is more challenging that it appears because while the wire 
> protocol is identical, the objects are completely different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Don Bosco Durai
+1 non-binding


On 5/15/15, 11:43 AM, "Gwen Shapira"  wrote:

>+1 non-binding
>
>On Fri, May 15, 2015 at 9:12 PM, Harsha  wrote:
>
>> +1 non-binding
>>
>>
>>
>>
>>
>>
>> On Fri, May 15, 2015 at 9:18 AM -0700, "Parth Brahmbhatt" <
>> pbrahmbh...@hortonworks.com> wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Hi,
>>
>> Opening the voting thread for KIP-11.
>>
>> Link to the KIP:
>> 
>>https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+
>>Interface
>> Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
>>
>> Thanks
>> Parth
>>




Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Gwen Shapira
+1 non-binding

On Fri, May 15, 2015 at 9:12 PM, Harsha  wrote:

> +1 non-binding
>
>
>
>
>
>
> On Fri, May 15, 2015 at 9:18 AM -0700, "Parth Brahmbhatt" <
> pbrahmbh...@hortonworks.com> wrote:
>
>
>
>
>
>
>
>
>
>
> Hi,
>
> Opening the voting thread for KIP-11.
>
> Link to the KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
> Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688
>
> Thanks
> Parth
>


[jira] [Commented] (KAFKA-2073) Replace TopicMetadata request/response with o.a.k.requests.metadata

2015-05-15 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545943#comment-14545943
 ] 

Sriharsha Chintalapani commented on KAFKA-2073:
---

[~abiletskyi] Please take it over. Thanks.

> Replace TopicMetadata request/response with o.a.k.requests.metadata
> ---
>
> Key: KAFKA-2073
> URL: https://issues.apache.org/jira/browse/KAFKA-2073
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Gwen Shapira
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
>
> Replace TopicMetadata request/response with o.a.k.requests.metadata.
> Note, this is more challenging that it appears because while the wire 
> protocol is identical, the objects are completely different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Harsha
+1 non-binding






On Fri, May 15, 2015 at 9:18 AM -0700, "Parth Brahmbhatt" 
 wrote:










Hi,

Opening the voting thread for KIP-11.

Link to the KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688

Thanks
Parth

[jira] [Commented] (KAFKA-2160) DelayedOperationPurgatory should remove the pair in watchersForKey with empty watchers list

2015-05-15 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545905#comment-14545905
 ] 

Guozhang Wang commented on KAFKA-2160:
--

Created reviewboard https://reviews.apache.org/r/34283/diff/
 against branch origin/trunk

> DelayedOperationPurgatory should remove the pair in watchersForKey with empty 
> watchers list
> ---
>
> Key: KAFKA-2160
> URL: https://issues.apache.org/jira/browse/KAFKA-2160
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Attachments: KAFKA-2160.patch, KAFKA-2160.patch, 
> KAFKA-2160_2015-04-30_15:20:14.patch, KAFKA-2160_2015-05-06_16:31:48.patch
>
>
> With purgatory usage in consumer coordinator, it will be common that watcher 
> lists are very short and live only for a short time. So we'd better clean 
> them from the watchersForKey Pool once the list become empty in 
> checkAndComplete() calls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2160) DelayedOperationPurgatory should remove the pair in watchersForKey with empty watchers list

2015-05-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2160:
-
Attachment: KAFKA-2160.patch

> DelayedOperationPurgatory should remove the pair in watchersForKey with empty 
> watchers list
> ---
>
> Key: KAFKA-2160
> URL: https://issues.apache.org/jira/browse/KAFKA-2160
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Attachments: KAFKA-2160.patch, KAFKA-2160.patch, 
> KAFKA-2160_2015-04-30_15:20:14.patch, KAFKA-2160_2015-05-06_16:31:48.patch
>
>
> With purgatory usage in consumer coordinator, it will be common that watcher 
> lists are very short and live only for a short time. So we'd better clean 
> them from the watchersForKey Pool once the list become empty in 
> checkAndComplete() calls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 34283: Third Attempt: use a global read-write lock in purgatory

2015-05-15 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34283/
---

Review request for kafka.


Bugs: KAFKA-2160
https://issues.apache.org/jira/browse/KAFKA-2160


Repository: kafka


Description
---

Fix KAFKA-2160


Diffs
-

  core/src/main/scala/kafka/server/DelayedOperation.scala 
2ed9b467c2865e5717d7f6fd933cd09a5c5b22c0 

Diff: https://reviews.apache.org/r/34283/diff/


Testing
---


Thanks,

Guozhang Wang



[jira] [Commented] (KAFKA-2194) Produce request failure after Kafka + Zookeeper restart

2015-05-15 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545873#comment-14545873
 ] 

Jiangjie Qin commented on KAFKA-2194:
-

Any reason to start Kafka before starting ZK?

> Produce request failure after Kafka + Zookeeper restart
> ---
>
> Key: KAFKA-2194
> URL: https://issues.apache.org/jira/browse/KAFKA-2194
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 0.8.2.1
> Environment: Windows Server 2012 R2
>Reporter: Ian Morgan
>Assignee: Jun Rao
>
> Trying to separate out Kafka-logs and Zookeeper data from the primary kafka 
> folder, so that distributed system can be distributed separately to the data 
> folders. Initialisation seems to succeed (e.g. old topics from kafka-logs is 
> loaded successfully).
> Steps to reproduce:
> 1. Start ZooKeeper
> 2. Start Kafka (write data to Kafka so that data is available in kafka-logs).
> 3. Kill Kafka
> 4. Kill Zookeeper
> 5. Start Kafka
> 6. Start Zookeeper
> 7. Try reading from Kafka
> Logs:
> Seeing the following in server.log (where LcmdSegments is the topic). 
> 2015-05-15 12:06:38,290 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request 
> with correlation id 148440 from client  on partition [LcmdSegments,14] failed 
> due to Partition [LcmdSegments,14] doesn't exist on 1
> 2015-05-15 12:06:38,384 INFO KAFKA.Processor: Closing socket connection to 
> /10.44.18.75.
> 2015-05-15 12:06:39,384 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request 
> with correlation id 148519 from client  on partition [LcmdSegments,14] failed 
> due to Partition [LcmdSegments,14] doesn't exist on 1
> 2015-05-15 12:06:39,493 INFO KAFKA.Processor: Closing socket connection to 
> /10.44.18.75.
> 2015-05-15 12:06:39,493 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request 
> with correlation id 148598 from client  on partition [LcmdSegments,14] failed 
> due to Partition [LcmdSegments,14] doesn't exist on 1
> 2015-05-15 12:06:39,603 INFO KAFKA.Processor: Closing socket connection to 
> /10.44.18.75.
> 2015-05-15 12:06:39,603 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request 
> with correlation id 148677 from client  on partition [LcmdSegments,14] failed 
> due to Partition [LcmdSegments,14] doesn't exist on 1
> 2015-05-15 12:06:39,696 INFO KAFKA.Processor: Closing socket connection to 
> /10.44.18.75.
> 2015-05-15 12:06:39,696 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request 
> with correlation id 148756 from client  on partition [LcmdSegments,14] failed 
> due to Partition [LcmdSegments,14] doesn't exist on 1
> And KafkaNet client returns:
> Topic:LcmdSegments returned error code of LeaderNotAvailable.  Retrying.
> Backing off metadata request retry.  Waiting for 62500ms.
> state-change.log shows an error:
> 2015-05-15 11:44:18,110 ERROR KAFKA.logger: Controller 1 epoch 2 initiated 
> state change for partition [LcmdSegments,14] from OfflinePartition to 
> OnlinePartition failed
> kafka.common.NoReplicaOnlineException: No replica for partition 
> [LcmdSegments,14] is alive. Live brokers are: [Set()], Assigned replicas are: 
> [List(1)]
>   at 
> kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
>   at 
> kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
>   at 
> kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
>   at 
> kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
>   at 
> kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
>   at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>   at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>   at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>   at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
>   at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
>   at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
>   at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>   at 
> kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
>   at 
> kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:70)
>   at 
> kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:314)
>   at 
> kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:161)
>   at 
> kaf

Re: Review Request 34170: Patch for KAFKA-2191

2015-05-15 Thread Jay Kreps


> On May 13, 2015, 11:50 p.m., Aditya Auradkar wrote:
> > clients/src/main/java/org/apache/kafka/common/metrics/stats/SampledStat.java,
> >  line 45
> > 
> >
> > I think this is a good catch.
> > 
> > Just so I understand, this is adding new samples for blank intervals of 
> > time when record was not called. As an optimization, I think you should 
> > only add "MetricConfig.samples" number of samples. In the rare case, there 
> > is no activity for 10 mins (say), this will add 10sec*6*10 = 600 samples 
> > which will be purged immediately on the next record call.
> 
> Jay Kreps wrote:
> This is a really good point. It is totally possible for a metric to track 
> activity on a topic that has no writes for a month, the first write would 
> then cause you to cycle through a month of samples. The logic around 
> correctly skipping these windows and calculating the correct window boundry 
> needs to be carefully worked out.
> 
> Jay Kreps wrote:
> Actually maybe you can elaborate on why this is needed at all? In the 
> current code if the current sample is complete we add a new sample. Your code 
> adds lots of samples. But why do we need that? Isn't purging obsolete samples 
> handled on measurement already? I think that is more elegant. You must see 
> some issue there, maybe you can explain?
> 
> Dong Lin wrote:
> Adi: Good catch! I will update the patch to fix the problem.
> 
> Jay: Sure. Here is the problem I find with the current code.
> 
> - The current code implements SampledStat.advance in a way that is 
> probably different from users' expectation: typically when rate is sampled, 
> time will be divided into consecutive slots of equal length and samples are 
> expired in unit of these time slots, which assures the user the rate is 
> measured on samples collected in the past expireAge period. The current 
> implementation doesn't provide such an easy-to-understand interpretations, 
> since the effective sampled period can be anywhere between 0 and expireAge.
> 
> - As one extreme example, say we call rate.measure after expireAge (i.e. 
> config.samples() * config.timeWindowMs()) has passed since last sample. Then 
> purgeObsoleteSamples will reset all samples and elapsed will be 0 as a 
> result. This is problematic: rate should instead keep the information that we 
> haven’t observed any sample during the past expireAge.
> 
> Does it make sense?

Hey Dong, I think what you are saying is that the last sample is partial, that 
is it doesn't cover a full window yet. This is true and is by design. That is 
the whole reason there are multiple samples--to stabalize the partial estimate. 
The ideal window would be a backwards looking sliding window of N ms from the 
time of measurement but that would be computationally unfeasible.

I don't understand your example. If no events have occurred in config.samples() 
* config.timeWindowMs() then the observed rate is 0, right? I think you are 
saying that the time estimate should be the full window, but that is fixed by 
changing the other formula as I suggested.


- Jay


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34170/#review83687
---


On May 14, 2015, 7:34 a.m., Dong Lin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34170/
> ---
> 
> (Updated May 14, 2015, 7:34 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2191
> https://issues.apache.org/jira/browse/KAFKA-2191
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-2191; Measured rate should not be infinite
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/metrics/stats/Rate.java 
> 98429da34418f7f1deba1b5e44e2e6025212edb3 
>   
> clients/src/main/java/org/apache/kafka/common/metrics/stats/SampledStat.java 
> b341b7daaa10204906d78b812fb05fd27bc69373 
> 
> Diff: https://reviews.apache.org/r/34170/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Dong Lin
> 
>



[jira] [Commented] (KAFKA-2073) Replace TopicMetadata request/response with o.a.k.requests.metadata

2015-05-15 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545841#comment-14545841
 ] 

Andrii Biletskyi commented on KAFKA-2073:
-

[~sriharsha] is there any update on this one?
This blocks me in KIP-4 where we will evolve TopicMetadataRequest. If you are 
not working on in - let me know, I can continue with this ticket. Thanks.

> Replace TopicMetadata request/response with o.a.k.requests.metadata
> ---
>
> Key: KAFKA-2073
> URL: https://issues.apache.org/jira/browse/KAFKA-2073
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Gwen Shapira
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
>
> Replace TopicMetadata request/response with o.a.k.requests.metadata.
> Note, this is more challenging that it appears because while the wire 
> protocol is identical, the objects are completely different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2169) Upgrade to zkclient-0.5

2015-05-15 Thread Parth Brahmbhatt (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545817#comment-14545817
 ] 

Parth Brahmbhatt commented on KAFKA-2169:
-

[~junrao] updated the review with changes you requested.

> Upgrade to zkclient-0.5
> ---
>
> Key: KAFKA-2169
> URL: https://issues.apache.org/jira/browse/KAFKA-2169
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
>Reporter: Neha Narkhede
>Assignee: Parth Brahmbhatt
>Priority: Critical
> Attachments: KAFKA-2169.patch, KAFKA-2169.patch, 
> KAFKA-2169_2015-05-11_13:52:57.patch, KAFKA-2169_2015-05-15_10:18:41.patch
>
>
> zkclient-0.5 is released 
> http://mvnrepository.com/artifact/com.101tec/zkclient/0.5 and has the fix for 
> KAFKA-824



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2169) Upgrade to zkclient-0.5

2015-05-15 Thread Parth Brahmbhatt (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545814#comment-14545814
 ] 

Parth Brahmbhatt commented on KAFKA-2169:
-

Updated reviewboard https://reviews.apache.org/r/34050/diff/
 against branch origin/trunk

> Upgrade to zkclient-0.5
> ---
>
> Key: KAFKA-2169
> URL: https://issues.apache.org/jira/browse/KAFKA-2169
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
>Reporter: Neha Narkhede
>Assignee: Parth Brahmbhatt
>Priority: Critical
> Attachments: KAFKA-2169.patch, KAFKA-2169.patch, 
> KAFKA-2169_2015-05-11_13:52:57.patch, KAFKA-2169_2015-05-15_10:18:41.patch
>
>
> zkclient-0.5 is released 
> http://mvnrepository.com/artifact/com.101tec/zkclient/0.5 and has the fix for 
> KAFKA-824



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2169) Upgrade to zkclient-0.5

2015-05-15 Thread Parth Brahmbhatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Brahmbhatt updated KAFKA-2169:

Attachment: KAFKA-2169_2015-05-15_10:18:41.patch

> Upgrade to zkclient-0.5
> ---
>
> Key: KAFKA-2169
> URL: https://issues.apache.org/jira/browse/KAFKA-2169
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
>Reporter: Neha Narkhede
>Assignee: Parth Brahmbhatt
>Priority: Critical
> Attachments: KAFKA-2169.patch, KAFKA-2169.patch, 
> KAFKA-2169_2015-05-11_13:52:57.patch, KAFKA-2169_2015-05-15_10:18:41.patch
>
>
> zkclient-0.5 is released 
> http://mvnrepository.com/artifact/com.101tec/zkclient/0.5 and has the fix for 
> KAFKA-824



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34050: Patch for KAFKA-2169

2015-05-15 Thread Parth Brahmbhatt

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34050/
---

(Updated May 15, 2015, 5:19 p.m.)


Review request for kafka.


Bugs: KAFKA-2169
https://issues.apache.org/jira/browse/KAFKA-2169


Repository: kafka


Description (updated)
---

System.exit instead of throwing RuntimeException when zokeeper session 
establishment fails.


Removing the unnecessary @throws.


Consumer will only log when zk session can not be established.


Diffs (updated)
-

  build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
  core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
aa8d9404a3e78a365df06404b79d0d8f694b4bd6 
  core/src/main/scala/kafka/consumer/ZookeeperTopicEventWatcher.scala 
38f4ec0bd1b388cc8fc04b38bbb2e7aaa1c3f43b 
  core/src/main/scala/kafka/controller/KafkaController.scala 
a6351163f5b6f080d6fa50bcc3533d445fcbc067 
  core/src/main/scala/kafka/server/KafkaHealthcheck.scala 
861b7f644941f88ce04a4e95f6b28d18bf1db16d 

Diff: https://reviews.apache.org/r/34050/diff/


Testing
---


Thanks,

Parth Brahmbhatt



[jira] [Commented] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Ben Edwards (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545798#comment-14545798
 ] 

Ben Edwards commented on KAFKA-2197:


Closing: the issue is that while zookeeper can see that kafka container as 
{{kafka}} without specifying the hostname to match the link name in the compose 
file the controller fails to resolve the link name as this isn't written to 
{{/etc/hosts}}. Not a bug, don't need fixing. So ashamed!

> Controller not able to update state for broker on the same machine
> --
>
> Key: KAFKA-2197
> URL: https://issues.apache.org/jira/browse/KAFKA-2197
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: docker 1.5, 64 bit Linux (4.0.1-1).
>Reporter: Ben Edwards
> Attachments: docker-compose-stripped.yml
>
>
> I am using kafka on docker. When I try to create a topic the controller seems 
> to get stuck and the topic is never usable for consumers or producers.
> {noformat}
> [2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
> Controller 9092 epoch 1 fails to send request 
> Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
>  -> 
> (LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
>  to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
> (kafka.controller.RequestSendThread)
> java.nio.channels.ClosedChannelException
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
>   at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>   at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {noformat}
> Repro steps:
> run docker-compose up with the attached docker-compose yaml file.
> enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
> <> bash}} to enter).
> run the following 
> {noformat}
> cd /opt/kafka_2.10-0.8.2.1/
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
> --replication-factor 1 --partitions 1
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> tail -f logs/controller.log 
> {noformat}
> This should allow you to observe the controller being upset. The zookeeper 
> instance is definitely reachable, the hostnames are correct as far as I can 
> tell. I am kind of at a loss as to what is happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Ben Edwards (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Edwards resolved KAFKA-2197.

Resolution: Not A Problem

> Controller not able to update state for broker on the same machine
> --
>
> Key: KAFKA-2197
> URL: https://issues.apache.org/jira/browse/KAFKA-2197
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: docker 1.5, 64 bit Linux (4.0.1-1).
>Reporter: Ben Edwards
> Attachments: docker-compose-stripped.yml
>
>
> I am using kafka on docker. When I try to create a topic the controller seems 
> to get stuck and the topic is never usable for consumers or producers.
> {noformat}
> [2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
> Controller 9092 epoch 1 fails to send request 
> Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
>  -> 
> (LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
>  to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
> (kafka.controller.RequestSendThread)
> java.nio.channels.ClosedChannelException
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
>   at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>   at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {noformat}
> Repro steps:
> run docker-compose up with the attached docker-compose yaml file.
> enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
> <> bash}} to enter).
> run the following 
> {noformat}
> cd /opt/kafka_2.10-0.8.2.1/
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
> --replication-factor 1 --partitions 1
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> tail -f logs/controller.log 
> {noformat}
> This should allow you to observe the controller being upset. The zookeeper 
> instance is definitely reachable, the hostnames are correct as far as I can 
> tell. I am kind of at a loss as to what is happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Ben Edwards (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Edwards updated KAFKA-2197:
---
Description: 
I am using kafka on docker. When I try to create a topic the controller seems 
to get stuck and the topic is never usable for consumers or producers.

{noformat}
[2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
Controller 9092 epoch 1 fails to send request 
Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
 -> 
(LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
 to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
(kafka.controller.RequestSendThread)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
at 
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
{noformat}

Repro steps:

run docker-compose up with the attached docker-compose yaml file.

enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
<> bash}} to enter).

run the following 

{noformat}
cd /opt/kafka_2.10-0.8.2.1/
./bin/kafka-topics.sh --list --zookeeper zk:2181
./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
--replication-factor 1 --partitions 1
./bin/kafka-topics.sh --list --zookeeper zk:2181
tail -f logs/controller.log 
{noformat}

This should allow you to observe the controller being upset. The zookeeper 
instance is definitely reachable, the hostnames are correct as far as I can 
tell. I am kind of at a loss as to what is happening.

  was:
I am using kafka on docker. When I try to create a topic the controller seems 
to get stuck and the topic is never usable for consumers or producers.

{noformat}
[2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
Controller 9092 epoch 1 fails to send request 
Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
 -> 
(LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
 to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
(kafka.controller.RequestSendThread)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
at 
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
{noformat}

Repro steps:

run docker-compose up with the attached docker-compose yaml file.

enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
<> bash}} to enter).

run the following 

{noformat}
cd /opt/kafka_2.10-0.8.2.1/
./bin/kafka-topics.sh --list --zookeeper zk:2181
./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic lol 
--replicatior-factor 1 --partitions 1
./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
--replication-factor 1 --partitions 1
./bin/kafka-topics.sh --list --zookeeper zk:2181
tail -f logs/controller.log 
{noformat}

This should allow you to observe the controller being upset. The zookeeper 
instance is definitely reachable, the hostnames are correct as far as I can 
tell. I am kind of at a loss as to what is happening.


> Controller not able to update state for broker on the same machine
> --
>
> Key: KAFKA-2197
> URL: https://issues.apache.org/jira/browse/KAFKA-2197
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: docker 1.5, 64 bit Linux (4.0.1-1).
>Reporter: Ben Edwards
> Attachments: docker-compose-stripped.yml
>
>
> I am using kafka on docker. When I try to create a topic the controller seems 
> to get stuck and the topic is never usable for consumers or producers.
> {noformat}
> [2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
> Controller 9092 epoch 1 fails to send request 
> Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
>  -> 
> (LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
>  to broker id:9092

[jira] [Commented] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Ben Edwards (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545753#comment-14545753
 ] 

Ben Edwards commented on KAFKA-2197:


I tried the mailing list, but I will try again! Do you think I should just 
close this, or link to it?

> Controller not able to update state for broker on the same machine
> --
>
> Key: KAFKA-2197
> URL: https://issues.apache.org/jira/browse/KAFKA-2197
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: docker 1.5, 64 bit Linux (4.0.1-1).
>Reporter: Ben Edwards
> Attachments: docker-compose-stripped.yml
>
>
> I am using kafka on docker. When I try to create a topic the controller seems 
> to get stuck and the topic is never usable for consumers or producers.
> {noformat}
> [2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
> Controller 9092 epoch 1 fails to send request 
> Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
>  -> 
> (LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
>  to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
> (kafka.controller.RequestSendThread)
> java.nio.channels.ClosedChannelException
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
>   at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>   at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {noformat}
> Repro steps:
> run docker-compose up with the attached docker-compose yaml file.
> enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
> <> bash}} to enter).
> run the following 
> {noformat}
> cd /opt/kafka_2.10-0.8.2.1/
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic lol 
> --replicatior-factor 1 --partitions 1
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
> --replication-factor 1 --partitions 1
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> tail -f logs/controller.log 
> {noformat}
> This should allow you to observe the controller being upset. The zookeeper 
> instance is definitely reachable, the hostnames are correct as far as I can 
> tell. I am kind of at a loss as to what is happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545751#comment-14545751
 ] 

Neha Narkhede commented on KAFKA-2197:
--

[~bedwards] Unassigning myself as I probably wouldn't get around to this, but 
you might have better visibility getting to the root cause on the mailing list 
(if you already haven't)

> Controller not able to update state for broker on the same machine
> --
>
> Key: KAFKA-2197
> URL: https://issues.apache.org/jira/browse/KAFKA-2197
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: docker 1.5, 64 bit Linux (4.0.1-1).
>Reporter: Ben Edwards
> Attachments: docker-compose-stripped.yml
>
>
> I am using kafka on docker. When I try to create a topic the controller seems 
> to get stuck and the topic is never usable for consumers or producers.
> {noformat}
> [2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
> Controller 9092 epoch 1 fails to send request 
> Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
>  -> 
> (LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
>  to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
> (kafka.controller.RequestSendThread)
> java.nio.channels.ClosedChannelException
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
>   at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>   at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {noformat}
> Repro steps:
> run docker-compose up with the attached docker-compose yaml file.
> enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
> <> bash}} to enter).
> run the following 
> {noformat}
> cd /opt/kafka_2.10-0.8.2.1/
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic lol 
> --replicatior-factor 1 --partitions 1
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
> --replication-factor 1 --partitions 1
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> tail -f logs/controller.log 
> {noformat}
> This should allow you to observe the controller being upset. The zookeeper 
> instance is definitely reachable, the hostnames are correct as far as I can 
> tell. I am kind of at a loss as to what is happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-2197:
-
Assignee: (was: Neha Narkhede)

> Controller not able to update state for broker on the same machine
> --
>
> Key: KAFKA-2197
> URL: https://issues.apache.org/jira/browse/KAFKA-2197
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: docker 1.5, 64 bit Linux (4.0.1-1).
>Reporter: Ben Edwards
> Attachments: docker-compose-stripped.yml
>
>
> I am using kafka on docker. When I try to create a topic the controller seems 
> to get stuck and the topic is never usable for consumers or producers.
> {noformat}
> [2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
> Controller 9092 epoch 1 fails to send request 
> Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
>  -> 
> (LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
>  to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
> (kafka.controller.RequestSendThread)
> java.nio.channels.ClosedChannelException
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
>   at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>   at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {noformat}
> Repro steps:
> run docker-compose up with the attached docker-compose yaml file.
> enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
> <> bash}} to enter).
> run the following 
> {noformat}
> cd /opt/kafka_2.10-0.8.2.1/
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic lol 
> --replicatior-factor 1 --partitions 1
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
> --replication-factor 1 --partitions 1
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> tail -f logs/controller.log 
> {noformat}
> This should allow you to observe the controller being upset. The zookeeper 
> instance is definitely reachable, the hostnames are correct as far as I can 
> tell. I am kind of at a loss as to what is happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Ben Edwards (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Edwards updated KAFKA-2197:
---
Attachment: (was: docker-compose.yml)

> Controller not able to update state for broker on the same machine
> --
>
> Key: KAFKA-2197
> URL: https://issues.apache.org/jira/browse/KAFKA-2197
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: docker 1.5, 64 bit Linux (4.0.1-1).
>Reporter: Ben Edwards
>Assignee: Neha Narkhede
> Attachments: docker-compose-stripped.yml
>
>
> I am using kafka on docker. When I try to create a topic the controller seems 
> to get stuck and the topic is never usable for consumers or producers.
> {noformat}
> [2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
> Controller 9092 epoch 1 fails to send request 
> Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
>  -> 
> (LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
>  to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
> (kafka.controller.RequestSendThread)
> java.nio.channels.ClosedChannelException
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
>   at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>   at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {noformat}
> Repro steps:
> run docker-compose up with the attached docker-compose yaml file.
> enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
> <> bash}} to enter).
> run the following 
> {noformat}
> cd /opt/kafka_2.10-0.8.2.1/
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic lol 
> --replicatior-factor 1 --partitions 1
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
> --replication-factor 1 --partitions 1
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> tail -f logs/controller.log 
> {noformat}
> This should allow you to observe the controller being upset. The zookeeper 
> instance is definitely reachable, the hostnames are correct as far as I can 
> tell. I am kind of at a loss as to what is happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Ben Edwards (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Edwards updated KAFKA-2197:
---
Attachment: docker-compose-stripped.yml

> Controller not able to update state for broker on the same machine
> --
>
> Key: KAFKA-2197
> URL: https://issues.apache.org/jira/browse/KAFKA-2197
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: docker 1.5, 64 bit Linux (4.0.1-1).
>Reporter: Ben Edwards
>Assignee: Neha Narkhede
> Attachments: docker-compose-stripped.yml
>
>
> I am using kafka on docker. When I try to create a topic the controller seems 
> to get stuck and the topic is never usable for consumers or producers.
> {noformat}
> [2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
> Controller 9092 epoch 1 fails to send request 
> Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
>  -> 
> (LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
>  to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
> (kafka.controller.RequestSendThread)
> java.nio.channels.ClosedChannelException
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
>   at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>   at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {noformat}
> Repro steps:
> run docker-compose up with the attached docker-compose yaml file.
> enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
> <> bash}} to enter).
> run the following 
> {noformat}
> cd /opt/kafka_2.10-0.8.2.1/
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic lol 
> --replicatior-factor 1 --partitions 1
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
> --replication-factor 1 --partitions 1
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> tail -f logs/controller.log 
> {noformat}
> This should allow you to observe the controller being upset. The zookeeper 
> instance is definitely reachable, the hostnames are correct as far as I can 
> tell. I am kind of at a loss as to what is happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Parth Brahmbhatt
Hi,

Opening the voting thread for KIP-11.

Link to the KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688

Thanks
Parth


[jira] [Updated] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Ben Edwards (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Edwards updated KAFKA-2197:
---
Attachment: docker-compose.yml

> Controller not able to update state for broker on the same machine
> --
>
> Key: KAFKA-2197
> URL: https://issues.apache.org/jira/browse/KAFKA-2197
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: docker 1.5, 64 bit Linux (4.0.1-1).
>Reporter: Ben Edwards
>Assignee: Neha Narkhede
> Attachments: docker-compose.yml
>
>
> I am using kafka on docker. When I try to create a topic the controller seems 
> to get stuck and the topic is never usable for consumers or producers.
> {noformat}
> [2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
> Controller 9092 epoch 1 fails to send request 
> Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
>  -> 
> (LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
>  to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
> (kafka.controller.RequestSendThread)
> java.nio.channels.ClosedChannelException
>   at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
>   at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>   at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {noformat}
> Repro steps:
> run docker-compose up with the attached docker-compose yaml file.
> enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
> <> bash}} to enter).
> run the following 
> {noformat}
> cd /opt/kafka_2.10-0.8.2.1/
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic lol 
> --replicatior-factor 1 --partitions 1
> ./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
> --replication-factor 1 --partitions 1
> ./bin/kafka-topics.sh --list --zookeeper zk:2181
> tail -f logs/controller.log 
> {noformat}
> This should allow you to observe the controller being upset. The zookeeper 
> instance is definitely reachable, the hostnames are correct as far as I can 
> tell. I am kind of at a loss as to what is happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Ben Edwards (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Edwards updated KAFKA-2197:
---
Description: 
I am using kafka on docker. When I try to create a topic the controller seems 
to get stuck and the topic is never usable for consumers or producers.

{noformat}
[2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
Controller 9092 epoch 1 fails to send request 
Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
 -> 
(LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
 to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
(kafka.controller.RequestSendThread)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
at 
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
{noformat}

Repro steps:

run docker-compose up with the attached docker-compose yaml file.

enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
<> bash}} to enter).

run the following 

{noformat}
cd /opt/kafka_2.10-0.8.2.1/
./bin/kafka-topics.sh --list --zookeeper zk:2181
./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic lol 
--replicatior-factor 1 --partitions 1
./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
--replication-factor 1 --partitions 1
./bin/kafka-topics.sh --list --zookeeper zk:2181
tail -f logs/controller.log 
{noformat}

This should allow you to observe the controller being upset. The zookeeper 
instance is definitely reachable, the hostnames are correct as far as I can 
tell. I am kind of at a loss as to what is happening.

  was:
I am using kafka on docker. When I try to create a topic the controller seems 
to get stuck and the topic is never usable for consumers or producers.

{noformat}
[2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
Controller 9092 epoch 1 fails to send request 
Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
 -> 
(LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
 to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
(kafka.controller.RequestSendThread)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
at 
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
{noformat}

Repro steps:

run docker-compose up with the attached docker-compose yaml file.

enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
<> bash}} to enter).

run the following 

{monospace}
cd /opt/kafka_2.10-0.8.2.1/
./bin/kafka-topics.sh --list --zookeeper zk:2181
./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic lol 
--replicatior-factor 1 --partitions 1
./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
--replication-factor 1 --partitions 1
./bin/kafka-topics.sh --list --zookeeper zk:2181
tail -f logs/controller.log 
{monospace}

This should allow you to observe the controller being upset. The zookeeper 
instance is definitely reachable, the hostnames are correct as far as I can 
tell. I am kind of at a loss as to what is happening.


> Controller not able to update state for broker on the same machine
> --
>
> Key: KAFKA-2197
> URL: https://issues.apache.org/jira/browse/KAFKA-2197
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.2.1
> Environment: docker 1.5, 64 bit Linux (4.0.1-1).
>Reporter: Ben Edwards
>Assignee: Neha Narkhede
>
> I am using kafka on docker. When I try to create a topic the controller seems 
> to get stuck and the topic is never usable for consumers or producers.
> {noformat}
> [2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
> Controller 9092 epoch 1 fails to send request 
> Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
>  -> 
> (LeaderAndIsrInfo:(Leader:9092,ISR:9092,Lead

[jira] [Created] (KAFKA-2197) Controller not able to update state for broker on the same machine

2015-05-15 Thread Ben Edwards (JIRA)
Ben Edwards created KAFKA-2197:
--

 Summary: Controller not able to update state for broker on the 
same machine
 Key: KAFKA-2197
 URL: https://issues.apache.org/jira/browse/KAFKA-2197
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 0.8.2.1
 Environment: docker 1.5, 64 bit Linux (4.0.1-1).
Reporter: Ben Edwards
Assignee: Neha Narkhede


I am using kafka on docker. When I try to create a topic the controller seems 
to get stuck and the topic is never usable for consumers or producers.

{noformat}
[2015-05-15 15:51:10,201] WARN [Controller-9092-to-broker-9092-send-thread], 
Controller 9092 epoch 1 fails to send request 
Name:LeaderAndIsrRequest;Version:0;Controller:9092;ControllerEpoch:1;CorrelationId:7;ClientId:id_9092-host_null-port_9092;Leaders:id:9092,host:kafka,port:9092;PartitionState:(lol,0)
 -> 
(LeaderAndIsrInfo:(Leader:9092,ISR:9092,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:9092)
 to broker id:9092,host:kafka,port:9092. Reconnecting to broker. 
(kafka.controller.RequestSendThread)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:100)
at 
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
{noformat}

Repro steps:

run docker-compose up with the attached docker-compose yaml file.

enter the kafka container (get the name with {{docker ps}}, {{docker exec -it 
<> bash}} to enter).

run the following 

{monospace}
cd /opt/kafka_2.10-0.8.2.1/
./bin/kafka-topics.sh --list --zookeeper zk:2181
./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic lol 
--replicatior-factor 1 --partitions 1
./bin/kafka-topics.sh --create --zookeeper zk:2181 --topic testing 
--replication-factor 1 --partitions 1
./bin/kafka-topics.sh --list --zookeeper zk:2181
tail -f logs/controller.log 
{monospace}

This should allow you to observe the controller being upset. The zookeeper 
instance is definitely reachable, the hostnames are correct as far as I can 
tell. I am kind of at a loss as to what is happening.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2196) remove roundrobin identical topic constraint in consumer coordinator

2015-05-15 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-2196:

Status: Patch Available  (was: Open)

> remove roundrobin identical topic constraint in consumer coordinator
> 
>
> Key: KAFKA-2196
> URL: https://issues.apache.org/jira/browse/KAFKA-2196
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2196.patch
>
>
> roundrobin doesn't need to make all consumers have identical topic 
> subscriptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2196) remove roundrobin identical topic constraint in consumer coordinator

2015-05-15 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-2196:

Attachment: KAFKA-2196.patch

> remove roundrobin identical topic constraint in consumer coordinator
> 
>
> Key: KAFKA-2196
> URL: https://issues.apache.org/jira/browse/KAFKA-2196
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2196.patch
>
>
> roundrobin doesn't need to make all consumers have identical topic 
> subscriptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 34273: remove roundrobin identical topic constraint in consumer coordinator

2015-05-15 Thread Onur Karaman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34273/
---

Review request for kafka.


Bugs: KAFKA-2196
https://issues.apache.org/jira/browse/KAFKA-2196


Repository: kafka


Description
---

roundrobin doesn't need to make all consumers have identical topic 
subscriptions.

todo:
- run this and range through some simulations


Diffs
-

  core/src/main/scala/kafka/coordinator/PartitionAssignor.scala 
106982286ce7a9e4f0e9722da2812e3a8e7a6cc3 
  core/src/test/scala/unit/kafka/coordinator/PartitionAssignorTest.scala 
ba6d5cd85b89214247209d974701eb6c9eb1e2b2 

Diff: https://reviews.apache.org/r/34273/diff/


Testing
---


Thanks,

Onur Karaman



[jira] [Commented] (KAFKA-2196) remove roundrobin identical topic constraint in consumer coordinator

2015-05-15 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545688#comment-14545688
 ] 

Onur Karaman commented on KAFKA-2196:
-

Created reviewboard https://reviews.apache.org/r/34273/diff/
 against branch origin/trunk

> remove roundrobin identical topic constraint in consumer coordinator
> 
>
> Key: KAFKA-2196
> URL: https://issues.apache.org/jira/browse/KAFKA-2196
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Attachments: KAFKA-2196.patch
>
>
> roundrobin doesn't need to make all consumers have identical topic 
> subscriptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2196) remove roundrobin identical topic constraint in consumer coordinator

2015-05-15 Thread Onur Karaman (JIRA)
Onur Karaman created KAFKA-2196:
---

 Summary: remove roundrobin identical topic constraint in consumer 
coordinator
 Key: KAFKA-2196
 URL: https://issues.apache.org/jira/browse/KAFKA-2196
 Project: Kafka
  Issue Type: Sub-task
Reporter: Onur Karaman
Assignee: Onur Karaman


roundrobin doesn't need to make all consumers have identical topic 
subscriptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: validate configuration while parsing ConfigDef

2015-05-15 Thread gprabhat90
GitHub user gprabhat90 opened a pull request:

https://github.com/apache/kafka/pull/66

validate configuration while parsing ConfigDef

In java client, configDef class's parse() function should return parsed and 
validated values (if validator is present) but it does not validate the 
configuration. e.g. i can create a KafkaProducer instance with a negative value 
for batch.size and then it throws NullPointerException while sending which is 
hard to debug.

This commits fix that

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gprabhat90/kafka fixConfigValidation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/66.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #66


commit 71443fe330976e22113a2267e829a136d778410a
Author: prabhat 
Date:   2015-05-15T15:13:15Z

validate configuration while parsing ConfigDef




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Sriharsha Chintalapani


> On May 13, 2015, 3:40 p.m., Jun Rao wrote:
> > clients/src/main/java/org/apache/kafka/common/network/Channel.java, lines 
> > 70-71
> > 
> >
> > This is a general question. For ssl and sasl, does authentication only 
> > happen at connection time for ssl and sasl? What happens when a 
> > certificate/credential is removed? Do we have to re-authenticate on the 
> > existing connection?

Addressed this on JIRA comments.


> On May 13, 2015, 3:40 p.m., Jun Rao wrote:
> > clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java,
> >  lines 287-289
> > 
> >
> > Should we give those fields a default? The user only needs to configure 
> > them if ssl is enabled.

right now we are checking for obsense of these on client side to configure 
SSLFactory. If we give defaults than client side will throw an exception.


> On May 13, 2015, 3:40 p.m., Jun Rao wrote:
> > clients/src/test/java/org/apache/kafka/common/network/SSLFactoryTest.java, 
> > line 39
> > 
> >
> > Does that bind the port? If so, we will need to select a random port.

It doesn't bind to a port. Its a hint to SSLEngine.


- Sriharsha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review83322
---


On May 15, 2015, 2:18 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated May 15, 2015, 2:18 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> PrincipalBuilder.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Addressing 
> reviews.
> 
> 
> Diffs
> -
> 
>   build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
>   checkstyle/checkstyle.xml a215ff36e9252879f1e0be5a86fef9a875bb8f38 
>   checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
>   clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
> 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
>   clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
> cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
>   clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
> bdff518b732105823058e6182f445248b45dc388 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 8e336a3aa96c73f52beaeb56b931baf4b026cf21 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 187d0004c8c46b6664ddaffecc6166d4b47351e5 
>   clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
> c4fa058692f50abb4f47bd344119d805c60123f5 
>   clients/src/main/java/org/apache/kafka/common/config/SecurityConfigs.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Channel.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java
>  PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
> b5f8d83e89f9026dc0853e5f92c00b2d7f043e22 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> 57de0585e5e9a53eb9dcd99cac1ab3eb2086a302 
>   clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java 
> PRE-CREATION 
>   
> clients/src/main/java/org/ap

[jira] [Updated] (KAFKA-1690) new java producer needs ssl support as a client

2015-05-15 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1690:
--
Status: Patch Available  (was: In Progress)

> new java producer needs ssl support as a client
> ---
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1690) new java producer needs ssl support as a client

2015-05-15 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1690:
--
Attachment: KAFKA-1690_2015-05-15_07:18:21.patch

> new java producer needs ssl support as a client
> ---
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1690) new java producer needs ssl support as a client

2015-05-15 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545554#comment-14545554
 ] 

Sriharsha Chintalapani commented on KAFKA-1690:
---

Updated reviewboard https://reviews.apache.org/r/33620/diff/
 against branch origin/trunk

> new java producer needs ssl support as a client
> ---
>
> Key: KAFKA-1690
> URL: https://issues.apache.org/jira/browse/KAFKA-1690
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Joe Stein
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.3
>
> Attachments: KAFKA-1690.patch, KAFKA-1690.patch, 
> KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, 
> KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, 
> KAFKA-1690_2015-05-15_07:18:21.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Sriharsha Chintalapani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/
---

(Updated May 15, 2015, 2:18 p.m.)


Review request for kafka.


Bugs: KAFKA-1690
https://issues.apache.org/jira/browse/KAFKA-1690


Repository: kafka


Description (updated)
---

KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client.


KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.


KAFKA-1690. new java producer needs ssl support as a client. Added 
PrincipalBuilder.


KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews.


Diffs (updated)
-

  build.gradle fef515b3b2276b1f861e7cc2e33e74c3ce5e405b 
  checkstyle/checkstyle.xml a215ff36e9252879f1e0be5a86fef9a875bb8f38 
  checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
  clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 
0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a 
  clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 
cf32e4e7c40738fe6d8adc36ae0cfad459ac5b0b 
  clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
bdff518b732105823058e6182f445248b45dc388 
  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
8e336a3aa96c73f52beaeb56b931baf4b026cf21 
  clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
187d0004c8c46b6664ddaffecc6166d4b47351e5 
  clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java 
c4fa058692f50abb4f47bd344119d805c60123f5 
  clients/src/main/java/org/apache/kafka/common/config/SecurityConfigs.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/Authenticator.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/Channel.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java 
PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java 
PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java
 PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
b5f8d83e89f9026dc0853e5f92c00b2d7f043e22 
  clients/src/main/java/org/apache/kafka/common/network/Selector.java 
57de0585e5e9a53eb9dcd99cac1ab3eb2086a302 
  clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java 
dab1a94dd29563688b6ecf4eeb0e180b06049d3f 
  
clients/src/main/java/org/apache/kafka/common/security/auth/DefaultPrincipalBuilder.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/security/auth/KafkaPrincipal.java 
PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/security/auth/PrincipalBuilder.java
 PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/utils/Utils.java 
f73eedb030987f018d8446bb1dcd98d19fa97331 
  clients/src/test/java/org/apache/kafka/common/network/EchoServer.java 
PRE-CREATION 
  clients/src/test/java/org/apache/kafka/common/network/SSLFactoryTest.java 
PRE-CREATION 
  clients/src/test/java/org/apache/kafka/common/network/SSLSelectorTest.java 
PRE-CREATION 
  clients/src/test/java/org/apache/kafka/common/network/SelectorTest.java 
d5b306b026e788b4e5479f3419805aa49ae889f3 
  clients/src/test/java/org/apache/kafka/common/utils/UtilsTest.java 
2ebe3c21f611dc133a2dbb8c7dfb0845f8c21498 
  clients/src/test/java/org/apache/kafka/test/TestSSLUtils.java PRE-CREATION 

Diff: https://reviews.apache.org/r/33620/diff/


Testing
---


Thanks,

Sriharsha Chintalapani



[jira] [Created] (KAFKA-2195) Add versionId to AbstractRequest.getErrorResponse and AbstractRequest.getRequest

2015-05-15 Thread Andrii Biletskyi (JIRA)
Andrii Biletskyi created KAFKA-2195:
---

 Summary: Add versionId to AbstractRequest.getErrorResponse and 
AbstractRequest.getRequest
 Key: KAFKA-2195
 URL: https://issues.apache.org/jira/browse/KAFKA-2195
 Project: Kafka
  Issue Type: Sub-task
Reporter: Andrii Biletskyi
Assignee: Andrii Biletskyi


This is needed to support versioning.
1) getRequest: to parse bytes into correct schema you need to know it's 
version; by default it's the latest version (current_schema)

2) getErrorResponse: after filling map with error codes you need to create 
respective Response message which dependes on versionId



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [Discussion] Using Client Requests and Responses in Server

2015-05-15 Thread Andrii Biletskyi
Okay,
I can pick that. I'll create sub-task under KAFKA-2044.

Thanks,
Andrii Biletskyi

On Fri, May 15, 2015 at 4:27 PM, Gwen Shapira  wrote:

> Agree that you need version in getErrorResponse too (so you'll get the
> correct error), which means you'll need to add versionId to constructors of
> every response object...
>
> You'll want to keep two interfaces, one with version and one with
> CURR_VERSION as default, so you won't need to modify every single call...
>
> On Fri, May 15, 2015 at 4:03 PM, Andrii Biletskyi <
> andrii.bilets...@stealth.ly> wrote:
>
> > Correct, I think we are on the same page.
> > This way we can fix RequestChannel part (where it uses
> > AbstractRequest.getRequest)
> >
> > But would it be okay to add versionId to AbstractRequest.getErrorResponse
> > signature too?
> > I'm a bit lost with all those Abstract... objects hierarchy and not sure
> > whether it's
> > the right solution.
> >
> > Thanks,
> > Andrii Biletskyi
> >
> > On Fri, May 15, 2015 at 3:47 PM, Gwen Shapira 
> > wrote:
> >
> > > I agree, we currently don't handle versions correctly when
> de-serializing
> > > into java objects. This will be an isssue for every req/resp we move to
> > use
> > > the java objects.
> > >
> > > It looks like this requires:
> > > 1. Add versionId parameter to all parse functions in Java req/resp
> > objects
> > > 2. Modify getRequest to pass it along
> > > 3. Modify RequestChannel to get the version out of the header and use
> it
> > > when de-serializing the body.
> > >
> > > Did I get that correct? I want to make sure we are talking about the
> same
> > > issue.
> > >
> > > Gwen
> > >
> > > On Fri, May 15, 2015 at 1:45 PM, Andrii Biletskyi <
> > > andrii.bilets...@stealth.ly> wrote:
> > >
> > > > Gwen,
> > > >
> > > > I didn't find this in answers above so apologies if this was
> discussed.
> > > > It's about the way we would like to handle request versions.
> > > >
> > > > As I understood from Jun's answer we generally should try using the
> > same
> > > > java object while evolving the request. I believe the only example of
> > > > evolved
> > > > request now - OffsetCommitRequest follows this approach.
> > > >
> > > > I'm trying to evolve MetadataRequest to the next version as part of
> > KIP-4
> > > > and not sure current AbstractRequest api (which is a basis for ported
> > to
> > > > java requests)
> > > > is sufficient.
> > > >
> > > > The problem is: in order to deserialize bytes into correct correct
> > object
> > > > you need
> > > > to know it's version. Suppose KafkaApi serves OffsetCommitRequestV0
> and
> > > V2
> > > > (current).
> > > > For such cases OffsetCommitRequest class has two constructors:
> > > >
> > > > public static OffsetCommitRequest parse(ByteBuffer buffer, int
> > versionId)
> > > > AND
> > > > public static OffsetCommitRequest parse(ByteBuffer buffer)
> > > >
> > > > The latter one will simply pick the "current" schema version.
> > > > Now AbstractRequest.getRequest which is an entry point for
> > deserializing
> > > > request
> > > > for KafkaApi matches only on RequestHeader.apiKey (and thus uses the
> > > second
> > > > OffsetCommitRequest constructor) which is not sufficient because we
> > also
> > > > need
> > > > RequestHeader.apiVersion in case old request version.
> > > >
> > > > The same problem appears in
> AbstractRequest.getErrorResponse(Throwable
> > > e) -
> > > > to construct the right error response object we need to know to which
> > > > apiVersion
> > > > to respond.
> > > >
> > > > I think this can affect other tasks under KAFKA-1927 - replacing
> > separate
> > > > RQ/RP,
> > > > so maybe it makes sense to decide/fix it once.
> > > >
> > > > Thanks,
> > > > Andrii Bieltskyi
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Mar 25, 2015 at 12:42 AM, Gwen Shapira <
> gshap...@cloudera.com>
> > > > wrote:
> > > >
> > > > > OK, I posted a working patch on KAFKA-2044 and
> > > > > https://reviews.apache.org/r/32459/diff/.
> > > > >
> > > > > There are few decisions there than can be up to discussion (factory
> > > > method
> > > > > on AbstractRequestResponse, the new handleErrors in request API),
> but
> > > as
> > > > > far as support for o.a.k.common requests in core goes, it does what
> > it
> > > > > needs to do.
> > > > >
> > > > > Please review!
> > > > >
> > > > > Gwen
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Mar 24, 2015 at 10:59 AM, Gwen Shapira <
> > gshap...@cloudera.com>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I uploaded a (very) preliminary patch with my idea.
> > > > > >
> > > > > > One thing thats missing:
> > > > > > RequestResponse had  handleError method that all requests
> > > implemented,
> > > > > > typically generating appropriate error Response for the request
> and
> > > > > sending
> > > > > > it along. Its used by KafkaApis to handle all protocol errors for
> > > valid
> > > > > > requests that are not handled elsewhere.
> > > > > > AbstractRequestResponse doesn't have such met

Re: [Discussion] Using Client Requests and Responses in Server

2015-05-15 Thread Gwen Shapira
Agree that you need version in getErrorResponse too (so you'll get the
correct error), which means you'll need to add versionId to constructors of
every response object...

You'll want to keep two interfaces, one with version and one with
CURR_VERSION as default, so you won't need to modify every single call...

On Fri, May 15, 2015 at 4:03 PM, Andrii Biletskyi <
andrii.bilets...@stealth.ly> wrote:

> Correct, I think we are on the same page.
> This way we can fix RequestChannel part (where it uses
> AbstractRequest.getRequest)
>
> But would it be okay to add versionId to AbstractRequest.getErrorResponse
> signature too?
> I'm a bit lost with all those Abstract... objects hierarchy and not sure
> whether it's
> the right solution.
>
> Thanks,
> Andrii Biletskyi
>
> On Fri, May 15, 2015 at 3:47 PM, Gwen Shapira 
> wrote:
>
> > I agree, we currently don't handle versions correctly when de-serializing
> > into java objects. This will be an isssue for every req/resp we move to
> use
> > the java objects.
> >
> > It looks like this requires:
> > 1. Add versionId parameter to all parse functions in Java req/resp
> objects
> > 2. Modify getRequest to pass it along
> > 3. Modify RequestChannel to get the version out of the header and use it
> > when de-serializing the body.
> >
> > Did I get that correct? I want to make sure we are talking about the same
> > issue.
> >
> > Gwen
> >
> > On Fri, May 15, 2015 at 1:45 PM, Andrii Biletskyi <
> > andrii.bilets...@stealth.ly> wrote:
> >
> > > Gwen,
> > >
> > > I didn't find this in answers above so apologies if this was discussed.
> > > It's about the way we would like to handle request versions.
> > >
> > > As I understood from Jun's answer we generally should try using the
> same
> > > java object while evolving the request. I believe the only example of
> > > evolved
> > > request now - OffsetCommitRequest follows this approach.
> > >
> > > I'm trying to evolve MetadataRequest to the next version as part of
> KIP-4
> > > and not sure current AbstractRequest api (which is a basis for ported
> to
> > > java requests)
> > > is sufficient.
> > >
> > > The problem is: in order to deserialize bytes into correct correct
> object
> > > you need
> > > to know it's version. Suppose KafkaApi serves OffsetCommitRequestV0 and
> > V2
> > > (current).
> > > For such cases OffsetCommitRequest class has two constructors:
> > >
> > > public static OffsetCommitRequest parse(ByteBuffer buffer, int
> versionId)
> > > AND
> > > public static OffsetCommitRequest parse(ByteBuffer buffer)
> > >
> > > The latter one will simply pick the "current" schema version.
> > > Now AbstractRequest.getRequest which is an entry point for
> deserializing
> > > request
> > > for KafkaApi matches only on RequestHeader.apiKey (and thus uses the
> > second
> > > OffsetCommitRequest constructor) which is not sufficient because we
> also
> > > need
> > > RequestHeader.apiVersion in case old request version.
> > >
> > > The same problem appears in AbstractRequest.getErrorResponse(Throwable
> > e) -
> > > to construct the right error response object we need to know to which
> > > apiVersion
> > > to respond.
> > >
> > > I think this can affect other tasks under KAFKA-1927 - replacing
> separate
> > > RQ/RP,
> > > so maybe it makes sense to decide/fix it once.
> > >
> > > Thanks,
> > > Andrii Bieltskyi
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Mar 25, 2015 at 12:42 AM, Gwen Shapira 
> > > wrote:
> > >
> > > > OK, I posted a working patch on KAFKA-2044 and
> > > > https://reviews.apache.org/r/32459/diff/.
> > > >
> > > > There are few decisions there than can be up to discussion (factory
> > > method
> > > > on AbstractRequestResponse, the new handleErrors in request API), but
> > as
> > > > far as support for o.a.k.common requests in core goes, it does what
> it
> > > > needs to do.
> > > >
> > > > Please review!
> > > >
> > > > Gwen
> > > >
> > > >
> > > >
> > > > On Tue, Mar 24, 2015 at 10:59 AM, Gwen Shapira <
> gshap...@cloudera.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I uploaded a (very) preliminary patch with my idea.
> > > > >
> > > > > One thing thats missing:
> > > > > RequestResponse had  handleError method that all requests
> > implemented,
> > > > > typically generating appropriate error Response for the request and
> > > > sending
> > > > > it along. Its used by KafkaApis to handle all protocol errors for
> > valid
> > > > > requests that are not handled elsewhere.
> > > > > AbstractRequestResponse doesn't have such method.
> > > > >
> > > > > I can, of course, add it.
> > > > > But before I jump into this, I'm wondering if there was another
> plan
> > on
> > > > > handling Api errors.
> > > > >
> > > > > Gwen
> > > > >
> > > > > On Mon, Mar 23, 2015 at 6:16 PM, Jun Rao  wrote:
> > > > >
> > > > >> I think what you are saying is that in RequestChannel, we can
> start
> > > > >> generating header/body for new request types and leave requestObj
> > > null.
> > > > >> For
> > > > >>

Re: [Discussion] Using Client Requests and Responses in Server

2015-05-15 Thread Andrii Biletskyi
Correct, I think we are on the same page.
This way we can fix RequestChannel part (where it uses
AbstractRequest.getRequest)

But would it be okay to add versionId to AbstractRequest.getErrorResponse
signature too?
I'm a bit lost with all those Abstract... objects hierarchy and not sure
whether it's
the right solution.

Thanks,
Andrii Biletskyi

On Fri, May 15, 2015 at 3:47 PM, Gwen Shapira  wrote:

> I agree, we currently don't handle versions correctly when de-serializing
> into java objects. This will be an isssue for every req/resp we move to use
> the java objects.
>
> It looks like this requires:
> 1. Add versionId parameter to all parse functions in Java req/resp objects
> 2. Modify getRequest to pass it along
> 3. Modify RequestChannel to get the version out of the header and use it
> when de-serializing the body.
>
> Did I get that correct? I want to make sure we are talking about the same
> issue.
>
> Gwen
>
> On Fri, May 15, 2015 at 1:45 PM, Andrii Biletskyi <
> andrii.bilets...@stealth.ly> wrote:
>
> > Gwen,
> >
> > I didn't find this in answers above so apologies if this was discussed.
> > It's about the way we would like to handle request versions.
> >
> > As I understood from Jun's answer we generally should try using the same
> > java object while evolving the request. I believe the only example of
> > evolved
> > request now - OffsetCommitRequest follows this approach.
> >
> > I'm trying to evolve MetadataRequest to the next version as part of KIP-4
> > and not sure current AbstractRequest api (which is a basis for ported to
> > java requests)
> > is sufficient.
> >
> > The problem is: in order to deserialize bytes into correct correct object
> > you need
> > to know it's version. Suppose KafkaApi serves OffsetCommitRequestV0 and
> V2
> > (current).
> > For such cases OffsetCommitRequest class has two constructors:
> >
> > public static OffsetCommitRequest parse(ByteBuffer buffer, int versionId)
> > AND
> > public static OffsetCommitRequest parse(ByteBuffer buffer)
> >
> > The latter one will simply pick the "current" schema version.
> > Now AbstractRequest.getRequest which is an entry point for deserializing
> > request
> > for KafkaApi matches only on RequestHeader.apiKey (and thus uses the
> second
> > OffsetCommitRequest constructor) which is not sufficient because we also
> > need
> > RequestHeader.apiVersion in case old request version.
> >
> > The same problem appears in AbstractRequest.getErrorResponse(Throwable
> e) -
> > to construct the right error response object we need to know to which
> > apiVersion
> > to respond.
> >
> > I think this can affect other tasks under KAFKA-1927 - replacing separate
> > RQ/RP,
> > so maybe it makes sense to decide/fix it once.
> >
> > Thanks,
> > Andrii Bieltskyi
> >
> >
> >
> >
> >
> > On Wed, Mar 25, 2015 at 12:42 AM, Gwen Shapira 
> > wrote:
> >
> > > OK, I posted a working patch on KAFKA-2044 and
> > > https://reviews.apache.org/r/32459/diff/.
> > >
> > > There are few decisions there than can be up to discussion (factory
> > method
> > > on AbstractRequestResponse, the new handleErrors in request API), but
> as
> > > far as support for o.a.k.common requests in core goes, it does what it
> > > needs to do.
> > >
> > > Please review!
> > >
> > > Gwen
> > >
> > >
> > >
> > > On Tue, Mar 24, 2015 at 10:59 AM, Gwen Shapira 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I uploaded a (very) preliminary patch with my idea.
> > > >
> > > > One thing thats missing:
> > > > RequestResponse had  handleError method that all requests
> implemented,
> > > > typically generating appropriate error Response for the request and
> > > sending
> > > > it along. Its used by KafkaApis to handle all protocol errors for
> valid
> > > > requests that are not handled elsewhere.
> > > > AbstractRequestResponse doesn't have such method.
> > > >
> > > > I can, of course, add it.
> > > > But before I jump into this, I'm wondering if there was another plan
> on
> > > > handling Api errors.
> > > >
> > > > Gwen
> > > >
> > > > On Mon, Mar 23, 2015 at 6:16 PM, Jun Rao  wrote:
> > > >
> > > >> I think what you are saying is that in RequestChannel, we can start
> > > >> generating header/body for new request types and leave requestObj
> > null.
> > > >> For
> > > >> existing requests, header/body will be null initially. Gradually, we
> > can
> > > >> migrate each type of requests by populating header/body, instead of
> > > >> requestObj. This makes sense to me since it serves two purposes (1)
> > not
> > > >> polluting the code base with duplicated request/response objects for
> > new
> > > >> types of requests and (2) allowing the refactoring of existing
> > requests
> > > to
> > > >> be done in smaller pieces.
> > > >>
> > > >> Could you try that approach and perhaps just migrate one existing
> > > request
> > > >> type (e.g. HeartBeatRequest) as an example? We probably need to
> rewind
> > > the
> > > >> buffer after reading the requestId when deserializing the header
> >

Re: [Discussion] Using Client Requests and Responses in Server

2015-05-15 Thread Gwen Shapira
I agree, we currently don't handle versions correctly when de-serializing
into java objects. This will be an isssue for every req/resp we move to use
the java objects.

It looks like this requires:
1. Add versionId parameter to all parse functions in Java req/resp objects
2. Modify getRequest to pass it along
3. Modify RequestChannel to get the version out of the header and use it
when de-serializing the body.

Did I get that correct? I want to make sure we are talking about the same
issue.

Gwen

On Fri, May 15, 2015 at 1:45 PM, Andrii Biletskyi <
andrii.bilets...@stealth.ly> wrote:

> Gwen,
>
> I didn't find this in answers above so apologies if this was discussed.
> It's about the way we would like to handle request versions.
>
> As I understood from Jun's answer we generally should try using the same
> java object while evolving the request. I believe the only example of
> evolved
> request now - OffsetCommitRequest follows this approach.
>
> I'm trying to evolve MetadataRequest to the next version as part of KIP-4
> and not sure current AbstractRequest api (which is a basis for ported to
> java requests)
> is sufficient.
>
> The problem is: in order to deserialize bytes into correct correct object
> you need
> to know it's version. Suppose KafkaApi serves OffsetCommitRequestV0 and V2
> (current).
> For such cases OffsetCommitRequest class has two constructors:
>
> public static OffsetCommitRequest parse(ByteBuffer buffer, int versionId)
> AND
> public static OffsetCommitRequest parse(ByteBuffer buffer)
>
> The latter one will simply pick the "current" schema version.
> Now AbstractRequest.getRequest which is an entry point for deserializing
> request
> for KafkaApi matches only on RequestHeader.apiKey (and thus uses the second
> OffsetCommitRequest constructor) which is not sufficient because we also
> need
> RequestHeader.apiVersion in case old request version.
>
> The same problem appears in AbstractRequest.getErrorResponse(Throwable e) -
> to construct the right error response object we need to know to which
> apiVersion
> to respond.
>
> I think this can affect other tasks under KAFKA-1927 - replacing separate
> RQ/RP,
> so maybe it makes sense to decide/fix it once.
>
> Thanks,
> Andrii Bieltskyi
>
>
>
>
>
> On Wed, Mar 25, 2015 at 12:42 AM, Gwen Shapira 
> wrote:
>
> > OK, I posted a working patch on KAFKA-2044 and
> > https://reviews.apache.org/r/32459/diff/.
> >
> > There are few decisions there than can be up to discussion (factory
> method
> > on AbstractRequestResponse, the new handleErrors in request API), but as
> > far as support for o.a.k.common requests in core goes, it does what it
> > needs to do.
> >
> > Please review!
> >
> > Gwen
> >
> >
> >
> > On Tue, Mar 24, 2015 at 10:59 AM, Gwen Shapira 
> > wrote:
> >
> > > Hi,
> > >
> > > I uploaded a (very) preliminary patch with my idea.
> > >
> > > One thing thats missing:
> > > RequestResponse had  handleError method that all requests implemented,
> > > typically generating appropriate error Response for the request and
> > sending
> > > it along. Its used by KafkaApis to handle all protocol errors for valid
> > > requests that are not handled elsewhere.
> > > AbstractRequestResponse doesn't have such method.
> > >
> > > I can, of course, add it.
> > > But before I jump into this, I'm wondering if there was another plan on
> > > handling Api errors.
> > >
> > > Gwen
> > >
> > > On Mon, Mar 23, 2015 at 6:16 PM, Jun Rao  wrote:
> > >
> > >> I think what you are saying is that in RequestChannel, we can start
> > >> generating header/body for new request types and leave requestObj
> null.
> > >> For
> > >> existing requests, header/body will be null initially. Gradually, we
> can
> > >> migrate each type of requests by populating header/body, instead of
> > >> requestObj. This makes sense to me since it serves two purposes (1)
> not
> > >> polluting the code base with duplicated request/response objects for
> new
> > >> types of requests and (2) allowing the refactoring of existing
> requests
> > to
> > >> be done in smaller pieces.
> > >>
> > >> Could you try that approach and perhaps just migrate one existing
> > request
> > >> type (e.g. HeartBeatRequest) as an example? We probably need to rewind
> > the
> > >> buffer after reading the requestId when deserializing the header
> (since
> > >> the
> > >> header includes the request id).
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >> On Mon, Mar 23, 2015 at 4:52 PM, Gwen Shapira 
> > >> wrote:
> > >>
> > >> > I'm thinking of a different approach, that will not fix everything,
> > but
> > >> > will allow adding new requests without code duplication (and
> therefore
> > >> > unblock KIP-4):
> > >> >
> > >> > RequestChannel.request currently takes a buffer and parses it into
> an
> > >> "old"
> > >> > request object. Since the objects are byte-compatibly, we should be
> > >> able to
> > >> > parse existing requests into both old and new objects. New requests
> > will
> > >> > only be par

[jira] [Updated] (KAFKA-2194) Produce request failure after Kafka + Zookeeper restart

2015-05-15 Thread Ian Morgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Morgan updated KAFKA-2194:
--
Description: 
Trying to separate out Kafka-logs and Zookeeper data from the primary kafka 
folder, so that distributed system can be distributed separately to the data 
folders. Initialisation seems to succeed (e.g. old topics from kafka-logs is 
loaded successfully).

Steps to reproduce:

1. Start ZooKeeper
2. Start Kafka (write data to Kafka so that data is available in kafka-logs).
3. Kill Kafka
4. Kill Zookeeper
5. Start Kafka
6. Start Zookeeper
7. Try reading from Kafka

Logs:

Seeing the following in server.log (where LcmdSegments is the topic). 

2015-05-15 12:06:38,290 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148440 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:38,384 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,384 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148519 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,493 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,493 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148598 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,603 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,603 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148677 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,696 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,696 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148756 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1

And KafkaNet client returns:

Topic:LcmdSegments returned error code of LeaderNotAvailable.  Retrying.
Backing off metadata request retry.  Waiting for 62500ms.

state-change.log shows an error:

2015-05-15 11:44:18,110 ERROR KAFKA.logger: Controller 1 epoch 2 initiated 
state change for partition [LcmdSegments,14] from OfflinePartition to 
OnlinePartition failed
kafka.common.NoReplicaOnlineException: No replica for partition 
[LcmdSegments,14] is alive. Live brokers are: [Set()], Assigned replicas are: 
[List(1)]
at 
kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
at 
kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
at 
kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
at 
kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
at 
kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at 
kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
at 
kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:70)
at 
kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:314)
at 
kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:161)
at 
kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:81)
at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:49)
at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47)
at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47)
at kafka.utils.Utils$.inLock(Utils.scala:535)
at 
kafka.server.ZookeeperLeaderElector.startup(ZookeeperLeaderElector.scala:47)
at 
kafka.controller.KafkaController$$anonfun$startup$1.apply$mcV$sp(KafkaController.scala:650)
at 
kafka.controller

[jira] [Created] (KAFKA-2194) Produce request failure after Kafka + Zookeeper restart

2015-05-15 Thread Ian Morgan (JIRA)
Ian Morgan created KAFKA-2194:
-

 Summary: Produce request failure after Kafka + Zookeeper restart
 Key: KAFKA-2194
 URL: https://issues.apache.org/jira/browse/KAFKA-2194
 Project: Kafka
  Issue Type: Bug
  Components: clients, producer 
Affects Versions: 0.8.2.1
 Environment: Windows Server 2012 R2
Reporter: Ian Morgan
Assignee: Jun Rao


Trying to separate out Kafka-logs and Zookeeper data from the primary kafka 
folder, so that distributed system can be distributed separately to the data 
folders. Initialisation seems to succeed (e.g. old topics from kafka-logs is 
loaded successfully).

Steps to reproduce:

1. Kill Kafka
2. Kill Zookeeper
3. Start Kafka
4. Start Zookeeper

Logs:

Seeing the following in server.log (where LcmdSegments is the topic). 

2015-05-15 12:06:38,290 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148440 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:38,384 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,384 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148519 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,493 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,493 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148598 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,603 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,603 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148677 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1
2015-05-15 12:06:39,696 INFO KAFKA.Processor: Closing socket connection to 
/10.44.18.75.
2015-05-15 12:06:39,696 WARN KAFKA.KafkaApis: [KafkaApi-1] Produce request with 
correlation id 148756 from client  on partition [LcmdSegments,14] failed due to 
Partition [LcmdSegments,14] doesn't exist on 1

And KafkaNet client returns:

Topic:LcmdSegments returned error code of LeaderNotAvailable.  Retrying.
Backing off metadata request retry.  Waiting for 62500ms.

state-change.log shows an error:

2015-05-15 11:44:18,110 ERROR KAFKA.logger: Controller 1 epoch 2 initiated 
state change for partition [LcmdSegments,14] from OfflinePartition to 
OnlinePartition failed
kafka.common.NoReplicaOnlineException: No replica for partition 
[LcmdSegments,14] is alive. Live brokers are: [Set()], Assigned replicas are: 
[List(1)]
at 
kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:75)
at 
kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:357)
at 
kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:206)
at 
kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:120)
at 
kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:117)
at 
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at 
kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:117)
at 
kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:70)
at 
kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:314)
at 
kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:161)
at 
kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:81)
at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:49)
at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47)
at 
kafka.server.ZookeeperLeaderElector$$anonfun$startup$1.apply(ZookeeperLeaderElector.scala:47)
at kafka.utils.Utils$.inLock(Utils.scala:535)
at 
kafka.server.ZookeeperLeaderElector.startup(ZookeeperLead

Re: [Discussion] Using Client Requests and Responses in Server

2015-05-15 Thread Andrii Biletskyi
Gwen,

I didn't find this in answers above so apologies if this was discussed.
It's about the way we would like to handle request versions.

As I understood from Jun's answer we generally should try using the same
java object while evolving the request. I believe the only example of
evolved
request now - OffsetCommitRequest follows this approach.

I'm trying to evolve MetadataRequest to the next version as part of KIP-4
and not sure current AbstractRequest api (which is a basis for ported to
java requests)
is sufficient.

The problem is: in order to deserialize bytes into correct correct object
you need
to know it's version. Suppose KafkaApi serves OffsetCommitRequestV0 and V2
(current).
For such cases OffsetCommitRequest class has two constructors:

public static OffsetCommitRequest parse(ByteBuffer buffer, int versionId)
AND
public static OffsetCommitRequest parse(ByteBuffer buffer)

The latter one will simply pick the "current" schema version.
Now AbstractRequest.getRequest which is an entry point for deserializing
request
for KafkaApi matches only on RequestHeader.apiKey (and thus uses the second
OffsetCommitRequest constructor) which is not sufficient because we also
need
RequestHeader.apiVersion in case old request version.

The same problem appears in AbstractRequest.getErrorResponse(Throwable e) -
to construct the right error response object we need to know to which
apiVersion
to respond.

I think this can affect other tasks under KAFKA-1927 - replacing separate
RQ/RP,
so maybe it makes sense to decide/fix it once.

Thanks,
Andrii Bieltskyi





On Wed, Mar 25, 2015 at 12:42 AM, Gwen Shapira 
wrote:

> OK, I posted a working patch on KAFKA-2044 and
> https://reviews.apache.org/r/32459/diff/.
>
> There are few decisions there than can be up to discussion (factory method
> on AbstractRequestResponse, the new handleErrors in request API), but as
> far as support for o.a.k.common requests in core goes, it does what it
> needs to do.
>
> Please review!
>
> Gwen
>
>
>
> On Tue, Mar 24, 2015 at 10:59 AM, Gwen Shapira 
> wrote:
>
> > Hi,
> >
> > I uploaded a (very) preliminary patch with my idea.
> >
> > One thing thats missing:
> > RequestResponse had  handleError method that all requests implemented,
> > typically generating appropriate error Response for the request and
> sending
> > it along. Its used by KafkaApis to handle all protocol errors for valid
> > requests that are not handled elsewhere.
> > AbstractRequestResponse doesn't have such method.
> >
> > I can, of course, add it.
> > But before I jump into this, I'm wondering if there was another plan on
> > handling Api errors.
> >
> > Gwen
> >
> > On Mon, Mar 23, 2015 at 6:16 PM, Jun Rao  wrote:
> >
> >> I think what you are saying is that in RequestChannel, we can start
> >> generating header/body for new request types and leave requestObj null.
> >> For
> >> existing requests, header/body will be null initially. Gradually, we can
> >> migrate each type of requests by populating header/body, instead of
> >> requestObj. This makes sense to me since it serves two purposes (1) not
> >> polluting the code base with duplicated request/response objects for new
> >> types of requests and (2) allowing the refactoring of existing requests
> to
> >> be done in smaller pieces.
> >>
> >> Could you try that approach and perhaps just migrate one existing
> request
> >> type (e.g. HeartBeatRequest) as an example? We probably need to rewind
> the
> >> buffer after reading the requestId when deserializing the header (since
> >> the
> >> header includes the request id).
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Mon, Mar 23, 2015 at 4:52 PM, Gwen Shapira 
> >> wrote:
> >>
> >> > I'm thinking of a different approach, that will not fix everything,
> but
> >> > will allow adding new requests without code duplication (and therefore
> >> > unblock KIP-4):
> >> >
> >> > RequestChannel.request currently takes a buffer and parses it into an
> >> "old"
> >> > request object. Since the objects are byte-compatibly, we should be
> >> able to
> >> > parse existing requests into both old and new objects. New requests
> will
> >> > only be parsed into new objects.
> >> >
> >> > Basically:
> >> > val requestId = buffer.getShort()
> >> > if (requestId in keyToNameAndDeserializerMap) {
> >> >requestObj = RequestKeys.deserializerForKey(requestId)(buffer)
> >> >header: RequestHeader = RequestHeader.parse(buffer)
> >> >body: Struct =
> >> >
> >>
> ProtoUtils.currentRequestSchema(apiKey).read(buffer).asInstanceOf[Struct]
> >> > } else {
> >> >requestObj = null
> >> > header: RequestHeader = RequestHeader.parse(buffer)
> >> >body: Struct =
> >> >
> >>
> ProtoUtils.currentRequestSchema(apiKey).read(buffer).asInstanceOf[Struct]
> >> > }
> >> >
> >> > This way existing KafkaApis will keep working as normal. The new Apis
> >> can
> >> > implement just the new header/body requests.
> >> > We'll do the same on the send-side: BoundedByteBufferSend can have a
> >> > c

Re: Review Request 33620: Patch for KAFKA-1690

2015-05-15 Thread Rajini Sivaram


> On May 14, 2015, 10:21 a.m., Rajini Sivaram wrote:
> > clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java,
> >  line 190
> > 
> >
> > I think when delegated tasks are run asynchronously, selection 
> > interestOps should be managed within the transport layer. Otherwise, 
> > packets may be processed out of order. InterestOps should be set to zero 
> > before tasks are executed, and changes to interestOps due to packets being 
> > ready for transmission etc. should be cached and set after the delegated 
> > tasks are executed. Not sure if the current transport layer interface is 
> > sufficient to handle this correctly.
> 
> Sriharsha Chintalapani wrote:
> I am not sure how packets can be processed out of order here. If the 
> handshakeStatus is NEED_TASK I call tasks() method which hands over any 
> delegated tasks to executorService . As long as these tasks are not finished 
> handshakeStatus stays NEED_TASK only after these tasks are finished we are 
> resuming handshake. There is no further reading/writing to socketChannel 
> until these tasks are finished. Can you elaborate on why you need InterestOps 
> to set to 0.

I think what happens at the moment is that handshake with a delegated task 
doesn't change interestOps. So it results in a tight loop when interestOps has 
write enabled. In this case, tasks() is called over and over again when poll() 
returns, until the delegated task is complete. This is undesirable, but doesn't 
lead to errors.
In the case of delegated tasks in read/write, I am not sure the code checks for 
NEED_TASK upfront, so it is not clear if they dont lead to packets being 
processed out of order because of interestOps not being reset. 
Basically we want to avoid processing any packets while potentially long 
running delegated tasks are being processed on a different thread. It feels 
like this is being achieved with a tight polling loop at the moment, which 
negates the benefit of running these tasks in a different thread. Setting 
interestOps to zero and resetting them when the task is complete would avoid 
the polling overhead and simplify the code in read/write which otherwise need 
to check if another thread is handling a delegated task.


> On May 14, 2015, 10:21 a.m., Rajini Sivaram wrote:
> > clients/src/main/java/org/apache/kafka/common/network/Authenticator.java, 
> > line 47
> > 
> >
> > Don't think authenticator interface should handle selection 
> > interestOps. Needs better encapsulation.
> 
> Sriharsha Chintalapani wrote:
> SaslAuth works by reading socketChannel and doing accept token on 
> incoming data on server side and similarly saslclient sends a new token . Why 
> do you think authenticate shouldn't handle interestOps since its reading and 
> writing to socketChannel. Whats your proposal
> 
> Rajini Sivaram wrote:
> At the moment management of interestOps is spread across Selector, 
> TransportLayer and Authenticator. I would have hoped that they could be 
> contained within the transport layer. My concern is that with transport layer 
> setting interestOps from multiple threads (for delegated tasks in SSL), it 
> may be too messy to handle interestOps within a pluggable authenticator as 
> well. Maybe it is better to address this when SaslAuth implementation is 
> ready.
> 
> Sriharsha Chintalapani wrote:
> Only selector is the one setting the interestOps not transportLayer or 
> authenticator , they just return a interstOp. How is multiple threads an 
> issue here?

May not be an issue if re-negotiation and full graceful shutdown are not 
required. Will wait to see how interestOps are handled for delegated tasks (in 
the other issue raised) before closing this one.


- Rajini


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33620/#review83741
---


On May 12, 2015, 11:20 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33620/
> ---
> 
> (Updated May 12, 2015, 11:20 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1690
> https://issues.apache.org/jira/browse/KAFKA-1690
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests.
> 
> 
> KAFKA-1690. new java producer needs ssl support as a client. Added 
> Princip

[jira] [Commented] (KAFKA-1928) Move kafka.network over to using the network classes in org.apache.kafka.common.network

2015-05-15 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545084#comment-14545084
 ] 

Gwen Shapira commented on KAFKA-1928:
-

Updated reviewboard https://reviews.apache.org/r/33065/diff/
 against branch trunk

> Move kafka.network over to using the network classes in 
> org.apache.kafka.common.network
> ---
>
> Key: KAFKA-1928
> URL: https://issues.apache.org/jira/browse/KAFKA-1928
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Jay Kreps
>Assignee: Gwen Shapira
> Attachments: KAFKA-1928.patch, KAFKA-1928_2015-04-28_00:09:40.patch, 
> KAFKA-1928_2015-04-30_17:48:33.patch, KAFKA-1928_2015-05-01_15:45:24.patch, 
> KAFKA-1928_2015-05-12_12:00:37.patch, KAFKA-1928_2015-05-12_12:57:57.patch, 
> KAFKA-1928_2015-05-15.patch, KAFKA-1928_2015-05-15_03.patch, 
> KAFKA-1928_2015-05-15_10:30:31.patch
>
>
> As part of the common package we introduced a bunch of network related code 
> and abstractions.
> We should look into replacing a lot of what is in kafka.network with this 
> code. Duplicate classes include things like Receive, Send, etc. It is likely 
> possible to also refactor the SocketServer to make use of Selector which 
> should significantly simplify it's code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1928) Move kafka.network over to using the network classes in org.apache.kafka.common.network

2015-05-15 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-1928:

Attachment: KAFKA-1928_2015-05-15_10:30:31.patch

> Move kafka.network over to using the network classes in 
> org.apache.kafka.common.network
> ---
>
> Key: KAFKA-1928
> URL: https://issues.apache.org/jira/browse/KAFKA-1928
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Reporter: Jay Kreps
>Assignee: Gwen Shapira
> Attachments: KAFKA-1928.patch, KAFKA-1928_2015-04-28_00:09:40.patch, 
> KAFKA-1928_2015-04-30_17:48:33.patch, KAFKA-1928_2015-05-01_15:45:24.patch, 
> KAFKA-1928_2015-05-12_12:00:37.patch, KAFKA-1928_2015-05-12_12:57:57.patch, 
> KAFKA-1928_2015-05-15.patch, KAFKA-1928_2015-05-15_03.patch, 
> KAFKA-1928_2015-05-15_10:30:31.patch
>
>
> As part of the common package we introduced a bunch of network related code 
> and abstractions.
> We should look into replacing a lot of what is in kafka.network with this 
> code. Duplicate classes include things like Receive, Send, etc. It is likely 
> possible to also refactor the SocketServer to make use of Selector which 
> should significantly simplify it's code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33065: Patch for KAFKA-1928

2015-05-15 Thread Gwen Shapira

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33065/
---

(Updated May 15, 2015, 7:30 a.m.)


Review request for kafka.


Bugs: 1928 and KAFKA-1928
https://issues.apache.org/jira/browse/1928
https://issues.apache.org/jira/browse/KAFKA-1928


Repository: kafka


Description (updated)
---

first pass on replacing Send


implement maxSize and improved docs


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
KAFKA-1928-v2


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
KAFKA-1928-v2

Conflicts:
core/src/main/scala/kafka/network/RequestChannel.scala

moved selector out of abstract thread


mid-way through putting selector in SocketServer


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
KAFKA-1928-v2

Also, SocketServer is now using Selector. Stil a bit messy - but all tests pass.

Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
KAFKA-1928-v2


renamed requestKey to connectionId to reflect new use and changed type from Any 
to String


Following Jun's comments - moved MultiSend to client. Cleaned up destinations 
as well


removed reify and remaining from send/recieve API, per Jun. moved 
maybeCloseOldest() to Selector per Jay


added idString to node API, changed written to int in Send API


cleaning up MultiSend, added size() to Send interface


fixed some issues with multisend


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
KAFKA-1928-v2


fixed metric thingies


fixed response order bug


error handling for illegal selector state and fix metrics bug


optimized selection key lookup with identity hash


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/clients/ClusterConnectionStates.java 
da76cc257b4cfe3c4bce7120a1f14c7f31ef8587 
  clients/src/main/java/org/apache/kafka/clients/InFlightRequests.java 
936487b16e7ac566f8bdcd39a7240ceb619fd30e 
  clients/src/main/java/org/apache/kafka/clients/KafkaClient.java 
1311f85847b022efec8cb05c450bb18231db6979 
  clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
435fbb5116e80302eba11ed1d3069cb577dbdcbd 
  clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 
bdff518b732105823058e6182f445248b45dc388 
  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
 e55ab11df4db0b0084f841a74cbcf819caf780d5 
  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java 
ef9dd5238fbc771496029866ece1d85db6d7b7a5 
  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
8e336a3aa96c73f52beaeb56b931baf4b026cf21 
  clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
187d0004c8c46b6664ddaffecc6166d4b47351e5 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
1e943d621732889a1c005b243920dc32cea7af66 
  clients/src/main/java/org/apache/kafka/common/Node.java 
f4e4186c7602787e58e304a2f1c293a633114656 
  clients/src/main/java/org/apache/kafka/common/network/ByteBufferReceive.java 
129ae827bccbd982ad93d56e46c6f5c46f147fe0 
  clients/src/main/java/org/apache/kafka/common/network/ByteBufferSend.java 
c8213e156ec9c9af49ee09f5238492318516aaa3 
  clients/src/main/java/org/apache/kafka/common/network/MultiSend.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java 
fc0d168324aaebb97065b0aafbd547a1994d76a7 
  clients/src/main/java/org/apache/kafka/common/network/NetworkSend.java 
68327cd3a734fd429966d3e2016a2488dbbb19e5 
  clients/src/main/java/org/apache/kafka/common/network/Receive.java 
4e33078c1eec834bd74aabcb5fc69f18c9d6d52a 
  clients/src/main/java/org/apache/kafka/common/network/Selectable.java 
b5f8d83e89f9026dc0853e5f92c00b2d7f043e22 
  clients/src/main/java/org/apache/kafka/common/network/Selector.java 
57de0585e5e9a53eb9dcd99cac1ab3eb2086a302 
  clients/src/main/java/org/apache/kafka/common/network/Send.java 
5d321a09e470166a1c33639cf0cab26a3bce98ec 
  clients/src/main/java/org/apache/kafka/common/requests/RequestSend.java 
27cbf390c7f148ffa8c5abc154c72cbf0829715c 
  clients/src/main/java/org/apache/kafka/common/requests/ResponseSend.java 
PRE-CREATION 
  clients/src/test/java/org/apache/kafka/clients/MockClient.java 
5e3fab13e3c02eb351558ec973b949b3d1196085 
  clients/src/test/java/org/apache/kafka/clients/NetworkClientTest.java 
8b278892883e63899b53e15efb9d8c926131e858 
  clients/src/test/java/org/apache/kafka/common/network/SelectorTest.java 
d5b306b026e788b4e5479f3419805aa49ae889f3 
  clients/src/test/java/org/apache/kafka/test/MockSelector.java 
ea89b06a4c9e5bb351201299cd3037f5226f0e6c 
  core/src/main/scala/kafka/admin/ConsumerGroupComma

Re: [DISCUSS] Add missing API to old high level consumer

2015-05-15 Thread Gwen Shapira
Makes sense, thanks!

When we added Authz to Sqoop, we tested with "mock" authorizers that would
either allow everything or deny everything and would log every call.
Perhaps this will make sense for Kafka too. Parth has the patch up, so we
can see how he tested it.

Gwen

On Fri, May 15, 2015 at 10:02 AM, Joe Stein  wrote:

> I peeked at the Java producer SSL changes, haven't tried it yet though. I
> can see about getting a Go version to help testing compatibility done in
> the next few weeks.
>
> I still don't understand the Auth pieces, I haven't been able to make the
> KIP lately I need to try to attend like every other or something.
>
> I will re-read Auth this weekend. I guess my question still is how do I
> test it. With SSL it is configure and it works or it doesn't. Auth is not
> so straightforward and more opinionated.
>
> ~ Joe Stein
> - - - - - - - - - - - - - - - - -
>
>   http://www.stealth.ly
> - - - - - - - - - - - - - - - - -
>
> On Fri, May 15, 2015 at 2:27 AM, Gwen Shapira 
> wrote:
>
> > I thought we wanted security on 0.8.3 too... the SSL + Authz patches seem
> > close to ready, no?
> >
> > On Fri, May 15, 2015 at 3:56 AM, Joe Stein  wrote:
> >
> > > Hey Becket, yeah good point. Officially there is no 0.8.3
> > > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
> > > release
> > > planned.
> > >
> > > I agree we should have the new consumer beta and a patch for the old
> one.
> > > If we do that in 0.8.3 that makes good sense, yup. We should also
> include
> > > https://issues.apache.org/jira/browse/KAFKA-1694 server side admin in
> > > 0.8.3
> > > too. We are testing that next week in a few languages to get it through
> > > testing first before committing.
> > >
> > > Maybe we branch 0.8.3 in the near future so that can go through getting
> > > released while the security changes for 0.9 get on trunk.
> > >
> > > ~ Joe Stein
> > >
> > > On Thu, May 14, 2015 at 8:00 PM, Jiangjie Qin
>  > >
> > > wrote:
> > >
> > > > Hey Joe,
> > > >
> > > > Actually this API was asked for before, and we have several use cases
> > in
> > > > LinkedIn as well. I thought we have added that in KAFKA-1650 but
> > > obviously
> > > > I forgot to do that.
> > > >
> > > > My understanding is that we won¹t really deprecate high level
> consumer
> > > > until we move to 0.9.0. So we can have this API either in 0.8.3 or
> > > > 0.8.2.2. Do you mean we only add them to those releases but not put
> it
> > > > into trunk? Any specific concern on that?
> > > >
> > > > Considering this API has already been provided in new consumer.
> Adding
> > > > this method probably won¹t cause any API compatibility issue even if
> > > > people move to new consumer later.
> > > > Given it is both backward and forward compatible and is a one line
> > > change,
> > > > I think it is probably OK to have it added.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > >
> > > >
> > > > On 5/13/15, 3:18 PM, "Joe Stein"  wrote:
> > > >
> > > > >My gut reaction is that this isn't that important for folks
> otherwise
> > > they
> > > > >would have complained already. If it is a blocker for folks
> upgrading
> > to
> > > > >0.8.2.1 then we should do a 0.8.2.2 release with this fix in it. For
> > > > >0.9.0.
> > > > >we are pushing for folks to start using the new consumer and that is
> > the
> > > > >upgrade path we should continue on, imho. If we are going to phase
> out
> > > the
> > > > >scala clients then we need to strive to not be making changes to
> them
> > on
> > > > >trunk.
> > > > >
> > > > >~ Joe Stein
> > > > >- - - - - - - - - - - - - - - - -
> > > > >
> > > > >  http://www.stealth.ly
> > > > >- - - - - - - - - - - - - - - - -
> > > > >
> > > > >On Wed, May 13, 2015 at 6:01 PM, Jiangjie Qin
> >  > > >
> > > > >wrote:
> > > > >
> > > > >> Add the DISCUSS prefix to the email title : )
> > > > >>
> > > > >> From: Jiangjie Qin mailto:j...@linkedin.com>>
> > > > >> Date: Tuesday, May 12, 2015 at 4:51 PM
> > > > >> To: "dev@kafka.apache.org" <
> > > > >> dev@kafka.apache.org>
> > > > >> Subject: Add missing API to old high level consumer
> > > > >>
> > > > >> Hi,
> > > > >>
> > > > >> I just noticed that in KAFKA-1650 (which is before we use KIP) we
> > > added
> > > > >>an
> > > > >> offset commit method in high level consumer that commits offsets
> > > using a
> > > > >> user provided offset map.
> > > > >>
> > > > >> public void commitOffsets(Map
> > > > >> offsetsToCommit, boolean retryOnFailure);
> > > > >>
> > > > >> This method was added to all the Scala classes but I forgot to add
> > it
> > > to
> > > > >> Java API of ConsumerConnector. (Already regretting now. . .)
> > > > >> This method is very useful in several cases and has been asked for
> > > from
> > > > >> time to time. For example, people have several threads consuming
> > > > >>messages
> > > > >> and processing them. Without this method, one thread will
> > unexpect

Re: [DISCUSS] Add missing API to old high level consumer

2015-05-15 Thread Joe Stein
I peeked at the Java producer SSL changes, haven't tried it yet though. I
can see about getting a Go version to help testing compatibility done in
the next few weeks.

I still don't understand the Auth pieces, I haven't been able to make the
KIP lately I need to try to attend like every other or something.

I will re-read Auth this weekend. I guess my question still is how do I
test it. With SSL it is configure and it works or it doesn't. Auth is not
so straightforward and more opinionated.

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Fri, May 15, 2015 at 2:27 AM, Gwen Shapira  wrote:

> I thought we wanted security on 0.8.3 too... the SSL + Authz patches seem
> close to ready, no?
>
> On Fri, May 15, 2015 at 3:56 AM, Joe Stein  wrote:
>
> > Hey Becket, yeah good point. Officially there is no 0.8.3
> > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
> > release
> > planned.
> >
> > I agree we should have the new consumer beta and a patch for the old one.
> > If we do that in 0.8.3 that makes good sense, yup. We should also include
> > https://issues.apache.org/jira/browse/KAFKA-1694 server side admin in
> > 0.8.3
> > too. We are testing that next week in a few languages to get it through
> > testing first before committing.
> >
> > Maybe we branch 0.8.3 in the near future so that can go through getting
> > released while the security changes for 0.9 get on trunk.
> >
> > ~ Joe Stein
> >
> > On Thu, May 14, 2015 at 8:00 PM, Jiangjie Qin  >
> > wrote:
> >
> > > Hey Joe,
> > >
> > > Actually this API was asked for before, and we have several use cases
> in
> > > LinkedIn as well. I thought we have added that in KAFKA-1650 but
> > obviously
> > > I forgot to do that.
> > >
> > > My understanding is that we won¹t really deprecate high level consumer
> > > until we move to 0.9.0. So we can have this API either in 0.8.3 or
> > > 0.8.2.2. Do you mean we only add them to those releases but not put it
> > > into trunk? Any specific concern on that?
> > >
> > > Considering this API has already been provided in new consumer. Adding
> > > this method probably won¹t cause any API compatibility issue even if
> > > people move to new consumer later.
> > > Given it is both backward and forward compatible and is a one line
> > change,
> > > I think it is probably OK to have it added.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > >
> > >
> > > On 5/13/15, 3:18 PM, "Joe Stein"  wrote:
> > >
> > > >My gut reaction is that this isn't that important for folks otherwise
> > they
> > > >would have complained already. If it is a blocker for folks upgrading
> to
> > > >0.8.2.1 then we should do a 0.8.2.2 release with this fix in it. For
> > > >0.9.0.
> > > >we are pushing for folks to start using the new consumer and that is
> the
> > > >upgrade path we should continue on, imho. If we are going to phase out
> > the
> > > >scala clients then we need to strive to not be making changes to them
> on
> > > >trunk.
> > > >
> > > >~ Joe Stein
> > > >- - - - - - - - - - - - - - - - -
> > > >
> > > >  http://www.stealth.ly
> > > >- - - - - - - - - - - - - - - - -
> > > >
> > > >On Wed, May 13, 2015 at 6:01 PM, Jiangjie Qin
>  > >
> > > >wrote:
> > > >
> > > >> Add the DISCUSS prefix to the email title : )
> > > >>
> > > >> From: Jiangjie Qin mailto:j...@linkedin.com>>
> > > >> Date: Tuesday, May 12, 2015 at 4:51 PM
> > > >> To: "dev@kafka.apache.org" <
> > > >> dev@kafka.apache.org>
> > > >> Subject: Add missing API to old high level consumer
> > > >>
> > > >> Hi,
> > > >>
> > > >> I just noticed that in KAFKA-1650 (which is before we use KIP) we
> > added
> > > >>an
> > > >> offset commit method in high level consumer that commits offsets
> > using a
> > > >> user provided offset map.
> > > >>
> > > >> public void commitOffsets(Map
> > > >> offsetsToCommit, boolean retryOnFailure);
> > > >>
> > > >> This method was added to all the Scala classes but I forgot to add
> it
> > to
> > > >> Java API of ConsumerConnector. (Already regretting now. . .)
> > > >> This method is very useful in several cases and has been asked for
> > from
> > > >> time to time. For example, people have several threads consuming
> > > >>messages
> > > >> and processing them. Without this method, one thread will
> unexpectedly
> > > >> commit offsets for another thread, thus might lose some messages if
> > > >> something goes wrong.
> > > >>
> > > >> I created KAFKA-2186 and hope we can add this missing method into
> the
> > > >>Java
> > > >> API of old high level consumer (literarily one line change).
> > > >> Although this method should have been there since KAFKA-1650,
> adding
> > > >>this
> > > >> method to Java API now is a public API change, just want to see if
> > > >>people
> > > >> think we need a KIP for this.
> > > >>
> > > >> Thanks.
> > > >>
> > > >> Jiangjie (Becket) Qin
> > > >>
> > >
> > >
>