[jira] [Commented] (KAFKA-4414) Unexpected "Halting because log truncation is not allowed"

2016-11-15 Thread huxi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669513#comment-15669513
 ] 

huxi commented on KAFKA-4414:
-

Could you reproduce the issue if setting a smaller value for 
"zookeeper.session.timeout.ms"?

> Unexpected "Halting because log truncation is not allowed"
> --
>
> Key: KAFKA-4414
> URL: https://issues.apache.org/jira/browse/KAFKA-4414
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Meyer Kizner
>
> Our Kafka installation runs with unclean leader election disabled, so brokers 
> halt when they find that their message offset is ahead of the leader's offset 
> for a topic. We had two brokers halt today with this issue. After much time 
> spent digging through the logs, I believe the following timeline describes 
> what occurred and points to a plausible hypothesis as to what happened.
> * B1, B2, and B3 are replicas of a topic, all in the ISR. B2 is currently the 
> leader, but B1 is the preferred leader. The controller runs on B3.
> * B1 fails, but the controller does not detect the failure immediately.
> * B2 receives a message from a producer and B3 fetches it to stay up to date. 
> B2 has not accepted the message, because B1 is down and so has not 
> acknowledged the message.
> * The controller triggers a preferred leader election, making B1 the leader, 
> and notifies all replicas.
> * Very shortly afterwards (~200ms), B1's broker registration in ZooKeeper 
> expires, so the controller reassigns B2 to be leader again and notifies all 
> replicas.
> * Because B3 is the controller, while B2 is on another box, B3 hears about 
> both of these events before B2 hears about either. B3 truncates its log to 
> the high water mark (before the pending message) and resumes fetching from B2.
> * B3 fetches the pending message from B2 again.
> * B2 learns that it has been displaced and then reelected, and truncates its 
> log to the high water mark, before the pending message.
> * The next time B3 tries to fetch from B2, it sees that B2 is missing the 
> pending message and halts.
> In this case, there was no data loss or inconsistency. I haven't fully 
> thought through whether either would be possible, but it seems likely that 
> they would be, especially if there had been multiple producers to this topic.
> I'm not completely certain about this timeline, but this sequence of events 
> appears to at least be possible. Looking a bit through the controller code, 
> there doesn't seem to be anything that forces {{LeaderAndIsrRequest}} to be 
> sent in a particular order. If someone with more knowledge of the code base 
> believes this is incorrect, I'd be happy to post the logs and/or do some more 
> digging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2141: KAFKA-4415; Reduce time to create and send Metadat...

2016-11-15 Thread lindong28
GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/2141

KAFKA-4415; Reduce time to create and send MetadataUpdateRequest



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-4415

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2141.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2141


commit a317b6de574842298015121b8cd836512728141e
Author: Dong Lin 
Date:   2016-11-16T05:30:03Z

KAFKA-4415; Reduce time to create and send MetadataUpdateRequest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4415) Reduce time to create and send MetadataUpdateRequest

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669497#comment-15669497
 ] 

ASF GitHub Bot commented on KAFKA-4415:
---

GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/2141

KAFKA-4415; Reduce time to create and send MetadataUpdateRequest



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-4415

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2141.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2141


commit a317b6de574842298015121b8cd836512728141e
Author: Dong Lin 
Date:   2016-11-16T05:30:03Z

KAFKA-4415; Reduce time to create and send MetadataUpdateRequest




> Reduce time to create and send MetadataUpdateRequest
> 
>
> Key: KAFKA-4415
> URL: https://issues.apache.org/jira/browse/KAFKA-4415
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>
> As of current implementation, when controller receives 
> ControlledShutdownRequest, it will 1) for every broker in the cluster, for 
> every partition on the broker which wants to shutdown, create an instance of 
> PartitionStateInfo and add it to 
> ControllerBrokerRequestBatch.,updateMetadataRequestMap; and 2) for every 
> broker, for every follower partitions on the broker which wants to shutdown, 
> send one MetadataUpdateRequst to that broker.
> In order to shutdown a broker, the controller will need to instantiate 
> O(partitionNum * brokerNum) PartitionStateInfo and send O(partitionNum * 
> brokerNum) partitionStateInfo. This is not efficient. The broker should only 
> need to instantiate O(partitionNum) PartitionStateInfo and send O(brokerNum) 
> MetadataUpdateRequest.
> Micro-benchmark results show that this optimization can reduce the time of 
> processing ControlledShutdownRequest by 30%.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4415) Reduce time to create and send MetadataUpdateRequest

2016-11-15 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-4415:
---

 Summary: Reduce time to create and send MetadataUpdateRequest
 Key: KAFKA-4415
 URL: https://issues.apache.org/jira/browse/KAFKA-4415
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin


As of current implementation, when controller receives 
ControlledShutdownRequest, it will 1) for every broker in the cluster, for 
every partition on the broker which wants to shutdown, create an instance of 
PartitionStateInfo and add it to 
ControllerBrokerRequestBatch.,updateMetadataRequestMap; and 2) for every 
broker, for every follower partitions on the broker which wants to shutdown, 
send one MetadataUpdateRequst to that broker.

In order to shutdown a broker, the controller will need to instantiate 
O(partitionNum * brokerNum) PartitionStateInfo and send O(partitionNum * 
brokerNum) partitionStateInfo. This is not efficient. The broker should only 
need to instantiate O(partitionNum) PartitionStateInfo and send O(brokerNum) 
MetadataUpdateRequest.

Micro-benchmark results show that this optimization can reduce the time of 
processing ControlledShutdownRequest by 30%.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4390) Replace MessageSet usage with client-side equivalents

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669354#comment-15669354
 ] 

ASF GitHub Bot commented on KAFKA-4390:
---

GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/2140

KAFKA-4390: Replace MessageSet usage with client-side alternatives



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA4390

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2140.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2140


commit 22353ad7719c2586e1756c0214187196a3323029
Author: Jason Gustafson 
Date:   2016-10-28T08:24:01Z

KAFKA-4390: Replace MessageSet usage with client-side alternatives




> Replace MessageSet usage with client-side equivalents
> -
>
> Key: KAFKA-4390
> URL: https://issues.apache.org/jira/browse/KAFKA-4390
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> Currently we have two separate implementations of Kafka's message format and 
> log structure, one on the client side and one on the server side. Once 
> KAFKA-2066 is merged, we will only be using the client side objects for 
> direct serialization/deserialization in the request APIs, but we we still be 
> using the server-side MessageSet objects everywhere else. Ideally, we can 
> update this code to use the client objects everywhere so that future message 
> format changes only need to be made in one place. This would eliminate the 
> potential for implementation differences and gives us a uniform API for 
> accessing the low-level log structure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2140: KAFKA-4390: Replace MessageSet usage with client-s...

2016-11-15 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/2140

KAFKA-4390: Replace MessageSet usage with client-side alternatives



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA4390

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2140.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2140


commit 22353ad7719c2586e1756c0214187196a3323029
Author: Jason Gustafson 
Date:   2016-10-28T08:24:01Z

KAFKA-4390: Replace MessageSet usage with client-side alternatives




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4391) On Windows, Kafka server stops with uncaught exception after coming back from sleep

2016-11-15 Thread huxi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669074#comment-15669074
 ] 

huxi commented on KAFKA-4391:
-

If we focus on the renameTo issue, that's an infamous issue on Windows 
platform. File.renameTo is very likely error-prone on Windows box. That's why 
Kafka 0.10 change to employ java.nio.file.Files.move method for renaming files.
[~guozhang] Do you think we should backport this fix to 0.9.x codestream?

> On Windows, Kafka server stops with uncaught exception after coming back from 
> sleep
> ---
>
> Key: KAFKA-4391
> URL: https://issues.apache.org/jira/browse/KAFKA-4391
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
> Environment: Windows 10, jdk1.8.0_111
>Reporter: Yiquan Zhou
>
> Steps to reproduce:
> 1. start the zookeeper
> $ bin\windows\zookeeper-server-start.bat config/zookeeper.properties
> 2. start the Kafka server with the default properties
> $ bin\windows\kafka-server-start.bat config/server.properties
> 3. put Windows into sleep mode for 1-2 hours
> 4. activate Windows again, an exception occurs in Kafka server console and 
> the server is stopped:
> {code:title=kafka console log}
> [2016-11-08 21:45:35,185] INFO Client session timed out, have not heard from 
> server in 10081379ms for sessionid 0x1584514da47, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:40,698] INFO zookeeper state changed (Disconnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-11-08 21:45:43,029] INFO Opening socket connection to server 
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,044] INFO Socket connection established to 
> 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,158] INFO Unable to reconnect to ZooKeeper service, 
> session 0x1584514da47 has expired, closing socket connection 
> (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,158] INFO zookeeper state changed (Expired) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-11-08 21:45:43,236] INFO Initiating client connection, 
> connectString=localhost:2181 sessionTimeout=6000 
> watcher=org.I0Itec.zkclient.ZkClient@11ca437b (org.apache.zookeeper.ZooKeeper)
> [2016-11-08 21:45:43,280] INFO EventThread shut down 
> (org.apache.zookeeper.ClientCnxn)
> log4j:ERROR Failed to rename [/controller.log] to 
> [/controller.log.2016-11-08-18].
> [2016-11-08 21:45:43,421] INFO Opening socket connection to server 
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,483] INFO Socket connection established to 
> 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,811] INFO Session establishment complete on server 
> 127.0.0.1/127.0.0.1:2181, sessionid = 0x1584514da470001, negotiated timeout = 
> 6000 (org.apache.zookeeper.ClientCnxn)
> [2016-11-08 21:45:43,827] INFO zookeeper state changed (SyncConnected) 
> (org.I0Itec.zkclient.ZkClient)
> log4j:ERROR Failed to rename [/server.log] to [/server.log.2016-11-08-18].
> [2016-11-08 21:45:43,827] INFO Creating /controller (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,014] INFO Result of znode creation is: OK 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,014] INFO 0 successfully elected as leader 
> (kafka.server.ZookeeperLeaderElector)
> log4j:ERROR Failed to rename [/state-change.log] to 
> [/state-change.log.2016-11-08-18].
> [2016-11-08 21:45:44,421] INFO re-registering broker info in ZK for broker 0 
> (kafka.server.KafkaHealthcheck)
> [2016-11-08 21:45:44,436] INFO Creating /brokers/ids/0 (is it secure? false) 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,686] INFO Result of znode creation is: OK 
> (kafka.utils.ZKCheckedEphemeral)
> [2016-11-08 21:45:44,686] INFO Registered broker 0 at path /brokers/ids/0 
> with addresses: PLAINTEXT -> EndPoint(192.168.0.15,9092,PLAINTEXT) 
> (kafka.utils.ZkUtils)
> [2016-11-08 21:45:44,686] INFO done re-registering broker 
> (kafka.server.KafkaHealthcheck)
> [2016-11-08 21:45:44,686] INFO Subscribing to /brokers/topics path to watch 
> for new topics (kafka.server.KafkaHealthcheck)
> [2016-11-08 21:45:45,046] INFO [ReplicaFetcherManager on broker 0] Removed 
> fetcher for partitions [test,0] (kafka.server.ReplicaFetcherManager)
> [2016-11-08 21:45:45,061] INFO New leader is 0 
> (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
> [2016-11-08 21:45:47,325] ERROR Uncaught exception in scheduled task 
> 'kafka-recovery-point-checkpoint' (kafka.utils.KafkaScheduler)
> 

[jira] [Commented] (KAFKA-4161) Decouple flush and offset commits

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668923#comment-15668923
 ] 

ASF GitHub Bot commented on KAFKA-4161:
---

GitHub user shikhar opened a pull request:

https://github.com/apache/kafka/pull/2139

KAFKA-4161: KIP-89: Allow sink connectors to decouple flush and offset 
commit



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shikhar/kafka kafka-4161-deux

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2139.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2139


commit 706bcc860fed939a00171ebf61fdab8639d99b06
Author: Shikhar Bhushan 
Date:   2016-11-15T00:29:43Z

KAFKA-4161: KIP-89: Allow sink connectors to decouple flush and offset 
commit




> Decouple flush and offset commits
> -
>
> Key: KAFKA-4161
> URL: https://issues.apache.org/jira/browse/KAFKA-4161
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Reporter: Shikhar Bhushan
>Assignee: Shikhar Bhushan
>  Labels: needs-kip
>
> It is desirable to have, in addition to the time-based flush interval, volume 
> or size-based commits. E.g. a sink connector which is buffering in terms of 
> number of records may want to request a flush when the buffer is full, or 
> when sufficient amount of data has been buffered in a file.
> Having a method like say {{requestFlush()}} on the {{SinkTaskContext}} would 
> allow for connectors to have flexible policies around flushes. This would be 
> in addition to the time interval based flushes that are controlled with 
> {{offset.flush.interval.ms}}, for which the clock should be reset when any 
> kind of flush happens.
> We should probably also support requesting flushes via the 
> {{SourceTaskContext}} for consistency though a use-case doesn't come to mind 
> off the bat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2139: KAFKA-4161: KIP-89: Allow sink connectors to decou...

2016-11-15 Thread shikhar
GitHub user shikhar opened a pull request:

https://github.com/apache/kafka/pull/2139

KAFKA-4161: KIP-89: Allow sink connectors to decouple flush and offset 
commit



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shikhar/kafka kafka-4161-deux

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2139.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2139


commit 706bcc860fed939a00171ebf61fdab8639d99b06
Author: Shikhar Bhushan 
Date:   2016-11-15T00:29:43Z

KAFKA-4161: KIP-89: Allow sink connectors to decouple flush and offset 
commit




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4331) Kafka Streams resetter is slow because it joins the same group for each topic

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668885#comment-15668885
 ] 

ASF GitHub Bot commented on KAFKA-4331:
---

GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/2138

KAFKA-4331: Kafka Streams resetter is slow because it joins the same group 
for each topic

  - bug-fix follow up
  - Resetter fails if no intermediate topic is used because seekToEnd() 
commit ALL partitions to EOL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka kafka-4331-streams-resetter-bugfix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2138.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2138


commit 87b8048b562ef6f4453f340e7e21ec4320a8ebf7
Author: Matthias J. Sax 
Date:   2016-11-16T00:19:41Z

KAFKA-4331: Kafka Streams resetter is slow because it joins the same group 
for each topic
  - bug-fix follow up
  - Resetter fails if no intermediate topic is used because seekToEnd() 
commit ALL partitions to EOL




> Kafka Streams resetter is slow because it joins the same group for each topic
> -
>
> Key: KAFKA-4331
> URL: https://issues.apache.org/jira/browse/KAFKA-4331
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.0, 0.10.0.1
>Reporter: Roger Hoover
>Assignee: Matthias J. Sax
> Fix For: 0.10.2.0
>
>
> The resetter is joining the same group for each topic which takes ~10secs in 
> my testing.  This makes the reset very slow when you have a lot of topics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2138: KAFKA-4331: Kafka Streams resetter is slow because...

2016-11-15 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/2138

KAFKA-4331: Kafka Streams resetter is slow because it joins the same group 
for each topic

  - bug-fix follow up
  - Resetter fails if no intermediate topic is used because seekToEnd() 
commit ALL partitions to EOL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka kafka-4331-streams-resetter-bugfix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2138.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2138


commit 87b8048b562ef6f4453f340e7e21ec4320a8ebf7
Author: Matthias J. Sax 
Date:   2016-11-16T00:19:41Z

KAFKA-4331: Kafka Streams resetter is slow because it joins the same group 
for each topic
  - bug-fix follow up
  - Resetter fails if no intermediate topic is used because seekToEnd() 
commit ALL partitions to EOL




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4414) Unexpected "Halting because log truncation is not allowed"

2016-11-15 Thread Meyer Kizner (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Meyer Kizner updated KAFKA-4414:

Description: 
Our Kafka installation runs with unclean leader election disabled, so brokers 
halt when they find that their message offset is ahead of the leader's offset 
for a topic. We had two brokers halt today with this issue. After much time 
spent digging through the logs, I believe the following timeline describes what 
occurred and points to a plausible hypothesis as to what happened.

* B1, B2, and B3 are replicas of a topic, all in the ISR. B2 is currently the 
leader, but B1 is the preferred leader. The controller runs on B3.
* B1 fails, but the controller does not detect the failure immediately.
* B2 receives a message from a producer and B3 fetches it to stay up to date. 
B2 has not accepted the message, because B1 is down and so has not acknowledged 
the message.
* The controller triggers a preferred leader election, making B1 the leader, 
and notifies all replicas.
* Very shortly afterwards (~200ms), B1's broker registration in ZooKeeper 
expires, so the controller reassigns B2 to be leader again and notifies all 
replicas.
* Because B3 is the controller, while B2 is on another box, B3 hears about both 
of these events before B2 hears about either. B3 truncates its log to the high 
water mark (before the pending message) and resumes fetching from B2.
* B3 fetches the pending message from B2 again.
* B2 learns that it has been displaced and then reelected, and truncates its 
log to the high water mark, before the pending message.
* The next time B3 tries to fetch from B2, it sees that B2 is missing the 
pending message and halts.

In this case, there was no data loss or inconsistency. I haven't fully thought 
through whether either would be possible, but it seems likely that they would 
be, especially if there had been multiple producers to this topic.

I'm not completely certain about this timeline, but this sequence of events 
appears to at least be possible. Looking a bit through the controller code, 
there doesn't seem to be anything that forces {{LeaderAndIsrRequest}} to be 
sent in a particular order. If someone with more knowledge of the code base 
believes this is incorrect, I'd be happy to post the logs and/or do some more 
digging.

  was:
Our Kafka installation runs with unclean leader election disabled, so brokers 
halt when they find that their message offset is ahead of the leader's offset 
for a topic. We had two brokers halt today with this issue. After much time 
spent digging through the logs, I believe the following timeline describes what 
occurred and points to a plausible hypothesis as to what happened.

* B1, B2, and B3 are replicas of a topic, all in the ISR. B2 is currently the 
leader, but B1 is the preferred leader. The controller runs on B3.
* B1 fails, but the controller does not detect the failure immediately.
* B2 receives a message from a producer and B3 fetches it to stay up to date. 
B2 has not accepted the message, because B1 is down and so has not acknowledged 
the message.
* The controller triggers a preferred leader election, making B1 the leader, 
and notifies all replicas.
* Very shortly afterwards (~200ms), B1's broker registration in ZooKeeper 
expires, so the controller reassigns B2 to be leader again and notifies all 
replicas.
* Because B3 is the controller, while B2 is on another box, B3 hears about both 
of these events before B2 hears about either. B3 truncates its log to the high 
water mark (before the pending message) and resumes fetching from B2.
* B3 fetches the pending message from B2 again.
* B2 learns that it has been displaced and then reelected, and truncates its 
log to the high water mark, before the pending message.
* The next time B3 tries to fetch from B2, it sees that B2 is missing the 
pending message and halts.

In this case, there was no data loss or inconsistency. I haven't fully thought 
through whether either would be possible, but it seems likely that they would 
be, especially if there had been multiple producers to this topic.

I'm not completely certain about this timeline, but this sequence of events 
appears to at least be possible. Looking a bit through the controller code, 
there doesn't seem to be anything that forces {{LeaderAndIsrRequest}}s to be 
sent in a particular order. If someone with more knowledge of the code base 
believes this is incorrect, I'd be happy to post the logs and/or do some more 
digging.


> Unexpected "Halting because log truncation is not allowed"
> --
>
> Key: KAFKA-4414
> URL: https://issues.apache.org/jira/browse/KAFKA-4414
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Meyer Kizner
>
> Our Kafka installation runs with unclean leader election 

[jira] [Created] (KAFKA-4414) Unexpected "Halting because log truncation is not allowed"

2016-11-15 Thread Meyer Kizner (JIRA)
Meyer Kizner created KAFKA-4414:
---

 Summary: Unexpected "Halting because log truncation is not allowed"
 Key: KAFKA-4414
 URL: https://issues.apache.org/jira/browse/KAFKA-4414
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.9.0.1
Reporter: Meyer Kizner


Our Kafka installation runs with unclean leader election disabled, so brokers 
halt when they find that their message offset is ahead of the leader's offset 
for a topic. We had two brokers halt today with this issue. After much time 
spent digging through the logs, I believe the following timeline describes what 
occurred and points to a plausible hypothesis as to what happened.

* B1, B2, and B3 are replicas of a topic, all in the ISR. B2 is currently the 
leader, but B1 is the preferred leader. The controller runs on B3.
* B1 fails, but the controller does not detect the failure immediately.
* B2 receives a message from a producer and B3 fetches it to stay up to date. 
B2 has not accepted the message, because B1 is down and so has not acknowledged 
the message.
* The controller triggers a preferred leader election, making B1 the leader, 
and notifies all replicas.
* Very shortly afterwards (~200ms), B1's broker registration in ZooKeeper 
expires, so the controller reassigns B2 to be leader again and notifies all 
replicas.
* Because B3 is the controller, while B2 is on another box, B3 hears about both 
of these events before B2 hears about either. B3 truncates its log to the high 
water mark (before the pending message) and resumes fetching from B2.
* B3 fetches the pending message from B2 again.
* B2 learns that it has been displaced and then reelected, and truncates its 
log to the high water mark, before the pending message.
* The next time B3 tries to fetch from B2, it sees that B2 is missing the 
pending message and halts.

In this case, there was no data loss or inconsistency. I haven't fully thought 
through whether either would be possible, but it seems likely that they would 
be, especially if there had been multiple producers to this topic.

I'm not completely certain about this timeline, but this sequence of events 
appears to at least be possible. Looking a bit through the controller code, 
there doesn't seem to be anything that forces {{LeaderAndIsrRequest}}s to be 
sent in a particular order. If someone with more knowledge of the code base 
believes this is incorrect, I'd be happy to post the logs and/or do some more 
digging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4413) Kakfa should support default SSLContext

2016-11-15 Thread Wenjie Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenjie Zhang updated KAFKA-4413:

Description: Currently, to enable SSL in either consumer or producer, we 
have to provide trustStore file and password. Ideally, if the Kafka server 
configured with CA signed certificate, since JRE includes certain CA ROOT certs 
inside "cacerts", Kafka should support SSL without any trustStore file, 
basically, we should update 
`org.apache.kafka.common.security.ssl.SslFactory.createSSLContext` to use 
`SSLContext.getDefault()` when trustStore file is not needed, not sure if there 
is any other places needs to be updated for this enhancement   (was: Currently, 
to enable SSL in either consumer or producer, we have to provide trustStore 
file and password. Ideally, if the Kafka server configured with CA signed 
certificate, since JRE includes certain CA ROOT certs inside "cacerts", Kafka 
should support using `SSLContext.getDefault()` when creating `SSLContext`, the 
changes need to be made at 
`org.apache.kafka.common.security.ssl.SslFactory.createSSLContext`, not sure if 
there is any other places needs to be updated for this enhancement )

> Kakfa should support default SSLContext
> ---
>
> Key: KAFKA-4413
> URL: https://issues.apache.org/jira/browse/KAFKA-4413
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.10.0.1
> Environment: All
>Reporter: Wenjie Zhang
>  Labels: SSLContext, SslFactory, https, ssl
>
> Currently, to enable SSL in either consumer or producer, we have to provide 
> trustStore file and password. Ideally, if the Kafka server configured with CA 
> signed certificate, since JRE includes certain CA ROOT certs inside 
> "cacerts", Kafka should support SSL without any trustStore file, basically, 
> we should update 
> `org.apache.kafka.common.security.ssl.SslFactory.createSSLContext` to use 
> `SSLContext.getDefault()` when trustStore file is not needed, not sure if 
> there is any other places needs to be updated for this enhancement 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4413) Kakfa should support default SSLContext

2016-11-15 Thread Wenjie Zhang (JIRA)
Wenjie Zhang created KAFKA-4413:
---

 Summary: Kakfa should support default SSLContext
 Key: KAFKA-4413
 URL: https://issues.apache.org/jira/browse/KAFKA-4413
 Project: Kafka
  Issue Type: Improvement
  Components: security
Affects Versions: 0.10.0.1
 Environment: All
Reporter: Wenjie Zhang


Currently, to enable SSL in either consumer or producer, we have to provide 
trustStore file and password. Ideally, if the Kafka server configured with CA 
signed certificate, since JRE includes certain CA ROOT certs inside "cacerts", 
Kafka should support using `SSLContext.getDefault()` when creating 
`SSLContext`, the changes need to be made at 
`org.apache.kafka.common.security.ssl.SslFactory.createSSLContext`, not sure if 
there is any other places needs to be updated for this enhancement 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP 88: OffsetFetch Protocol Update

2016-11-15 Thread Vahid S Hashemian
Hi Jason,

I updated the KIP one more time to make use of AdminClient to access the 
updated API.
I made a note of the previous two versions of the KIP under "Rejected 
Alternatives".

Thanks.
--Vahid



From:   Vahid S Hashemian/Silicon Valley/IBM@IBMUS
To: dev@kafka.apache.org
Date:   11/09/2016 11:21 AM
Subject:Re: [DISCUSS] KIP 88: OffsetFetch Protocol Update



Jason,
For some reason I did not receive your earlier response to the thread.
I just saw it when I went to 
https://www.mail-archive.com/dev@kafka.apache.org/msg59608.html
In the updated KIP I exposed the capability via KafkaConsumer (your first 
suggestion), but would be happy to look into adding it to AdminClient in 
the next round if you think that's the better approach.
Thanks.
--Vahid














[jira] [Commented] (KAFKA-4396) Seeing offsets not resetting even when reset policy is configured explicitly

2016-11-15 Thread Justin Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668217#comment-15668217
 ] 

Justin Miller commented on KAFKA-4396:
--

Any updates on this? Thanks!

> Seeing offsets not resetting even when reset policy is configured explicitly
> 
>
> Key: KAFKA-4396
> URL: https://issues.apache.org/jira/browse/KAFKA-4396
> Project: Kafka
>  Issue Type: Bug
>Reporter: Justin Miller
>
> I've been seeing a curious error with kafka 0.10 (spark 2.11), these may be 
> two separate errors, I'm not sure. What's puzzling is that I'm setting 
> auto.offset.reset to latest and it's still throwing an 
> OffsetOutOfRangeException, behavior that's contrary to the code. Please help! 
> :)
> {code}
> val kafkaParams = Map[String, Object](
>   "group.id" -> consumerGroup,
>   "bootstrap.servers" -> bootstrapServers,
>   "key.deserializer" -> classOf[ByteArrayDeserializer],
>   "value.deserializer" -> classOf[MessageRowDeserializer],
>   "auto.offset.reset" -> "latest",
>   "enable.auto.commit" -> (false: java.lang.Boolean),
>   "max.poll.records" -> persisterConfig.maxPollRecords,
>   "request.timeout.ms" -> persisterConfig.requestTimeoutMs,
>   "session.timeout.ms" -> persisterConfig.sessionTimeoutMs,
>   "heartbeat.interval.ms" -> persisterConfig.heartbeatIntervalMs,
>   "connections.max.idle.ms"-> persisterConfig.connectionsMaxIdleMs
> )
> {code}
> {code}
> 16/11/09 23:10:17 INFO BlockManagerInfo: Added broadcast_154_piece0 in memory 
> on xyz (size: 146.3 KB, free: 8.4 GB)
> 16/11/09 23:10:23 WARN TaskSetManager: Lost task 15.0 in stage 151.0 (TID 
> 38837, xyz): org.apache.kafka.clients.consumer.OffsetOutOfRangeException: 
> Offsets out of range with no configured reset policy for partitions: 
> {topic=231884473}
> at 
> org.apache.kafka.clients.consumer.internals.Fetcher.parseFetchedData(Fetcher.java:588)
> at 
> org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:354)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1000)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938)
> at 
> org.apache.spark.streaming.kafka010.CachedKafkaConsumer.poll(CachedKafkaConsumer.scala:99)
> at 
> org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:70)
> at 
> org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:227)
> at 
> org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:193)
> at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
> at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
> at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:397)
> at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
> at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
> at org.apache.spark.scheduler.Task.run(Task.scala:85)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 16/11/09 23:10:29 INFO TaskSetManager: Finished task 10.0 in stage 154.0 (TID 
> 39388) in 12043 ms on xyz (1/16)
> 16/11/09 23:10:31 INFO TaskSetManager: Finished task 0.0 in stage 154.0 (TID 
> 39375) in 13444 ms on xyz (2/16)
> 16/11/09 23:10:44 WARN TaskSetManager: Lost task 1.0 in stage 151.0 (TID 
> 38843, xyz): java.util.ConcurrentModificationException: KafkaConsumer is not 
> safe for multi-threaded access
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.acquire(KafkaConsumer.java:1431)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:929)
> at 
> org.apache.spark.streaming.kafka010.CachedKafkaConsumer.poll(CachedKafkaConsumer.scala:99)
> at 
> 

Re: [VOTE] KIP-84: Support SASL SCRAM mechanisms

2016-11-15 Thread Rajini Sivaram
Radai,

I don't have a strong objection to using a more verbose format. But the
reasons for choosing the cryptic s=,t=,... format:

   1. Unlike other properties like quotas stored in Zookeeper which need to
   be human readable in order to query the values, these values only need to
   be parsed by code. The base64-encoded array of bytes don't really mean
   anything to a human. You would only ever want to check if the user has
   credentials for a mechanism, you can't really tell what the credentials are
   from the value stored in ZK.
   2. Single letter keys save space. Agree, it is not much, but since a
   more verbose format doesn't add much value, it feels like wasted space in
   ZK to store long key names for each property for each user for each
   mechanism.
   3. SCRAM authentication messages defined in RFC 5802
    use comma-separated key=value
   pairs with single letter keys. s= and i= appear exactly
   like that in SCRAM messages. Server key and stored key are not exchanged,
   so I chose two unused letters. The same parser used for SCRAM messages is
   used to parse this persisted value as well since the format is the same.


On Tue, Nov 15, 2016 at 5:02 PM, radai  wrote:

> small nitpick - given that s,t,k and i are used as part of a rather large
> CSV format, what is the gain in having them be single letter aliases?
> in other words - why not salt=... , serverKey=... , storedKey=... ,
> iterations=... ?
>
> On Tue, Nov 15, 2016 at 7:26 AM, Mickael Maison 
> wrote:
>
> > +1
> >
> > On Tue, Nov 15, 2016 at 10:57 AM, Rajini Sivaram
> >  wrote:
> > > Jun,
> > >
> > > Thank you, I have made the updates to the KIP.
> > >
> > > On Tue, Nov 15, 2016 at 12:34 AM, Jun Rao  wrote:
> > >
> > >> Hi, Rajini,
> > >>
> > >> Thanks for the proposal. +1. A few minor comments.
> > >>
> > >> 30. Could you add that the broker config sasl.enabled.mechanisms can
> now
> > >> take more values?
> > >>
> > >> 31. Could you document the meaning of s,t,k,i used in
> > /config/users/alice
> > >> in ZK?
> > >>
> > >> 32. In the rejected section, could you document why we decided not to
> > bump
> > >> up the version of SaslHandshakeRequest?
> > >>
> > >> Jun
> > >>
> > >>
> > >> On Mon, Nov 14, 2016 at 5:57 AM, Rajini Sivaram <
> > >> rajinisiva...@googlemail.com> wrote:
> > >>
> > >> > Hi all,
> > >> >
> > >> > I would like to initiate the voting process for *KIP-84: Support
> > >> SASL/SCRAM
> > >> > mechanisms*:
> > >> >
> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> > 84%3A+Support+SASL+SCRAM+mechanisms
> > >> >
> > >> > This KIP adds support for four SCRAM mechanisms (SHA-224, SHA-256,
> > >> SHA-384
> > >> > and SHA-512) for SASL authentication, giving more choice for users
> to
> > >> > configure security. When delegation token support is added to Kafka,
> > >> SCRAM
> > >> > will also support secure authentication using delegation tokens.
> > >> >
> > >> > Thank you...
> > >> >
> > >> > Regards,
> > >> >
> > >> > Rajini
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > Rajini
> >
>



-- 
Regards,

Rajini


[jira] [Commented] (KAFKA-3959) __consumer_offsets wrong number of replicas at startup

2016-11-15 Thread Todd Palino (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668010#comment-15668010
 ] 

Todd Palino commented on KAFKA-3959:


As noted, I just want the config enforced. If RF=3 is configured, that's what 
we should get. If you need RF=1 for testing, or for specific use cases, set it. 
Even make the default 1 if that's really what we want. But if I explicitly set 
RF=3, that's what I should get. And if it causes errors, and I've explicitly 
set it, that's on me as the user.

> __consumer_offsets wrong number of replicas at startup
> --
>
> Key: KAFKA-3959
> URL: https://issues.apache.org/jira/browse/KAFKA-3959
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, offset manager, replication
>Affects Versions: 0.9.0.1, 0.10.0.0
> Environment: Brokers of 3 kafka nodes running Red Hat Enterprise 
> Linux Server release 7.2 (Maipo)
>Reporter: Alban Hurtaud
>
> When creating a stack of 3 kafka brokers, the consumer is starting faster 
> than kafka nodes and when trying to read a topic, only one kafka node is 
> available.
> So the __consumer_offsets is created with a replication factor set to 1 
> (instead of configured 3) :
> offsets.topic.replication.factor=3
> default.replication.factor=3
> min.insync.replicas=2
> Then, other kafka nodes go up and we have exceptions because the replicas # 
> for __consumer_offsets is 1 and min insync is 2. So exceptions are thrown.
> What I missed is : Why the __consumer_offsets is created with replication to 
> 1 (when 1 broker is running) whereas in server.properties it is set to 3 ?
> To reproduce : 
> - Prepare 3 kafka nodes with the 3 lines above added to servers.properties.
> - Run one kafka,
> - Run one consumer (the __consumer_offsets is created with replicas =1)
> - Run 2 more kafka nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4412) Replication fetch stuck in loop on offset null

2016-11-15 Thread Xavier Lange (JIRA)
Xavier Lange created KAFKA-4412:
---

 Summary: Replication fetch stuck in loop on offset null
 Key: KAFKA-4412
 URL: https://issues.apache.org/jira/browse/KAFKA-4412
 Project: Kafka
  Issue Type: Bug
  Components: replication
Reporter: Xavier Lange


I kicked off a cluster rebalance and it never completed. I had to look at node 
eth0 traffic to see there was a constant 60MB/s (I'm usually at about 5MB/s 
ingest). The /kafka/logs/server.log was looping like this, then I deleted the 
topic in question to make it stop:

{quote}
[2016-11-15 18:21:27,745] ERROR Found invalid messages during fetch for 
partition [cisco-2016.11.13,19] offset 861323 error null 
(kafka.server.ReplicaFetcherThread)
[2016-11-15 18:21:27,755] ERROR Found invalid messages during fetch for 
partition [cisco-2016.11.13,19] offset 861323 error null 
(kafka.server.ReplicaFetcherThread)
[2016-11-15 18:21:27,773] ERROR Found invalid messages during fetch for 
partition [cisco-2016.11.13,19] offset 861323 error null 
(kafka.server.ReplicaFetcherThread)
[2016-11-15 18:21:27,788] ERROR Found invalid messages during fetch for 
partition [cisco-2016.11.13,19] offset 861323 error null 
(kafka.server.ReplicaFetcherThread)
[2016-11-15 18:21:27,847] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,19] 
(kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:27,852] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,11] 
(kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:27,853] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,9] (kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:27,855] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,3] (kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:27,856] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,16] 
(kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:27,857] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,2] (kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:27,858] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,19] 
(kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:28,012] INFO Deleting index 
/data/cisco-2016.11.13-19/.index (kafka.log.OffsetIndex)
[2016-11-15 18:21:28,016] INFO Deleted log for partition [cisco-2016.11.13,19] 
in /data/cisco-2016.11.13-19. (kafka.log.LogManager)
[2016-11-15 18:21:28,024] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,11] 
(kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:28,165] INFO Deleting index 
/data/cisco-2016.11.13-11/.index (kafka.log.OffsetIndex)
[2016-11-15 18:21:28,165] INFO Deleted log for partition [cisco-2016.11.13,11] 
in /data/cisco-2016.11.13-11. (kafka.log.LogManager)
[2016-11-15 18:21:28,167] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,9] (kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:28,232] INFO Deleting index 
/data/cisco-2016.11.13-9/.index (kafka.log.OffsetIndex)
[2016-11-15 18:21:28,232] INFO Deleted log for partition [cisco-2016.11.13,9] 
in /data/cisco-2016.11.13-9. (kafka.log.LogManager)
[2016-11-15 18:21:28,242] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,3] (kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:28,341] INFO Deleting index 
/data/cisco-2016.11.13-3/.index (kafka.log.OffsetIndex)
[2016-11-15 18:21:28,342] INFO Deleted log for partition [cisco-2016.11.13,3] 
in /data/cisco-2016.11.13-3. (kafka.log.LogManager)
[2016-11-15 18:21:28,375] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,16] 
(kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:28,465] INFO Deleting index 
/data/cisco-2016.11.13-16/.index (kafka.log.OffsetIndex)
[2016-11-15 18:21:28,466] INFO Deleted log for partition [cisco-2016.11.13,16] 
in /data/cisco-2016.11.13-16. (kafka.log.LogManager)
[2016-11-15 18:21:28,469] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions [cisco-2016.11.13,2] (kafka.server.ReplicaFetcherManager)
[2016-11-15 18:21:28,486] INFO Deleting index 
/data/cisco-2016.11.13-2/.index (kafka.log.OffsetIndex)
[2016-11-15 18:21:28,486] INFO Deleted log for partition [cisco-2016.11.13,2] 
in /data/cisco-2016.11.13-2. (kafka.log.LogManager)
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-84: Support SASL SCRAM mechanisms

2016-11-15 Thread Gwen Shapira
+1

On Mon, Nov 14, 2016 at 5:57 AM, Rajini Sivaram
 wrote:
> Hi all,
>
> I would like to initiate the voting process for *KIP-84: Support SASL/SCRAM
> mechanisms*:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-84%3A+Support+SASL+SCRAM+mechanisms
>
> This KIP adds support for four SCRAM mechanisms (SHA-224, SHA-256, SHA-384
> and SHA-512) for SASL authentication, giving more choice for users to
> configure security. When delegation token support is added to Kafka, SCRAM
> will also support secure authentication using delegation tokens.
>
> Thank you...
>
> Regards,
>
> Rajini



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog


Re: [VOTE] KIP-84: Support SASL SCRAM mechanisms

2016-11-15 Thread Edoardo Comar
+1 (non-binding)
--
Edoardo Comar
IBM MessageHub
eco...@uk.ibm.com
IBM UK Ltd, Hursley Park, SO21 2JN

IBM United Kingdom Limited Registered in England and Wales with number 
741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 
3AU



From:   Rajini Sivaram 
To: dev@kafka.apache.org
Date:   15/11/2016 11:00
Subject:Re: [VOTE] KIP-84: Support SASL SCRAM mechanisms



Jun,

Thank you, I have made the updates to the KIP.

On Tue, Nov 15, 2016 at 12:34 AM, Jun Rao  wrote:

> Hi, Rajini,
>
> Thanks for the proposal. +1. A few minor comments.
>
> 30. Could you add that the broker config sasl.enabled.mechanisms can now
> take more values?
>
> 31. Could you document the meaning of s,t,k,i used in 
/config/users/alice
> in ZK?
>
> 32. In the rejected section, could you document why we decided not to 
bump
> up the version of SaslHandshakeRequest?
>
> Jun
>
>
> On Mon, Nov 14, 2016 at 5:57 AM, Rajini Sivaram <
> rajinisiva...@googlemail.com> wrote:
>
> > Hi all,
> >
> > I would like to initiate the voting process for *KIP-84: Support
> SASL/SCRAM
> > mechanisms*:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 84%3A+Support+SASL+SCRAM+mechanisms
> >
> > This KIP adds support for four SCRAM mechanisms (SHA-224, SHA-256,
> SHA-384
> > and SHA-512) for SASL authentication, giving more choice for users to
> > configure security. When delegation token support is added to Kafka,
> SCRAM
> > will also support secure authentication using delegation tokens.
> >
> > Thank you...
> >
> > Regards,
> >
> > Rajini
> >
>



-- 
Regards,

Rajini



Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


[jira] [Comment Edited] (KAFKA-559) Garbage collect old consumer metadata entries

2016-11-15 Thread tony mancill (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667548#comment-15667548
 ] 

tony mancill edited comment on KAFKA-559 at 11/15/16 5:20 PM:
--

Hi [~ewencp].  The attachment 
[KAFKA-559.patch|https://issues.apache.org/jira/secure/attachment/12677332/KAFKA-559.patch]
 deletes bin/kafka-cleanup-obsolete-zk-entries.sh in the 3rd patch - I don't 
think this is intentional, is it?

{code}
>From 1351639d8e7dc2c4b5e869b05960951a82cc629a Mon Sep 17 00:00:00 2001
From: Ewen Cheslack-Postava 
Date: Thu, 23 Oct 2014 15:03:10 -0700
Subject: [PATCH 3/4] Fix naming: entires -> entries.

---
 bin/kafka-cleanup-obsolete-zk-entires.sh   |  19 --
 .../kafka/tools/CleanupObsoleteZkEntires.scala | 331 -
 .../kafka/tools/CleanupObsoleteZkEntries.scala | 331 +
 3 files changed, 331 insertions(+), 350 deletions(-)
 delete mode 100755 bin/kafka-cleanup-obsolete-zk-entires.sh
 delete mode 100644 
core/src/main/scala/kafka/tools/CleanupObsoleteZkEntires.scala
 create mode 100644 
core/src/main/scala/kafka/tools/CleanupObsoleteZkEntries.scala

diff --git a/bin/kafka-cleanup-obsolete-zk-entires.sh 
b/bin/kafka-cleanup-obsolete-zk-entires.sh
deleted file mode 100755
index f2c0cb8..000
--- a/bin/kafka-cleanup-obsolete-zk-entires.sh
+++ /dev/null
{code}

Instead, the shell script should be updated for the correct classname - 
something like:

{code}
diff --git a/bin/kafka-cleanup-obsolete-zk-entires.sh 
b/bin/kafka-cleanup-obsolete-zk-entires.sh
index f2c0cb8..0625183 100755
--- a/bin/kafka-cleanup-obsolete-zk-entires.sh
+++ b/bin/kafka-cleanup-obsolete-zk-entires.sh
@@ -16,4 +16,4 @@

 base_dir=$(dirname $0)
 export KAFKA_OPTS="-Xmx512M -server -Dcom.sun.management.jmxremote 
-Dlog4j.configuration=file:$base_dir/kafka-console-consumer-log4j.properties"
-$base_dir/kafka-run-class.sh kafka.tools.CleanupObsoleteZkEntires $@
+$base_dir/kafka-run-class.sh kafka.tools.CleanupObsoleteZkEntries $@
{code}


was (Author: tmancill):
Hi [~ewencp].  The attachment 
[KAFKA-559.patch|https://issues.apache.org/jira/secure/attachment/12677332/KAFKA-559.patch]
 deletes bin/kafka-cleanup-obsolete-zk-entries.sh in the 3rd patch - I don't 
think this is intentional, is it?

{code}
>From 1351639d8e7dc2c4b5e869b05960951a82cc629a Mon Sep 17 00:00:00 2001
From: Ewen Cheslack-Postava 
Date: Thu, 23 Oct 2014 15:03:10 -0700
Subject: [PATCH 3/4] Fix naming: entires -> entries.

---
 bin/kafka-cleanup-obsolete-zk-entires.sh   |  19 --
 .../kafka/tools/CleanupObsoleteZkEntires.scala | 331 -
 .../kafka/tools/CleanupObsoleteZkEntries.scala | 331 +
 3 files changed, 331 insertions(+), 350 deletions(-)
 delete mode 100755 bin/kafka-cleanup-obsolete-zk-entires.sh
 delete mode 100644 
core/src/main/scala/kafka/tools/CleanupObsoleteZkEntires.scala
 create mode 100644 
core/src/main/scala/kafka/tools/CleanupObsoleteZkEntries.scala

diff --git a/bin/kafka-cleanup-obsolete-zk-entires.sh 
b/bin/kafka-cleanup-obsolete-zk-entires.sh
deleted file mode 100755
index f2c0cb8..000
--- a/bin/kafka-cleanup-obsolete-zk-entires.sh
+++ /dev/null
{code}

> Garbage collect old consumer metadata entries
> -
>
> Key: KAFKA-559
> URL: https://issues.apache.org/jira/browse/KAFKA-559
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Jay Kreps
>Assignee: Ewen Cheslack-Postava
>  Labels: newbie, project
> Attachments: KAFKA-559.patch, KAFKA-559.v1.patch, KAFKA-559.v2.patch
>
>
> Many use cases involve tranient consumers. These consumers create entries 
> under their consumer group in zk and maintain offsets there as well. There is 
> currently no way to delete these entries. It would be good to have a tool 
> that did something like
>   bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date] 
> --zookeeper [zk_connect]
> This would scan through consumer group entries and delete any that had no 
> offset update since the given date.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-84: Support SASL SCRAM mechanisms

2016-11-15 Thread radai
small nitpick - given that s,t,k and i are used as part of a rather large
CSV format, what is the gain in having them be single letter aliases?
in other words - why not salt=... , serverKey=... , storedKey=... ,
iterations=... ?

On Tue, Nov 15, 2016 at 7:26 AM, Mickael Maison 
wrote:

> +1
>
> On Tue, Nov 15, 2016 at 10:57 AM, Rajini Sivaram
>  wrote:
> > Jun,
> >
> > Thank you, I have made the updates to the KIP.
> >
> > On Tue, Nov 15, 2016 at 12:34 AM, Jun Rao  wrote:
> >
> >> Hi, Rajini,
> >>
> >> Thanks for the proposal. +1. A few minor comments.
> >>
> >> 30. Could you add that the broker config sasl.enabled.mechanisms can now
> >> take more values?
> >>
> >> 31. Could you document the meaning of s,t,k,i used in
> /config/users/alice
> >> in ZK?
> >>
> >> 32. In the rejected section, could you document why we decided not to
> bump
> >> up the version of SaslHandshakeRequest?
> >>
> >> Jun
> >>
> >>
> >> On Mon, Nov 14, 2016 at 5:57 AM, Rajini Sivaram <
> >> rajinisiva...@googlemail.com> wrote:
> >>
> >> > Hi all,
> >> >
> >> > I would like to initiate the voting process for *KIP-84: Support
> >> SASL/SCRAM
> >> > mechanisms*:
> >> >
> >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> > 84%3A+Support+SASL+SCRAM+mechanisms
> >> >
> >> > This KIP adds support for four SCRAM mechanisms (SHA-224, SHA-256,
> >> SHA-384
> >> > and SHA-512) for SASL authentication, giving more choice for users to
> >> > configure security. When delegation token support is added to Kafka,
> >> SCRAM
> >> > will also support secure authentication using delegation tokens.
> >> >
> >> > Thank you...
> >> >
> >> > Regards,
> >> >
> >> > Rajini
> >> >
> >>
> >
> >
> >
> > --
> > Regards,
> >
> > Rajini
>


[jira] [Commented] (KAFKA-4406) Add support for custom Java Security Providers in configuration

2016-11-15 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667577#comment-15667577
 ] 

Rajini Sivaram commented on KAFKA-4406:
---

The PR adds the new configuration to clients, so I had assumed that you were 
updating providers in client VMs. A few comments:
* Having two different ways for configuring clients and broker for the same 
property doesn't sound good.
* I think the PR is adding any security provider and not just 
{{ssl.provider.classes}}, so the configuration option name  is misleading.
* Not sure if the solution is generic enough. The PR adds a security provider 
to the end of the list provided by the JVM, confiigured system property etc. 
That works in this case where you are adding a new type, but not in the case 
where you want to replace a provider (then you are back again to fixing it in 
the standard Java way for the JVM). Perhaps an interface or a generic broker 
interceptor would be better?


> Add support for custom Java Security Providers in configuration
> ---
>
> Key: KAFKA-4406
> URL: https://issues.apache.org/jira/browse/KAFKA-4406
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: Magnus Reftel
>Priority: Minor
>
> Currently, the only way to add a custom security provider is though adding a 
> -Djava.security.properties= option to the command line, e.g. though 
> KAFKA_OPTS. It would be more convenient if this could be done though the 
> config file, like all the other SSL related options.
> I propose adding a new configuration option, ssl.provider.classes, which 
> holds a list of names of security provider classes that will be loaded, 
> instantiated, and added before creating SSL contexts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-559) Garbage collect old consumer metadata entries

2016-11-15 Thread tony mancill (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667548#comment-15667548
 ] 

tony mancill commented on KAFKA-559:


Hi [~ewencp].  The attachment 
[KAFKA-559.patch|https://issues.apache.org/jira/secure/attachment/12677332/KAFKA-559.patch]
 deletes bin/kafka-cleanup-obsolete-zk-entries.sh in the 3rd patch - I don't 
think this is intentional, is it?

{code}
>From 1351639d8e7dc2c4b5e869b05960951a82cc629a Mon Sep 17 00:00:00 2001
From: Ewen Cheslack-Postava 
Date: Thu, 23 Oct 2014 15:03:10 -0700
Subject: [PATCH 3/4] Fix naming: entires -> entries.

---
 bin/kafka-cleanup-obsolete-zk-entires.sh   |  19 --
 .../kafka/tools/CleanupObsoleteZkEntires.scala | 331 -
 .../kafka/tools/CleanupObsoleteZkEntries.scala | 331 +
 3 files changed, 331 insertions(+), 350 deletions(-)
 delete mode 100755 bin/kafka-cleanup-obsolete-zk-entires.sh
 delete mode 100644 
core/src/main/scala/kafka/tools/CleanupObsoleteZkEntires.scala
 create mode 100644 
core/src/main/scala/kafka/tools/CleanupObsoleteZkEntries.scala

diff --git a/bin/kafka-cleanup-obsolete-zk-entires.sh 
b/bin/kafka-cleanup-obsolete-zk-entires.sh
deleted file mode 100755
index f2c0cb8..000
--- a/bin/kafka-cleanup-obsolete-zk-entires.sh
+++ /dev/null
{code}

> Garbage collect old consumer metadata entries
> -
>
> Key: KAFKA-559
> URL: https://issues.apache.org/jira/browse/KAFKA-559
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Jay Kreps
>Assignee: Ewen Cheslack-Postava
>  Labels: newbie, project
> Attachments: KAFKA-559.patch, KAFKA-559.v1.patch, KAFKA-559.v2.patch
>
>
> Many use cases involve tranient consumers. These consumers create entries 
> under their consumer group in zk and maintain offsets there as well. There is 
> currently no way to delete these entries. It would be good to have a tool 
> that did something like
>   bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date] 
> --zookeeper [zk_connect]
> This would scan through consumer group entries and delete any that had no 
> offset update since the given date.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-1933) Fine-grained locking in log append

2016-11-15 Thread Maxim Ivanov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Ivanov resolved KAFKA-1933.
-
Resolution: Won't Fix

This patch is not relevant anymore as Kafka finally matured enough to not do 
recompression of incoming batches just to set offsets

> Fine-grained locking in log append
> --
>
> Key: KAFKA-1933
> URL: https://issues.apache.org/jira/browse/KAFKA-1933
> Project: Kafka
>  Issue Type: Improvement
>  Components: log
>Reporter: Maxim Ivanov
>Assignee: Maxim Ivanov
>Priority: Minor
> Attachments: KAFKA-1933.patch, KAFKA-1933_2015-02-09_12:27:06.patch
>
>
> This patch adds finer locking when appending to log. It breaks
> global append lock into 2 sequential and 1 parallel phase.
> Basic idea is to allow every thread to "reserve" offsets in non
> overlapping ranges, then do compression in parallel and then
> "commit" write to log in the same order offsets where reserved.
> Results on a server with 16 cores CPU available:
> gzip: 564.0 sec -> 45.2 sec (12.4x speedup)
> LZ4: 56.7 sec -> 9.9 sec (5.7x speedup)
> Kafka was configured to run 16  IO threads, data was pushed using 32 netcat 
> instances pushing in parallel batches of 200 msg 6.2 kb each (3264 MB in 
> total)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4406) Add support for custom Java Security Providers in configuration

2016-11-15 Thread Magnus Reftel (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667481#comment-15667481
 ] 

Magnus Reftel commented on KAFKA-4406:
--

For client applications, I agree that it is better to just call 
`Security.addProvider` directly, which is what we do. The purpose of this 
change is to allow configuration files to add security providers to the Kafka 
brokers. The alternative there is, as far as I can see, only to add a 
`-Djava.security.properties`, guessing at a free provider index, and hoping 
that it stays unused over time. That doesn't seem like a reliable solution to 
me.

> Add support for custom Java Security Providers in configuration
> ---
>
> Key: KAFKA-4406
> URL: https://issues.apache.org/jira/browse/KAFKA-4406
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: Magnus Reftel
>Priority: Minor
>
> Currently, the only way to add a custom security provider is though adding a 
> -Djava.security.properties= option to the command line, e.g. though 
> KAFKA_OPTS. It would be more convenient if this could be done though the 
> config file, like all the other SSL related options.
> I propose adding a new configuration option, ssl.provider.classes, which 
> holds a list of names of security provider classes that will be loaded, 
> instantiated, and added before creating SSL contexts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4411) broker don't have access to kafka zookeeper nodes

2016-11-15 Thread Mohammed amine GARMES (JIRA)
Mohammed amine GARMES created KAFKA-4411:


 Summary: broker don't have access to kafka zookeeper nodes
 Key: KAFKA-4411
 URL: https://issues.apache.org/jira/browse/KAFKA-4411
 Project: Kafka
  Issue Type: Bug
  Components: admin, config
Affects Versions: 0.9.0.1
 Environment: Red Hat Enterprise Linux Server release 7.0 
Java 1.8.0_66-b17 
Kafka 0.9.0.1
Reporter: Mohammed amine GARMES
Priority: Critical


I have 2 kafka servers configured to start with kafka security, I try to start 
the akfka servers with the JASS below ==>

server 1
 KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/opt/kafka/config/kafka.keytab"
principal="kafka/kafka1.test@test.net";
};

// ZooKeeper client authentication
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/opt/kafka/config/kafka.keytab"
principal="kafka/kafka1.test@test.net";
};
server 2 :
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/opt/kafka/config/kafka.keytab"
principal="kafka/kafka2.test@test.net";
};

// ZooKeeper client authentication
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/opt/kafka/config/kafka.keytab"
principal="kafka/kafka2.test@test.net";
};

the problem:

when I start the kafka server 1 all is fine, but when I try to start the second 
server I have an issue because it haven't the access to the zookeeper node 
(/brokers) for kafka. the all zookeeper path /brokers is blocked by the first 
server, so the second server haven't the right access to write in this path .

The ACL of /brokers is the fqdn of the first server, normally  should be open 
for all and close ACL of the path /broker/ids/1, in this case the second server 
can write in /brokers and close the /brokers/ids/2 for him.

I founded a solution but I am not sure that the right solution, I create a new 
kakfa-kerberos user, so for all server I use the same user :

Server1
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/opt/kafka/config/kafka.keytab"
principal="kafka/kafka1.test@test.net";
};

// ZooKeeper client authentication
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/opt/kafka/config/kafkaZk.keytab"
principal="kafka/kafkazk.test@test.net";
};

Server2
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/opt/kafka/config/kafka.keytab"
principal="kafka/kafka2.test@test.net";
};

// ZooKeeper client authentication
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/opt/kafka/config/kafkaZk.keytab"
principal="kafka/kafkazk.test@test.net";
};


Can help me or clarify to me how I can use Kafka security correctly ?!!




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-3469) kafka-topics lock down znodes with user principal when zk security is enabled.

2016-11-15 Thread Mohammed amine GARMES (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammed amine GARMES updated KAFKA-3469:
-
Comment: was deleted

(was: Hello, 
I spent much time with this problem before I find this discussion, maybe you 
can clarify this point in your documentation :

http://kafka.apache.org/090/documentation.html#security_sasl 

Indeed, we need to use the same principal in the "Zookeeper client 
authentication" configuration for all workers 

// Zookeeper client authentication
Client {
   com.sun.security.auth.module.Krb5LoginModule required
   useKeyTab=true
   storeKey=true
   keyTab="/etc/security/keytabs/kafka_server.keytab"
   principal="kafka/zkad...@example.com";
};

bests regards)

> kafka-topics lock down znodes with user principal when zk security is enabled.
> --
>
> Key: KAFKA-3469
> URL: https://issues.apache.org/jira/browse/KAFKA-3469
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> In envs where ZK is kerberized, if a user, other than user running kafka 
> processes, creates a topic, ZkUtils will lock down corresponding znodes for 
> the user. Kafka will not be able to modify those znodes and that leaves the 
> topic unusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-84: Support SASL SCRAM mechanisms

2016-11-15 Thread Mickael Maison
+1

On Tue, Nov 15, 2016 at 10:57 AM, Rajini Sivaram
 wrote:
> Jun,
>
> Thank you, I have made the updates to the KIP.
>
> On Tue, Nov 15, 2016 at 12:34 AM, Jun Rao  wrote:
>
>> Hi, Rajini,
>>
>> Thanks for the proposal. +1. A few minor comments.
>>
>> 30. Could you add that the broker config sasl.enabled.mechanisms can now
>> take more values?
>>
>> 31. Could you document the meaning of s,t,k,i used in /config/users/alice
>> in ZK?
>>
>> 32. In the rejected section, could you document why we decided not to bump
>> up the version of SaslHandshakeRequest?
>>
>> Jun
>>
>>
>> On Mon, Nov 14, 2016 at 5:57 AM, Rajini Sivaram <
>> rajinisiva...@googlemail.com> wrote:
>>
>> > Hi all,
>> >
>> > I would like to initiate the voting process for *KIP-84: Support
>> SASL/SCRAM
>> > mechanisms*:
>> >
>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > 84%3A+Support+SASL+SCRAM+mechanisms
>> >
>> > This KIP adds support for four SCRAM mechanisms (SHA-224, SHA-256,
>> SHA-384
>> > and SHA-512) for SASL authentication, giving more choice for users to
>> > configure security. When delegation token support is added to Kafka,
>> SCRAM
>> > will also support secure authentication using delegation tokens.
>> >
>> > Thank you...
>> >
>> > Regards,
>> >
>> > Rajini
>> >
>>
>
>
>
> --
> Regards,
>
> Rajini


RE: Kafka 0.10 Monitoring tool

2016-11-15 Thread Ghosh, Achintya (Contractor)
Thank you Otis for your reply.

Kafka Manger does not work during the high load, it shows the timeout and 
Burrow and KafkaOffsetMonitor does not return the group names properly even 
during the load.

SPM is not an open source, so do you have anything opensource that works in 
Kafka 0.10 version?

Thanks
Achintya

-Original Message-
From: Otis Gospodnetić [mailto:otis.gospodne...@gmail.com] 
Sent: Monday, November 14, 2016 9:25 PM
To: us...@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: Re: Kafka 0.10 Monitoring tool

Hi,

Why are these tools not working perfectly for you?
Does it *have to* be open-source?  If not, Sematext SPM collects a lot of Kafka 
metrics, with consumer lag being one of them -- 
https://sematext.com/blog/2016/06/07/kafka-consumer-lag-offsets-monitoring/

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch 
Consulting Support Training - http://sematext.com/


On Mon, Nov 14, 2016 at 5:16 PM, Ghosh, Achintya (Contractor) < 
achintya_gh...@comcast.com> wrote:

> Hi there,
> What is the best open source tool for Kafka monitoring mainly to check 
> the offset lag. We tried the following tools:
>
>
> 1.   Burrow
>
> 2.   KafkaOffsetMonitor
>
> 3.   Prometheus and Grafana
>
> 4.   Kafka Manager
>
> But nothing is working perfectly. Please help us on this.
>
> Thanks
> Achintya
>
>


[GitHub] kafka pull request #2041: MINOR: Clarify how to fix conversion issues when p...

2016-11-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2041


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-4407) Java consumer does not always send LEAVE_GROUP request during shut down

2016-11-15 Thread Igor (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor resolved KAFKA-4407.
-
Resolution: Duplicate

Sorry for the duplicate, I'm closing this one. 

> Java consumer does not always send LEAVE_GROUP request during shut down
> ---
>
> Key: KAFKA-4407
> URL: https://issues.apache.org/jira/browse/KAFKA-4407
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.0, 0.9.0.1
>Reporter: Igor
>
> Normally, KafkaConsumer.close method sends LEAVE_GROUP request to the broker 
> during shut down, since the method AbstractCoordinator.maybeLeaveGroup is 
> called inside AbstractCoordinator.close method. However, maybeLeaveGroup does 
> not actually care if request is sent, and since network client is closed 
> nearly after the consumer coordinator during shut down, the request could be 
> never sent under certain circumstances.
> As a result, Kafka broker will wait for session.timeout to remove the 
> consumer from its group, and if the consumer reconnects within this time 
> interval, it won't receive just came messages.
> If waiting for LEAVE_GROUP request is not a desired option, could its timeout 
> be at least configured?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4407) Java consumer does not always send LEAVE_GROUP request during shut down

2016-11-15 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667173#comment-15667173
 ] 

Rajini Sivaram commented on KAFKA-4407:
---

KAFKA-3703 (PR https://github.com/apache/kafka/pull/1836) addresses this issue.

> Java consumer does not always send LEAVE_GROUP request during shut down
> ---
>
> Key: KAFKA-4407
> URL: https://issues.apache.org/jira/browse/KAFKA-4407
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.1.0, 0.9.0.1
>Reporter: Igor
>
> Normally, KafkaConsumer.close method sends LEAVE_GROUP request to the broker 
> during shut down, since the method AbstractCoordinator.maybeLeaveGroup is 
> called inside AbstractCoordinator.close method. However, maybeLeaveGroup does 
> not actually care if request is sent, and since network client is closed 
> nearly after the consumer coordinator during shut down, the request could be 
> never sent under certain circumstances.
> As a result, Kafka broker will wait for session.timeout to remove the 
> consumer from its group, and if the consumer reconnects within this time 
> interval, it won't receive just came messages.
> If waiting for LEAVE_GROUP request is not a desired option, could its timeout 
> be at least configured?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (KAFKA-1548) Refactor the "replica_id" in requests

2016-11-15 Thread Balint Molnar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-1548 started by Balint Molnar.

> Refactor the "replica_id" in requests
> -
>
> Key: KAFKA-1548
> URL: https://issues.apache.org/jira/browse/KAFKA-1548
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Balint Molnar
>  Labels: newbie
> Fix For: 0.10.2.0
>
>
> Today in many requests like fetch and offset we have a integer replica_id 
> field, if the request is from a follower consumer it is the broker id from 
> that follower replica, if it is from a regular consumer it could be one of 
> the two values: "-1" for ordinary consumer, or "-2" for debugging consumer. 
> Hence this replica_id field is used in two folds:
> 1) Logging for trouble shooting in request logs, which can be helpful only 
> when this is from a follower replica, 
> 2) Deciding if it is from the consumer or a replica to logically handle the 
> request in different ways. For this purpose we do not really care about the 
> actually id value.
> We probably would like to do the following improvements:
> 1) Rename "replica_id" to sth. less confusing?
> 2) Change the request.toString() function based on the replica_id, whether it 
> is a positive integer (meaning from a broker replica fetcher) or -1/-2 
> (meaning from a regular consumer).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1548) Refactor the "replica_id" in requests

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667147#comment-15667147
 ] 

ASF GitHub Bot commented on KAFKA-1548:
---

GitHub user baluchicken opened a pull request:

https://github.com/apache/kafka/pull/2137

KAFKA-1548 Refactor the "replica_id" in requests



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/baluchicken/kafka-1 KAFKA-1548

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2137.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2137


commit 4e55654c288fa18ad294796b5d58c5bc8e198ccc
Author: Balint Molnar 
Date:   2016-11-15T13:27:09Z

KAFKA-1548 Refactor the "replica_id" in requests




> Refactor the "replica_id" in requests
> -
>
> Key: KAFKA-1548
> URL: https://issues.apache.org/jira/browse/KAFKA-1548
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Balint Molnar
>  Labels: newbie
> Fix For: 0.10.2.0
>
>
> Today in many requests like fetch and offset we have a integer replica_id 
> field, if the request is from a follower consumer it is the broker id from 
> that follower replica, if it is from a regular consumer it could be one of 
> the two values: "-1" for ordinary consumer, or "-2" for debugging consumer. 
> Hence this replica_id field is used in two folds:
> 1) Logging for trouble shooting in request logs, which can be helpful only 
> when this is from a follower replica, 
> 2) Deciding if it is from the consumer or a replica to logically handle the 
> request in different ways. For this purpose we do not really care about the 
> actually id value.
> We probably would like to do the following improvements:
> 1) Rename "replica_id" to sth. less confusing?
> 2) Change the request.toString() function based on the replica_id, whether it 
> is a positive integer (meaning from a broker replica fetcher) or -1/-2 
> (meaning from a regular consumer).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-1548) Refactor the "replica_id" in requests

2016-11-15 Thread Balint Molnar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balint Molnar reassigned KAFKA-1548:


Assignee: Balint Molnar  (was: Gwen Shapira)

> Refactor the "replica_id" in requests
> -
>
> Key: KAFKA-1548
> URL: https://issues.apache.org/jira/browse/KAFKA-1548
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Balint Molnar
>  Labels: newbie
> Fix For: 0.10.2.0
>
>
> Today in many requests like fetch and offset we have a integer replica_id 
> field, if the request is from a follower consumer it is the broker id from 
> that follower replica, if it is from a regular consumer it could be one of 
> the two values: "-1" for ordinary consumer, or "-2" for debugging consumer. 
> Hence this replica_id field is used in two folds:
> 1) Logging for trouble shooting in request logs, which can be helpful only 
> when this is from a follower replica, 
> 2) Deciding if it is from the consumer or a replica to logically handle the 
> request in different ways. For this purpose we do not really care about the 
> actually id value.
> We probably would like to do the following improvements:
> 1) Rename "replica_id" to sth. less confusing?
> 2) Change the request.toString() function based on the replica_id, whether it 
> is a positive integer (meaning from a broker replica fetcher) or -1/-2 
> (meaning from a regular consumer).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2137: KAFKA-1548 Refactor the "replica_id" in requests

2016-11-15 Thread baluchicken
GitHub user baluchicken opened a pull request:

https://github.com/apache/kafka/pull/2137

KAFKA-1548 Refactor the "replica_id" in requests



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/baluchicken/kafka-1 KAFKA-1548

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2137.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2137


commit 4e55654c288fa18ad294796b5d58c5bc8e198ccc
Author: Balint Molnar 
Date:   2016-11-15T13:27:09Z

KAFKA-1548 Refactor the "replica_id" in requests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4406) Add support for custom Java Security Providers in configuration

2016-11-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666982#comment-15666982
 ] 

Ismael Juma commented on KAFKA-4406:


[~rsivaram] beat me to it. I share the concern about setting a JVM-wide setting 
via a Kafka config.

> Add support for custom Java Security Providers in configuration
> ---
>
> Key: KAFKA-4406
> URL: https://issues.apache.org/jira/browse/KAFKA-4406
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: Magnus Reftel
>Priority: Minor
>
> Currently, the only way to add a custom security provider is though adding a 
> -Djava.security.properties= option to the command line, e.g. though 
> KAFKA_OPTS. It would be more convenient if this could be done though the 
> config file, like all the other SSL related options.
> I propose adding a new configuration option, ssl.provider.classes, which 
> holds a list of names of security provider classes that will be loaded, 
> instantiated, and added before creating SSL contexts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4406) Add support for custom Java Security Providers in configuration

2016-11-15 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666957#comment-15666957
 ] 

Rajini Sivaram commented on KAFKA-4406:
---

[~magnus.reftel] You will need to create a [Kafka Improvement 
Proposal|https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals]
 to add a new configuration parameter.

The other SSL configuration options change a property on the SSL context used 
by Kafka. But this configuration is changing a JVM-wide setting. Since you can 
call {{Security.addProvider()}} in your application or set 
{{java.security.properties}} for the JVM without changing code, I am not sure 
if a Kafka-specific option is really necessary.

> Add support for custom Java Security Providers in configuration
> ---
>
> Key: KAFKA-4406
> URL: https://issues.apache.org/jira/browse/KAFKA-4406
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: Magnus Reftel
>Priority: Minor
>
> Currently, the only way to add a custom security provider is though adding a 
> -Djava.security.properties= option to the command line, e.g. though 
> KAFKA_OPTS. It would be more convenient if this could be done though the 
> config file, like all the other SSL related options.
> I propose adding a new configuration option, ssl.provider.classes, which 
> holds a list of names of security provider classes that will be loaded, 
> instantiated, and added before creating SSL contexts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2136: MINOR: Remove unused code

2016-11-15 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/2136

MINOR: Remove unused code



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka remove-unused-code

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2136.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2136


commit eefa5107914a2556cdaed177c479f4428464ca9d
Author: Ismael Juma 
Date:   2016-11-15T03:44:18Z

Remove unused `zkSessionTimeout` in `ControllerContext`

commit 9248489bc36f89a38e064d8e42dd3a0d10510ae0
Author: Ismael Juma 
Date:   2016-11-15T11:08:39Z

Remove unused `ByteBoundedBlockingQueue`




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2135: KAFKA-3637: Added initial states

2016-11-15 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/2135

KAFKA-3637: Added initial states



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka KAFKA-3637-streams-state

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2135.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2135


commit 80548e1179adfae5d41a90c9bd43cd60ae37d5bf
Author: Eno Thereska 
Date:   2016-11-15T11:42:34Z

Added initial states




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3637) Add method that checks if streams are initialised

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666928#comment-15666928
 ] 

ASF GitHub Bot commented on KAFKA-3637:
---

GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/2135

KAFKA-3637: Added initial states



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka KAFKA-3637-streams-state

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2135.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2135


commit 80548e1179adfae5d41a90c9bd43cd60ae37d5bf
Author: Eno Thereska 
Date:   2016-11-15T11:42:34Z

Added initial states




> Add method that checks if streams are initialised
> -
>
> Key: KAFKA-3637
> URL: https://issues.apache.org/jira/browse/KAFKA-3637
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>  Labels: newbie
> Fix For: 0.10.2.0
>
>
> Currently when streams are initialised and started with streams.start(), 
> there is no way for the caller to know if the initialisation procedure 
> (including starting tasks) is complete or not. Hence, the caller is forced to 
> guess for how long to wait. It would be good to have a way to return the 
> state of the streams to the caller.
> One option would be to follow a similar approach in Kafka Server 
> (BrokerStates.scala).
> Would be good for example, to keep track of whether Kafka Streams is 
> starting/running/rebalancing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag

2016-11-15 Thread Becket Qin
Hi Renu,

Technically speaking we may not need to bump up the magic value. A
tombstone is a message either with tombstone bit set OR with a null
value.(once all the clients has been upgraded, it automatically becomes
only based on tombstone) However, this leads to a few subtle issues.

1. Migration. In practice, the up-conversion may not be used until all the
clients are upgraded. Kafka requires the client version to be lower than
the server version for compatibility. That means the new server code is
always deployed when clients are still running old code. When rolling out
the new server code, we will keep the message format at 0.10.1. We cannot
use message format version 0.10.2 yet because if we do that the brokers
will suddenly lose zero copy This is because all the clients are still
running old code. If we bump up the message version to 0.10.2, the broker
will have to look at the message to see if down conversion is needed).
Later on when most of the consumers have upgraded, we will then bump up the
message format version to 0.10.2. So the broker cannot always up convert
and depend on the tombstone even with the new code.

2. Long term efficiency. If we don't bump up the magic byte, on the broker
side, the broker will always have to look at both tombstone bit and the
value when do the compaction. Assuming we do not bump up the magic byte,
imagine the broker sees a message which does not have a tombstone bit set.
The broker does not know when the message was produced (i.e. whether the
message has been up converted or not), it has to take a further look at the
value to see if it is null or not in order to determine if it is a
tombstone. The same logic has to be put on the consumer as well because the
consumer does not know if the message has been up converted or not. With a
magic value bump, the broker/consumer knows for sure there is no need to
look at the value anymore if the tombstone bit is not set. So if we want to
eventually only use tombstone bit, a magic value bump is necessary.

Thanks,

Jiangjie (Becket) Qin

On Mon, Nov 14, 2016 at 11:39 PM, Renu Tewari  wrote:

> Is upping the magic byte to 2 needed?
>
> In your example say
> For broker api version at or above 0.10.2 the tombstone bit will be used
> for log compaction deletion.
> If the producerequest version is less than 0.10.2 and the message is null,
> the broker will up convert to set the tombstone bit on
> If the producerequest version is at or above 0.10.2 then the tombstone bit
> value is unchanged
> Either way the new version of broker only  uses the tombstone bit
> internally.
>
> thanks
> Renu
>
> On Mon, Nov 14, 2016 at 8:31 PM, Becket Qin  wrote:
>
> > If we follow the current way of doing this format change, it would work
> the
> > following way:
> >
> > 0. Bump up the magic byte to 2 to indicate the tombstone bit is used.
> >
> > 1. On receiving a ProduceRequest, broker always convert the messages to
> the
> > configured message.format.version.
> > 1.1 If the message version does not match the configured
> > message.format.version, the broker will either up convert or down convert
> > the message. In that case, users pay the performance cost of
> re-compression
> > if needed.
> > 1.2 If the message version matches the configured message.format.version,
> > the broker will not do the format conversion and user may save the
> > re-compression cost if the message.format.version is on or above 0.10.0.
> >
> > 2. On receiving a FetchRequest, the broker check the FetchRequest version
> > to see if the consumer supports the configured message.format.version or
> > not. If the consumer does not support it, down conversion is required and
> > zero copy is lost. Otherwise zero copy is used to return the
> FetchResponse.
> >
> > Notice that setting message.format.version to 0.10.2 is equivalent to
> > always up convert if needed, but that also means to always down convert
> if
> > there is an old consumer.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> >
> > On Mon, Nov 14, 2016 at 1:43 PM, Michael Pearce 
> > wrote:
> >
> > > I like the idea of up converting and then just having the logic to look
> > > for tombstones. It makes that quite clean in nature.
> > >
> > > It's quite late here in the UK, so I fully understand / confirm I
> > > understand what you propose could you write it on the kip wiki or fully
> > > describe exactly how you see it working, so uk morning I could read
> > through?
> > >
> > > Thanks all for the input on this it is appreciated.
> > >
> > >
> > > Sent using OWA for iPhone
> > > 
> > > From: Mayuresh Gharat 
> > > Sent: Monday, November 14, 2016 9:28:16 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> > >
> > > Hi Michael,
> > >
> > > Just another thing that came up during my discussion with Renu and I
> > wanted
> > > 

[jira] [Updated] (KAFKA-4406) Add support for custom Java Security Providers in configuration

2016-11-15 Thread Magnus Reftel (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Magnus Reftel updated KAFKA-4406:
-
Status: Patch Available  (was: Open)

https://github.com/apache/kafka/pull/2134

> Add support for custom Java Security Providers in configuration
> ---
>
> Key: KAFKA-4406
> URL: https://issues.apache.org/jira/browse/KAFKA-4406
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: Magnus Reftel
>Priority: Minor
>
> Currently, the only way to add a custom security provider is though adding a 
> -Djava.security.properties= option to the command line, e.g. though 
> KAFKA_OPTS. It would be more convenient if this could be done though the 
> config file, like all the other SSL related options.
> I propose adding a new configuration option, ssl.provider.classes, which 
> holds a list of names of security provider classes that will be loaded, 
> instantiated, and added before creating SSL contexts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-4406) Add support for custom Java Security Providers in configuration

2016-11-15 Thread Magnus Reftel (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Magnus Reftel updated KAFKA-4406:
-
Comment: was deleted

(was: https://github.com/apache/kafka/pull/2134)

> Add support for custom Java Security Providers in configuration
> ---
>
> Key: KAFKA-4406
> URL: https://issues.apache.org/jira/browse/KAFKA-4406
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: Magnus Reftel
>Priority: Minor
>
> Currently, the only way to add a custom security provider is though adding a 
> -Djava.security.properties= option to the command line, e.g. though 
> KAFKA_OPTS. It would be more convenient if this could be done though the 
> config file, like all the other SSL related options.
> I propose adding a new configuration option, ssl.provider.classes, which 
> holds a list of names of security provider classes that will be loaded, 
> instantiated, and added before creating SSL contexts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4406) Add support for custom Java Security Providers in configuration

2016-11-15 Thread Magnus Reftel (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666899#comment-15666899
 ] 

Magnus Reftel commented on KAFKA-4406:
--

https://github.com/apache/kafka/pull/2134

> Add support for custom Java Security Providers in configuration
> ---
>
> Key: KAFKA-4406
> URL: https://issues.apache.org/jira/browse/KAFKA-4406
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: Magnus Reftel
>Priority: Minor
>
> Currently, the only way to add a custom security provider is though adding a 
> -Djava.security.properties= option to the command line, e.g. though 
> KAFKA_OPTS. It would be more convenient if this could be done though the 
> config file, like all the other SSL related options.
> I propose adding a new configuration option, ssl.provider.classes, which 
> holds a list of names of security provider classes that will be loaded, 
> instantiated, and added before creating SSL contexts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2134: config: add ssl.provider.classes config option

2016-11-15 Thread reftel
GitHub user reftel opened a pull request:

https://github.com/apache/kafka/pull/2134

config: add ssl.provider.classes config option



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/reftel/kafka feature/security_providers_config

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2134.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2134


commit 11bd07fd7a27ebc8e48b7c895e3a1ce987131041
Author: Magnus Reftel 
Date:   2016-11-14T12:45:18Z

config: add ssl.provider.classes config option




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] KIP-84: Support SASL SCRAM mechanisms

2016-11-15 Thread Rajini Sivaram
Jun,

Thank you, I have made the updates to the KIP.

On Tue, Nov 15, 2016 at 12:34 AM, Jun Rao  wrote:

> Hi, Rajini,
>
> Thanks for the proposal. +1. A few minor comments.
>
> 30. Could you add that the broker config sasl.enabled.mechanisms can now
> take more values?
>
> 31. Could you document the meaning of s,t,k,i used in /config/users/alice
> in ZK?
>
> 32. In the rejected section, could you document why we decided not to bump
> up the version of SaslHandshakeRequest?
>
> Jun
>
>
> On Mon, Nov 14, 2016 at 5:57 AM, Rajini Sivaram <
> rajinisiva...@googlemail.com> wrote:
>
> > Hi all,
> >
> > I would like to initiate the voting process for *KIP-84: Support
> SASL/SCRAM
> > mechanisms*:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 84%3A+Support+SASL+SCRAM+mechanisms
> >
> > This KIP adds support for four SCRAM mechanisms (SHA-224, SHA-256,
> SHA-384
> > and SHA-512) for SASL authentication, giving more choice for users to
> > configure security. When delegation token support is added to Kafka,
> SCRAM
> > will also support secure authentication using delegation tokens.
> >
> > Thank you...
> >
> > Regards,
> >
> > Rajini
> >
>



-- 
Regards,

Rajini


[jira] [Updated] (KAFKA-3637) Add method that checks if streams are initialised

2016-11-15 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska updated KAFKA-3637:

Fix Version/s: 0.10.2.0

> Add method that checks if streams are initialised
> -
>
> Key: KAFKA-3637
> URL: https://issues.apache.org/jira/browse/KAFKA-3637
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>  Labels: newbie
> Fix For: 0.10.2.0
>
>
> Currently when streams are initialised and started with streams.start(), 
> there is no way for the caller to know if the initialisation procedure 
> (including starting tasks) is complete or not. Hence, the caller is forced to 
> guess for how long to wait. It would be good to have a way to return the 
> state of the streams to the caller.
> One option would be to follow a similar approach in Kafka Server 
> (BrokerStates.scala).
> Would be good for example, to keep track of whether Kafka Streams is 
> starting/running/rebalancing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3637) Add method that checks if streams are initialised

2016-11-15 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska updated KAFKA-3637:

Issue Type: Improvement  (was: Sub-task)
Parent: (was: KAFKA-2590)

> Add method that checks if streams are initialised
> -
>
> Key: KAFKA-3637
> URL: https://issues.apache.org/jira/browse/KAFKA-3637
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
>  Labels: newbie
>
> Currently when streams are initialised and started with streams.start(), 
> there is no way for the caller to know if the initialisation procedure 
> (including starting tasks) is complete or not. Hence, the caller is forced to 
> guess for how long to wait. It would be good to have a way to return the 
> state of the streams to the caller.
> One option would be to follow a similar approach in Kafka Server 
> (BrokerStates.scala).
> Would be good for example, to keep track of whether Kafka Streams is 
> starting/running/rebalancing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4355) StreamThread intermittently dies with "Topic not found during partition assignment" when broker restarted

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666719#comment-15666719
 ] 

ASF GitHub Bot commented on KAFKA-4355:
---

GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/2133

KAFKA-4355: Skip topics that have no partitions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka KAFKA-4355-topic-not-found

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2133.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2133


commit 6acb95f2da291072f20b92b47bd078a47922c2e5
Author: Eno Thereska 
Date:   2016-11-15T10:03:21Z

Skip topics that have no partitions




> StreamThread intermittently dies with "Topic not found during partition 
> assignment" when broker restarted
> -
>
> Key: KAFKA-4355
> URL: https://issues.apache.org/jira/browse/KAFKA-4355
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0, 0.10.0.0
> Environment: kafka 0.10.0.0
> kafka 0.10.1.0
> uname -a
> Linux lp02485 4.4.0-34-generic #53~14.04.1-Ubuntu SMP Wed Jul 27 16:56:40 UTC 
> 2016 x86_64 x86_64 x86_64 GNU/Linux
> java -version
> java version "1.8.0_92"
> Java(TM) SE Runtime Environment (build 1.8.0_92-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)
>Reporter: Michal Borowiecki
>Assignee: Eno Thereska
>  Labels: architecture
>
> When (a) starting kafka streams app before the broker or
> (b) restarting the broker while the streams app is running:
> the stream thread intermittently dies with "Topic not found during partition 
> assignment" StreamsException.
> This happens about between one in 5 or one in 10 times.
> Stack trace:
> {noformat}
> Exception in thread "StreamThread-2" 
> org.apache.kafka.streams.errors.StreamsException: Topic not found during 
> partition assignment: scheduler
>   at 
> org.apache.kafka.streams.processor.DefaultPartitionGrouper.maxNumPartitions(DefaultPartitionGrouper.java:81)
>   at 
> org.apache.kafka.streams.processor.DefaultPartitionGrouper.partitionGroups(DefaultPartitionGrouper.java:55)
>   at 
> org.apache.kafka.streams.processor.internals.StreamPartitionAssignor.assign(StreamPartitionAssignor.java:370)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.performAssignment(ConsumerCoordinator.java:313)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onJoinLeader(AbstractCoordinator.java:467)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.access$1000(AbstractCoordinator.java:88)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:419)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:395)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:742)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:722)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:186)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:149)
>   at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:116)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:479)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:316)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:256)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:180)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:308)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:277)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:259)
>   at 
> 

[GitHub] kafka pull request #2133: KAFKA-4355: Skip topics that have no partitions

2016-11-15 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/2133

KAFKA-4355: Skip topics that have no partitions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka KAFKA-4355-topic-not-found

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2133.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2133


commit 6acb95f2da291072f20b92b47bd078a47922c2e5
Author: Eno Thereska 
Date:   2016-11-15T10:03:21Z

Skip topics that have no partitions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4410) KafkaController sends double the expected number of StopReplicaRequests during controlled shutdown

2016-11-15 Thread Onur Karaman (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman updated KAFKA-4410:

Description: 
We expect KafkaController to send the broker undergoing controlled shutdown one 
StopReplicaRequest for each follower replica on that broker. Examining 
KafkaController.shutdownBroker, we see that this is not the case:
1. KafkaController.shutdownBroker itself sends the shutting down broker a 
StopReplicaRequest for each follower replica
2. KafkaController.shutdownBroker transitions every follower replica to 
OfflineReplica in its call to replicaStateMachine.handleStateChanges, which 
also sends the shutting down broker a StopReplicaRequest.

  was:
We expect KafkaController to send one StopReplicaRequest for each follower 
replica on the broker undergoing controlled shutdown. Examining 
KafkaController.shutdownBroker, we see that this is not the case:
1. KafkaController.shutdownBroker itself sends a StopReplicaRequest for each 
follower replica
2. KafkaController.shutdownBroker transitions every follower replica to 
OfflineReplica in its call to replicaStateMachine.handleStateChanges, which 
also sends a StopReplicaRequest.


> KafkaController sends double the expected number of StopReplicaRequests 
> during controlled shutdown
> --
>
> Key: KAFKA-4410
> URL: https://issues.apache.org/jira/browse/KAFKA-4410
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Onur Karaman
>
> We expect KafkaController to send the broker undergoing controlled shutdown 
> one StopReplicaRequest for each follower replica on that broker. Examining 
> KafkaController.shutdownBroker, we see that this is not the case:
> 1. KafkaController.shutdownBroker itself sends the shutting down broker a 
> StopReplicaRequest for each follower replica
> 2. KafkaController.shutdownBroker transitions every follower replica to 
> OfflineReplica in its call to replicaStateMachine.handleStateChanges, which 
> also sends the shutting down broker a StopReplicaRequest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4410) KafkaController sends double the expected number of StopReplicaRequests during controlled shutdown

2016-11-15 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666555#comment-15666555
 ] 

Onur Karaman commented on KAFKA-4410:
-

To reproduce the bug, spin up zookeeper and two kafka brokers:
{code}
> ./bin/zookeeper-server-start.sh config/zookeeper.properties
> export LOG_DIR=logs0 && ./bin/kafka-server-start.sh config/server0.properties
> export LOG_DIR=logs1 && ./bin/kafka-server-start.sh config/server1.properties
{code}
Create a topic with 100 partitions replication factor 2. This should make each 
broker have 50 leader replicas and 50 follower replicas:
{code}
> ./bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic t 
> --partition 100 --replication-factor 2
Created topic "t".
> ./bin/kafka-topics.sh --zookeeper localhost:2181 --describe | grep -o 
> "Leader: [0-9]" | sort | uniq -c
  50 Leader: 0
  50 Leader: 1
{code}
Control shutdown the broker (I chose the non-controller, broker 1). The request 
log indicates 99, almost exactly double the number of follower replicas on 
broker 1.
{code}
> grep "api_key=5" logs1/kafka-request.log | wc -l
  99
{code}
The one replica which was not doubled (partition 75), had its duplicate request 
fail to go out because broker 1 had already begun to disconnect from the 
controller.
{code}
> grep "api_key=5" logs1/kafka-request.log | egrep -o "partition=\d+" | sort | 
> uniq -c
   2 partition=1
   2 partition=11
   2 partition=13
   2 partition=15
   2 partition=17
   2 partition=19
   2 partition=21
   2 partition=23
   2 partition=25
   2 partition=27
   2 partition=29
   2 partition=3
   2 partition=31
   2 partition=33
   2 partition=35
   2 partition=37
   2 partition=39
   2 partition=41
   2 partition=43
   2 partition=45
   2 partition=47
   2 partition=49
   2 partition=5
   2 partition=51
   2 partition=53
   2 partition=55
   2 partition=57
   2 partition=59
   2 partition=61
   2 partition=63
   2 partition=65
   2 partition=67
   2 partition=69
   2 partition=7
   2 partition=71
   2 partition=73
   1 partition=75
   2 partition=77
   2 partition=79
   2 partition=81
   2 partition=83
   2 partition=85
   2 partition=87
   2 partition=89
   2 partition=9
   2 partition=91
   2 partition=93
   2 partition=95
   2 partition=97
   2 partition=99

> grep "fails to send request" logs0/controller.log
[2016-11-15 00:29:42,930] WARN [Controller-0-to-broker-1-send-thread], 
Controller 0 epoch 1 fails to send request 
{controller_id=0,controller_epoch=1,delete_partitions=false,partitions=[{topic=t,partition=75}]}
 to broker localhost:9091 (id: 1 rack: null). Reconnecting to broker. 
(kafka.controller.RequestSendThread)
{code}
Factoring in the failed StopReplicaRequest, this results in 99 + 1 = 100 
StopReplicaRequests, or 2x the expected number of StopReplicaRequests.

> KafkaController sends double the expected number of StopReplicaRequests 
> during controlled shutdown
> --
>
> Key: KAFKA-4410
> URL: https://issues.apache.org/jira/browse/KAFKA-4410
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Onur Karaman
>
> We expect KafkaController to send one StopReplicaRequest for each follower 
> replica on the broker undergoing controlled shutdown. Examining 
> KafkaController.shutdownBroker, we see that this is not the case:
> 1. KafkaController.shutdownBroker itself sends a StopReplicaRequest for each 
> follower replica
> 2. KafkaController.shutdownBroker transitions every follower replica to 
> OfflineReplica in its call to replicaStateMachine.handleStateChanges, which 
> also sends a StopReplicaRequest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4410) KafkaController sends double the expected number of StopReplicaRequests during controlled shutdown

2016-11-15 Thread Onur Karaman (JIRA)
Onur Karaman created KAFKA-4410:
---

 Summary: KafkaController sends double the expected number of 
StopReplicaRequests during controlled shutdown
 Key: KAFKA-4410
 URL: https://issues.apache.org/jira/browse/KAFKA-4410
 Project: Kafka
  Issue Type: Bug
Reporter: Onur Karaman
Assignee: Onur Karaman


We expect KafkaController to send one StopReplicaRequest for each follower 
replica on the broker undergoing controlled shutdown. Examining 
KafkaController.shutdownBroker, we see that this is not the case:
1. KafkaController.shutdownBroker itself sends a StopReplicaRequest for each 
follower replica
2. KafkaController.shutdownBroker transitions every follower replica to 
OfflineReplica in its call to replicaStateMachine.handleStateChanges, which 
also sends a StopReplicaRequest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)