[jira] [Commented] (KAFKA-6105) group.id is not picked by kafka.tools.EndToEndLatency

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216356#comment-16216356
 ] 

ASF GitHub Bot commented on KAFKA-6105:
---

GitHub user cnZach opened a pull request:

https://github.com/apache/kafka/pull/4125

KAFKA-6105: load client properties in proper order for EndToEndLatency tool

Currently, the property file is loaded first, and later a auto generated 
group.id is used:
consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, "test-group-" + 
System.currentTimeMillis())

so even user gives the group.id in a property file, it is not picked up.

Change it to load client properties in proper order: set default values 
first, then try to load the custom values set in client.properties file.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cnZach/kafka cnZach_KAFKA-6105

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4125.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4125


commit 448ea9df1f735da5362eb3204e9bd7a133516fb2
Author: Yuexin Zhang 
Date:   2017-10-24T05:48:04Z

load client properties in proper order: set default values first, then try 
to load the custom values set in client.properties file




> group.id is not picked by kafka.tools.EndToEndLatency
> -
>
> Key: KAFKA-6105
> URL: https://issues.apache.org/jira/browse/KAFKA-6105
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.11.0.0
>Reporter: Yuexin Zhang
>
> As per these lines:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/EndToEndLatency.scala#L64-L67
> the property file is loaded first, and later a auto generated group.id is 
> used:
> consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, "test-group-" + 
> System.currentTimeMillis())
> so even user gives the group.id in a property file, it is not picked up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6105) group.id is not picked by kafka.tools.EndToEndLatency

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216355#comment-16216355
 ] 

ASF GitHub Bot commented on KAFKA-6105:
---

Github user cnZach closed the pull request at:

https://github.com/apache/kafka/pull/4116


> group.id is not picked by kafka.tools.EndToEndLatency
> -
>
> Key: KAFKA-6105
> URL: https://issues.apache.org/jira/browse/KAFKA-6105
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.11.0.0
>Reporter: Yuexin Zhang
>
> As per these lines:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/EndToEndLatency.scala#L64-L67
> the property file is loaded first, and later a auto generated group.id is 
> used:
> consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, "test-group-" + 
> System.currentTimeMillis())
> so even user gives the group.id in a property file, it is not picked up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6074) Use ZookeeperClient in ReplicaManager and Partition

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216308#comment-16216308
 ] 

ASF GitHub Bot commented on KAFKA-6074:
---

GitHub user tedyu opened a pull request:

https://github.com/apache/kafka/pull/4124

KAFKA-6074 Use ZookeeperClient in ReplicaManager and Partition



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tedyu/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4124.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4124


commit 0164ff44b0e67cbec9e8b56efe6e139ef87e5d69
Author: tedyu 
Date:   2017-10-24T04:51:59Z

KAFKA-6074 Use ZookeeperClient in ReplicaManager and Partition




> Use ZookeeperClient in ReplicaManager and Partition
> ---
>
> Key: KAFKA-6074
> URL: https://issues.apache.org/jira/browse/KAFKA-6074
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
> Fix For: 1.1.0
>
> Attachments: 6074.v1.txt
>
>
> We want to replace the usage of ZkUtils with ZookeeperClient in 
> ReplicaManager and Partition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6074) Use ZookeeperClient in ReplicaManager and Partition

2017-10-23 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216276#comment-16216276
 ] 

Jun Rao commented on KAFKA-6074:


[~tedyu], thanks for the patch. A couple of comments.

1. You don't need to use ZooKeeperClientWrapper. We can just implement 
createSequentialPersistentPath() like what createPartitionReassignment() does.
2. Could you submit the patch as a git pull request?  See 
https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes for 
details.

> Use ZookeeperClient in ReplicaManager and Partition
> ---
>
> Key: KAFKA-6074
> URL: https://issues.apache.org/jira/browse/KAFKA-6074
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
> Fix For: 1.1.0
>
> Attachments: 6074.v1.txt
>
>
> We want to replace the usage of ZkUtils with ZookeeperClient in 
> ReplicaManager and Partition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6074) Use ZookeeperClient in ReplicaManager and Partition

2017-10-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216193#comment-16216193
 ] 

Ted Yu commented on KAFKA-6074:
---

{code}
  val zkPath = new ZkPath(zkClientWrap)
{code}
Before I pull ZkPath into KafkaControllerZkUtils, can I get some review ?

> Use ZookeeperClient in ReplicaManager and Partition
> ---
>
> Key: KAFKA-6074
> URL: https://issues.apache.org/jira/browse/KAFKA-6074
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
> Fix For: 1.1.0
>
> Attachments: 6074.v1.txt
>
>
> We want to replace the usage of ZkUtils with ZookeeperClient in 
> ReplicaManager and Partition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6074) Use ZookeeperClient in ReplicaManager and Partition

2017-10-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KAFKA-6074:
--
Attachment: 6074.v1.txt

> Use ZookeeperClient in ReplicaManager and Partition
> ---
>
> Key: KAFKA-6074
> URL: https://issues.apache.org/jira/browse/KAFKA-6074
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
> Fix For: 1.1.0
>
> Attachments: 6074.v1.txt
>
>
> We want to replace the usage of ZkUtils with ZookeeperClient in 
> ReplicaManager and Partition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6074) Use ZookeeperClient in ReplicaManager and Partition

2017-10-23 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216185#comment-16216185
 ] 

Jun Rao commented on KAFKA-6074:


[~tedyu], that's right.

> Use ZookeeperClient in ReplicaManager and Partition
> ---
>
> Key: KAFKA-6074
> URL: https://issues.apache.org/jira/browse/KAFKA-6074
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
> Fix For: 1.1.0
>
>
> We want to replace the usage of ZkUtils with ZookeeperClient in 
> ReplicaManager and Partition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6074) Use ZookeeperClient in ReplicaManager and Partition

2017-10-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216171#comment-16216171
 ] 

Ted Yu commented on KAFKA-6074:
---

In ReplicationUtils#propagateIsrChanges() (called by 
ReplicaManager#maybePropagateIsrChanges()) :
{code}
val isrChangeNotificationPath: String = 
zkUtils.createSequentialPersistentPath(
{code}
Does this mean that createSequentialPersistentPath() should be added to 
KafkaControllerZkUtils ?

> Use ZookeeperClient in ReplicaManager and Partition
> ---
>
> Key: KAFKA-6074
> URL: https://issues.apache.org/jira/browse/KAFKA-6074
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
> Fix For: 1.1.0
>
>
> We want to replace the usage of ZkUtils with ZookeeperClient in 
> ReplicaManager and Partition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6111) Tests for KafkaControllerZkUtils

2017-10-23 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-6111:
--

 Summary: Tests for KafkaControllerZkUtils
 Key: KAFKA-6111
 URL: https://issues.apache.org/jira/browse/KAFKA-6111
 Project: Kafka
  Issue Type: Sub-task
Reporter: Ismael Juma
 Fix For: 1.1.0


It has no tests at the moment and we need to fix that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6072) Use ZookeeperClient in GroupCoordinator and TransactionCoordinator

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-6072:
---
Fix Version/s: 1.1.0

> Use ZookeeperClient in GroupCoordinator and TransactionCoordinator
> --
>
> Key: KAFKA-6072
> URL: https://issues.apache.org/jira/browse/KAFKA-6072
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
>Assignee: Manikumar
> Fix For: 1.1.0
>
>
> We want to replace the usage of ZkUtils in GroupCoordinator and 
> TransactionCoordinator with ZookeeperClient.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-6073) Use ZookeeperClient in KafkaApis

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-6073:
--

Assignee: Mickael Maison

> Use ZookeeperClient in KafkaApis
> 
>
> Key: KAFKA-6073
> URL: https://issues.apache.org/jira/browse/KAFKA-6073
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Jun Rao
>Assignee: Mickael Maison
> Fix For: 1.1.0
>
>
> We want to replace the usage of ZkUtils with ZookeeperClient in KafkaApis.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5894) add the notion of max inflight requests to async ZookeeperClient

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5894:
---
Fix Version/s: 1.1.0

> add the notion of max inflight requests to async ZookeeperClient
> 
>
> Key: KAFKA-5894
> URL: https://issues.apache.org/jira/browse/KAFKA-5894
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Fix For: 1.1.0
>
>
> ZookeeperClient is a zookeeper client that encourages pipelined requests to 
> zookeeper. We want to add the notion of max inflight requests to the client 
> for several reasons:
> # to bound memory overhead associated with async requests on the client.
> # to not overwhelm the zookeeper ensemble with a burst of requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5027) Kafka Controller Redesign

2017-10-23 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216153#comment-16216153
 ] 

Jun Rao commented on KAFKA-5027:


For the 1.1.0 release, we want to be able to complete at least the following 
sub-tasks. So, if anyone wants to help out, it would be great to pick up 
subtasks from the following list first.

10. KAFKA-5029
13. KAFKA-5473
15. KAFKA-5646
16. KAFKA-5647
17. KAFKA-5894 
18. KAFKA-6065
20. KAFKA-6072
21. KAFKA-6073
22. KAFKA-6074


> Kafka Controller Redesign
> -
>
> Key: KAFKA-5027
> URL: https://issues.apache.org/jira/browse/KAFKA-5027
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Onur Karaman
>Assignee: Onur Karaman
>
> The goal of this redesign is to improve controller performance, controller 
> maintainability, and cluster reliability.
> Documentation regarding what's being considered can be found 
> [here|https://docs.google.com/document/d/1rLDmzDOGQQeSiMANP0rC2RYp_L7nUGHzFD9MQISgXYM].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5212) Consumer ListOffsets request can starve group heartbeats

2017-10-23 Thread Richard Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216134#comment-16216134
 ] 

Richard Yu edited comment on KAFKA-5212 at 10/24/17 12:40 AM:
--

[~hachikuji] I am wondering how pause can be introduced into the test so that a 
missed heartbeat can be seen without the fix.


was (Author: yohan123):
[~hachikuji] I am wondering how polls can be introduced into the test so that a 
missed heartbeat can be seen without the fix.

> Consumer ListOffsets request can starve group heartbeats
> 
>
> Key: KAFKA-5212
> URL: https://issues.apache.org/jira/browse/KAFKA-5212
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Jason Gustafson
>Assignee: Richard Yu
> Fix For: 1.1.0, 1.0.1
>
> Attachments: 5212.patch
>
>
> The consumer is not able to send heartbeats while it is awaiting a 
> ListOffsets response. Typically this is not a problem because ListOffsets 
> requests are handled quickly, but in the worst case if the request takes 
> longer than the session timeout, the consumer will fall out of the group.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5212) Consumer ListOffsets request can starve group heartbeats

2017-10-23 Thread Richard Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216134#comment-16216134
 ] 

Richard Yu edited comment on KAFKA-5212 at 10/24/17 12:37 AM:
--

[~hachikuji] I am wondering how polls can be introduced into the test so that a 
missed heartbeat can be seen without the fix.


was (Author: yohan123):
@hacikuji Am I getting closer?

> Consumer ListOffsets request can starve group heartbeats
> 
>
> Key: KAFKA-5212
> URL: https://issues.apache.org/jira/browse/KAFKA-5212
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Jason Gustafson
>Assignee: Richard Yu
> Fix For: 1.1.0, 1.0.1
>
> Attachments: 5212.patch
>
>
> The consumer is not able to send heartbeats while it is awaiting a 
> ListOffsets response. Typically this is not a problem because ListOffsets 
> requests are handled quickly, but in the worst case if the request takes 
> longer than the session timeout, the consumer will fall out of the group.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5212) Consumer ListOffsets request can starve group heartbeats

2017-10-23 Thread Richard Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216134#comment-16216134
 ] 

Richard Yu commented on KAFKA-5212:
---

@hacikuji Am I getting closer?

> Consumer ListOffsets request can starve group heartbeats
> 
>
> Key: KAFKA-5212
> URL: https://issues.apache.org/jira/browse/KAFKA-5212
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Reporter: Jason Gustafson
>Assignee: Richard Yu
> Fix For: 1.1.0, 1.0.1
>
> Attachments: 5212.patch
>
>
> The consumer is not able to send heartbeats while it is awaiting a 
> ListOffsets response. Typically this is not a problem because ListOffsets 
> requests are handled quickly, but in the worst case if the request takes 
> longer than the session timeout, the consumer will fall out of the group.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6110) Warning when running the broker on Windows

2017-10-23 Thread Vahid Hashemian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahid Hashemian updated KAFKA-6110:
---
Description: 
*This issue exists in 1.0.0-RC2.*

The following warning appears in the broker log at startup:
{code}
[2017-10-23 15:29:49,370] WARN Error processing 
kafka.log:type=LogManager,name=LogDirectoryOffline,logDirectory=C:\tmp\kafka-logs
 (com.yammer.metrics.reporting.JmxReporter)
javax.management.MalformedObjectNameException: Invalid character ':' in value 
part of property
at javax.management.ObjectName.construct(ObjectName.java:618)
at javax.management.ObjectName.(ObjectName.java:1382)
at 
com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395)
at 
com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516)
at 
com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491)
at 
com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79)
at 
kafka.metrics.KafkaMetricsGroup$class.newGauge(KafkaMetricsGroup.scala:80)
at kafka.log.LogManager.newGauge(LogManager.scala:50)
at kafka.log.LogManager$$anonfun$6.apply(LogManager.scala:117)
at kafka.log.LogManager$$anonfun$6.apply(LogManager.scala:116)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at kafka.log.LogManager.(LogManager.scala:116)
at kafka.log.LogManager$.apply(LogManager.scala:799)
at kafka.server.KafkaServer.startup(KafkaServer.scala:222)
at 
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:92)
at kafka.Kafka.main(Kafka.scala)
{code}

  was:
The following warning appears in the broker log at startup:
{code}
[2017-10-23 15:29:49,370] WARN Error processing 
kafka.log:type=LogManager,name=LogDirectoryOffline,logDirectory=C:\tmp\kafka-logs
 (com.yammer.metrics.reporting.JmxReporter)
javax.management.MalformedObjectNameException: Invalid character ':' in value 
part of property
at javax.management.ObjectName.construct(ObjectName.java:618)
at javax.management.ObjectName.(ObjectName.java:1382)
at 
com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395)
at 
com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516)
at 
com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491)
at 
com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79)
at 
kafka.metrics.KafkaMetricsGroup$class.newGauge(KafkaMetricsGroup.scala:80)
at kafka.log.LogManager.newGauge(LogManager.scala:50)
at kafka.log.LogManager$$anonfun$6.apply(LogManager.scala:117)
at kafka.log.LogManager$$anonfun$6.apply(LogManager.scala:116)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at kafka.log.LogManager.(LogManager.scala:116)
at kafka.log.LogManager$.apply(LogManager.scala:799)
at kafka.server.KafkaServer.startup(KafkaServer.scala:222)
at 
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:92)
at kafka.Kafka.main(Kafka.scala)
{code}


> Warning when running the broker on Windows
> --
>
> Key: KAFKA-6110
> URL: https://issues.apache.org/jira/browse/KAFKA-6110
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vahid Hashemian
>Priority: Minor
>
> *This issue exists in 1.0.0-RC2.*
> The following warning appears in the broker log at startup:
> {code}
> [2017-10-23 15:29:49,370] WARN Error processing 
> kafka.log:type=LogManager,name=LogDirectoryOffline,logDirectory=C:\tmp\kafka-logs
>  (com.yammer.metrics.reporting.JmxReporter)
> javax.management.MalformedObjectNameException: Invalid character ':' in value 
> part of property
> at javax.management.ObjectName.construct(ObjectName.java:618)
> at javax.management.ObjectName.(ObjectName.java:1382)
> at 
> com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395)
> at 
> com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516)
> at 
> com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491)
> at 
> com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79)
> at 
> kafka.metrics.KafkaMetricsGroup$class.newGauge(KafkaMetricsGroup.scala:80)
> at kafka.log.LogManager.newGauge(LogManager.scala:50)
> at 

[jira] [Commented] (KAFKA-6096) Add concurrent tests to exercise all paths in group/transaction managers

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215941#comment-16215941
 ] 

ASF GitHub Bot commented on KAFKA-6096:
---

GitHub user rajinisivaram opened a pull request:

https://github.com/apache/kafka/pull/4122

KAFKA-6096: Add multi-threaded tests for group coordinator, txn manager



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajinisivaram/kafka KAFKA-6096-deadlock-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4122.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4122


commit 5e39e73da84c0057ae4815b066cbc6e9113bc608
Author: Rajini Sivaram 
Date:   2017-10-23T20:59:04Z

KAFKA-6096: Add multi-threaded tests for group coordinator, txn manager




> Add concurrent tests to exercise all paths in group/transaction managers
> 
>
> Key: KAFKA-6096
> URL: https://issues.apache.org/jira/browse/KAFKA-6096
> Project: Kafka
>  Issue Type: Test
>  Components: core
>Reporter: Rajini Sivaram
>Assignee: Jason Gustafson
> Fix For: 1.1.0
>
>
> We don't have enough tests to test locking/deadlocks in GroupMetadataManager 
> and TransactionManager. Since we have had a lot of deadlocks (KAFKA-5970, 
> KAFKA-6042 etc.) which were not detected during testing, we should add more 
> mock tests with concurrency to verify the locking.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6109) ResetIntegrationTest may fail due to IllegalArgumentException

2017-10-23 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215878#comment-16215878
 ] 

Guozhang Wang commented on KAFKA-6109:
--

[~tedyu] Please take a look at 
https://github.com/apache/kafka/pull/4096/commits/bad05511683aa111fc9dccc37dbfa7b64d022753

> ResetIntegrationTest may fail due to IllegalArgumentException
> -
>
> Key: KAFKA-6109
> URL: https://issues.apache.org/jira/browse/KAFKA-6109
> Project: Kafka
>  Issue Type: Test
>Reporter: Ted Yu
>Priority: Minor
>
> From https://builds.apache.org/job/kafka-trunk-jdk7/2918 :
> {code}
> org.apache.kafka.streams.integration.ResetIntegrationTest > 
> testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
> java.lang.IllegalArgumentException: Setting the time to 1508791687000 
> while current time 1508791687475 is newer; this is not allowed
> at 
> org.apache.kafka.common.utils.MockTime.setCurrentTimeMs(MockTime.java:81)
> at 
> org.apache.kafka.streams.integration.AbstractResetIntegrationTest.beforePrepareTest(AbstractResetIntegrationTest.java:114)
> at 
> org.apache.kafka.streams.integration.ResetIntegrationTest.before(ResetIntegrationTest.java:55)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6109) ResetIntegrationTest may fail due to IllegalArgumentException

2017-10-23 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-6109:
-

 Summary: ResetIntegrationTest may fail due to 
IllegalArgumentException
 Key: KAFKA-6109
 URL: https://issues.apache.org/jira/browse/KAFKA-6109
 Project: Kafka
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor


>From https://builds.apache.org/job/kafka-trunk-jdk7/2918 :
{code}
org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
java.lang.IllegalArgumentException: Setting the time to 1508791687000 while 
current time 1508791687475 is newer; this is not allowed
at 
org.apache.kafka.common.utils.MockTime.setCurrentTimeMs(MockTime.java:81)
at 
org.apache.kafka.streams.integration.AbstractResetIntegrationTest.beforePrepareTest(AbstractResetIntegrationTest.java:114)
at 
org.apache.kafka.streams.integration.ResetIntegrationTest.before(ResetIntegrationTest.java:55)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6108) Synchronizing on commits and StandbyTasks can be improved

2017-10-23 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-6108:
-
Description: 
In Kafka Streams, we use an optimization that allows us to reuse a source topic 
as changelog topic (and thus, avoid unnecessary data duplication) if we read a 
topic directly as {{KTable}}. To guarantee that {{StandbyTasks}} provide a 
correct state, we need to synchronize the read progress of the {{StandbyTasks}} 
with the processing progress of the main {{StreamTask}} --- otherwise, the 
{{StandbyTasks}} might restore state too much into the future. For this, we 
limit the allowed restore offsets of the {{StandbyTasks}} to be not larger than 
the committed offsets of the {{StreamTask}}.

Furthermore, we buffer all data returned by the restore consumer that is beyond 
the allowed restore-offsets in-memory.

To achieve both goals, we regularly update the max allowed restore offsets 
(this is done within task internally) and we also use a flag 
{{processStandbyRecords}} within {{StreamThread}} with the purpose to not call 
{{poll()}} on the restore consumer if our in-memory buffer has already data 
beyond the allowed max restore offsets.

We should consider:
 - unify both places in the code and put the whole logic into a single place 
(suggestion is to use the {{StreamThread}} -- a tasks, does not need to know 
about this optimization)
 - feed only those data into the task, that the task is allowed to restore 
(instead of everything)

  was:
In Kafka Streams, we use an optimization that allows us to reuse a source topic 
as changelog topic (and thus, avoid unnecessary data duplication) if we read a 
topic directly as {{KTable}}. To guarantee that {{StandbyTasks}} provide a 
correct state, we need to synchronize the read progress of the {{StandbyTasks}} 
with the processing progress of the main {{StreamTask}} --- otherwise, the 
{{StandbyTasks}} might restore state too much into the future. For this, we 
limit the allowed restore offsets of the {{StandbyTasks}} to be not larger than 
the committed offsets of the {{StreamTask}}.

Furthermore, we buffer all data returned by the restore consumer that is beyond 
the allowed restore-offsets in-memory.

To achieve both goals, we regularly update the max allowed restore offsets 
(this is done task internally) and we also use a flag {{processStandbyRecords}} 
within {{StreamThread}} with the purpose to not call {{poll()}} on the restore 
consumer if our in-memory buffer has already data beyond the allowed max 
restore offsets.

We should consider:
 - unify both places in the code and put the whole logic into a single place 
(suggestion is to use the {{StreamThread}} -- a tasks, does not need to know 
about this optimization)
 - feed only those data into the task, that the task is allowed to restore 
(instead of everything)


> Synchronizing on commits and StandbyTasks can be improved
> -
>
> Key: KAFKA-6108
> URL: https://issues.apache.org/jira/browse/KAFKA-6108
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>
> In Kafka Streams, we use an optimization that allows us to reuse a source 
> topic as changelog topic (and thus, avoid unnecessary data duplication) if we 
> read a topic directly as {{KTable}}. To guarantee that {{StandbyTasks}} 
> provide a correct state, we need to synchronize the read progress of the 
> {{StandbyTasks}} with the processing progress of the main {{StreamTask}} --- 
> otherwise, the {{StandbyTasks}} might restore state too much into the future. 
> For this, we limit the allowed restore offsets of the {{StandbyTasks}} to be 
> not larger than the committed offsets of the {{StreamTask}}.
> Furthermore, we buffer all data returned by the restore consumer that is 
> beyond the allowed restore-offsets in-memory.
> To achieve both goals, we regularly update the max allowed restore offsets 
> (this is done within task internally) and we also use a flag 
> {{processStandbyRecords}} within {{StreamThread}} with the purpose to not 
> call {{poll()}} on the restore consumer if our in-memory buffer has already 
> data beyond the allowed max restore offsets.
> We should consider:
>  - unify both places in the code and put the whole logic into a single place 
> (suggestion is to use the {{StreamThread}} -- a tasks, does not need to know 
> about this optimization)
>  - feed only those data into the task, that the task is allowed to restore 
> (instead of everything)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6108) Synchronizing on commits and StandbyTasks can be improved

2017-10-23 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-6108:
--

 Summary: Synchronizing on commits and StandbyTasks can be improved
 Key: KAFKA-6108
 URL: https://issues.apache.org/jira/browse/KAFKA-6108
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 1.0.0
Reporter: Matthias J. Sax


In Kafka Streams, we use an optimization that allows us to reuse a source topic 
as changelog topic (and thus, avoid unnecessary data duplication) if we read a 
topic directly as {{KTable}}. To guarantee that {{StandbyTasks}} provide a 
correct state, we need to synchronize the read progress of the {{StandbyTasks}} 
with the processing progress of the main {{StreamTask}} --- otherwise, the 
{{StandbyTasks}} might restore state too much into the future. For this, we 
limit the allowed restore offsets of the {{StandbyTasks}} to be not larger than 
the committed offsets of the {{StreamTask}}.

Furthermore, we buffer all data returned by the restore consumer that is beyond 
the allowed restore-offsets in-memory.

To achieve both goals, we regularly update the max allowed restore offsets 
(this is done task internally) and we also use a flag {{processStandbyRecords}} 
within {{StreamThread}} with the purpose to not call {{poll()}} on the restore 
consumer if our in-memory buffer has already data beyond the allowed max 
restore offsets.

We should consider:
 - unify both places in the code and put the whole logic into a single place 
(suggestion is to use the {{StreamThread}} -- a tasks, does not need to know 
about this optimization)
 - feed only those data into the task, that the task is allowed to restore 
(instead of everything)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5140) Flaky ResetIntegrationTest

2017-10-23 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5140.
--
   Resolution: Fixed
Fix Version/s: (was: 0.11.0.0)
   0.11.0.2
   1.0.0

Issue resolved by pull request 4095
[https://github.com/apache/kafka/pull/4095]

> Flaky ResetIntegrationTest
> --
>
> Key: KAFKA-5140
> URL: https://issues.apache.org/jira/browse/KAFKA-5140
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 0.10.2.0
>Reporter: Matthias J. Sax
>Assignee: Guozhang Wang
> Fix For: 1.0.0, 0.11.0.2
>
>
> {noformat}
> org.apache.kafka.streams.integration.ResetIntegrationTest > 
> testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
> java.lang.AssertionError: 
> Expected: <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), 
> KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642115, 1), KeyValue(2986681642075, 1), 
> KeyValue(2986681642075, 2), KeyValue(2986681642095, 2), 
> KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642115, 2), KeyValue(2986681642135, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642175, 1), 
> KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), 
> KeyValue(2986681642135, 3), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 2), KeyValue(2986681642195, 1), 
> KeyValue(2986681642155, 3), KeyValue(2986681642175, 2), 
> KeyValue(2986681642195, 2), KeyValue(2986681642155, 3), 
> KeyValue(2986681642175, 3), KeyValue(2986681642195, 2), 
> KeyValue(2986681642155, 4), KeyValue(2986681642175, 3), 
> KeyValue(2986681642195, 3)]>
>  but: was <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), 
> KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642115, 1), KeyValue(2986681642075, 1), 
> KeyValue(2986681642075, 2), KeyValue(2986681642095, 2), 
> KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642115, 2), KeyValue(2986681642135, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642175, 1), 
> KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), 
> KeyValue(2986681642135, 3), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 2), KeyValue(2986681642195, 1), 
> KeyValue(2986681642155, 3), KeyValue(2986681642175, 2), 
> KeyValue(2986681642195, 2), KeyValue(2986681642155, 3)]>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
> at 
> org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:190)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5140) Flaky ResetIntegrationTest

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215724#comment-16215724
 ] 

ASF GitHub Bot commented on KAFKA-5140:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/4095


> Flaky ResetIntegrationTest
> --
>
> Key: KAFKA-5140
> URL: https://issues.apache.org/jira/browse/KAFKA-5140
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 0.10.2.0
>Reporter: Matthias J. Sax
>Assignee: Guozhang Wang
> Fix For: 0.11.0.0
>
>
> {noformat}
> org.apache.kafka.streams.integration.ResetIntegrationTest > 
> testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
> java.lang.AssertionError: 
> Expected: <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), 
> KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642115, 1), KeyValue(2986681642075, 1), 
> KeyValue(2986681642075, 2), KeyValue(2986681642095, 2), 
> KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642115, 2), KeyValue(2986681642135, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642175, 1), 
> KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), 
> KeyValue(2986681642135, 3), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 2), KeyValue(2986681642195, 1), 
> KeyValue(2986681642155, 3), KeyValue(2986681642175, 2), 
> KeyValue(2986681642195, 2), KeyValue(2986681642155, 3), 
> KeyValue(2986681642175, 3), KeyValue(2986681642195, 2), 
> KeyValue(2986681642155, 4), KeyValue(2986681642175, 3), 
> KeyValue(2986681642195, 3)]>
>  but: was <[KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642075, 1), KeyValue(2986681642035, 1), 
> KeyValue(2986681642095, 1), KeyValue(2986681642055, 1), 
> KeyValue(2986681642115, 1), KeyValue(2986681642075, 1), 
> KeyValue(2986681642075, 2), KeyValue(2986681642095, 2), 
> KeyValue(2986681642115, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642095, 2), KeyValue(2986681642115, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642135, 1), 
> KeyValue(2986681642115, 2), KeyValue(2986681642135, 2), 
> KeyValue(2986681642155, 1), KeyValue(2986681642175, 1), 
> KeyValue(2986681642135, 2), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 1), KeyValue(2986681642195, 1), 
> KeyValue(2986681642135, 3), KeyValue(2986681642155, 2), 
> KeyValue(2986681642175, 2), KeyValue(2986681642195, 1), 
> KeyValue(2986681642155, 3), KeyValue(2986681642175, 2), 
> KeyValue(2986681642195, 2), KeyValue(2986681642155, 3)]>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
> at 
> org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:190)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6107) SCRAM user add fails if Kafka has never been started

2017-10-23 Thread Dustin Cote (JIRA)
Dustin Cote created KAFKA-6107:
--

 Summary: SCRAM user add fails if Kafka has never been started
 Key: KAFKA-6107
 URL: https://issues.apache.org/jira/browse/KAFKA-6107
 Project: Kafka
  Issue Type: Bug
  Components: tools, zkclient
Affects Versions: 0.11.0.0
Reporter: Dustin Cote
Priority: Minor


When trying to add a SCRAM user in ZooKeeper without having ever starting 
Kafka, the kafka-configs tool does not handle it well. This is a common use 
case because starting a new cluster where you want SCRAM for inter broker 
communication would generally result in seeing this problem. Today, the 
workaround is to start Kafka, add the user, then restart Kafka. Here's how to 
reproduce:

1) Start ZooKeeper
2) Run 
{code}
bin/kafka-configs --zookeeper localhost:2181 --alter --add-config 
'SCRAM-SHA-256=[iterations=8192,password=broker_pwd],SCRAM-SHA-512=[password=broker_pwd]'
 --entity-type users --entity-name broker
{code}

This will result in:
{code}
bin/kafka-configs --zookeeper localhost:2181 --alter --add-config 
'SCRAM-SHA-256=[iterations=8192,password=broker_pwd],SCRAM-SHA-512=[password=broker_pwd]'
 --entity-type users --entity-name broker
Error while executing config command 
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /config/changes/config_change_
org.I0Itec.zkclient.exception.ZkNoNodeException: 
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for /config/changes/config_change_
at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:1001)
at org.I0Itec.zkclient.ZkClient.create(ZkClient.java:528)
at 
org.I0Itec.zkclient.ZkClient.createPersistentSequential(ZkClient.java:444)
at kafka.utils.ZkPath.createPersistentSequential(ZkUtils.scala:1045)
at kafka.utils.ZkUtils.createSequentialPersistentPath(ZkUtils.scala:527)
at 
kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$changeEntityConfig(AdminUtils.scala:600)
at 
kafka.admin.AdminUtils$.changeUserOrUserClientIdConfig(AdminUtils.scala:551)
at kafka.admin.AdminUtilities$class.changeConfigs(AdminUtils.scala:63)
at kafka.admin.AdminUtils$.changeConfigs(AdminUtils.scala:72)
at kafka.admin.ConfigCommand$.alterConfig(ConfigCommand.scala:101)
at kafka.admin.ConfigCommand$.main(ConfigCommand.scala:68)
at kafka.admin.ConfigCommand.main(ConfigCommand.scala)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /config/changes/config_change_
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at org.I0Itec.zkclient.ZkConnection.create(ZkConnection.java:100)
at org.I0Itec.zkclient.ZkClient$3.call(ZkClient.java:531)
at org.I0Itec.zkclient.ZkClient$3.call(ZkClient.java:528)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:991)
... 11 more
{code}

The command doesn't appear to fail but it does throw an exception and return an 
error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6100) Streams quick start crashes Java on Windows

2017-10-23 Thread Vahid Hashemian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vahid Hashemian updated KAFKA-6100:
---
Attachment: java.exe_171023_115335.dmp.zip

Attached a crash dump created with {{procdump}}.

Here are some basic info from this dump:
{code}
DUMP_CLASS: 2

DUMP_QUALIFIER: 400

CONTEXT:  (.ecxr)
rax=0001 rbx= rcx=0005
rdx= rsi=18c3b428 rdi=00928640
rip=7ffc0b39d658 rsp=18c3b1e0 rbp=0010
 r8=  r9= r10=009359d0
r11= r12=7ffc0b339620 r13=7ffc0b339560
r14= r15=0080
iopl=0 nv up ei pl nz na pe nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b efl=0202
ucrtbase!invoke_watson+0x18:
7ffc`0b39d658 cd29int 29h
Resetting default scope

FAULTING_IP: 
ucrtbase!invoke_watson+18
7ffc`0b39d658 cd29int 29h

EXCEPTION_RECORD:  (.exr -1)
ExceptionAddress: 7ffc0b39d658 (ucrtbase!invoke_watson+0x0018)
   ExceptionCode: c409 (Security check failure or stack buffer overrun)
  ExceptionFlags: 0001
NumberParameters: 1
   Parameter[0]: 0005
Subcode: 0x5 FAST_FAIL_INVALID_ARG

DEFAULT_BUCKET_ID:  FAIL_FAST_INVALID_ARG

PROCESS_NAME:  java.exe

ERROR_CODE: (NTSTATUS) 0xc409 - The system detected an overrun of a 
stack-based buffer in this application. This overrun could potentially allow a 
malicious user to gain control of this application.

EXCEPTION_CODE: (NTSTATUS) 0xc409 - The system detected an overrun of a 
stack-based buffer in this application. This overrun could potentially allow a 
malicious user to gain control of this application.

EXCEPTION_CODE_STR:  c409

EXCEPTION_PARAMETER1:  0005

WATSON_BKT_PROCSTAMP:  59ba508a

WATSON_BKT_PROCVER:  8.0.1520.16

PROCESS_VER_PRODUCT:  Java(TM) Platform SE 8

WATSON_BKT_MODULE:  ucrtbase.dll

WATSON_BKT_MODSTAMP:  59bf2b6f

WATSON_BKT_MODOFFSET:  6d658

WATSON_BKT_MODVER:  6.2.14393.1770

MODULE_VER_PRODUCT:  MicrosoftÆ WindowsÆ Operating System

BUILD_VERSION_STRING:  10.0.14393.1198 (rs1_release_sec.170427-1353)

MODLIST_WITH_TSCHKSUM_HASH:  a875db61e6293693921cd0f58006b89f200dd909

MODLIST_SHA1_HASH:  81bf4e1fbb4ade1b9d312304478f0499566307cf

COMMENT:  
*** "C:\Users\User\Downloads\Procdump\procdump64.exe" -accepteula -ma -j 
"c:\tmp\dumps" 6088 520 0247
*** Just-In-Time debugger. PID: 6088 Event Handle: 520 JIT Context: .jdinfo 
0x247

NTGLOBALFLAG:  0

PROCESS_BAM_CURRENT_THROTTLED: 0

PROCESS_BAM_PREVIOUS_THROTTLED: 0

APPLICATION_VERIFIER_FLAGS:  0

PRODUCT_TYPE:  1

SUITE_MASK:  272

DUMP_FLAGS:  8000c07

DUMP_TYPE:  3

ANALYSIS_SESSION_HOST:  WINDEV1610EVAL

ANALYSIS_SESSION_TIME:  10-23-2017 11:57:06.0320

ANALYSIS_VERSION: 10.0.16299.15 x86fre

THREAD_ATTRIBUTES: 
OS_LOCALE:  ENU

PROBLEM_CLASSES: 

ID: [0n270]
Type:   [FAIL_FAST]
Class:  Primary
Scope:  DEFAULT_BUCKET_ID (Failure Bucket ID prefix)
BUCKET_ID
Name:   Add
Data:   Omit
PID:[Unspecified]
TID:[Unspecified]
Frame:  [0]

ID: [0n257]
Type:   [INVALID_ARG]
Class:  Addendum
Scope:  DEFAULT_BUCKET_ID (Failure Bucket ID prefix)
BUCKET_ID
Name:   Add
Data:   Omit
PID:[Unspecified]
TID:[Unspecified]
Frame:  [0]

BUGCHECK_STR:  FAIL_FAST_INVALID_ARG

PRIMARY_PROBLEM_CLASS:  FAIL_FAST

LAST_CONTROL_TRANSFER:  from 7ffc0b39d521 to 7ffc0b39d658

STACK_TEXT:  
`18c3b1e0 7ffc`0b39d521 : ` `0010 
`18c3b428 7ffc`0b33be21 : ucrtbase!invoke_watson+0x18
`18c3b210 7ffc`0b39d5f9 : ` 7ffc`0b33a63d 
` `18c3b428 : ucrtbase!invalid_parameter+0x81
`18c3b250 7ffc`0b39751d : `0080 ` 
`0010 `18c3b428 : ucrtbase!invalid_parameter_noinfo+0x9
`18c3b290 7ffb`f18fd150 : `0004 `18c3b428 
`00935b70 `0098 : ucrtbase!aligned_offset_malloc_base+0xa1
`18c3b2c0 7ffb`f18fd082 : `00935b70 `00935b60 
`18c3b4f9 `18c3b428 : 
librocksdbjni4615894589067161782!Java_org_rocksdb_WriteBatchWithIndex_setSavePoint0+0x26120
`18c3b330 7ffb`f18fe909 : `00935b60 `18c3b448 
` `0008 : 
librocksdbjni4615894589067161782!Java_org_rocksdb_WriteBatchWithIndex_setSavePoint0+0x26052
`18c3b3a0 7ffb`f194f19f : `0094ee90 `0080 
7ffb`0004 `0094 : 
librocksdbjni4615894589067161782!Java_org_rocksdb_WriteBatchWithIndex_setSavePoint0+0x278d9
`18c3b410 7ffb`f190db83 : `00936560 7ffb`f1c26140 
`00931790 `00936560 : 

[jira] [Commented] (KAFKA-6106) Postpone normal processing of tasks within a thread until restoration of all tasks have completed

2017-10-23 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215564#comment-16215564
 ] 

Matthias J. Sax commented on KAFKA-6106:


IMHO, both scenarios (block all processing to speed up restoration as well as 
start partial processing asap) are valuable and we should give a user a config 
to choose between both behaviors. This would require a KIP.

> Postpone normal processing of tasks within a thread until restoration of all 
> tasks have completed
> -
>
> Key: KAFKA-6106
> URL: https://issues.apache.org/jira/browse/KAFKA-6106
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.11.0.1, 1.0.0
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>
> Let's say a stream thread hosts multiple tasks, A and B. At the very 
> beginning when A and B are assigned to the thread, the thread state is 
> {{TASKS_ASSIGNED}}, and the thread start restoring these two tasks during 
> this state using the restore consumer while using normal consumer for 
> heartbeating.
> If task A's restoration has completed earlier than task B, then the thread 
> will start processing A immediately even when it is still in the 
> {{TASKS_ASSIGNED}} phase. But processing task A will slow down restoration of 
> task B since it is single-thread. So the thread's transition to {{RUNNING}} 
> when all of its assigned tasks have completed restoring and now can be 
> processed will be delayed.
> Note that the streams instance's state will only transit to {{RUNNING}} when 
> all of its threads have transit to {{RUNNING}}, so the instance's transition 
> will also be delayed by this scenario.
> We'd better to not start processing ready tasks immediately, but instead 
> focus on restoration during the {{TASKS_ASSIGNED}} state to shorten the 
> overall time of the instance's state transition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6100) Streams quick start crashes Java on Windows

2017-10-23 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215490#comment-16215490
 ] 

Guozhang Wang commented on KAFKA-6100:
--

[~vahid] If you could paste the stack trace as Ted suggested, we could 
coordinate with the rockDB community on tracking this issue then.

> Streams quick start crashes Java on Windows 
> 
>
> Key: KAFKA-6100
> URL: https://issues.apache.org/jira/browse/KAFKA-6100
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
> Environment: Windows 10 VM
>Reporter: Vahid Hashemian
> Attachments: Screen Shot 2017-10-20 at 11.53.14 AM.png
>
>
> *This issue was detected in 1.0.0 RC2.*
> The following step in streams quick start crashes Java on Windows 10:
> {{bin/kafka-run-class.sh 
> org.apache.kafka.streams.examples.wordcount.WordCountDemo}}
> I tracked this down to [this 
> change|https://github.com/apache/kafka/commit/196bcfca0c56420793f85514d1602bde564b0651#diff-6512f838e273b79676cac5f72456127fR67],
>  and it seems to new version of RocksDB is to blame.  I tried the quick start 
> with the previous version of RocksDB (5.7.3) and did not run into this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6100) Streams quick start crashes Java on Windows

2017-10-23 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215481#comment-16215481
 ] 

Guozhang Wang commented on KAFKA-6100:
--

Pasting the related information from mailing list:

{code}
org.apache.kafka.streams.examples.wordcount.WordCountDemo
constantly crashed Java for me, with the error: "An unhandled win32
exception occurred in java.exe".

This is what shows up in the broker log before the crash:
[2017-10-17 15:09:13,948] INFO [GroupCoordinator 0]: Preparing to
rebalance group streams-wordcount with old generation 0
(__consumer_offsets-35) (kafka.coordinator.group.GroupCoordinator)
[2017-10-17 15:09:13,964] INFO [GroupCoordinator 0]: Stabilized group
streams-wordcount generation 1 (__consumer_offsets-35)
(kafka.coordinator.group.GroupCoordinator)
[2017-10-17 15:09:14,214] INFO [GroupCoordinator 0]: Assignment received
from leader for group streams-wordcount for generation 1
(kafka.coordinator.group.GroupCoordinator)
{code}

> Streams quick start crashes Java on Windows 
> 
>
> Key: KAFKA-6100
> URL: https://issues.apache.org/jira/browse/KAFKA-6100
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
> Environment: Windows 10 VM
>Reporter: Vahid Hashemian
> Attachments: Screen Shot 2017-10-20 at 11.53.14 AM.png
>
>
> *This issue was detected in 1.0.0 RC2.*
> The following step in streams quick start crashes Java on Windows 10:
> {{bin/kafka-run-class.sh 
> org.apache.kafka.streams.examples.wordcount.WordCountDemo}}
> I tracked this down to [this 
> change|https://github.com/apache/kafka/commit/196bcfca0c56420793f85514d1602bde564b0651#diff-6512f838e273b79676cac5f72456127fR67],
>  and it seems to new version of RocksDB is to blame.  I tried the quick start 
> with the previous version of RocksDB (5.7.3) and did not run into this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-5846) Use singleton NoOpConsumerRebalanceListener in subscribe() call where listener is not specified

2017-10-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191405#comment-16191405
 ] 

Ted Yu edited comment on KAFKA-5846 at 10/23/17 5:21 PM:
-

Patch looks good .


was (Author: yuzhih...@gmail.com):
Patch looks good.

> Use singleton NoOpConsumerRebalanceListener in subscribe() call where 
> listener is not specified
> ---
>
> Key: KAFKA-5846
> URL: https://issues.apache.org/jira/browse/KAFKA-5846
> Project: Kafka
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Kamal Chandraprakash
>Priority: Minor
>
> Currently KafkaConsumer creates instance of NoOpConsumerRebalanceListener for 
> each subscribe() call where ConsumerRebalanceListener is not specified:
> {code}
> public void subscribe(Pattern pattern) {
> subscribe(pattern, new NoOpConsumerRebalanceListener());
> {code}
> We can create a singleton NoOpConsumerRebalanceListener to be used in such 
> scenarios.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6070) ducker-ak: add ipaddress and enum34 dependencies to docker image

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6070.

   Resolution: Fixed
Fix Version/s: 1.1.0
   1.0.0

> ducker-ak: add ipaddress and enum34 dependencies to docker image
> 
>
> Key: KAFKA-6070
> URL: https://issues.apache.org/jira/browse/KAFKA-6070
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
> Fix For: 1.0.0, 1.1.0
>
>
> ducker-ak: add ipaddress and enum34 dependencies to docker image



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-4991) KerberosLogin#login should probably be synchronized

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4991:
---
Fix Version/s: 1.1.0

> KerberosLogin#login should probably be synchronized
> ---
>
> Key: KAFKA-4991
> URL: https://issues.apache.org/jira/browse/KAFKA-4991
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>  Labels: newbie
> Fix For: 1.1.0
>
>
> KerberosLogin#login should probably be synchronized, since it is modifying 
> {{loginContext}} and {{lastLogin}}, which are normally only accessed under 
> the lock.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6070) ducker-ak: add ipaddress and enum34 dependencies to docker image

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215317#comment-16215317
 ] 

ASF GitHub Bot commented on KAFKA-6070:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/4084


> ducker-ak: add ipaddress and enum34 dependencies to docker image
> 
>
> Key: KAFKA-6070
> URL: https://issues.apache.org/jira/browse/KAFKA-6070
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>
> ducker-ak: add ipaddress and enum34 dependencies to docker image



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-4991) KerberosLogin#login should probably be synchronized

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4991:
---
Labels: newbie  (was: )

> KerberosLogin#login should probably be synchronized
> ---
>
> Key: KAFKA-4991
> URL: https://issues.apache.org/jira/browse/KAFKA-4991
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>  Labels: newbie
> Fix For: 1.1.0
>
>
> KerberosLogin#login should probably be synchronized, since it is modifying 
> {{loginContext}} and {{lastLogin}}, which are normally only accessed under 
> the lock.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5743) All ducktape services should store their files in subdirectories of /mnt

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-5743.

   Resolution: Fixed
Fix Version/s: 1.0.0

> All ducktape services should store their files in subdirectories of /mnt
> 
>
> Key: KAFKA-5743
> URL: https://issues.apache.org/jira/browse/KAFKA-5743
> Project: Kafka
>  Issue Type: Improvement
>  Components: system tests
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
> Fix For: 1.0.0
>
>
> Currently, some ducktape services like KafkaService store their files 
> directly in /mnt.  This means that cleanup involves running {{rm -rf 
> /mnt/*}}.  It would be better if services stored their files in 
> subdirectories of mount.  For example, KafkaService could store its files in 
> /mnt/kafka.  This would make cleanup simpler and avoid the need to remove all 
> of /mnt.  It would also make running multiple services on the same node 
> possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-6101) Reconnecting to broker does not exponentially backoff

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6101.

Resolution: Fixed
  Assignee: Ted Yu

> Reconnecting to broker does not exponentially backoff
> -
>
> Key: KAFKA-6101
> URL: https://issues.apache.org/jira/browse/KAFKA-6101
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.11.0.0
>Reporter: Sean Rohead
>Assignee: Ted Yu
> Fix For: 1.0.0, 1.1.0
>
> Attachments: 6101.v2.txt, 6101.v3.txt, text.html
>
>
> I am using com.typesafe.akka:akka-stream-kafka:0.17 which relies on 
> kafka-clients:0.11.0.0.
> I have set the reconnect.backoff.max.ms property to 6.
> When I start the application without kafka running, I see a flood of the 
> following log message:
> [warn] o.a.k.c.NetworkClient - Connection to node -1 could not be 
> established. Broker may not be available.
> The log messages occur several times a second and the frequency of these 
> messages does not decrease over time as would be expected if exponential 
> backoff was working properly.
> I set a breakpoint in the debugger in ClusterConnectionStates:188 and noticed 
> that every time this breakpoint is hit, nodeState.failedAttempts is always 0. 
> This is why the delay does not increase exponentially. It also appears that 
> every time the breakpoint is hit, it is on a different instance, so even 
> though the number of failedAttempts is incremented, we never get the 
> breakpoint for the same instance more than one time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6101) Reconnecting to broker does not exponentially backoff

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-6101:
---
Fix Version/s: 1.1.0

> Reconnecting to broker does not exponentially backoff
> -
>
> Key: KAFKA-6101
> URL: https://issues.apache.org/jira/browse/KAFKA-6101
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.11.0.0
>Reporter: Sean Rohead
> Fix For: 1.0.0, 1.1.0
>
> Attachments: 6101.v2.txt, 6101.v3.txt, text.html
>
>
> I am using com.typesafe.akka:akka-stream-kafka:0.17 which relies on 
> kafka-clients:0.11.0.0.
> I have set the reconnect.backoff.max.ms property to 6.
> When I start the application without kafka running, I see a flood of the 
> following log message:
> [warn] o.a.k.c.NetworkClient - Connection to node -1 could not be 
> established. Broker may not be available.
> The log messages occur several times a second and the frequency of these 
> messages does not decrease over time as would be expected if exponential 
> backoff was working properly.
> I set a breakpoint in the debugger in ClusterConnectionStates:188 and noticed 
> that every time this breakpoint is hit, nodeState.failedAttempts is always 0. 
> This is why the delay does not increase exponentially. It also appears that 
> every time the breakpoint is hit, it is on a different instance, so even 
> though the number of failedAttempts is incremented, we never get the 
> breakpoint for the same instance more than one time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6101) Reconnecting to broker does not exponentially backoff

2017-10-23 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-6101:
---
Fix Version/s: 1.0.0

> Reconnecting to broker does not exponentially backoff
> -
>
> Key: KAFKA-6101
> URL: https://issues.apache.org/jira/browse/KAFKA-6101
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.11.0.0
>Reporter: Sean Rohead
> Fix For: 1.0.0, 1.1.0
>
> Attachments: 6101.v2.txt, 6101.v3.txt, text.html
>
>
> I am using com.typesafe.akka:akka-stream-kafka:0.17 which relies on 
> kafka-clients:0.11.0.0.
> I have set the reconnect.backoff.max.ms property to 6.
> When I start the application without kafka running, I see a flood of the 
> following log message:
> [warn] o.a.k.c.NetworkClient - Connection to node -1 could not be 
> established. Broker may not be available.
> The log messages occur several times a second and the frequency of these 
> messages does not decrease over time as would be expected if exponential 
> backoff was working properly.
> I set a breakpoint in the debugger in ClusterConnectionStates:188 and noticed 
> that every time this breakpoint is hit, nodeState.failedAttempts is always 0. 
> This is why the delay does not increase exponentially. It also appears that 
> every time the breakpoint is hit, it is on a different instance, so even 
> though the number of failedAttempts is incremented, we never get the 
> breakpoint for the same instance more than one time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6101) Reconnecting to broker does not exponentially backoff

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215283#comment-16215283
 ] 

ASF GitHub Bot commented on KAFKA-6101:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/4118


> Reconnecting to broker does not exponentially backoff
> -
>
> Key: KAFKA-6101
> URL: https://issues.apache.org/jira/browse/KAFKA-6101
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.11.0.0
>Reporter: Sean Rohead
> Attachments: 6101.v2.txt, 6101.v3.txt, text.html
>
>
> I am using com.typesafe.akka:akka-stream-kafka:0.17 which relies on 
> kafka-clients:0.11.0.0.
> I have set the reconnect.backoff.max.ms property to 6.
> When I start the application without kafka running, I see a flood of the 
> following log message:
> [warn] o.a.k.c.NetworkClient - Connection to node -1 could not be 
> established. Broker may not be available.
> The log messages occur several times a second and the frequency of these 
> messages does not decrease over time as would be expected if exponential 
> backoff was working properly.
> I set a breakpoint in the debugger in ClusterConnectionStates:188 and noticed 
> that every time this breakpoint is hit, nodeState.failedAttempts is always 0. 
> This is why the delay does not increase exponentially. It also appears that 
> every time the breakpoint is hit, it is on a different instance, so even 
> though the number of failedAttempts is incremented, we never get the 
> breakpoint for the same instance more than one time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6101) Reconnecting to broker does not exponentially backoff

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215160#comment-16215160
 ] 

ASF GitHub Bot commented on KAFKA-6101:
---

GitHub user tedyu opened a pull request:

https://github.com/apache/kafka/pull/4118

KAFKA-6101 Reconnecting to broker does not exponentially backoff



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tedyu/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4118.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4118


commit cd928d3867e42774bb167b3aaf11ca6a8dd8d48f
Author: tedyu 
Date:   2017-10-23T13:49:27Z

KAFKA-6101 Reconnecting to broker does not exponentially backoff




> Reconnecting to broker does not exponentially backoff
> -
>
> Key: KAFKA-6101
> URL: https://issues.apache.org/jira/browse/KAFKA-6101
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.11.0.0
>Reporter: Sean Rohead
> Attachments: 6101.v2.txt, 6101.v3.txt, text.html
>
>
> I am using com.typesafe.akka:akka-stream-kafka:0.17 which relies on 
> kafka-clients:0.11.0.0.
> I have set the reconnect.backoff.max.ms property to 6.
> When I start the application without kafka running, I see a flood of the 
> following log message:
> [warn] o.a.k.c.NetworkClient - Connection to node -1 could not be 
> established. Broker may not be available.
> The log messages occur several times a second and the frequency of these 
> messages does not decrease over time as would be expected if exponential 
> backoff was working properly.
> I set a breakpoint in the debugger in ClusterConnectionStates:188 and noticed 
> that every time this breakpoint is hit, nodeState.failedAttempts is always 0. 
> This is why the delay does not increase exponentially. It also appears that 
> every time the breakpoint is hit, it is on a different instance, so even 
> though the number of failedAttempts is incremented, we never get the 
> breakpoint for the same instance more than one time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6101) Reconnecting to broker does not exponentially backoff

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215148#comment-16215148
 ] 

ASF GitHub Bot commented on KAFKA-6101:
---

Github user tedyu closed the pull request at:

https://github.com/apache/kafka/pull/4108


> Reconnecting to broker does not exponentially backoff
> -
>
> Key: KAFKA-6101
> URL: https://issues.apache.org/jira/browse/KAFKA-6101
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.11.0.0
>Reporter: Sean Rohead
> Attachments: 6101.v2.txt, 6101.v3.txt, text.html
>
>
> I am using com.typesafe.akka:akka-stream-kafka:0.17 which relies on 
> kafka-clients:0.11.0.0.
> I have set the reconnect.backoff.max.ms property to 6.
> When I start the application without kafka running, I see a flood of the 
> following log message:
> [warn] o.a.k.c.NetworkClient - Connection to node -1 could not be 
> established. Broker may not be available.
> The log messages occur several times a second and the frequency of these 
> messages does not decrease over time as would be expected if exponential 
> backoff was working properly.
> I set a breakpoint in the debugger in ClusterConnectionStates:188 and noticed 
> that every time this breakpoint is hit, nodeState.failedAttempts is always 0. 
> This is why the delay does not increase exponentially. It also appears that 
> every time the breakpoint is hit, it is on a different instance, so even 
> though the number of failedAttempts is incremented, we never get the 
> breakpoint for the same instance more than one time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6042) Kafka Request Handler deadlocks and brings down the cluster.

2017-10-23 Thread Ben Corlett (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215143#comment-16215143
 ] 

Ben Corlett commented on KAFKA-6042:


I've deployed 1.0.0-SNAPSHOT to broker 25. I'll let you know how we get on.

> Kafka Request Handler deadlocks and brings down the cluster.
> 
>
> Key: KAFKA-6042
> URL: https://issues.apache.org/jira/browse/KAFKA-6042
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.11.0.0, 0.11.0.1, 1.0.0
> Environment: kafka version: 0.11.0.1
> client versions: 0.8.2.1-0.10.2.1
> platform: aws (eu-west-1a)
> nodes: 36 x r4.xlarge
> disk storage: 2.5 tb per node (~73% usage per node)
> topics: 250
> number of partitions: 48k (approx)
> os: ubuntu 14.04
> jvm: Java(TM) SE Runtime Environment (build 1.8.0_131-b11), Java HotSpot(TM) 
> 64-Bit Server VM (build 25.131-b11, mixed mode)
>Reporter: Ben Corlett
>Assignee: Rajini Sivaram
>Priority: Blocker
> Fix For: 1.0.0
>
> Attachments: thread_dump.txt.gz
>
>
> We have been experiencing a deadlock that happens on a consistent server 
> within our cluster. This happens multiple times a week currently. It first 
> started happening when we upgraded to 0.11.0.0. Sadly 0.11.0.1 failed to 
> resolve the issue.
> Sequence of events:
> At a seemingly random time broker 125 goes into a deadlock. As soon as it is 
> deadlocked it will remove all the ISR's for any partition is its the leader 
> for.
> [2017-10-10 00:06:10,061] INFO Partition [XX,24] on broker 125: 
> Shrinking ISR from 117,125 to 125 (kafka.cluster.Partition)
> [2017-10-10 00:06:10,073] INFO Partition [XX,974] on broker 125: 
> Shrinking ISR from 117,125 to 125 (kafka.cluster.Partition)
> [2017-10-10 00:06:10,079] INFO Partition [XX,64] on broker 125: 
> Shrinking ISR from 117,125 to 125 (kafka.cluster.Partition)
> [2017-10-10 00:06:10,081] INFO Partition [XX,21] on broker 125: 
> Shrinking ISR from 117,125 to 125 (kafka.cluster.Partition)
> [2017-10-10 00:06:10,084] INFO Partition [XX,12] on broker 125: 
> Shrinking ISR from 117,125 to 125 (kafka.cluster.Partition)
> [2017-10-10 00:06:10,085] INFO Partition [XX,61] on broker 125: 
> Shrinking ISR from 117,125 to 125 (kafka.cluster.Partition)
> [2017-10-10 00:06:10,086] INFO Partition [XX,53] on broker 125: 
> Shrinking ISR from 117,125 to 125 (kafka.cluster.Partition)
> [2017-10-10 00:06:10,088] INFO Partition [XX,27] on broker 125: 
> Shrinking ISR from 117,125 to 125 (kafka.cluster.Partition)
> [2017-10-10 00:06:10,090] INFO Partition [XX,182] on broker 125: 
> Shrinking ISR from 117,125 to 125 (kafka.cluster.Partition)
> [2017-10-10 00:06:10,091] INFO Partition [XX,16] on broker 125: 
> Shrinking ISR from 117,125 to 125 (kafka.cluster.Partition)
> 
> The other nodes fail to connect to the node 125 
> [2017-10-10 00:08:42,318] WARN [ReplicaFetcherThread-0-125]: Error in fetch 
> to broker 125, request (type=FetchRequest, replicaId=101, maxWait=500, 
> minBytes=1, maxBytes=10485760, fetchData={XX-94=(offset=0, 
> logStartOffset=0, maxBytes=1048576), XX-22=(offset=0, 
> logStartOffset=0, maxBytes=1048576), XX-58=(offset=0, 
> logStartOffset=0, maxBytes=1048576), XX-11=(offset=78932482, 
> logStartOffset=50881481, maxBytes=1048576), XX-55=(offset=0, 
> logStartOffset=0, maxBytes=1048576), XX-19=(offset=0, 
> logStartOffset=0, maxBytes=1048576), XX-91=(offset=0, 
> logStartOffset=0, maxBytes=1048576), XX-5=(offset=903857106, 
> logStartOffset=0, maxBytes=1048576), XX-80=(offset=0, 
> logStartOffset=0, maxBytes=1048576), XX-88=(offset=0, 
> logStartOffset=0, maxBytes=1048576), XX-34=(offset=308, 
> logStartOffset=308, maxBytes=1048576), XX-7=(offset=369990, 
> logStartOffset=369990, maxBytes=1048576), XX-0=(offset=57965795, 
> logStartOffset=0, maxBytes=1048576)}) (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 125 was disconnected before the response 
> was read
> at 
> org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:93)
> at 
> kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:93)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:207)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
> at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:151)
> at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:112)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64)
> As node 125 removed all the ISRs as it was 

[jira] [Commented] (KAFKA-5637) Document compatibility and release policies

2017-10-23 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16215132#comment-16215132
 ] 

Ismael Juma commented on KAFKA-5637:


Thanks [~sliebau]! And no rush, family comes first. :)

> Document compatibility and release policies
> ---
>
> Key: KAFKA-5637
> URL: https://issues.apache.org/jira/browse/KAFKA-5637
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Ismael Juma
>Assignee: Sönke Liebau
> Fix For: 1.1.0
>
>
> We should document our compatibility and release policies in one place so 
> that people have the correct expectations. This is generally important, but 
> more so now that we are releasing 1.0.0.
> I extracted the following topics from the mailing list thread as the ones 
> that should be documented as a minimum: 
> *Code stability*
> * Explanation of stability annotations and their implications
> * Explanation of what public apis are
> * *Discussion point: * Do we want to keep the _unstable_ annotation or is 
> _evolving_ sufficient going forward?
> *Support duration*
> * How long are versions supported?
> * How far are bugfixes backported?
> * How far are security fixes backported?
> * How long are protocol versions supported by subsequent code versions?
> * How long are older clients supported?
> * How long are older brokers supported?
> I will create an initial pull request to add a section to the documentation 
> as basis for further discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6097) Kafka ssl.endpoint.identification.algorithm=HTTPS not working

2017-10-23 Thread Damyan Petev Manev (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damyan Petev Manev updated KAFKA-6097:
--
Description: 
When ssl.endpoint.identification.algorithm is set to HTTPS and I have san 
extension on my server certificate clients do not verify the servers's fully 
qualified domain name (FQDN) agains it.
Client certificate authentication works. With the following san extension - 
dns:some.thing.here I expect connection to fail, because according to  
 http://kafka.apache.org/documentation.html#security_ssl :
 "clients will verify the server's fully qualified domain name (FQDN) against 
one of the following two fields
Common Name (CN)
Subject Alternative Name (SAN)",
but messages are produced and consumed successfully.

I am using kafka 0.10.2.1 command line tools. 




  was:
When ssl.endpoint.identification.algorithm is set to HTTPS and I have san 
extension on my server certificate clients do not verify the servers's fully 
qualified domain name (FQDN) agains it.
Client certificate authentication works. With the following san extension - 
dns:some.thing.here I expect connection to fail, because according to  
 http://kafka.apache.org/documentation.html#security_ssl :
 "clients will verify the server's fully qualified domain name (FQDN) against 
one of the following two fields
Common Name (CN)
Subject Alternative Name (SAN)",
but messages are produced and consumed successfully.





> Kafka ssl.endpoint.identification.algorithm=HTTPS not working
> -
>
> Key: KAFKA-6097
> URL: https://issues.apache.org/jira/browse/KAFKA-6097
> Project: Kafka
>  Issue Type: Bug
>Reporter: Damyan Petev Manev
> Attachments: kafka-certificates-script.sh
>
>
> When ssl.endpoint.identification.algorithm is set to HTTPS and I have san 
> extension on my server certificate clients do not verify the servers's fully 
> qualified domain name (FQDN) agains it.
> Client certificate authentication works. With the following san extension - 
> dns:some.thing.here I expect connection to fail, because according to  
>  http://kafka.apache.org/documentation.html#security_ssl :
>  "clients will verify the server's fully qualified domain name (FQDN) against 
> one of the following two fields
> Common Name (CN)
> Subject Alternative Name (SAN)",
> but messages are produced and consumed successfully.
> I am using kafka 0.10.2.1 command line tools. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6105) group.id is not picked by kafka.tools.EndToEndLatency

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214801#comment-16214801
 ] 

ASF GitHub Bot commented on KAFKA-6105:
---

GitHub user cnZach opened a pull request:

https://github.com/apache/kafka/pull/4116

KAFKA-6105: load client properties in proper order for EndToEndLatency tool

Currently, the property file is loaded first, and later a auto generated 
group.id is used:
`consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, "test-group-" + 
System.currentTimeMillis())`

so even user gives the group.id in a property file, it is not picked up.

Change it to load client properties in proper order: set default values 
first, then try to load the custom values set in client.properties file.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cnZach/kafka cnZach_KAFKA-6105

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4116.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4116


commit 99b6ce136d1c4bafa0f74828583d6b4af6cb0785
Author: Yuexin Zhang 
Date:   2017-10-23T08:11:45Z

load client properties in proper order: set default values first, then try 
to load the custom values set in client.properties file




> group.id is not picked by kafka.tools.EndToEndLatency
> -
>
> Key: KAFKA-6105
> URL: https://issues.apache.org/jira/browse/KAFKA-6105
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.11.0.0
>Reporter: Yuexin Zhang
>
> As per these lines:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/EndToEndLatency.scala#L64-L67
> the property file is loaded first, and later a auto generated group.id is 
> used:
> consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, "test-group-" + 
> System.currentTimeMillis())
> so even user gives the group.id in a property file, it is not picked up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6105) group.id is not picked by kafka.tools.EndToEndLatency

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214772#comment-16214772
 ] 

ASF GitHub Bot commented on KAFKA-6105:
---

Github user cnZach closed the pull request at:

https://github.com/apache/kafka/pull/4115


> group.id is not picked by kafka.tools.EndToEndLatency
> -
>
> Key: KAFKA-6105
> URL: https://issues.apache.org/jira/browse/KAFKA-6105
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.11.0.0
>Reporter: Yuexin Zhang
>
> As per these lines:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/EndToEndLatency.scala#L64-L67
> the property file is loaded first, and later a auto generated group.id is 
> used:
> consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, "test-group-" + 
> System.currentTimeMillis())
> so even user gives the group.id in a property file, it is not picked up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5574) kafka-consumer-perf-test.sh report header has one less column in show-detailed-stats mode

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214758#comment-16214758
 ] 

ASF GitHub Bot commented on KAFKA-5574:
---

Github user cnZach closed the pull request at:

https://github.com/apache/kafka/pull/3512


> kafka-consumer-perf-test.sh report header has one less column in 
> show-detailed-stats mode
> -
>
> Key: KAFKA-5574
> URL: https://issues.apache.org/jira/browse/KAFKA-5574
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.9.0.0, 0.10.0.0
>Reporter: Yuexin Zhang
>
> time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
> 2017-07-09 21:40:40:369, 0, 0.1492, 2.6176, 5000, 87719.2982
> 2017-07-09 21:40:40:386, 0, 0.2983, 149.0479, 1, 500.
> 2017-07-09 21:40:40:387, 0, 0.4473, 149.0812, 15000, 500.
> there's one more column between "time" and "data.consumed.in.MB", it's 
> currently set to 0:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerPerformance.scala#L158
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerPerformance.scala#L175
> is it a thread id? what is this id used for?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6105) group.id is not picked by kafka.tools.EndToEndLatency

2017-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214757#comment-16214757
 ] 

ASF GitHub Bot commented on KAFKA-6105:
---

GitHub user cnZach opened a pull request:

https://github.com/apache/kafka/pull/4115

KAFKA-6105: load client properties in proper order for 
kafka.tools.EndToEndLatency

Currently, the property file is loaded first, and later a auto generated 
group.id is used:
consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, "test-group-" + 
System.currentTimeMillis())
so even user gives the group.id in a property file, it is not picked up.

we need to load client properties in proper order, so that we allow user to 
specify group.id and other properties, excludes only the properties provided in 
the argument list.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cnZach/kafka cnZach_KAFKA-6105

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4115.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4115


commit d83f84e14c556fffddde2a74469e5be43fc99b10
Author: Yuexin Zhang 
Date:   2017-10-23T07:13:10Z

load client properties in proper order, so that we allow user to sepcify 
group.id and other propeties, excludes only the properties provided in the 
argument list




> group.id is not picked by kafka.tools.EndToEndLatency
> -
>
> Key: KAFKA-6105
> URL: https://issues.apache.org/jira/browse/KAFKA-6105
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.11.0.0
>Reporter: Yuexin Zhang
>
> As per these lines:
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/EndToEndLatency.scala#L64-L67
> the property file is loaded first, and later a auto generated group.id is 
> used:
> consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, "test-group-" + 
> System.currentTimeMillis())
> so even user gives the group.id in a property file, it is not picked up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-6105) group.id is not picked by kafka.tools.EndToEndLatency

2017-10-23 Thread Yuexin Zhang (JIRA)
Yuexin Zhang created KAFKA-6105:
---

 Summary: group.id is not picked by kafka.tools.EndToEndLatency
 Key: KAFKA-6105
 URL: https://issues.apache.org/jira/browse/KAFKA-6105
 Project: Kafka
  Issue Type: Bug
  Components: tools
Affects Versions: 0.11.0.0
Reporter: Yuexin Zhang


As per these lines:

https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/EndToEndLatency.scala#L64-L67

the property file is loaded first, and later a auto generated group.id is used:
consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, "test-group-" + 
System.currentTimeMillis())

so even user gives the group.id in a property file, it is not picked up.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)