date:20190724



[ 
https://issues.apache.org/jira/browse/KAFKA-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892297#comment-16892297
 ] 

ASF GitHub Bot commented on KAFKA-8179:
---

ableegoldman commented on pull request #7100: KAFKA-8179: KIP-429, new 
PartitionAssignor interface
URL: https://github.com/apache/kafka/pull/7100
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incremental Rebalance Protocol for Kafka Consumer
> -
>
> Key: KAFKA-8179
> URL: https://issues.apache.org/jira/browse/KAFKA-8179
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Major
>
> Recently Kafka community is promoting cooperative rebalancing to mitigate the 
> pain points in the stop-the-world rebalancing protocol. This ticket is 
> created to initiate that idea at the Kafka consumer client, which will be 
> beneficial for heavy-stateful consumers such as Kafka Streams applications.
> In short, the scope of this ticket includes reducing unnecessary rebalance 
> latency due to heavy partition migration: i.e. partitions being revoked and 
> re-assigned. This would make the built-in consumer assignors (range, 
> round-robin etc) to be aware of previously assigned partitions and be sticky 
> in best-effort.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8179) Incremental Rebalance Protocol for Kafka Consumer



[ 
https://issues.apache.org/jira/browse/KAFKA-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892296#comment-16892296
 ] 

ASF GitHub Bot commented on KAFKA-8179:
---

ableegoldman commented on pull request #7107: KAFKA-8179: 
PartitionAssignorAdapter for backwards compatibility
URL: https://github.com/apache/kafka/pull/7107
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incremental Rebalance Protocol for Kafka Consumer
> -
>
> Key: KAFKA-8179
> URL: https://issues.apache.org/jira/browse/KAFKA-8179
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Major
>
> Recently Kafka community is promoting cooperative rebalancing to mitigate the 
> pain points in the stop-the-world rebalancing protocol. This ticket is 
> created to initiate that idea at the Kafka consumer client, which will be 
> beneficial for heavy-stateful consumers such as Kafka Streams applications.
> In short, the scope of this ticket includes reducing unnecessary rebalance 
> latency due to heavy partition migration: i.e. partitions being revoked and 
> re-assigned. This would make the built-in consumer assignors (range, 
> round-robin etc) to be aware of previously assigned partitions and be sticky 
> in best-effort.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8703) Move PartitionAssignor to public API

2019-07-24 Thread Sophie Blee-Goldman (JIRA)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sophie Blee-Goldman updated KAFKA-8703:
---
Description: 
Currently the PartitionAssignor, which is meant to be a pluggable interface, 
sits in the internal package. It should be part of the public API, so we are 
deprecating the old consumer.internal.PartitionAssignor in favor of a new 
consumer.ConsumerPartitionAssignor.

We also want to take the opportunity to refactor the interface a bit, so as to 
achieve an easier to evolve API moving forward

Due to the way assignors are instantiated, moving to a new 
ConsumerPartitionAssignor interface will be fully compatible for most users 
except those who have implemented the internal.PartitionAssignor (see 
KAFKA-8704)

  was:
Currently the PartitionAssignor, which is meant to be a pluggable interface, 
sits in the internal package. It should be part of the public API, so we are 
deprecating the old consumer.internal.PartitionAssignor in favor of a new 
consumer.PartitionAssignor.

We also want to take the opportunity to refactor the interface a bit, so as to 
achieve
 # Better separation of user/assignor and consumer provided metadata
 # Easier to evolve API

Due to the way assignors are instantiated, moving to a new PartitionAssignor 
interface will be fully compatible for most users except those who have 
implemented the internal.PartitionAssignor (see KAFKA-8704)


> Move PartitionAssignor to public API
> 
>
> Key: KAFKA-8703
> URL: https://issues.apache.org/jira/browse/KAFKA-8703
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Sophie Blee-Goldman
>Assignee: Sophie Blee-Goldman
>Priority: Major
>
> Currently the PartitionAssignor, which is meant to be a pluggable interface, 
> sits in the internal package. It should be part of the public API, so we are 
> deprecating the old consumer.internal.PartitionAssignor in favor of a new 
> consumer.ConsumerPartitionAssignor.
> We also want to take the opportunity to refactor the interface a bit, so as 
> to achieve an easier to evolve API moving forward
> Due to the way assignors are instantiated, moving to a new 
> ConsumerPartitionAssignor interface will be fully compatible for most users 
> except those who have implemented the internal.PartitionAssignor (see 
> KAFKA-8704)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (KAFKA-8704) Add PartitionAssignor adapter for backwards compatibility

2019-07-24 Thread Sophie Blee-Goldman (JIRA)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sophie Blee-Goldman reassigned KAFKA-8704:
--

Assignee: Sophie Blee-Goldman

> Add PartitionAssignor adapter for backwards compatibility
> -
>
> Key: KAFKA-8704
> URL: https://issues.apache.org/jira/browse/KAFKA-8704
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients
>Reporter: Sophie Blee-Goldman
>Assignee: Sophie Blee-Goldman
>Priority: Major
>
> As part of KIP-429, we are deprecating the old 
> consumer.internal.PartitionAssignor in favor of a [new 
> consumer.PartitionAssignor|https://issues.apache.org/jira/browse/KAFKA-8703] 
> interface  that is part of the public API.
>  
> Although the old PartitionAssignor was technically part of the internal 
> package, some users may have implemented it and this change will break source 
> compatibility for them as they would need to modify their class to implement 
> the new interface. The number of users affected may be small, but nonetheless 
> we would like to add an adapter to maintain compatibility for these users.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8179) Incremental Rebalance Protocol for Kafka Consumer



[ 
https://issues.apache.org/jira/browse/KAFKA-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892294#comment-16892294
 ] 

ASF GitHub Bot commented on KAFKA-8179:
---

ableegoldman commented on pull request #7110:  KAFKA-8179: 
PartitionAssignorAdapter
URL: https://github.com/apache/kafka/pull/7110
 
 
   Follow up to [new PartitionAssignor 
interface](https://issues.apache.org/jira/browse/KAFKA-8703) -- should be 
rebased after [7108](https://github.com/apache/kafka/pull/7108) is merged
   
   Adds a PartitionAssignorAdapter class to [maintain backwards 
compatibility](https://issues.apache.org/jira/browse/KAFKA-8704)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incremental Rebalance Protocol for Kafka Consumer
> -
>
> Key: KAFKA-8179
> URL: https://issues.apache.org/jira/browse/KAFKA-8179
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Major
>
> Recently Kafka community is promoting cooperative rebalancing to mitigate the 
> pain points in the stop-the-world rebalancing protocol. This ticket is 
> created to initiate that idea at the Kafka consumer client, which will be 
> beneficial for heavy-stateful consumers such as Kafka Streams applications.
> In short, the scope of this ticket includes reducing unnecessary rebalance 
> latency due to heavy partition migration: i.e. partitions being revoked and 
> re-assigned. This would make the built-in consumer assignors (range, 
> round-robin etc) to be aware of previously assigned partitions and be sticky 
> in best-effort.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8712) Build Kafka Streams against Scala 2.13

2019-07-24 Thread Loic DIVAD (JIRA)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Loic DIVAD updated KAFKA-8712:
--
Description: Since Scala 2.13 has been released and KAFKA-7197 brings its 
support, the library \{{kafka-streams-scala}} could be also compiled for Scala 
2.13.   (was: Since Scala 2.13 has been released and KAFKA-7197 brings its 
support, the library {{kafka-streams-scala }}could be also compiled for Scala 
2.13. )

> Build Kafka Streams against Scala 2.13
> --
>
> Key: KAFKA-8712
> URL: https://issues.apache.org/jira/browse/KAFKA-8712
> Project: Kafka
>  Issue Type: Wish
>  Components: streams
>Reporter: Loic DIVAD
>Priority: Minor
> Fix For: 2.4.0
>
>
> Since Scala 2.13 has been released and KAFKA-7197 brings its support, the 
> library \{{kafka-streams-scala}} could be also compiled for Scala 2.13. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8712) Build Kafka Streams against Scala 2.13

2019-07-24 Thread Ismael Juma (JIRA)



[ 
https://issues.apache.org/jira/browse/KAFKA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892261#comment-16892261
 ] 

Ismael Juma commented on KAFKA-8712:


That's my expectation and hence why I was asking the question. There could be a 
bug somewhere.

> Build Kafka Streams against Scala 2.13
> --
>
> Key: KAFKA-8712
> URL: https://issues.apache.org/jira/browse/KAFKA-8712
> Project: Kafka
>  Issue Type: Wish
>  Components: streams
>Reporter: Loic DIVAD
>Priority: Minor
> Fix For: 2.4.0
>
>
> Since Scala 2.13 has been released and KAFKA-7197 brings its support, the 
> library {{kafka-streams-scala }}could be also compiled for Scala 2.13. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8712) Build Kafka Streams against Scala 2.13



[ 
https://issues.apache.org/jira/browse/KAFKA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892260#comment-16892260
 ] 

Matthias J. Sax commented on KAFKA-8712:


[~ijuma] Are you saying, that we don't need to do anything and this request is 
covered via KAFKA-7197 already?

> Build Kafka Streams against Scala 2.13
> --
>
> Key: KAFKA-8712
> URL: https://issues.apache.org/jira/browse/KAFKA-8712
> Project: Kafka
>  Issue Type: Wish
>  Components: streams
>Reporter: Loic DIVAD
>Priority: Minor
> Fix For: 2.4.0
>
>
> Since Scala 2.13 has been released and KAFKA-7197 brings its support, the 
> library {{kafka-streams-scala }}could be also compiled for Scala 2.13. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8712) Build Kafka Streams against Scala 2.13



 [ 
https://issues.apache.org/jira/browse/KAFKA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8712:
---
Fix Version/s: 2.4.0

> Build Kafka Streams against Scala 2.13
> --
>
> Key: KAFKA-8712
> URL: https://issues.apache.org/jira/browse/KAFKA-8712
> Project: Kafka
>  Issue Type: Wish
>  Components: streams
>Reporter: Loic DIVAD
>Priority: Minor
> Fix For: 2.4.0
>
>
> Since Scala 2.13 has been released and KAFKA-7197 brings its support, the 
> library {{kafka-streams-scala }}could be also compiled for Scala 2.13. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8712) Build Kafka Streams against Scala 2.13



 [ 
https://issues.apache.org/jira/browse/KAFKA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8712:
---
Affects Version/s: (was: 2.3.0)

> Build Kafka Streams against Scala 2.13
> --
>
> Key: KAFKA-8712
> URL: https://issues.apache.org/jira/browse/KAFKA-8712
> Project: Kafka
>  Issue Type: Wish
>  Components: streams
>Reporter: Loic DIVAD
>Priority: Minor
>
> Since Scala 2.13 has been released and KAFKA-7197 brings its support, the 
> library {{kafka-streams-scala }}could be also compiled for Scala 2.13. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8712) Build Kafka Streams against Scala 2.13

2019-07-24 Thread Ismael Juma (JIRA)



[ 
https://issues.apache.org/jira/browse/KAFKA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892257#comment-16892257
 ] 

Ismael Juma commented on KAFKA-8712:


It should be. Do you have reason to believe it's not?

> Build Kafka Streams against Scala 2.13
> --
>
> Key: KAFKA-8712
> URL: https://issues.apache.org/jira/browse/KAFKA-8712
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Loic DIVAD
>Priority: Minor
>
> Since Scala 2.13 has been released and KAFKA-7197 brings its support, the 
> library {{kafka-streams-scala }}could be also compiled for Scala 2.13. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8712) Build Kafka Streams against Scala 2.13

2019-07-24 Thread Loic DIVAD (JIRA)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Loic DIVAD updated KAFKA-8712:
--
Issue Type: Wish  (was: New Feature)

> Build Kafka Streams against Scala 2.13
> --
>
> Key: KAFKA-8712
> URL: https://issues.apache.org/jira/browse/KAFKA-8712
> Project: Kafka
>  Issue Type: Wish
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Loic DIVAD
>Priority: Minor
>
> Since Scala 2.13 has been released and KAFKA-7197 brings its support, the 
> library {{kafka-streams-scala }}could be also compiled for Scala 2.13. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KAFKA-8712) Build Kafka Streams against Scala 2.13

2019-07-24 Thread Loic DIVAD (JIRA)

Loic DIVAD created KAFKA-8712:
-

 Summary: Build Kafka Streams against Scala 2.13
 Key: KAFKA-8712
 URL: https://issues.apache.org/jira/browse/KAFKA-8712
 Project: Kafka
  Issue Type: New Feature
  Components: streams
Affects Versions: 2.3.0
Reporter: Loic DIVAD


Since Scala 2.13 has been released and KAFKA-7197 brings its support, the 
library {{kafka-streams-scala }}could be also compiled for Scala 2.13. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8705) NullPointerException was thrown by topology optimization when two MergeNodes have common KeyChaingingNode



[ 
https://issues.apache.org/jira/browse/KAFKA-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892228#comment-16892228
 ] 

ASF GitHub Bot commented on KAFKA-8705:
---

bbejeck commented on pull request #7109: KAFKA-8705: Remove parent node after 
leaving loop to prevent NPE
URL: https://github.com/apache/kafka/pull/7109
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> NullPointerException was thrown by topology optimization when two MergeNodes 
> have common KeyChaingingNode
> -
>
> Key: KAFKA-8705
> URL: https://issues.apache.org/jira/browse/KAFKA-8705
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Hiroshi Nakahara
>Assignee: Bill Bejeck
>Priority: Major
>
> NullPointerException was thrown by topology optimization when two MergeNodes 
> have common KeyChaingingNode.
> Kafka Stream version: 2.3.0
> h3. Code
> {code:java}
> import org.apache.kafka.common.serialization.Serdes;
> import org.apache.kafka.streams.StreamsBuilder;
> import org.apache.kafka.streams.StreamsConfig;
> import org.apache.kafka.streams.kstream.Consumed;
> import org.apache.kafka.streams.kstream.KStream;
> import java.util.Properties;
> public class Main {
> public static void main(String[] args) {
> final StreamsBuilder streamsBuilder = new StreamsBuilder();
> final KStream parentStream = 
> streamsBuilder.stream("parentTopic", Consumed.with(Serdes.Integer(), 
> Serdes.Integer()))
> .selectKey(Integer::sum);  // To make parentStream 
> KeyChaingingPoint
> final KStream childStream1 = 
> parentStream.mapValues(v -> v + 1);
> final KStream childStream2 = 
> parentStream.mapValues(v -> v + 2);
> final KStream childStream3 = 
> parentStream.mapValues(v -> v + 3);
> childStream1
> .merge(childStream2)
> .merge(childStream3)
> .to("outputTopic");
> final Properties properties = new Properties();
> properties.setProperty(StreamsConfig.TOPOLOGY_OPTIMIZATION, 
> StreamsConfig.OPTIMIZE);
> streamsBuilder.build(properties);
> }
> }
> {code}
> h3. Expected result
> streamsBuilder.build should create Topology without throwing Exception.  The 
> expected topology is:
> {code:java}
> Topologies:
>Sub-topology: 0
> Source: KSTREAM-SOURCE-00 (topics: [parentTopic])
>   --> KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-KEY-SELECT-01 (stores: [])
>   --> KSTREAM-MAPVALUES-02, KSTREAM-MAPVALUES-03, 
> KSTREAM-MAPVALUES-04
>   <-- KSTREAM-SOURCE-00
> Processor: KSTREAM-MAPVALUES-02 (stores: [])
>   --> KSTREAM-MERGE-05
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MAPVALUES-03 (stores: [])
>   --> KSTREAM-MERGE-05
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MAPVALUES-04 (stores: [])
>   --> KSTREAM-MERGE-06
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MERGE-05 (stores: [])
>   --> KSTREAM-MERGE-06
>   <-- KSTREAM-MAPVALUES-02, KSTREAM-MAPVALUES-03
> Processor: KSTREAM-MERGE-06 (stores: [])
>   --> KSTREAM-SINK-07
>   <-- KSTREAM-MERGE-05, KSTREAM-MAPVALUES-04
> Sink: KSTREAM-SINK-07 (topic: outputTopic)
>   <-- KSTREAM-MERGE-06
> {code}
> h3. Actual result
> NullPointerException was thrown with the following stacktrace.
> {code:java}
> Exception in thread "main" java.lang.NullPointerException
>   at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybeUpdateKeyChangingRepartitionNodeMap(InternalStreamsBuilder.java:397)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybeOptimizeRepartitionOperations(InternalStreamsBuilder.java:315)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybePerformOptimizations(InternalStreamsBuilder.java:304)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.buildAndOptimizeTopology(InternalStreamsBuilder.java:275)
>   at 
> org.apache.kafka.streams.StreamsBuilder.build(StreamsBuilder.java:558)
>   at Main.main(Main.java:24){code}
> h3. Cause
> This exception occurs in 
>

[jira] [Commented] (KAFKA-8705) NullPointerException was thrown by topology optimization when two MergeNodes have common KeyChaingingNode



[ 
https://issues.apache.org/jira/browse/KAFKA-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892227#comment-16892227
 ] 

ASF GitHub Bot commented on KAFKA-8705:
---

bbejeck commented on pull request #7109: KAFKA-8705: Remove parent node after 
leaving loop to prevent NPE
URL: https://github.com/apache/kafka/pull/7109
 
 
   Fixes case where multiple children merged from a key-changing node causes an 
NPE.
   
   I've updated the tests.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> NullPointerException was thrown by topology optimization when two MergeNodes 
> have common KeyChaingingNode
> -
>
> Key: KAFKA-8705
> URL: https://issues.apache.org/jira/browse/KAFKA-8705
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Hiroshi Nakahara
>Assignee: Bill Bejeck
>Priority: Major
>
> NullPointerException was thrown by topology optimization when two MergeNodes 
> have common KeyChaingingNode.
> Kafka Stream version: 2.3.0
> h3. Code
> {code:java}
> import org.apache.kafka.common.serialization.Serdes;
> import org.apache.kafka.streams.StreamsBuilder;
> import org.apache.kafka.streams.StreamsConfig;
> import org.apache.kafka.streams.kstream.Consumed;
> import org.apache.kafka.streams.kstream.KStream;
> import java.util.Properties;
> public class Main {
> public static void main(String[] args) {
> final StreamsBuilder streamsBuilder = new StreamsBuilder();
> final KStream parentStream = 
> streamsBuilder.stream("parentTopic", Consumed.with(Serdes.Integer(), 
> Serdes.Integer()))
> .selectKey(Integer::sum);  // To make parentStream 
> KeyChaingingPoint
> final KStream childStream1 = 
> parentStream.mapValues(v -> v + 1);
> final KStream childStream2 = 
> parentStream.mapValues(v -> v + 2);
> final KStream childStream3 = 
> parentStream.mapValues(v -> v + 3);
> childStream1
> .merge(childStream2)
> .merge(childStream3)
> .to("outputTopic");
> final Properties properties = new Properties();
> properties.setProperty(StreamsConfig.TOPOLOGY_OPTIMIZATION, 
> StreamsConfig.OPTIMIZE);
> streamsBuilder.build(properties);
> }
> }
> {code}
> h3. Expected result
> streamsBuilder.build should create Topology without throwing Exception.  The 
> expected topology is:
> {code:java}
> Topologies:
>Sub-topology: 0
> Source: KSTREAM-SOURCE-00 (topics: [parentTopic])
>   --> KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-KEY-SELECT-01 (stores: [])
>   --> KSTREAM-MAPVALUES-02, KSTREAM-MAPVALUES-03, 
> KSTREAM-MAPVALUES-04
>   <-- KSTREAM-SOURCE-00
> Processor: KSTREAM-MAPVALUES-02 (stores: [])
>   --> KSTREAM-MERGE-05
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MAPVALUES-03 (stores: [])
>   --> KSTREAM-MERGE-05
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MAPVALUES-04 (stores: [])
>   --> KSTREAM-MERGE-06
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MERGE-05 (stores: [])
>   --> KSTREAM-MERGE-06
>   <-- KSTREAM-MAPVALUES-02, KSTREAM-MAPVALUES-03
> Processor: KSTREAM-MERGE-06 (stores: [])
>   --> KSTREAM-SINK-07
>   <-- KSTREAM-MERGE-05, KSTREAM-MAPVALUES-04
> Sink: KSTREAM-SINK-07 (topic: outputTopic)
>   <-- KSTREAM-MERGE-06
> {code}
> h3. Actual result
> NullPointerException was thrown with the following stacktrace.
> {code:java}
> Exception in thread "main" java.lang.NullPointerException
>   at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybeUpdateKeyChangingRepartitionNodeMap(InternalStreamsBuilder.java:397)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybeOptimizeRepartitionOperations(InternalStreamsBuilder.java:315)
>   at 
>

[jira] [Commented] (KAFKA-8179) Incremental Rebalance Protocol for Kafka Consumer



[ 
https://issues.apache.org/jira/browse/KAFKA-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892184#comment-16892184
 ] 

ASF GitHub Bot commented on KAFKA-8179:
---

ableegoldman commented on pull request #7108: KAFKA-8179: add public 
ConsumerPartitionAssignor interface
URL: https://github.com/apache/kafka/pull/7108
 
 
   [Subtask JIRA](https://issues.apache.org/jira/browse/KAFKA-8703)
   Main changes of this PR
   * Deprecate old consumer.internal.PartitionAssignor and add public 
consumer.ConsumerPartitionAssignor with all OOTB assignors migrated to new 
interface
   * Refactor assignor's assignment/subscription related classes for easier to 
evolve API
   * Removed version number from classes as it is only needed for 
serialization/deserialization
   
   Other previously-discussed cleanup included in this PR:
   * Remove Assignment.error added in [pt 
1](https://github.com/apache/kafka/pull/6528/files)
   * Remove ConsumerCoordinator#adjustAssignment added in [pt 
2](https://github.com/apache/kafka/pull/6778/)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incremental Rebalance Protocol for Kafka Consumer
> -
>
> Key: KAFKA-8179
> URL: https://issues.apache.org/jira/browse/KAFKA-8179
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Major
>
> Recently Kafka community is promoting cooperative rebalancing to mitigate the 
> pain points in the stop-the-world rebalancing protocol. This ticket is 
> created to initiate that idea at the Kafka consumer client, which will be 
> beneficial for heavy-stateful consumers such as Kafka Streams applications.
> In short, the scope of this ticket includes reducing unnecessary rebalance 
> latency due to heavy partition migration: i.e. partitions being revoked and 
> re-assigned. This would make the built-in consumer assignors (range, 
> round-robin etc) to be aware of previously assigned partitions and be sticky 
> in best-effort.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8673) Kafka stream threads stuck while sending offsets to transaction preventing join group from completing



[ 
https://issues.apache.org/jira/browse/KAFKA-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892132#comment-16892132
 ] 

Guozhang Wang commented on KAFKA-8673:
--

Also cc [~bob-barrett] [~hachikuji] who has been looking to KIP-360 recently: I 
will dig deeper into the producer's transactional manager but so far I think 
the infinite re-try logic on `TxnRequestHandler` is good, so this is a mystery 
why the request did not go through after broker's available.

> Kafka stream threads stuck while sending offsets to transaction preventing 
> join group from completing
> -
>
> Key: KAFKA-8673
> URL: https://issues.apache.org/jira/browse/KAFKA-8673
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Affects Versions: 2.2.0
>Reporter: Varsha Abhinandan
>Priority: Major
> Attachments: Screen Shot 2019-07-11 at 12.08.09 PM.png
>
>
> We observed a deadlock kind of a situation in our Kafka streams application 
> when we accidentally shut down all the brokers. The Kafka cluster was brought 
> back in about an hour. 
> Observations made :
>  # Normal Kafka producers and consumers started working fine after the 
> brokers were up again. 
>  # The Kafka streams applications were stuck in the "rebalancing" state.
>  # The Kafka streams apps have exactly-once semantics enabled.
>  # The stack trace showed most of the stream threads sending the join group 
> requests to the group co-ordinator
>  # Few stream threads couldn't initiate the join group request since the call 
> to 
> [org.apache.kafka.clients.producer.KafkaProducer#sendOffsetsToTransaction|https://jira.corp.appdynamics.com/browse/ANLYTCS_ES-2062#sendOffsetsToTransaction%20which%20was%20hung]
>  was stuck.
>  # Seems like the join group requests were getting parked at the coordinator 
> since the expected members hadn't sent their own group join requests
>  # And after the timeout, the stream threads that were not stuck sent a new 
> join group requests.  
>  # Maybe (6) and (7) is happening infinitely
>  # Sample values of the GroupMetadata object on the group co-ordinator - 
> [^Screen Shot 2019-07-11 at 12.08.09 PM.png]
>  # The list of notYetJoinedMembers client id's matched with the threads 
> waiting for their offsets to be committed. 
> {code:java}
> [List(MemberMetadata(memberId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-38-consumer-efa41349-3da1-43b6-9710-a662f68c63b1,
>  
> clientId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-38-consumer,
>  clientHost=/10.136.98.48, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36-consumer-7cc8e41b-ad98-4006-a18a-b22abe6350f4,
>  
> clientId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36-consumer,
>  clientHost=/10.136.103.148, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-27-consumer-9ffb96c1-3379-4cbd-bee1-5d4719fe6c9d,
>  
> clientId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-27-consumer,
>  clientHost=/10.136.98.48, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21-consumer-5b8a1f1f-84dd-4a87-86c8-7542c0e50d1f,
>  
> clientId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21-consumer,
>  clientHost=/10.136.103.148, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-994cee9b-b79b-483b-97cd-f89e8cbb015a-StreamThread-33-consumer-3cb67ec9-c548-4386-962d-64d9772bf719,
>  
> clientId=metric-extractor-stream-c1-994cee9b-b79b-483b-97cd-f89e8cbb015a-StreamThread-33-consumer,
>  clientHost=/10.136.99.15, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ))]
> vabhinandan-mac:mp-jstack varsha.abhinandan$ cat jstack.* | grep 
> "metric-extractor-stream-c1-" | grep "StreamThread-" | grep "waiting on 
> condition"
> "metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36"
>  #128 daemon prio=5 os_prio=0 tid=0x7fc53c047800 nid=0xac waiting on 
> condition [0x7fc4e68e7000]
> "metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21"
>  #93 daemon prio=5 os_prio=0 tid=0x7fc53c2b5800 nid=0x9d waiting on 
> condition [0x7fc4e77f6000]
>

[jira] [Commented] (KAFKA-8673) Kafka stream threads stuck while sending offsets to transaction preventing join group from completing



[ 
https://issues.apache.org/jira/browse/KAFKA-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892129#comment-16892129
 ] 

Guozhang Wang commented on KAFKA-8673:
--

Yes, but my main point is that the `AddOffsetsToTxn` should not be blocked 
forever once broker is back up online again, and therefore those threads would 
eventually be joining group again and unblock the others.

> Kafka stream threads stuck while sending offsets to transaction preventing 
> join group from completing
> -
>
> Key: KAFKA-8673
> URL: https://issues.apache.org/jira/browse/KAFKA-8673
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Affects Versions: 2.2.0
>Reporter: Varsha Abhinandan
>Priority: Major
> Attachments: Screen Shot 2019-07-11 at 12.08.09 PM.png
>
>
> We observed a deadlock kind of a situation in our Kafka streams application 
> when we accidentally shut down all the brokers. The Kafka cluster was brought 
> back in about an hour. 
> Observations made :
>  # Normal Kafka producers and consumers started working fine after the 
> brokers were up again. 
>  # The Kafka streams applications were stuck in the "rebalancing" state.
>  # The Kafka streams apps have exactly-once semantics enabled.
>  # The stack trace showed most of the stream threads sending the join group 
> requests to the group co-ordinator
>  # Few stream threads couldn't initiate the join group request since the call 
> to 
> [org.apache.kafka.clients.producer.KafkaProducer#sendOffsetsToTransaction|https://jira.corp.appdynamics.com/browse/ANLYTCS_ES-2062#sendOffsetsToTransaction%20which%20was%20hung]
>  was stuck.
>  # Seems like the join group requests were getting parked at the coordinator 
> since the expected members hadn't sent their own group join requests
>  # And after the timeout, the stream threads that were not stuck sent a new 
> join group requests.  
>  # Maybe (6) and (7) is happening infinitely
>  # Sample values of the GroupMetadata object on the group co-ordinator - 
> [^Screen Shot 2019-07-11 at 12.08.09 PM.png]
>  # The list of notYetJoinedMembers client id's matched with the threads 
> waiting for their offsets to be committed. 
> {code:java}
> [List(MemberMetadata(memberId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-38-consumer-efa41349-3da1-43b6-9710-a662f68c63b1,
>  
> clientId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-38-consumer,
>  clientHost=/10.136.98.48, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36-consumer-7cc8e41b-ad98-4006-a18a-b22abe6350f4,
>  
> clientId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36-consumer,
>  clientHost=/10.136.103.148, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-27-consumer-9ffb96c1-3379-4cbd-bee1-5d4719fe6c9d,
>  
> clientId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-27-consumer,
>  clientHost=/10.136.98.48, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21-consumer-5b8a1f1f-84dd-4a87-86c8-7542c0e50d1f,
>  
> clientId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21-consumer,
>  clientHost=/10.136.103.148, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-994cee9b-b79b-483b-97cd-f89e8cbb015a-StreamThread-33-consumer-3cb67ec9-c548-4386-962d-64d9772bf719,
>  
> clientId=metric-extractor-stream-c1-994cee9b-b79b-483b-97cd-f89e8cbb015a-StreamThread-33-consumer,
>  clientHost=/10.136.99.15, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ))]
> vabhinandan-mac:mp-jstack varsha.abhinandan$ cat jstack.* | grep 
> "metric-extractor-stream-c1-" | grep "StreamThread-" | grep "waiting on 
> condition"
> "metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36"
>  #128 daemon prio=5 os_prio=0 tid=0x7fc53c047800 nid=0xac waiting on 
> condition [0x7fc4e68e7000]
> "metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21"
>  #93 daemon prio=5 os_prio=0 tid=0x7fc53c2b5800 nid=0x9d waiting on 
> condition [0x7fc4e77f6000]
> "metric-extractor-stream-c1-994cee9b-b79b-483b-97cd-f89e8cbb015a-StreamThread-33"
>  #125

[jira] [Commented] (KAFKA-8711) Kafka 2.3.0 Transient Unit Test Failures SocketServerTest. testControlPlaneRequest



[ 
https://issues.apache.org/jira/browse/KAFKA-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892118#comment-16892118
 ] 

Chandrasekhar commented on KAFKA-8711:
--

Attached find the logs...Please let us know if we can comment this test case 
for this build and move forward until there is a release available with this 
fix. If so where is this code in repo and how should we run the subset of tc's 
rather than waiting ~2hrs to find the failures...

 

Thanks Much

 

Chandra

[^KafkaAUTFailures07242019_PASS2.txt]!KafkaUTFailures07242019_PASS2.GIF!

> Kafka 2.3.0 Transient Unit Test Failures SocketServerTest. 
> testControlPlaneRequest
> --
>
> Key: KAFKA-8711
> URL: https://issues.apache.org/jira/browse/KAFKA-8711
> Project: Kafka
>  Issue Type: Test
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Critical
> Attachments: KafkaAUTFailures07242019_PASS2.txt, 
> KafkaUTFailures07242019_PASS2.GIF
>
>
> Cloned Kakfa 2.3.0 source to our git repo and compiled it using 'gradle 
> build', we see the following error consistently:
> Gradle Version 4.7
>  
> testControlPlaneRequest
> java.net.BindException: Address already in use (Bind failed)
>     at java.net.PlainSocketImpl.socketBind(Native Method)
>     at 
> java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
>     at java.net.Socket.bind(Socket.java:644)
>     at java.net.Socket.(Socket.java:433)
>     at java.net.Socket.(Socket.java:286)
>     at kafka.network.SocketServerTest.connect(SocketServerTest.scala:140)
>     at 
> kafka.network.SocketServerTest.$anonfun$testControlPlaneRequest$1(SocketServerTest.scala:200)
>     at 
> kafka.network.SocketServerTest.$anonfun$testControlPlaneRequest$1$adapted(SocketServerTest.scala:199)
>     at 
> kafka.network.SocketServerTest.withTestableServer(SocketServerTest.scala:1141)
>     at 
> kafka.network.SocketServerTest.testControlPlaneRequest(SocketServerTest.scala:199)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>     at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328)
>     at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:412)
>     at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>     at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>     at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>     at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>     at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>     at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at

[jira] [Updated] (KAFKA-8711) Kafka 2.3.0 Transient Unit Test Failures SocketServerTest. testControlPlaneRequest



 [ 
https://issues.apache.org/jira/browse/KAFKA-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandrasekhar updated KAFKA-8711:
-
Attachment: KafkaUTFailures07242019_PASS2.GIF

> Kafka 2.3.0 Transient Unit Test Failures SocketServerTest. 
> testControlPlaneRequest
> --
>
> Key: KAFKA-8711
> URL: https://issues.apache.org/jira/browse/KAFKA-8711
> Project: Kafka
>  Issue Type: Test
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Critical
> Attachments: KafkaAUTFailures07242019_PASS2.txt, 
> KafkaUTFailures07242019_PASS2.GIF
>
>
> Cloned Kakfa 2.3.0 source to our git repo and compiled it using 'gradle 
> build', we see the following error consistently:
> Gradle Version 4.7
>  
> testControlPlaneRequest
> java.net.BindException: Address already in use (Bind failed)
>     at java.net.PlainSocketImpl.socketBind(Native Method)
>     at 
> java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
>     at java.net.Socket.bind(Socket.java:644)
>     at java.net.Socket.(Socket.java:433)
>     at java.net.Socket.(Socket.java:286)
>     at kafka.network.SocketServerTest.connect(SocketServerTest.scala:140)
>     at 
> kafka.network.SocketServerTest.$anonfun$testControlPlaneRequest$1(SocketServerTest.scala:200)
>     at 
> kafka.network.SocketServerTest.$anonfun$testControlPlaneRequest$1$adapted(SocketServerTest.scala:199)
>     at 
> kafka.network.SocketServerTest.withTestableServer(SocketServerTest.scala:1141)
>     at 
> kafka.network.SocketServerTest.testControlPlaneRequest(SocketServerTest.scala:199)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>     at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328)
>     at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:412)
>     at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
>     at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>     at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>     at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
>     at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>     at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>     at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>     at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>     at 
>

[jira] [Created] (KAFKA-8711) Kafka 2.3.0 Transient Unit Test Failures SocketServerTest. testControlPlaneRequest

Chandrasekhar created KAFKA-8711:


 Summary: Kafka 2.3.0 Transient Unit Test Failures 
SocketServerTest. testControlPlaneRequest
 Key: KAFKA-8711
 URL: https://issues.apache.org/jira/browse/KAFKA-8711
 Project: Kafka
  Issue Type: Test
  Components: core, unit tests
Affects Versions: 2.3.0
Reporter: Chandrasekhar


Cloned Kakfa 2.3.0 source to our git repo and compiled it using 'gradle build', 
we see the following error consistently:

Gradle Version 4.7

 

testControlPlaneRequest
java.net.BindException: Address already in use (Bind failed)
    at java.net.PlainSocketImpl.socketBind(Native Method)
    at 
java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:387)
    at java.net.Socket.bind(Socket.java:644)
    at java.net.Socket.(Socket.java:433)
    at java.net.Socket.(Socket.java:286)
    at kafka.network.SocketServerTest.connect(SocketServerTest.scala:140)
    at 
kafka.network.SocketServerTest.$anonfun$testControlPlaneRequest$1(SocketServerTest.scala:200)
    at 
kafka.network.SocketServerTest.$anonfun$testControlPlaneRequest$1$adapted(SocketServerTest.scala:199)
    at 
kafka.network.SocketServerTest.withTestableServer(SocketServerTest.scala:1141)
    at 
kafka.network.SocketServerTest.testControlPlaneRequest(SocketServerTest.scala:199)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
    at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
    at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
    at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365)
    at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
    at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
    at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328)
    at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
    at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:412)
    at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
    at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
    at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
    at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
    at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
    at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
    at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
    at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
    at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
    at com.sun.proxy.$Proxy1.processTestClass(Unknown Source)
    at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
    at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at

[jira] [Updated] (KAFKA-8706) Kafka 2.3.0 Transient Unit Test Failures on Oracle Linux - See attached for details



 [ 
https://issues.apache.org/jira/browse/KAFKA-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandrasekhar updated KAFKA-8706:
-
Priority: Major  (was: Minor)
 Summary: Kafka 2.3.0 Transient Unit Test Failures on Oracle Linux - See 
attached for details  (was: Kafka 2.3.0 Unit Test Failures on Oracle Linux  - 
Need help debugging framework or issue.)

> Kafka 2.3.0 Transient Unit Test Failures on Oracle Linux - See attached for 
> details
> ---
>
> Key: KAFKA-8706
> URL: https://issues.apache.org/jira/browse/KAFKA-8706
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Major
> Attachments: KafkaAUTFailures.txt, KafkaAUTFailures07242019.txt, 
> KafkaAUTFailures07242019_PASS2.txt, KafkaUTFailures07242019.GIF, 
> KafkaUTFailures07242019_PASS2.GIF
>
>
> Hi
> We have just imported KAFKA 2.3.0 source code from git repo and compiling 
> using Gradle 4.7 on Oracle VM with following info:
> [vagrant@localhost kafka-2.3.0]$ uname -a
>  Linux localhost 4.1.12-112.14.1.el7uek.x86_64 #2 SMP Fri Dec 8 18:37:23 PST 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>  [vagrant@localhost kafka-2.3.0]$
>  
> Upon compiling (#gradle build) , there are 6 test failures at the end. Failed 
> Tests are reported as following:
> DescribeConsumerGroupTest. testDescribeOffsetsOfExistingGroupWithNoMembers 
>  SaslSslAdminClientIntegrationTest. 
> testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords 
>  UserQuotaTest. testQuotaOverrideDelete 
>  UserQuotaTest. testThrottledProducerConsumer 
>  MetricsDuringTopicCreationDeletionTest. testMetricsDuringTopicCreateDelete 
>  SocketServerTest. testControlPlaneRequest
> Attached find the failures.
>  
> [^KafkaAUTFailures.txt]
>  
>  
>  We would like to know if we are missing anything in our build environment or 
> if this is a known test failures in Kafka 2.3.0
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8706) Kafka 2.3.0 Unit Test Failures on Oracle Linux - Need help debugging framework or issue.



 [ 
https://issues.apache.org/jira/browse/KAFKA-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandrasekhar updated KAFKA-8706:
-
Component/s: unit tests

> Kafka 2.3.0 Unit Test Failures on Oracle Linux  - Need help debugging 
> framework or issue.
> -
>
> Key: KAFKA-8706
> URL: https://issues.apache.org/jira/browse/KAFKA-8706
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Minor
> Attachments: KafkaAUTFailures.txt, KafkaAUTFailures07242019.txt, 
> KafkaAUTFailures07242019_PASS2.txt, KafkaUTFailures07242019.GIF, 
> KafkaUTFailures07242019_PASS2.GIF
>
>
> Hi
> We have just imported KAFKA 2.3.0 source code from git repo and compiling 
> using Gradle 4.7 on Oracle VM with following info:
> [vagrant@localhost kafka-2.3.0]$ uname -a
>  Linux localhost 4.1.12-112.14.1.el7uek.x86_64 #2 SMP Fri Dec 8 18:37:23 PST 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>  [vagrant@localhost kafka-2.3.0]$
>  
> Upon compiling (#gradle build) , there are 6 test failures at the end. Failed 
> Tests are reported as following:
> DescribeConsumerGroupTest. testDescribeOffsetsOfExistingGroupWithNoMembers 
>  SaslSslAdminClientIntegrationTest. 
> testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords 
>  UserQuotaTest. testQuotaOverrideDelete 
>  UserQuotaTest. testThrottledProducerConsumer 
>  MetricsDuringTopicCreationDeletionTest. testMetricsDuringTopicCreateDelete 
>  SocketServerTest. testControlPlaneRequest
> Attached find the failures.
>  
> [^KafkaAUTFailures.txt]
>  
>  
>  We would like to know if we are missing anything in our build environment or 
> if this is a known test failures in Kafka 2.3.0
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8673) Kafka stream threads stuck while sending offsets to transaction preventing join group from completing

2019-07-24 Thread Boyang Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/KAFKA-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892075#comment-16892075
 ] 

Boyang Chen commented on KAFKA-8673:


[~guozhang] Just to clarify, the `AddOffsetsToTxn` request should be sent to 
txn coordinator, while `TxnOffsetCommit` is sent to group coordinator which 
could indeed block join group if hanging.

> Kafka stream threads stuck while sending offsets to transaction preventing 
> join group from completing
> -
>
> Key: KAFKA-8673
> URL: https://issues.apache.org/jira/browse/KAFKA-8673
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Affects Versions: 2.2.0
>Reporter: Varsha Abhinandan
>Priority: Major
> Attachments: Screen Shot 2019-07-11 at 12.08.09 PM.png
>
>
> We observed a deadlock kind of a situation in our Kafka streams application 
> when we accidentally shut down all the brokers. The Kafka cluster was brought 
> back in about an hour. 
> Observations made :
>  # Normal Kafka producers and consumers started working fine after the 
> brokers were up again. 
>  # The Kafka streams applications were stuck in the "rebalancing" state.
>  # The Kafka streams apps have exactly-once semantics enabled.
>  # The stack trace showed most of the stream threads sending the join group 
> requests to the group co-ordinator
>  # Few stream threads couldn't initiate the join group request since the call 
> to 
> [org.apache.kafka.clients.producer.KafkaProducer#sendOffsetsToTransaction|https://jira.corp.appdynamics.com/browse/ANLYTCS_ES-2062#sendOffsetsToTransaction%20which%20was%20hung]
>  was stuck.
>  # Seems like the join group requests were getting parked at the coordinator 
> since the expected members hadn't sent their own group join requests
>  # And after the timeout, the stream threads that were not stuck sent a new 
> join group requests.  
>  # Maybe (6) and (7) is happening infinitely
>  # Sample values of the GroupMetadata object on the group co-ordinator - 
> [^Screen Shot 2019-07-11 at 12.08.09 PM.png]
>  # The list of notYetJoinedMembers client id's matched with the threads 
> waiting for their offsets to be committed. 
> {code:java}
> [List(MemberMetadata(memberId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-38-consumer-efa41349-3da1-43b6-9710-a662f68c63b1,
>  
> clientId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-38-consumer,
>  clientHost=/10.136.98.48, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36-consumer-7cc8e41b-ad98-4006-a18a-b22abe6350f4,
>  
> clientId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36-consumer,
>  clientHost=/10.136.103.148, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-27-consumer-9ffb96c1-3379-4cbd-bee1-5d4719fe6c9d,
>  
> clientId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-27-consumer,
>  clientHost=/10.136.98.48, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21-consumer-5b8a1f1f-84dd-4a87-86c8-7542c0e50d1f,
>  
> clientId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21-consumer,
>  clientHost=/10.136.103.148, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-994cee9b-b79b-483b-97cd-f89e8cbb015a-StreamThread-33-consumer-3cb67ec9-c548-4386-962d-64d9772bf719,
>  
> clientId=metric-extractor-stream-c1-994cee9b-b79b-483b-97cd-f89e8cbb015a-StreamThread-33-consumer,
>  clientHost=/10.136.99.15, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ))]
> vabhinandan-mac:mp-jstack varsha.abhinandan$ cat jstack.* | grep 
> "metric-extractor-stream-c1-" | grep "StreamThread-" | grep "waiting on 
> condition"
> "metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36"
>  #128 daemon prio=5 os_prio=0 tid=0x7fc53c047800 nid=0xac waiting on 
> condition [0x7fc4e68e7000]
> "metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21"
>  #93 daemon prio=5 os_prio=0 tid=0x7fc53c2b5800 nid=0x9d waiting on 
> condition [0x7fc4e77f6000]
> "metric-extractor-stream-c1-994cee9b-b79b-483b-97cd-f89e8cbb015a-StreamThread-33"
>  #125 daemon prio=5 os_prio=0

[jira] [Commented] (KAFKA-8706) Kafka 2.3.0 Unit Test Failures on Oracle Linux - Need help debugging framework or issue.



[ 
https://issues.apache.org/jira/browse/KAFKA-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892055#comment-16892055
 ] 

Chandrasekhar commented on KAFKA-8706:
--

Second Pass: Failed Test Cases: 3, per your list above:

UserQuotaTest. testQuotaOverrideDelete  - 
https://issues.apache.org/jira/browse/KAFKA-8032
UserQuotaTest. testThrottledProducerConsumer - 
https://issues.apache.org/jira/browse/KAFKA-8073
SocketServerTest. testControlPlaneRequest : Need to write a bug - will do in a 
few mins...

 

Please let me know if you know the reason for these intermittent failures 3,4,6 
...and let us know if this release is stable enough.

 

!KafkaUTFailures07242019_PASS2.GIF![^KafkaAUTFailures07242019_PASS2.txt]

> Kafka 2.3.0 Unit Test Failures on Oracle Linux  - Need help debugging 
> framework or issue.
> -
>
> Key: KAFKA-8706
> URL: https://issues.apache.org/jira/browse/KAFKA-8706
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Minor
> Attachments: KafkaAUTFailures.txt, KafkaAUTFailures07242019.txt, 
> KafkaAUTFailures07242019_PASS2.txt, KafkaUTFailures07242019.GIF, 
> KafkaUTFailures07242019_PASS2.GIF
>
>
> Hi
> We have just imported KAFKA 2.3.0 source code from git repo and compiling 
> using Gradle 4.7 on Oracle VM with following info:
> [vagrant@localhost kafka-2.3.0]$ uname -a
>  Linux localhost 4.1.12-112.14.1.el7uek.x86_64 #2 SMP Fri Dec 8 18:37:23 PST 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>  [vagrant@localhost kafka-2.3.0]$
>  
> Upon compiling (#gradle build) , there are 6 test failures at the end. Failed 
> Tests are reported as following:
> DescribeConsumerGroupTest. testDescribeOffsetsOfExistingGroupWithNoMembers 
>  SaslSslAdminClientIntegrationTest. 
> testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords 
>  UserQuotaTest. testQuotaOverrideDelete 
>  UserQuotaTest. testThrottledProducerConsumer 
>  MetricsDuringTopicCreationDeletionTest. testMetricsDuringTopicCreateDelete 
>  SocketServerTest. testControlPlaneRequest
> Attached find the failures.
>  
> [^KafkaAUTFailures.txt]
>  
>  
>  We would like to know if we are missing anything in our build environment or 
> if this is a known test failures in Kafka 2.3.0
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8706) Kafka 2.3.0 Unit Test Failures on Oracle Linux - Need help debugging framework or issue.



 [ 
https://issues.apache.org/jira/browse/KAFKA-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandrasekhar updated KAFKA-8706:
-
Attachment: KafkaUTFailures07242019_PASS2.GIF

> Kafka 2.3.0 Unit Test Failures on Oracle Linux  - Need help debugging 
> framework or issue.
> -
>
> Key: KAFKA-8706
> URL: https://issues.apache.org/jira/browse/KAFKA-8706
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Minor
> Attachments: KafkaAUTFailures.txt, KafkaAUTFailures07242019.txt, 
> KafkaAUTFailures07242019_PASS2.txt, KafkaUTFailures07242019.GIF, 
> KafkaUTFailures07242019_PASS2.GIF
>
>
> Hi
> We have just imported KAFKA 2.3.0 source code from git repo and compiling 
> using Gradle 4.7 on Oracle VM with following info:
> [vagrant@localhost kafka-2.3.0]$ uname -a
>  Linux localhost 4.1.12-112.14.1.el7uek.x86_64 #2 SMP Fri Dec 8 18:37:23 PST 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>  [vagrant@localhost kafka-2.3.0]$
>  
> Upon compiling (#gradle build) , there are 6 test failures at the end. Failed 
> Tests are reported as following:
> DescribeConsumerGroupTest. testDescribeOffsetsOfExistingGroupWithNoMembers 
>  SaslSslAdminClientIntegrationTest. 
> testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords 
>  UserQuotaTest. testQuotaOverrideDelete 
>  UserQuotaTest. testThrottledProducerConsumer 
>  MetricsDuringTopicCreationDeletionTest. testMetricsDuringTopicCreateDelete 
>  SocketServerTest. testControlPlaneRequest
> Attached find the failures.
>  
> [^KafkaAUTFailures.txt]
>  
>  
>  We would like to know if we are missing anything in our build environment or 
> if this is a known test failures in Kafka 2.3.0
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8706) Kafka 2.3.0 Unit Test Failures on Oracle Linux - Need help debugging framework or issue.



 [ 
https://issues.apache.org/jira/browse/KAFKA-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandrasekhar updated KAFKA-8706:
-
Attachment: KafkaAUTFailures07242019_PASS2.txt

> Kafka 2.3.0 Unit Test Failures on Oracle Linux  - Need help debugging 
> framework or issue.
> -
>
> Key: KAFKA-8706
> URL: https://issues.apache.org/jira/browse/KAFKA-8706
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Minor
> Attachments: KafkaAUTFailures.txt, KafkaAUTFailures07242019.txt, 
> KafkaAUTFailures07242019_PASS2.txt, KafkaUTFailures07242019.GIF, 
> KafkaUTFailures07242019_PASS2.GIF
>
>
> Hi
> We have just imported KAFKA 2.3.0 source code from git repo and compiling 
> using Gradle 4.7 on Oracle VM with following info:
> [vagrant@localhost kafka-2.3.0]$ uname -a
>  Linux localhost 4.1.12-112.14.1.el7uek.x86_64 #2 SMP Fri Dec 8 18:37:23 PST 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>  [vagrant@localhost kafka-2.3.0]$
>  
> Upon compiling (#gradle build) , there are 6 test failures at the end. Failed 
> Tests are reported as following:
> DescribeConsumerGroupTest. testDescribeOffsetsOfExistingGroupWithNoMembers 
>  SaslSslAdminClientIntegrationTest. 
> testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords 
>  UserQuotaTest. testQuotaOverrideDelete 
>  UserQuotaTest. testThrottledProducerConsumer 
>  MetricsDuringTopicCreationDeletionTest. testMetricsDuringTopicCreateDelete 
>  SocketServerTest. testControlPlaneRequest
> Attached find the failures.
>  
> [^KafkaAUTFailures.txt]
>  
>  
>  We would like to know if we are missing anything in our build environment or 
> if this is a known test failures in Kafka 2.3.0
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-7190) Under low traffic conditions purging repartition topics cause WARN statements about UNKNOWN_PRODUCER_ID

2019-07-24 Thread Bob Barrett (JIRA)



[ 
https://issues.apache.org/jira/browse/KAFKA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892046#comment-16892046
 ] 

Bob Barrett commented on KAFKA-7190:


The periodic producer expiration check expires producers if 1) there is no 
ongoing transaction for the producer, and 2) the max timestamp of the last 
batch written by that producer is more than `transactional.id.expiration.ms` 
ago. The default value for `transactional.id.expiration.ms` is 7 days. If the 
input topic only has messages older than 7 days, and the transformed records 
produced to it also have a timestamp older than 7 days, then the producer will 
be expired the first time that the expiration check runs without an ongoing 
transaction. KIP-360 will indirectly address this problem by allowing the 
producer to continue after receiving an UNKNOWN_PRODUCER_ID error, but a better 
solution would probably be to set the producer state timestamp based on the 
current time, not the batch timestamp, as [~rocketraman] suggested. 
[~hachikuji] what do you think?

In the meantime, another workaround would be to set 
`transactional.id.expiration.ms` to a larger number, which would allow the 
transformed records to retain the default Streams timestamp behavior.

> Under low traffic conditions purging repartition topics cause WARN statements 
> about  UNKNOWN_PRODUCER_ID 
> -
>
> Key: KAFKA-7190
> URL: https://issues.apache.org/jira/browse/KAFKA-7190
> Project: Kafka
>  Issue Type: Improvement
>  Components: core, streams
>Affects Versions: 1.1.0, 1.1.1
>Reporter: Bill Bejeck
>Assignee: Guozhang Wang
>Priority: Major
>
> When a streams application has little traffic, then it is possible that 
> consumer purging would delete
> even the last message sent by a producer (i.e., all the messages sent by
> this producer have been consumed and committed), and as a result, the broker
> would delete that producer's ID. The next time when this producer tries to
> send, it will get this UNKNOWN_PRODUCER_ID error code, but in this case,
> this error is retriable: the producer would just get a new producer id and
> retries, and then this time it will succeed. 
>  
> Possible fixes could be on the broker side, i.e., delaying the deletion of 
> the produderIDs for a more extended period or on the streams side developing 
> a more conservative approach to deleting offsets from repartition topics
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8673) Kafka stream threads stuck while sending offsets to transaction preventing join group from completing

2019-07-24 Thread Varsha Abhinandan (JIRA)



[ 
https://issues.apache.org/jira/browse/KAFKA-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892014#comment-16892014
 ] 

Varsha Abhinandan commented on KAFKA-8673:
--

Hi [~guozhang], the threads were blocked on TransactionalRequestResult.await 
for about 4 days. The rebalance completed only after we restarted the processes 
which had the stream threads stuck on TransactionalRequestResult.await. 

 

 

> Kafka stream threads stuck while sending offsets to transaction preventing 
> join group from completing
> -
>
> Key: KAFKA-8673
> URL: https://issues.apache.org/jira/browse/KAFKA-8673
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Affects Versions: 2.2.0
>Reporter: Varsha Abhinandan
>Priority: Major
> Attachments: Screen Shot 2019-07-11 at 12.08.09 PM.png
>
>
> We observed a deadlock kind of a situation in our Kafka streams application 
> when we accidentally shut down all the brokers. The Kafka cluster was brought 
> back in about an hour. 
> Observations made :
>  # Normal Kafka producers and consumers started working fine after the 
> brokers were up again. 
>  # The Kafka streams applications were stuck in the "rebalancing" state.
>  # The Kafka streams apps have exactly-once semantics enabled.
>  # The stack trace showed most of the stream threads sending the join group 
> requests to the group co-ordinator
>  # Few stream threads couldn't initiate the join group request since the call 
> to 
> [org.apache.kafka.clients.producer.KafkaProducer#sendOffsetsToTransaction|https://jira.corp.appdynamics.com/browse/ANLYTCS_ES-2062#sendOffsetsToTransaction%20which%20was%20hung]
>  was stuck.
>  # Seems like the join group requests were getting parked at the coordinator 
> since the expected members hadn't sent their own group join requests
>  # And after the timeout, the stream threads that were not stuck sent a new 
> join group requests.  
>  # Maybe (6) and (7) is happening infinitely
>  # Sample values of the GroupMetadata object on the group co-ordinator - 
> [^Screen Shot 2019-07-11 at 12.08.09 PM.png]
>  # The list of notYetJoinedMembers client id's matched with the threads 
> waiting for their offsets to be committed. 
> {code:java}
> [List(MemberMetadata(memberId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-38-consumer-efa41349-3da1-43b6-9710-a662f68c63b1,
>  
> clientId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-38-consumer,
>  clientHost=/10.136.98.48, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36-consumer-7cc8e41b-ad98-4006-a18a-b22abe6350f4,
>  
> clientId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36-consumer,
>  clientHost=/10.136.103.148, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-27-consumer-9ffb96c1-3379-4cbd-bee1-5d4719fe6c9d,
>  
> clientId=metric-extractor-stream-c1-d9ac8890-cd80-4b75-a85a-2ff39ea27961-StreamThread-27-consumer,
>  clientHost=/10.136.98.48, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21-consumer-5b8a1f1f-84dd-4a87-86c8-7542c0e50d1f,
>  
> clientId=metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21-consumer,
>  clientHost=/10.136.103.148, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ), 
> MemberMetadata(memberId=metric-extractor-stream-c1-994cee9b-b79b-483b-97cd-f89e8cbb015a-StreamThread-33-consumer-3cb67ec9-c548-4386-962d-64d9772bf719,
>  
> clientId=metric-extractor-stream-c1-994cee9b-b79b-483b-97cd-f89e8cbb015a-StreamThread-33-consumer,
>  clientHost=/10.136.99.15, sessionTimeoutMs=15000, 
> rebalanceTimeoutMs=2147483647, supportedProtocols=List(stream), ))]
> vabhinandan-mac:mp-jstack varsha.abhinandan$ cat jstack.* | grep 
> "metric-extractor-stream-c1-" | grep "StreamThread-" | grep "waiting on 
> condition"
> "metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-36"
>  #128 daemon prio=5 os_prio=0 tid=0x7fc53c047800 nid=0xac waiting on 
> condition [0x7fc4e68e7000]
> "metric-extractor-stream-c1-4875282b-1f26-47cd-affd-23ba5f26787a-StreamThread-21"
>  #93 daemon prio=5 os_prio=0 tid=0x7fc53c2b5800 nid=0x9d waiting on 
> condition [0x7fc4e77f6000]
>

[jira] [Commented] (KAFKA-7190) Under low traffic conditions purging repartition topics cause WARN statements about UNKNOWN_PRODUCER_ID

2019-07-24 Thread Raman Gupta (JIRA)



[ 
https://issues.apache.org/jira/browse/KAFKA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891992#comment-16891992
 ] 

Raman Gupta commented on KAFKA-7190:


> Did you override message.timestamp.difference.max.ms?

No.

> Under low traffic conditions purging repartition topics cause WARN statements 
> about  UNKNOWN_PRODUCER_ID 
> -
>
> Key: KAFKA-7190
> URL: https://issues.apache.org/jira/browse/KAFKA-7190
> Project: Kafka
>  Issue Type: Improvement
>  Components: core, streams
>Affects Versions: 1.1.0, 1.1.1
>Reporter: Bill Bejeck
>Assignee: Guozhang Wang
>Priority: Major
>
> When a streams application has little traffic, then it is possible that 
> consumer purging would delete
> even the last message sent by a producer (i.e., all the messages sent by
> this producer have been consumed and committed), and as a result, the broker
> would delete that producer's ID. The next time when this producer tries to
> send, it will get this UNKNOWN_PRODUCER_ID error code, but in this case,
> this error is retriable: the producer would just get a new producer id and
> retries, and then this time it will succeed. 
>  
> Possible fixes could be on the broker side, i.e., delaying the deletion of 
> the produderIDs for a more extended period or on the streams side developing 
> a more conservative approach to deleting offsets from repartition topics
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8709) improve consumer offsets resiliency

2019-07-24 Thread Christian Becker (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-8709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Christian Becker updated KAFKA-8709:

Description:
We attempted to do an update from 2.2 to 2.3 and then a rollback was done after
{{inter.broker.protocol}} was changed. (We know this shouldn't be done, but it
happened).

After downgrading to 2.2 again, some {{__consumer-offsets}} partitions fail to
load with the message {{Unknown group metadata version 3}}. Subsequently the
broker continues it's startup and the consumer groups won't exist. So the
consumers are starting at their configured OLDEST or NEWEST position and start
committing their offsets.

However on subsequent restarts of the brokers, the {{Unknown group metadata
version}} exception remains and so the restarts are happening over and over
again.

In order to prevent this, I'd suggest a updated flow when loading the offsets:
- the loading should continue reading the __consumer-offsets partition to see
if a subsequent offset is commited that is readable
- if no "valid" offset could be found, throw the existing exception to let the
operator know about the situation
- if a valid offset can be found, continue as normal

This would cause the following sequence of events:
1. corrupted offsets are written
2. broker restart
2a. broker loads offset partition
2b. {{KafkaException}} when loading the offset partition
2c. no "valid" offset is found after the "corrupt" record
2d. offsets reset
3. consumergroups are recreated and "valid" offsets are appended
4. broker restart
4a. broker loads offset partition
4b. {{KafkaException}} when loading the offset partition
4c. "valid" offset is found after the "corrupted" ones
5. consumergroups still have their latest offset

It's a special case now, that this happened after some human error, but this
also poses a danger for situations where the offsets might be corrupted for
some unrelated reason. losing the offsets is a very serious situation and there
should be safeguards against it, especially when there might be offsets
recoverable. With this improvement, the offsets would be still lost once, but
the broker is able to recover automatically over time and after compaction the
corrupted records will be gone. (In our case this caused serious confusion as
we've lost the offsets multiple times, as the error message {{Error loading
offsets from}} implies, that the corrupted data is deleted and therefore the
situation is recovered, whereas in reality this continues to be a issue until
the corrupt data is gone once and for all which might take a long time.

In our case we seem to have evicted the broken records with temporarily setting
the segment time to a very low value and deactivation of compaction
{code:java}
/opt/kafka/bin/kafka-topics.sh --alter --config segment.ms=90 --topic
__consumer_offsets --zookeeper localhost:2181
/opt/kafka/bin/kafka-topics.sh --alter --config cleanup.policy=delete --topic
__consumer_offsets --zookeeper localhost:2181
/opt/kafka/bin/kafka-topics.sh --alter --config retention.ms=180 --topic
__consumer_offsets --zookeeper localhost:2181
< wait for the cleaner to clean up >
/opt/kafka/bin/kafka-topics.sh --alter --delete-config segment.ms --topic
__consumer_offsets --zookeeper localhost:2181
/opt/kafka/bin/kafka-topics.sh --alter --config cleanup.policy=compact --topic
__consumer_offsets --zookeeper localhost:2181
/opt/kafka/bin/kafka-topics.sh --alter --delete-config retention.ms --topic
__consumer_offsets --zookeeper localhost:2181
< hope all consumer groups commited their offset before a failover needs to
happen >{code}

was:
We attempted to do an update from 2.2 to 2.3 and then a rollback was done after
{{inter.broker.protocol}} was changed. (We know this shouldn't be done, but it
happened).

However on subsequent restarts of the brokers, the {{Unknown group metadata
version}} exception remains and so the restarts are happening over and over
again.

This would cause the following sequence of events:
1. corrupted offsets are written
2. broker restart
2a. broker loads offset partition
2b. {{KafkaException}} when loading the offset partition
2c. no

[jira] [Commented] (KAFKA-7940) Flaky Test CustomQuotaCallbackTest#testCustomQuotaCallback



[ 
https://issues.apache.org/jira/browse/KAFKA-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891987#comment-16891987
 ] 

Matthias J. Sax commented on KAFKA-7940:


[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3809/tests]

> Flaky Test CustomQuotaCallbackTest#testCustomQuotaCallback
> --
>
> Key: KAFKA-7940
> URL: https://issues.apache.org/jira/browse/KAFKA-7940
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Assignee: Stanislav Kozlovski
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.2.0
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/14/]
> {quote}java.lang.AssertionError: Too many quotaLimit calls Map(PRODUCE -> 1, 
> FETCH -> 1, REQUEST -> 4) at org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> kafka.api.CustomQuotaCallbackTest.testCustomQuotaCallback(CustomQuotaCallbackTest.scala:105){quote}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8122) Flaky Test EosIntegrationTest#shouldNotViolateEosIfOneTaskFailsWithState



 [ 
https://issues.apache.org/jira/browse/KAFKA-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8122:
---
Affects Version/s: 2.4.0

> Flaky Test EosIntegrationTest#shouldNotViolateEosIfOneTaskFailsWithState
> 
>
> Key: KAFKA-8122
> URL: https://issues.apache.org/jira/browse/KAFKA-8122
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.3.0, 2.4.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Major
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3285/testReport/junit/org.apache.kafka.streams.integration/EosIntegrationTest/shouldNotViolateEosIfOneTaskFailsWithState/]
> {quote}java.lang.AssertionError: Expected: <[KeyValue(0, 0), KeyValue(0, 1), 
> KeyValue(0, 3), KeyValue(0, 6), KeyValue(0, 10), KeyValue(0, 15), KeyValue(0, 
> 21), KeyValue(0, 28), KeyValue(0, 36), KeyValue(0, 45)]> but: was 
> <[KeyValue(0, 0), KeyValue(0, 1), KeyValue(0, 3), KeyValue(0, 6), KeyValue(0, 
> 10), KeyValue(0, 15), KeyValue(0, 21), KeyValue(0, 28), KeyValue(0, 36), 
> KeyValue(0, 45), KeyValue(0, 55), KeyValue(0, 66), KeyValue(0, 78), 
> KeyValue(0, 91), KeyValue(0, 105)]> at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at 
> org.apache.kafka.streams.integration.EosIntegrationTest.checkResultPerKey(EosIntegrationTest.java:212)
>  at 
> org.apache.kafka.streams.integration.EosIntegrationTest.shouldNotViolateEosIfOneTaskFailsWithState(EosIntegrationTest.java:414){quote}
> STDOUT
> {quote}[2019-03-17 01:19:51,971] INFO Created server with tickTime 800 
> minSessionTimeout 1600 maxSessionTimeout 16000 datadir 
> /tmp/kafka-10997967593034298484/version-2 snapdir 
> /tmp/kafka-5184295822696533708/version-2 
> (org.apache.zookeeper.server.ZooKeeperServer:174) [2019-03-17 01:19:51,971] 
> INFO binding to port /127.0.0.1:0 
> (org.apache.zookeeper.server.NIOServerCnxnFactory:89) [2019-03-17 
> 01:19:51,973] INFO KafkaConfig values: advertised.host.name = null 
> advertised.listeners = null advertised.port = null 
> alter.config.policy.class.name = null 
> alter.log.dirs.replication.quota.window.num = 11 
> alter.log.dirs.replication.quota.window.size.seconds = 1 
> authorizer.class.name = auto.create.topics.enable = false 
> auto.leader.rebalance.enable = true background.threads = 10 broker.id = 0 
> broker.id.generation.enable = true broker.rack = null 
> client.quota.callback.class = null compression.type = producer 
> connection.failed.authentication.delay.ms = 100 connections.max.idle.ms = 
> 60 connections.max.reauth.ms = 0 control.plane.listener.name = null 
> controlled.shutdown.enable = true controlled.shutdown.max.retries = 3 
> controlled.shutdown.retry.backoff.ms = 5000 controller.socket.timeout.ms = 
> 3 create.topic.policy.class.name = null default.replication.factor = 1 
> delegation.token.expiry.check.interval.ms = 360 
> delegation.token.expiry.time.ms = 8640 delegation.token.master.key = null 
> delegation.token.max.lifetime.ms = 60480 
> delete.records.purgatory.purge.interval.requests = 1 delete.topic.enable = 
> true fetch.purgatory.purge.interval.requests = 1000 
> group.initial.rebalance.delay.ms = 0 group.max.session.timeout.ms = 30 
> group.max.size = 2147483647 group.min.session.timeout.ms = 0 host.name = 
> localhost inter.broker.listener.name = null inter.broker.protocol.version = 
> 2.2-IV1 kafka.metrics.polling.interval.secs = 10 kafka.metrics.reporters = [] 
> leader.imbalance.check.interval.seconds = 300 
> leader.imbalance.per.broker.percentage = 10 listener.security.protocol.map = 
> PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL 
> listeners = null log.cleaner.backoff.ms = 15000 
> log.cleaner.dedupe.buffer.size = 2097152 log.cleaner.delete.retention.ms = 
> 8640 log.cleaner.enable = true log.cleaner.io.buffer.load.factor = 0.9 
> log.cleaner.io.buffer.size = 524288 log.cleaner.io.max.bytes.per.second = 
> 1.7976931348623157E308 log.cleaner.min.cleanable.ratio = 0.5 
> log.cleaner.min.compaction.lag.ms = 0 log.cleaner.threads = 1 
> log.cleanup.policy = [delete] log.dir = 
> /tmp/junit16020146621422955757/junit17406374597406011269 log.dirs = null 
> log.flush.interval.messages = 9223372036854775807 log.flush.interval.ms = 
> null log.flush.offset.checkpoint.interval.ms = 6 
> log.flush.scheduler.interval.ms = 9223372036854775807 
> log.flush.start.offset.checkpoint.interval.ms = 6 
> log.index.interval.bytes = 4096 log.index.size.max.bytes = 10485760 
> log.message.downconversion.enable = true log.message.format.version = 2.2-IV1 
>

[jira] [Commented] (KAFKA-8122) Flaky Test EosIntegrationTest#shouldNotViolateEosIfOneTaskFailsWithState



[ 
https://issues.apache.org/jira/browse/KAFKA-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891988#comment-16891988
 ] 

Matthias J. Sax commented on KAFKA-8122:


[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3809/tests]

> Flaky Test EosIntegrationTest#shouldNotViolateEosIfOneTaskFailsWithState
> 
>
> Key: KAFKA-8122
> URL: https://issues.apache.org/jira/browse/KAFKA-8122
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.3.0, 2.4.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Major
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3285/testReport/junit/org.apache.kafka.streams.integration/EosIntegrationTest/shouldNotViolateEosIfOneTaskFailsWithState/]
> {quote}java.lang.AssertionError: Expected: <[KeyValue(0, 0), KeyValue(0, 1), 
> KeyValue(0, 3), KeyValue(0, 6), KeyValue(0, 10), KeyValue(0, 15), KeyValue(0, 
> 21), KeyValue(0, 28), KeyValue(0, 36), KeyValue(0, 45)]> but: was 
> <[KeyValue(0, 0), KeyValue(0, 1), KeyValue(0, 3), KeyValue(0, 6), KeyValue(0, 
> 10), KeyValue(0, 15), KeyValue(0, 21), KeyValue(0, 28), KeyValue(0, 36), 
> KeyValue(0, 45), KeyValue(0, 55), KeyValue(0, 66), KeyValue(0, 78), 
> KeyValue(0, 91), KeyValue(0, 105)]> at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at 
> org.apache.kafka.streams.integration.EosIntegrationTest.checkResultPerKey(EosIntegrationTest.java:212)
>  at 
> org.apache.kafka.streams.integration.EosIntegrationTest.shouldNotViolateEosIfOneTaskFailsWithState(EosIntegrationTest.java:414){quote}
> STDOUT
> {quote}[2019-03-17 01:19:51,971] INFO Created server with tickTime 800 
> minSessionTimeout 1600 maxSessionTimeout 16000 datadir 
> /tmp/kafka-10997967593034298484/version-2 snapdir 
> /tmp/kafka-5184295822696533708/version-2 
> (org.apache.zookeeper.server.ZooKeeperServer:174) [2019-03-17 01:19:51,971] 
> INFO binding to port /127.0.0.1:0 
> (org.apache.zookeeper.server.NIOServerCnxnFactory:89) [2019-03-17 
> 01:19:51,973] INFO KafkaConfig values: advertised.host.name = null 
> advertised.listeners = null advertised.port = null 
> alter.config.policy.class.name = null 
> alter.log.dirs.replication.quota.window.num = 11 
> alter.log.dirs.replication.quota.window.size.seconds = 1 
> authorizer.class.name = auto.create.topics.enable = false 
> auto.leader.rebalance.enable = true background.threads = 10 broker.id = 0 
> broker.id.generation.enable = true broker.rack = null 
> client.quota.callback.class = null compression.type = producer 
> connection.failed.authentication.delay.ms = 100 connections.max.idle.ms = 
> 60 connections.max.reauth.ms = 0 control.plane.listener.name = null 
> controlled.shutdown.enable = true controlled.shutdown.max.retries = 3 
> controlled.shutdown.retry.backoff.ms = 5000 controller.socket.timeout.ms = 
> 3 create.topic.policy.class.name = null default.replication.factor = 1 
> delegation.token.expiry.check.interval.ms = 360 
> delegation.token.expiry.time.ms = 8640 delegation.token.master.key = null 
> delegation.token.max.lifetime.ms = 60480 
> delete.records.purgatory.purge.interval.requests = 1 delete.topic.enable = 
> true fetch.purgatory.purge.interval.requests = 1000 
> group.initial.rebalance.delay.ms = 0 group.max.session.timeout.ms = 30 
> group.max.size = 2147483647 group.min.session.timeout.ms = 0 host.name = 
> localhost inter.broker.listener.name = null inter.broker.protocol.version = 
> 2.2-IV1 kafka.metrics.polling.interval.secs = 10 kafka.metrics.reporters = [] 
> leader.imbalance.check.interval.seconds = 300 
> leader.imbalance.per.broker.percentage = 10 listener.security.protocol.map = 
> PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL 
> listeners = null log.cleaner.backoff.ms = 15000 
> log.cleaner.dedupe.buffer.size = 2097152 log.cleaner.delete.retention.ms = 
> 8640 log.cleaner.enable = true log.cleaner.io.buffer.load.factor = 0.9 
> log.cleaner.io.buffer.size = 524288 log.cleaner.io.max.bytes.per.second = 
> 1.7976931348623157E308 log.cleaner.min.cleanable.ratio = 0.5 
> log.cleaner.min.compaction.lag.ms = 0 log.cleaner.threads = 1 
> log.cleanup.policy = [delete] log.dir = 
> /tmp/junit16020146621422955757/junit17406374597406011269 log.dirs = null 
> log.flush.interval.messages = 9223372036854775807 log.flush.interval.ms = 
> null log.flush.offset.checkpoint.interval.ms = 6 
> log.flush.scheduler.interval.ms = 9223372036854775807 
> log.flush.start.offset.checkpoint.interval.ms = 6 
> log.index.interval.bytes = 4096 log.index.size.max.bytes

[jira] [Commented] (KAFKA-8589) Flakey test ResetConsumerGroupOffsetTest#testResetOffsetsExistingTopic



[ 
https://issues.apache.org/jira/browse/KAFKA-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891985#comment-16891985
 ] 

Matthias J. Sax commented on KAFKA-8589:


[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3809/tests]

> Flakey test ResetConsumerGroupOffsetTest#testResetOffsetsExistingTopic
> --
>
> Key: KAFKA-8589
> URL: https://issues.apache.org/jira/browse/KAFKA-8589
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, clients, unit tests
>Affects Versions: 2.4.0
>Reporter: Boyang Chen
>Priority: Major
>  Labels: flaky-test
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5724/consoleFull]
> *20:25:15* 
> kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsExistingTopic 
> failed, log available in 
> /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/core/build/reports/testOutput/kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsExistingTopic.test.stdout*20:25:15*
>  *20:25:15* kafka.admin.ResetConsumerGroupOffsetTest > 
> testResetOffsetsExistingTopic FAILED*20:25:15* 
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
> coordinator is not available.*20:25:15* at 
> org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)*20:25:15*
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)*20:25:15*
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)*20:25:15*
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)*20:25:15*
>  at 
> kafka.admin.ConsumerGroupCommand$ConsumerGroupService.$anonfun$resetOffsets$1(ConsumerGroupCommand.scala:379)*20:25:15*
>  at 
> scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:160)*20:25:15*
>  at 
> scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:160)*20:25:15*
>  at scala.collection.Iterator.foreach(Iterator.scala:941)*20:25:15*   
>   at scala.collection.Iterator.foreach$(Iterator.scala:941)*20:25:15* 
> at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1429)*20:25:15*  
>at scala.collection.IterableLike.foreach(IterableLike.scala:74)*20:25:15*  
>at 
> scala.collection.IterableLike.foreach$(IterableLike.scala:73)*20:25:15*   
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:56)*20:25:15*   
>   at 
> scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:160)*20:25:15*
>  at 
> scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:158)*20:25:15*
>  at 
> scala.collection.AbstractTraversable.foldLeft(Traversable.scala:108)*20:25:15*
>  at 
> kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:377)*20:25:15*
>  at 
> kafka.admin.ResetConsumerGroupOffsetTest.resetOffsets(ResetConsumerGroupOffsetTest.scala:507)*20:25:15*
>  at 
> kafka.admin.ResetConsumerGroupOffsetTest.resetAndAssertOffsets(ResetConsumerGroupOffsetTest.scala:477)*20:25:15*
>  at 
> kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsExistingTopic(ResetConsumerGroupOffsetTest.scala:123)*20:25:15*
>  *20:25:15* Caused by:*20:25:15* 
> org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
> coordinator is not available.*20*



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8671) NullPointerException occurs if topic associated with GlobalKTable changes

2019-07-24 Thread Alex Leung (JIRA)



[ 
https://issues.apache.org/jira/browse/KAFKA-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891984#comment-16891984
 ] 

Alex Leung commented on KAFKA-8671:
---

Thanks for taking a look, [~guozhang]. I also wanted to add a unit test for 
this but haven't had time yet. Will add that and submit a PR.

> NullPointerException occurs if topic associated with GlobalKTable changes
> -
>
> Key: KAFKA-8671
> URL: https://issues.apache.org/jira/browse/KAFKA-8671
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0, 2.1.0, 2.2.0, 2.3.0
>Reporter: Alex Leung
>Assignee: Alex Leung
>Priority: Critical
>
> The following NullPointerException occurs when the global/.checkpoint file 
> contains a line with a topic previously associated with (but no longer 
> configured for) a GlobalKTable:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.kafka.streams.processor.internals.GlobalStateUpdateTask.update(GlobalStateUpdateTask.java:85)
> at 
> org.apache.kafka.streams.processor.internals.GlobalStreamThread$StateConsumer.pollAndUpdate(GlobalStreamThread.java:241)
> at 
> org.apache.kafka.streams.processor.internals.GlobalStreamThread.run(GlobalStreamThread.java:290){code}
>  
> After line 84 
> ([https://github.com/apache/kafka/blob/2.0/streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateUpdateTask.java#L84)]
>  `sourceNodeAndDeserializer` is null for the old, but still valid, topic. 
> This can be reproduced with the following sequence:
>  # create a GlobalKTable associated with topic, 'global-topic1'
>  # change the topic associated with the GlobalKTable to 'global-topic2' 
>  ##  at this point, the global/.checkpoint file will contain lines for both 
> topics
>  # produce messages to previous topic ('global-topic1')
>  # the consumer will attempt to consume from global-topic1, but no 
> deserializer associated with global-topic1 will be found and the NPE will 
> occur
> It looks like the following recent commit has included checkpoint validations 
> that may prevent this issue: 
> https://github.com/apache/kafka/commit/53b4ce5c00d61be87962f603682873665155cec4#diff-cc98a6c20f2a8483e1849aea6921c34dR425



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-7190) Under low traffic conditions purging repartition topics cause WARN statements about UNKNOWN_PRODUCER_ID



[ 
https://issues.apache.org/jira/browse/KAFKA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891976#comment-16891976
 ] 

Guozhang Wang commented on KAFKA-7190:
--

Hmm interesting, in this case the topic is not a repartition topic so no 
purge-records requests would be sent (note that in this case even if the 
messages are not yet physically removed, the producer id would still be 
deleted). Did you override {{message.timestamp.difference.max.ms}}?

cc [~bob-barrett] who's working on KIP-360 now, maybe he can chime in with some 
insights.

> Under low traffic conditions purging repartition topics cause WARN statements 
> about  UNKNOWN_PRODUCER_ID 
> -
>
> Key: KAFKA-7190
> URL: https://issues.apache.org/jira/browse/KAFKA-7190
> Project: Kafka
>  Issue Type: Improvement
>  Components: core, streams
>Affects Versions: 1.1.0, 1.1.1
>Reporter: Bill Bejeck
>Assignee: Guozhang Wang
>Priority: Major
>
> When a streams application has little traffic, then it is possible that 
> consumer purging would delete
> even the last message sent by a producer (i.e., all the messages sent by
> this producer have been consumed and committed), and as a result, the broker
> would delete that producer's ID. The next time when this producer tries to
> send, it will get this UNKNOWN_PRODUCER_ID error code, but in this case,
> this error is retriable: the producer would just get a new producer id and
> retries, and then this time it will succeed. 
>  
> Possible fixes could be on the broker side, i.e., delaying the deletion of 
> the produderIDs for a more extended period or on the streams side developing 
> a more conservative approach to deleting offsets from repartition topics
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KAFKA-8710) InitProducerId changes for KIP-360

2019-07-24 Thread Bob Barrett (JIRA)

Bob Barrett created KAFKA-8710:
--

 Summary: InitProducerId changes for KIP-360
 Key: KAFKA-8710
 URL: https://issues.apache.org/jira/browse/KAFKA-8710
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 2.3.0
Reporter: Bob Barrett
Assignee: Bob Barrett


As part of KIP-360, InitProducerId needs to accept two additional parameters, 
the current producerId and the current producerEpoch, and it needs to allow 
producers to safely re-initialize a producer ID and continue processing as long 
as no other producer with the same transactional ID has started up.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8671) NullPointerException occurs if topic associated with GlobalKTable changes



[ 
https://issues.apache.org/jira/browse/KAFKA-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891971#comment-16891971
 ] 

Guozhang Wang commented on KAFKA-8671:
--

I looked at the diff and it looks promising to me.

Could you file a PR following the contribution guidelines as well? 
https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes#ContributingCodeChanges-PullRequest

Also cc [~vvcephei] for reviewing this change.

> NullPointerException occurs if topic associated with GlobalKTable changes
> -
>
> Key: KAFKA-8671
> URL: https://issues.apache.org/jira/browse/KAFKA-8671
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0, 2.1.0, 2.2.0, 2.3.0
>Reporter: Alex Leung
>Assignee: Alex Leung
>Priority: Critical
>
> The following NullPointerException occurs when the global/.checkpoint file 
> contains a line with a topic previously associated with (but no longer 
> configured for) a GlobalKTable:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.kafka.streams.processor.internals.GlobalStateUpdateTask.update(GlobalStateUpdateTask.java:85)
> at 
> org.apache.kafka.streams.processor.internals.GlobalStreamThread$StateConsumer.pollAndUpdate(GlobalStreamThread.java:241)
> at 
> org.apache.kafka.streams.processor.internals.GlobalStreamThread.run(GlobalStreamThread.java:290){code}
>  
> After line 84 
> ([https://github.com/apache/kafka/blob/2.0/streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateUpdateTask.java#L84)]
>  `sourceNodeAndDeserializer` is null for the old, but still valid, topic. 
> This can be reproduced with the following sequence:
>  # create a GlobalKTable associated with topic, 'global-topic1'
>  # change the topic associated with the GlobalKTable to 'global-topic2' 
>  ##  at this point, the global/.checkpoint file will contain lines for both 
> topics
>  # produce messages to previous topic ('global-topic1')
>  # the consumer will attempt to consume from global-topic1, but no 
> deserializer associated with global-topic1 will be found and the NPE will 
> occur
> It looks like the following recent commit has included checkpoint validations 
> that may prevent this issue: 
> https://github.com/apache/kafka/commit/53b4ce5c00d61be87962f603682873665155cec4#diff-cc98a6c20f2a8483e1849aea6921c34dR425



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8396) Clean up Transformer API

2019-07-24 Thread John Roesler (JIRA)



[ 
https://issues.apache.org/jira/browse/KAFKA-8396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891913#comment-16891913
 ] 

John Roesler commented on KAFKA-8396:
-

>From the discussion of KIP-478, Guozhang had the following comments relevant 
>to this ticket:

{quote}Hi John,

Just a wild thought about Transformer: now with the new Processor#init(ProcessorContext), do we still need a
Transformer (and even ValueTransformer / ValueTransformerWithKey)?

What if:

* We just make KStream#transform to get a ProcessorSupplier as well, and
inside `process()` we check that at most one `context.forward()` is called,
and then take it as the return value.
* We would still use ValueTransformer for KStream#transformValue, or we can
also use a `ProcessorSupplier where we allow at most one
`context.forward()` AND we ignore whatever passed in as key but just use
the original key.


Guozhang{quote}

> Clean up Transformer API
> 
>
> Key: KAFKA-8396
> URL: https://issues.apache.org/jira/browse/KAFKA-8396
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: John Roesler
>Priority: Major
>  Labels: needs-kip, user-experience
>
> Currently, KStream operators transformValues and flatTransformValues disable 
> context forwarding, and force operators to just return the new values.
> The reason is that we wanted to prevent the key from changing, since the 
> whole point of a `xValues` transformation is that we _do not_ change the key, 
> and hence don't need to repartition.
> However, the chosen mechanism has some drawbacks: The Transform concept is 
> basically a way to plug in a custom Processor within the Streams DSL, but 
> these restrictions make it more like a MapValues with access to the context. 
> For example, even though you can still schedule punctuations, there's no way 
> to forward values as a result of them. So, as a user, it's hard to build a 
> mental model of how to use a TransformValues (because it's not quite a 
> Transformer and not quite a Mapper).
> Also, logically, a Transformer can call forward as much as it wants, so a 
> Transformer and a FlatTransformer are effectively the same thing. Then, we 
> also have TransformValues and FlatTransformValues that are also two more 
> versions of the same thing, just to implement the key restrictions. 
> Internally, some of these can send downstream by returning OR forwarding, and 
> others can only return. It's a lot for users to keep in mind.
> We can clean up this API significantly by just allowing all transformers to 
> call `forward`. In the `Values` case, we can wrap the ProcessorContext in one 
> that checks the key is `equal` to the one that got passed in (i.e., saves a 
> reference and enforces equality with that reference in any call to 
> `forward`). Then, we can actually deprecate the `*ValueTransformer*` 
> interfaces and remove the restriction about calling forward.
> We can consider a further cleanup (TBD) to deprecate the existing Transformer 
> interface entirely, and replace it with one with a `void` return type. Then, 
> the Transform and FlatTransform cases collapse together, and we just need 
> Transform and TransformValues.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KAFKA-8706) Kafka 2.3.0 Unit Test Failures on Oracle Linux - Need help debugging framework or issue.



[ 
https://issues.apache.org/jira/browse/KAFKA-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891890#comment-16891890
 ] 

Chandrasekhar commented on KAFKA-8706:
--

Re-running just one more time to confirm, if we hit these 3 times...I will have 
to change priority to critical to make sure there is no impending danger to 
actual running services...Kindly suggest on severity and let me know if you 
need any more information from this effort...we are just trying to get to 
latest so we are not left behind.

 

Appreciate swift response again!!

 

Best regards

 

Chandra Rao

> Kafka 2.3.0 Unit Test Failures on Oracle Linux  - Need help debugging 
> framework or issue.
> -
>
> Key: KAFKA-8706
> URL: https://issues.apache.org/jira/browse/KAFKA-8706
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Minor
> Attachments: KafkaAUTFailures.txt, KafkaAUTFailures07242019.txt, 
> KafkaUTFailures07242019.GIF
>
>
> Hi
> We have just imported KAFKA 2.3.0 source code from git repo and compiling 
> using Gradle 4.7 on Oracle VM with following info:
> [vagrant@localhost kafka-2.3.0]$ uname -a
>  Linux localhost 4.1.12-112.14.1.el7uek.x86_64 #2 SMP Fri Dec 8 18:37:23 PST 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>  [vagrant@localhost kafka-2.3.0]$
>  
> Upon compiling (#gradle build) , there are 6 test failures at the end. Failed 
> Tests are reported as following:
> DescribeConsumerGroupTest. testDescribeOffsetsOfExistingGroupWithNoMembers 
>  SaslSslAdminClientIntegrationTest. 
> testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords 
>  UserQuotaTest. testQuotaOverrideDelete 
>  UserQuotaTest. testThrottledProducerConsumer 
>  MetricsDuringTopicCreationDeletionTest. testMetricsDuringTopicCreateDelete 
>  SocketServerTest. testControlPlaneRequest
> Attached find the failures.
>  
> [^KafkaAUTFailures.txt]
>  
>  
>  We would like to know if we are missing anything in our build environment or 
> if this is a known test failures in Kafka 2.3.0
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Comment Edited] (KAFKA-8705) NullPointerException was thrown by topology optimization when two MergeNodes have common KeyChaingingNode

2019-07-24 Thread Bill Bejeck (JIRA)



[ 
https://issues.apache.org/jira/browse/KAFKA-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891887#comment-16891887
 ] 

Bill Bejeck edited comment on KAFKA-8705 at 7/24/19 2:12 PM:
-

[~hiro116s] thanks for the detailed explanation and the comprehensive analysis! 
 I'll take a look into fixing this.  Thanks again for taking the time to report 
this issue.

 

-Bill


was (Author: bbejeck):
[~hiro116s] thanks for reporting this, we'll take a look into this.

> NullPointerException was thrown by topology optimization when two MergeNodes 
> have common KeyChaingingNode
> -
>
> Key: KAFKA-8705
> URL: https://issues.apache.org/jira/browse/KAFKA-8705
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Hiroshi Nakahara
>Assignee: Bill Bejeck
>Priority: Major
>
> NullPointerException was thrown by topology optimization when two MergeNodes 
> have common KeyChaingingNode.
> Kafka Stream version: 2.3.0
> h3. Code
> {code:java}
> import org.apache.kafka.common.serialization.Serdes;
> import org.apache.kafka.streams.StreamsBuilder;
> import org.apache.kafka.streams.StreamsConfig;
> import org.apache.kafka.streams.kstream.Consumed;
> import org.apache.kafka.streams.kstream.KStream;
> import java.util.Properties;
> public class Main {
> public static void main(String[] args) {
> final StreamsBuilder streamsBuilder = new StreamsBuilder();
> final KStream parentStream = 
> streamsBuilder.stream("parentTopic", Consumed.with(Serdes.Integer(), 
> Serdes.Integer()))
> .selectKey(Integer::sum);  // To make parentStream 
> KeyChaingingPoint
> final KStream childStream1 = 
> parentStream.mapValues(v -> v + 1);
> final KStream childStream2 = 
> parentStream.mapValues(v -> v + 2);
> final KStream childStream3 = 
> parentStream.mapValues(v -> v + 3);
> childStream1
> .merge(childStream2)
> .merge(childStream3)
> .to("outputTopic");
> final Properties properties = new Properties();
> properties.setProperty(StreamsConfig.TOPOLOGY_OPTIMIZATION, 
> StreamsConfig.OPTIMIZE);
> streamsBuilder.build(properties);
> }
> }
> {code}
> h3. Expected result
> streamsBuilder.build should create Topology without throwing Exception.  The 
> expected topology is:
> {code:java}
> Topologies:
>Sub-topology: 0
> Source: KSTREAM-SOURCE-00 (topics: [parentTopic])
>   --> KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-KEY-SELECT-01 (stores: [])
>   --> KSTREAM-MAPVALUES-02, KSTREAM-MAPVALUES-03, 
> KSTREAM-MAPVALUES-04
>   <-- KSTREAM-SOURCE-00
> Processor: KSTREAM-MAPVALUES-02 (stores: [])
>   --> KSTREAM-MERGE-05
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MAPVALUES-03 (stores: [])
>   --> KSTREAM-MERGE-05
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MAPVALUES-04 (stores: [])
>   --> KSTREAM-MERGE-06
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MERGE-05 (stores: [])
>   --> KSTREAM-MERGE-06
>   <-- KSTREAM-MAPVALUES-02, KSTREAM-MAPVALUES-03
> Processor: KSTREAM-MERGE-06 (stores: [])
>   --> KSTREAM-SINK-07
>   <-- KSTREAM-MERGE-05, KSTREAM-MAPVALUES-04
> Sink: KSTREAM-SINK-07 (topic: outputTopic)
>   <-- KSTREAM-MERGE-06
> {code}
> h3. Actual result
> NullPointerException was thrown with the following stacktrace.
> {code:java}
> Exception in thread "main" java.lang.NullPointerException
>   at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybeUpdateKeyChangingRepartitionNodeMap(InternalStreamsBuilder.java:397)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybeOptimizeRepartitionOperations(InternalStreamsBuilder.java:315)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybePerformOptimizations(InternalStreamsBuilder.java:304)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.buildAndOptimizeTopology(InternalStreamsBuilder.java:275)
>   at 
> org.apache.kafka.streams.StreamsBuilder.build(StreamsBuilder.java:558)
>   at Main.main(Main.java:24){code}
> h3. Cause
> This exception occurs in 
> InternalStreamsBuilder#maybeUpdateKeyChaingingRepartitionNodeMap.
> {code:java}
> private void maybeUpdateKeyChangingRepartitionNodeMap() {
> final Map> 
>

[jira] [Commented] (KAFKA-8705) NullPointerException was thrown by topology optimization when two MergeNodes have common KeyChaingingNode

2019-07-24 Thread Bill Bejeck (JIRA)



[ 
https://issues.apache.org/jira/browse/KAFKA-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891887#comment-16891887
 ] 

Bill Bejeck commented on KAFKA-8705:


[~hiro116s] thanks for reporting this, we'll take a look into this.

> NullPointerException was thrown by topology optimization when two MergeNodes 
> have common KeyChaingingNode
> -
>
> Key: KAFKA-8705
> URL: https://issues.apache.org/jira/browse/KAFKA-8705
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Hiroshi Nakahara
>Assignee: Bill Bejeck
>Priority: Major
>
> NullPointerException was thrown by topology optimization when two MergeNodes 
> have common KeyChaingingNode.
> Kafka Stream version: 2.3.0
> h3. Code
> {code:java}
> import org.apache.kafka.common.serialization.Serdes;
> import org.apache.kafka.streams.StreamsBuilder;
> import org.apache.kafka.streams.StreamsConfig;
> import org.apache.kafka.streams.kstream.Consumed;
> import org.apache.kafka.streams.kstream.KStream;
> import java.util.Properties;
> public class Main {
> public static void main(String[] args) {
> final StreamsBuilder streamsBuilder = new StreamsBuilder();
> final KStream parentStream = 
> streamsBuilder.stream("parentTopic", Consumed.with(Serdes.Integer(), 
> Serdes.Integer()))
> .selectKey(Integer::sum);  // To make parentStream 
> KeyChaingingPoint
> final KStream childStream1 = 
> parentStream.mapValues(v -> v + 1);
> final KStream childStream2 = 
> parentStream.mapValues(v -> v + 2);
> final KStream childStream3 = 
> parentStream.mapValues(v -> v + 3);
> childStream1
> .merge(childStream2)
> .merge(childStream3)
> .to("outputTopic");
> final Properties properties = new Properties();
> properties.setProperty(StreamsConfig.TOPOLOGY_OPTIMIZATION, 
> StreamsConfig.OPTIMIZE);
> streamsBuilder.build(properties);
> }
> }
> {code}
> h3. Expected result
> streamsBuilder.build should create Topology without throwing Exception.  The 
> expected topology is:
> {code:java}
> Topologies:
>Sub-topology: 0
> Source: KSTREAM-SOURCE-00 (topics: [parentTopic])
>   --> KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-KEY-SELECT-01 (stores: [])
>   --> KSTREAM-MAPVALUES-02, KSTREAM-MAPVALUES-03, 
> KSTREAM-MAPVALUES-04
>   <-- KSTREAM-SOURCE-00
> Processor: KSTREAM-MAPVALUES-02 (stores: [])
>   --> KSTREAM-MERGE-05
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MAPVALUES-03 (stores: [])
>   --> KSTREAM-MERGE-05
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MAPVALUES-04 (stores: [])
>   --> KSTREAM-MERGE-06
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MERGE-05 (stores: [])
>   --> KSTREAM-MERGE-06
>   <-- KSTREAM-MAPVALUES-02, KSTREAM-MAPVALUES-03
> Processor: KSTREAM-MERGE-06 (stores: [])
>   --> KSTREAM-SINK-07
>   <-- KSTREAM-MERGE-05, KSTREAM-MAPVALUES-04
> Sink: KSTREAM-SINK-07 (topic: outputTopic)
>   <-- KSTREAM-MERGE-06
> {code}
> h3. Actual result
> NullPointerException was thrown with the following stacktrace.
> {code:java}
> Exception in thread "main" java.lang.NullPointerException
>   at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybeUpdateKeyChangingRepartitionNodeMap(InternalStreamsBuilder.java:397)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybeOptimizeRepartitionOperations(InternalStreamsBuilder.java:315)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybePerformOptimizations(InternalStreamsBuilder.java:304)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.buildAndOptimizeTopology(InternalStreamsBuilder.java:275)
>   at 
> org.apache.kafka.streams.StreamsBuilder.build(StreamsBuilder.java:558)
>   at Main.main(Main.java:24){code}
> h3. Cause
> This exception occurs in 
> InternalStreamsBuilder#maybeUpdateKeyChaingingRepartitionNodeMap.
> {code:java}
> private void maybeUpdateKeyChangingRepartitionNodeMap() {
> final Map> 
> mergeNodesToKeyChangers = new HashMap<>();
> for (final StreamsGraphNode mergeNode : mergeNodes) {
> mergeNodesToKeyChangers.put(mergeNode, new LinkedHashSet<>());
> final Collection keys = 
>

[jira] [Comment Edited] (KAFKA-8706) Kafka 2.3.0 Unit Test Failures on Oracle Linux - Need help debugging framework or issue.



[ 
https://issues.apache.org/jira/browse/KAFKA-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891872#comment-16891872
 ] 

Chandrasekhar edited comment on KAFKA-8706 at 7/24/19 2:08 PM:
---

[~ckamal]: Thanks for swift response and references to known issues!

 

Reran the AUT again, and found four failures. The following four are 
consistently failing. NOTE: Initially I had the build run of file descriptors, 
so I bumped up the open file descriptors to get around that issue before first 
run.

Do these failures mean anything to actually running software? i.e. the failures:

testProduceConsumeViaAssign

testThrottledProducerConsumer

testMetricsDuringTopicCreateDelete

testControlPlaneRequest

 

[^KafkaAUTFailures07242019.txt]!KafkaUTFailures07242019.GIF!


was (Author: chandranc@oracle.com):
Reran the AUT again, and found four failures. The following four are 
consistently failing. NOTE: Initially I had the build run of file descriptors, 
so I bumped up the open file descriptors to get around that issue before first 
run.

Do these failures mean anything to actually running software? i.e. the failures:

testProduceConsumeViaAssign

testThrottledProducerConsumer

testMetricsDuringTopicCreateDelete

testControlPlaneRequest

 

[^KafkaAUTFailures07242019.txt]!KafkaUTFailures07242019.GIF!

> Kafka 2.3.0 Unit Test Failures on Oracle Linux  - Need help debugging 
> framework or issue.
> -
>
> Key: KAFKA-8706
> URL: https://issues.apache.org/jira/browse/KAFKA-8706
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Minor
> Attachments: KafkaAUTFailures.txt, KafkaAUTFailures07242019.txt, 
> KafkaUTFailures07242019.GIF
>
>
> Hi
> We have just imported KAFKA 2.3.0 source code from git repo and compiling 
> using Gradle 4.7 on Oracle VM with following info:
> [vagrant@localhost kafka-2.3.0]$ uname -a
>  Linux localhost 4.1.12-112.14.1.el7uek.x86_64 #2 SMP Fri Dec 8 18:37:23 PST 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>  [vagrant@localhost kafka-2.3.0]$
>  
> Upon compiling (#gradle build) , there are 6 test failures at the end. Failed 
> Tests are reported as following:
> DescribeConsumerGroupTest. testDescribeOffsetsOfExistingGroupWithNoMembers 
>  SaslSslAdminClientIntegrationTest. 
> testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords 
>  UserQuotaTest. testQuotaOverrideDelete 
>  UserQuotaTest. testThrottledProducerConsumer 
>  MetricsDuringTopicCreationDeletionTest. testMetricsDuringTopicCreateDelete 
>  SocketServerTest. testControlPlaneRequest
> Attached find the failures.
>  
> [^KafkaAUTFailures.txt]
>  
>  
>  We would like to know if we are missing anything in our build environment or 
> if this is a known test failures in Kafka 2.3.0
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (KAFKA-8705) NullPointerException was thrown by topology optimization when two MergeNodes have common KeyChaingingNode

2019-07-24 Thread Bill Bejeck (JIRA)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck reassigned KAFKA-8705:
--

Assignee: Bill Bejeck

> NullPointerException was thrown by topology optimization when two MergeNodes 
> have common KeyChaingingNode
> -
>
> Key: KAFKA-8705
> URL: https://issues.apache.org/jira/browse/KAFKA-8705
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Hiroshi Nakahara
>Assignee: Bill Bejeck
>Priority: Major
>
> NullPointerException was thrown by topology optimization when two MergeNodes 
> have common KeyChaingingNode.
> Kafka Stream version: 2.3.0
> h3. Code
> {code:java}
> import org.apache.kafka.common.serialization.Serdes;
> import org.apache.kafka.streams.StreamsBuilder;
> import org.apache.kafka.streams.StreamsConfig;
> import org.apache.kafka.streams.kstream.Consumed;
> import org.apache.kafka.streams.kstream.KStream;
> import java.util.Properties;
> public class Main {
> public static void main(String[] args) {
> final StreamsBuilder streamsBuilder = new StreamsBuilder();
> final KStream parentStream = 
> streamsBuilder.stream("parentTopic", Consumed.with(Serdes.Integer(), 
> Serdes.Integer()))
> .selectKey(Integer::sum);  // To make parentStream 
> KeyChaingingPoint
> final KStream childStream1 = 
> parentStream.mapValues(v -> v + 1);
> final KStream childStream2 = 
> parentStream.mapValues(v -> v + 2);
> final KStream childStream3 = 
> parentStream.mapValues(v -> v + 3);
> childStream1
> .merge(childStream2)
> .merge(childStream3)
> .to("outputTopic");
> final Properties properties = new Properties();
> properties.setProperty(StreamsConfig.TOPOLOGY_OPTIMIZATION, 
> StreamsConfig.OPTIMIZE);
> streamsBuilder.build(properties);
> }
> }
> {code}
> h3. Expected result
> streamsBuilder.build should create Topology without throwing Exception.  The 
> expected topology is:
> {code:java}
> Topologies:
>Sub-topology: 0
> Source: KSTREAM-SOURCE-00 (topics: [parentTopic])
>   --> KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-KEY-SELECT-01 (stores: [])
>   --> KSTREAM-MAPVALUES-02, KSTREAM-MAPVALUES-03, 
> KSTREAM-MAPVALUES-04
>   <-- KSTREAM-SOURCE-00
> Processor: KSTREAM-MAPVALUES-02 (stores: [])
>   --> KSTREAM-MERGE-05
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MAPVALUES-03 (stores: [])
>   --> KSTREAM-MERGE-05
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MAPVALUES-04 (stores: [])
>   --> KSTREAM-MERGE-06
>   <-- KSTREAM-KEY-SELECT-01
> Processor: KSTREAM-MERGE-05 (stores: [])
>   --> KSTREAM-MERGE-06
>   <-- KSTREAM-MAPVALUES-02, KSTREAM-MAPVALUES-03
> Processor: KSTREAM-MERGE-06 (stores: [])
>   --> KSTREAM-SINK-07
>   <-- KSTREAM-MERGE-05, KSTREAM-MAPVALUES-04
> Sink: KSTREAM-SINK-07 (topic: outputTopic)
>   <-- KSTREAM-MERGE-06
> {code}
> h3. Actual result
> NullPointerException was thrown with the following stacktrace.
> {code:java}
> Exception in thread "main" java.lang.NullPointerException
>   at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybeUpdateKeyChangingRepartitionNodeMap(InternalStreamsBuilder.java:397)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybeOptimizeRepartitionOperations(InternalStreamsBuilder.java:315)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.maybePerformOptimizations(InternalStreamsBuilder.java:304)
>   at 
> org.apache.kafka.streams.kstream.internals.InternalStreamsBuilder.buildAndOptimizeTopology(InternalStreamsBuilder.java:275)
>   at 
> org.apache.kafka.streams.StreamsBuilder.build(StreamsBuilder.java:558)
>   at Main.main(Main.java:24){code}
> h3. Cause
> This exception occurs in 
> InternalStreamsBuilder#maybeUpdateKeyChaingingRepartitionNodeMap.
> {code:java}
> private void maybeUpdateKeyChangingRepartitionNodeMap() {
> final Map> 
> mergeNodesToKeyChangers = new HashMap<>();
> for (final StreamsGraphNode mergeNode : mergeNodes) {
> mergeNodesToKeyChangers.put(mergeNode, new LinkedHashSet<>());
> final Collection keys = 
> keyChangingOperationsToOptimizableRepartitionNodes.keySet();
> for (final StreamsGraphNode key : keys)

[jira] [Commented] (KAFKA-8706) Kafka 2.3.0 Unit Test Failures on Oracle Linux - Need help debugging framework or issue.



[ 
https://issues.apache.org/jira/browse/KAFKA-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891872#comment-16891872
 ] 

Chandrasekhar commented on KAFKA-8706:
--

Reran the AUT again, and found four failures. The following four are 
consistently failing. NOTE: Initially I had the build run of file descriptors, 
so I bumped up the open file descriptors to get around that issue before first 
run.

Do these failures mean anything to actually running software? i.e. the failures:

testProduceConsumeViaAssign

testThrottledProducerConsumer

testMetricsDuringTopicCreateDelete

testControlPlaneRequest

 

[^KafkaAUTFailures07242019.txt]!KafkaUTFailures07242019.GIF!

> Kafka 2.3.0 Unit Test Failures on Oracle Linux  - Need help debugging 
> framework or issue.
> -
>
> Key: KAFKA-8706
> URL: https://issues.apache.org/jira/browse/KAFKA-8706
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Minor
> Attachments: KafkaAUTFailures.txt, KafkaAUTFailures07242019.txt, 
> KafkaUTFailures07242019.GIF
>
>
> Hi
> We have just imported KAFKA 2.3.0 source code from git repo and compiling 
> using Gradle 4.7 on Oracle VM with following info:
> [vagrant@localhost kafka-2.3.0]$ uname -a
>  Linux localhost 4.1.12-112.14.1.el7uek.x86_64 #2 SMP Fri Dec 8 18:37:23 PST 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>  [vagrant@localhost kafka-2.3.0]$
>  
> Upon compiling (#gradle build) , there are 6 test failures at the end. Failed 
> Tests are reported as following:
> DescribeConsumerGroupTest. testDescribeOffsetsOfExistingGroupWithNoMembers 
>  SaslSslAdminClientIntegrationTest. 
> testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords 
>  UserQuotaTest. testQuotaOverrideDelete 
>  UserQuotaTest. testThrottledProducerConsumer 
>  MetricsDuringTopicCreationDeletionTest. testMetricsDuringTopicCreateDelete 
>  SocketServerTest. testControlPlaneRequest
> Attached find the failures.
>  
> [^KafkaAUTFailures.txt]
>  
>  
>  We would like to know if we are missing anything in our build environment or 
> if this is a known test failures in Kafka 2.3.0
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8706) Kafka 2.3.0 Unit Test Failures on Oracle Linux - Need help debugging framework or issue.



 [ 
https://issues.apache.org/jira/browse/KAFKA-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandrasekhar updated KAFKA-8706:
-
Attachment: KafkaUTFailures07242019.GIF

> Kafka 2.3.0 Unit Test Failures on Oracle Linux  - Need help debugging 
> framework or issue.
> -
>
> Key: KAFKA-8706
> URL: https://issues.apache.org/jira/browse/KAFKA-8706
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Chandrasekhar
>Priority: Minor
> Attachments: KafkaAUTFailures.txt, KafkaAUTFailures07242019.txt, 
> KafkaUTFailures07242019.GIF
>
>
> Hi
> We have just imported KAFKA 2.3.0 source code from git repo and compiling 
> using Gradle 4.7 on Oracle VM with following info:
> [vagrant@localhost kafka-2.3.0]$ uname -a
>  Linux localhost 4.1.12-112.14.1.el7uek.x86_64 #2 SMP Fri Dec 8 18:37:23 PST 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>  [vagrant@localhost kafka-2.3.0]$
>  
> Upon compiling (#gradle build) , there are 6 test failures at the end. Failed 
> Tests are reported as following:
> DescribeConsumerGroupTest. testDescribeOffsetsOfExistingGroupWithNoMembers 
>  SaslSslAdminClientIntegrationTest. 
> testReplicaCanFetchFromLogStartOffsetAfterDeleteRecords 
>  UserQuotaTest. testQuotaOverrideDelete 
>  UserQuotaTest. testThrottledProducerConsumer 
>  MetricsDuringTopicCreationDeletionTest. testMetricsDuringTopicCreateDelete 
>  SocketServerTest. testControlPlaneRequest
> Attached find the failures.
>  
> [^KafkaAUTFailures.txt]
>  
>  
>  We would like to know if we are missing anything in our build environment or 
> if this is a known test failures in Kafka 2.3.0
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KAFKA-8706) Kafka 2.3.0 Unit Test Failures on Oracle Linux - Need help debugging framework or issue.