[jira] [Created] (KAFKA-7566) Add sidecar job to leader (or a random single follower) only

2018-10-29 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7566:
--

 Summary: Add sidecar job to leader (or a random single follower) 
only
 Key: KAFKA-7566
 URL: https://issues.apache.org/jira/browse/KAFKA-7566
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


Hey there,

recently we need to add an archive job to a streaming application. The caveat 
is that we need to make sure only one instance is doing this task to avoid 
potential race condition, and we also don't want to schedule it as a regular 
stream task so that we will be blocking normal streaming operation. 

Although we could do so by doing a zk lease, I'm raising the case here since 
this could be some potential use case for streaming job also. For example, 
there are some `leader specific` operation we could schedule in DSL instead of 
adhoc manner.

Let me know if you think this makes sense to you, thank you!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7641) Add `group.max.size` to cap group metadata on broker

2018-11-16 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7641:
--

 Summary: Add `group.max.size` to cap group metadata on broker
 Key: KAFKA-7641
 URL: https://issues.apache.org/jira/browse/KAFKA-7641
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7728) Add JoinReason to the join group request for better rebalance handling

2018-12-12 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7728:
--

 Summary: Add JoinReason to the join group request for better 
rebalance handling
 Key: KAFKA-7728
 URL: https://issues.apache.org/jira/browse/KAFKA-7728
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


Recently [~mgharat] and I discussed about the current rebalance logic on leader 
join group request handling. So far we blindly trigger rebalance when the 
leader rejoins. The caveat is that KIP-345 is not covering this effort and if a 
consumer group is not using sticky assignment but using other strategy like 
round robin, the redundant rebalance could still shuffle the topic partitions 
around consumers. (for example mirror maker application)

I checked on broker side and here is what we currently do:

 
{code:java}
if (group.isLeader(memberId) || !member.matches(protocols))  
// force a rebalance if a member has changed metadata or if the leader sends 
JoinGroup. 
// The latter allows the leader to trigger rebalances for changes affecting 
assignment 
// which do not affect the member metadata (such as topic metadata changes for 
the consumer) {code}
Based on the broker logic, we only need to trigger rebalance for leader rejoin 
when the topic metadata change has happened. I also looked up the 
ConsumerCoordinator code on client side, and found out the metadata monitoring 
logic here:
{code:java}
public boolean rejoinNeededOrPending() {
...
// we need to rejoin if we performed the assignment and metadata has changed
if (assignmentSnapshot != null && !assignmentSnapshot.equals(metadataSnapshot))
  return true;
}{code}
 I guess instead of just returning true, we could introduce a new enum field 
called JoinReason which could indicate the purpose of the rejoin. Thus we don't 
need to do a full rebalance when the leader is just in rolling bounce.

We could utilize this information I guess. Just add another enum field into the 
join group request called JoinReason so that we know whether leader is 
rejoining due to topic metadata change. If yes, we trigger rebalance obviously; 
if no, we shouldn't trigger rebalance.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-4835) Allow users control over repartitioning

2019-01-04 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-4835.

Resolution: Fixed

> Allow users control over repartitioning
> ---
>
> Key: KAFKA-4835
> URL: https://issues.apache.org/jira/browse/KAFKA-4835
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Michal Borowiecki
>Priority: Major
>  Labels: needs-kip
>
> From 
> https://issues.apache.org/jira/browse/KAFKA-4601?focusedCommentId=15881030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15881030
> ...it would be good to provide users more control over the repartitioning. 
> My use case is as follows (unrelated bits omitted for brevity):
> {code}
>   KTable loggedInCustomers = builder
>   .stream("customerLogins")
>   .groupBy((key, activity) -> 
>   activity.getCustomerRef())
>   .reduce((first,second) -> second, loginStore());
>   
>   builder
>   .stream("balanceUpdates")
>   .map((key, activity) -> new KeyValue<>(
>   activity.getCustomerRef(),
>   activity))
>   .join(loggedInCustomers, (activity, session) -> ...
>   .to("sessions");
> {code}
> Both "groupBy" and "map" in the underlying implementation set the 
> repartitionRequired flag (since the key changes), and the aggregation/join 
> that follows will create the repartitioned topic.
> However, in our case I know that both input streams are already partitioned 
> by the customerRef value, which I'm mapping into the key (because it's 
> required by the join operation).
> So there are 2 unnecessary intermediate topics created with their associated 
> overhead, while the ultimate goal is simply to do a join on a value that we 
> already use to partition the original streams anyway.
> (Note, we don't have the option to re-implement the original input streams to 
> make customerRef the message key.)
> I think it would be better to allow the user to decide (from their knowledge 
> of the incoming streams) whether a repartition is mandatory on aggregation 
> and join operations (overloaded version of the methods with the 
> repartitionRequired flag exposed maybe?)
> An alternative would be to allow users to perform a join on a value other 
> than the key (a keyValueMapper parameter to join, like the one used for joins 
> with global tables), but I expect that to be more involved and error-prone to 
> use for people who don't understand the partitioning requirements well 
> (whereas it's safe for global tables).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (KAFKA-4835) Allow users control over repartitioning

2019-01-05 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reopened KAFKA-4835:


> Allow users control over repartitioning
> ---
>
> Key: KAFKA-4835
> URL: https://issues.apache.org/jira/browse/KAFKA-4835
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Michal Borowiecki
>Priority: Major
>  Labels: needs-kip
>
> From 
> https://issues.apache.org/jira/browse/KAFKA-4601?focusedCommentId=15881030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15881030
> ...it would be good to provide users more control over the repartitioning. 
> My use case is as follows (unrelated bits omitted for brevity):
> {code}
>   KTable loggedInCustomers = builder
>   .stream("customerLogins")
>   .groupBy((key, activity) -> 
>   activity.getCustomerRef())
>   .reduce((first,second) -> second, loginStore());
>   
>   builder
>   .stream("balanceUpdates")
>   .map((key, activity) -> new KeyValue<>(
>   activity.getCustomerRef(),
>   activity))
>   .join(loggedInCustomers, (activity, session) -> ...
>   .to("sessions");
> {code}
> Both "groupBy" and "map" in the underlying implementation set the 
> repartitionRequired flag (since the key changes), and the aggregation/join 
> that follows will create the repartitioned topic.
> However, in our case I know that both input streams are already partitioned 
> by the customerRef value, which I'm mapping into the key (because it's 
> required by the join operation).
> So there are 2 unnecessary intermediate topics created with their associated 
> overhead, while the ultimate goal is simply to do a join on a value that we 
> already use to partition the original streams anyway.
> (Note, we don't have the option to re-implement the original input streams to 
> make customerRef the message key.)
> I think it would be better to allow the user to decide (from their knowledge 
> of the incoming streams) whether a repartition is mandatory on aggregation 
> and join operations (overloaded version of the methods with the 
> repartitionRequired flag exposed maybe?)
> An alternative would be to allow users to perform a join on a value other 
> than the key (a keyValueMapper parameter to join, like the one used for joins 
> with global tables), but I expect that to be more involved and error-prone to 
> use for people who don't understand the partitioning requirements well 
> (whereas it's safe for global tables).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7816) Windowed topic should have window size as part of the metadata

2019-01-12 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7816:
--

 Summary: Windowed topic should have window size as part of the 
metadata
 Key: KAFKA-7816
 URL: https://issues.apache.org/jira/browse/KAFKA-7816
 Project: Kafka
  Issue Type: Improvement
  Components: consumer, streams
Reporter: Boyang Chen
Assignee: Boyang Chen


Currently the Kafka window store topics require a windowed serde to properly 
deserialize the records. One of the required config is `window.size.ms`, which 
indicates the diff between (window.end - window.start). For space efficiency, 
KStream only stores the windowed record with window start time, because as long 
as the restore consumer knows size of the window, it would properly derive the 
window end time by adding window.size.ms to window start time.

However, this makes the reuse of window topic very hard because another user 
has to config the correct window size in order to deserialize the data. When we 
extract the customized consumer as a template, every time new user has to 
define their own window size. If we do wild-card matching consumer, things 
could be even worse to work because different topics may have different window 
size and user has to read through the application code to find that info.

To make the decoding of window topic easier, we are proposing to add a new 
config to TopicMetadata called `windowSize` which could be used for 
applications to properly deserialize the data without requirement to config a 
window size. This could also make client side serde API easier. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7824) Require member.id for initial join group request

2019-01-15 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7824:
--

 Summary: Require member.id for initial join group request
 Key: KAFKA-7824
 URL: https://issues.apache.org/jira/browse/KAFKA-7824
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Reporter: Boyang Chen
Assignee: Boyang Chen


For request with unknown member id, broker will blindly accept the new join 
group request, store the member metadata and return a UUID to consumer. The 
edge case is that if initial join group request keeps failing due to connection 
timeout, or the consumer keeps restarting, or the max.poll.interval.ms 
configured on client is set to infinite (no rebalance timeout kicking in to 
clean up the member metadata map), there will be accumulated MemberMetadata 
info within group metadata cache which will eventually burst broker memory. The 
detection and fencing of invalid join group request is crucial for broker 
stability.

 

The proposed solution is to require one more bounce for the consumer to use a 
valid member.id to join the group. Details in this 
[KIP|https://cwiki.apache.org/confluence/display/KAFKA/KIP-394%3A+Require+member.id+for+initial+join+group+request]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7853) Refactor ConsumerCoordinator/AbstractCoordinator to reduce constructor parameter list

2019-01-22 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7853:
--

 Summary: Refactor ConsumerCoordinator/AbstractCoordinator to 
reduce constructor parameter list
 Key: KAFKA-7853
 URL: https://issues.apache.org/jira/browse/KAFKA-7853
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


The parameter lists for class ConsumerCoordinator/AbstractCoordinator are 
growing over time. We should think of reducing the parameter size by 
introducing some intermediate data structs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7857) Add AbstractCoordinatorConfig class to consolidate consumer coordinator configs between Consumer and Connect

2019-01-22 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7857:
--

 Summary: Add AbstractCoordinatorConfig class to consolidate 
consumer coordinator configs between Consumer and Connect
 Key: KAFKA-7857
 URL: https://issues.apache.org/jira/browse/KAFKA-7857
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen
Assignee: Boyang Chen


Right now there are a lot of duplicate configuration concerning client 
coordinator shared across ConsumerConfig and DistributedConfig (connect 
config). It makes sense to extract all coordinator related configs into a 
separate config class to reduce code redundancy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7858) Replace JoinGroup request/response with automated protocol

2019-01-22 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7858:
--

 Summary: Replace JoinGroup request/response with automated protocol
 Key: KAFKA-7858
 URL: https://issues.apache.org/jira/browse/KAFKA-7858
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7859) Replace LeaveGroup request/response with automated protocol

2019-01-22 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7859:
--

 Summary: Replace LeaveGroup request/response with automated 
protocol
 Key: KAFKA-7859
 URL: https://issues.apache.org/jira/browse/KAFKA-7859
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7862) Modify JoinGroup logic to incorporate group.instance.id change

2019-01-23 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7862:
--

 Summary: Modify JoinGroup logic to incorporate group.instance.id 
change
 Key: KAFKA-7862
 URL: https://issues.apache.org/jira/browse/KAFKA-7862
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen


The step one for KIP-345 join group logic change to corporate with static 
membership.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6995) Make config "internal.leave.group.on.close" public

2019-01-23 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-6995.

Resolution: Won't Fix

We are planning to deprecate this config with the introduction of KIP-345. 

> Make config "internal.leave.group.on.close" public
> --
>
> Key: KAFKA-6995
> URL: https://issues.apache.org/jira/browse/KAFKA-6995
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer, streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>  Labels: needs-kip
>
> We are proposing to make the config "internal.leave.group.on.close" public. 
> The reason is that for heavy state application the sticky assignment won't 
> work because each stream worker will leave group during rolling restart, and 
> there is a possibility that some members are left and rejoined while others 
> are still awaiting restart. This would then cause multiple rebalance because 
> after the ongoing rebalance is done, we are expecting late members to rejoin 
> and move state from `stable` to `prepareBalance`. To solve this problem, 
> heavy state application needs to use this config to avoid member list update, 
> so that at most one rebalance will be triggered at a proper time when all the 
> members are rejoined during rolling restart. This should just be one line 
> change.
> Code here:
> * internal.leave.group.on.close
>  * Whether or not the consumer should leave the group on close. If set to 
> false then a rebalance
>  * won't occur until session.timeout.ms expires.
>  *
>  * 
>  * Note: this is an internal configuration and could be changed in the future 
> in a backward incompatible way
>  *
>  */
>  static final String LEAVE_GROUP_ON_CLOSE_CONFIG = 
> "internal.leave.group.on.close";



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7859) Replace LeaveGroup request/response with automated protocol

2019-01-31 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7859.

Resolution: Fixed

> Replace LeaveGroup request/response with automated protocol
> ---
>
> Key: KAFKA-7859
> URL: https://issues.apache.org/jira/browse/KAFKA-7859
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7816) Windowed topic should have window size as part of the metadata

2019-01-31 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7816.

Resolution: Won't Fix

Not a real use case

> Windowed topic should have window size as part of the metadata
> --
>
> Key: KAFKA-7816
> URL: https://issues.apache.org/jira/browse/KAFKA-7816
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer, streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> Currently the Kafka window store topics require a windowed serde to properly 
> deserialize the records. One of the required config is `window.size.ms`, 
> which indicates the diff between (window.end - window.start). For space 
> efficiency, KStream only stores the windowed record with window start time, 
> because as long as the restore consumer knows size of the window, it would 
> properly derive the window end time by adding window.size.ms to window start 
> time.
> However, this makes the reuse of window topic very hard because another user 
> has to config the correct window size in order to deserialize the data. When 
> we extract the customized consumer as a template, every time new user has to 
> define their own window size. If we do wild-card matching consumer, things 
> could be even worse to work because different topics may have different 
> window size and user has to read through the application code to find that 
> info.
> To make the decoding of window topic easier, we are proposing to add a new 
> config to TopicMetadata called `windowSize` which could be used for 
> applications to properly deserialize the data without requirement to config a 
> window size. This could also make client side serde API easier. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7899) Command line tool to invalidate group metadata for clean assignment

2019-02-05 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7899:
--

 Summary: Command line tool to invalidate group metadata for clean 
assignment
 Key: KAFKA-7899
 URL: https://issues.apache.org/jira/browse/KAFKA-7899
 Project: Kafka
  Issue Type: New Feature
  Components: consumer, streams
Affects Versions: 1.1.0
Reporter: Boyang Chen
Assignee: Boyang Chen


Right now the group metadata will affect consumers under sticky assignment, 
since it persists previous topic partition assignment which affects the 
judgement of consumer leader. Specifically for KStream applications (under 
1.1), if we are scaling up the cluster, it is hard to balance the traffic since 
most tasks would still go to "previous round active" assignments, even though 
we hope them to move towards other hosts.

It would be preferable to have a tool to invalidate the group metadata stored 
on broker, so that for sticky assignment we could have a clean start.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7899) Command line tool to invalidate group metadata for clean assignment

2019-02-05 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7899.

Resolution: Invalid

It appears that the metadata should be generated by the client instead of 
broker. So basically this is invalid issue.

> Command line tool to invalidate group metadata for clean assignment
> ---
>
> Key: KAFKA-7899
> URL: https://issues.apache.org/jira/browse/KAFKA-7899
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer, streams
>Affects Versions: 1.1.0
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> Right now the group metadata will affect consumers under sticky assignment, 
> since it persists previous topic partition assignment which affects the 
> judgement of consumer leader. Specifically for KStream applications (under 
> 1.1), if we are scaling up the cluster, it is hard to balance the traffic 
> since most tasks would still go to "previous round active" assignments, even 
> though we hope them to move towards other hosts.
> It would be preferable to have a tool to invalidate the group metadata stored 
> on broker, so that for sticky assignment we could have a clean start.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7903) Replace OffsetCommit request/response with automated protocol

2019-02-06 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7903:
--

 Summary: Replace OffsetCommit request/response with automated 
protocol
 Key: KAFKA-7903
 URL: https://issues.apache.org/jira/browse/KAFKA-7903
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6723) Separate "max.poll.record" for restore consumer and common consumer

2018-03-27 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-6723:
--

 Summary: Separate "max.poll.record" for restore consumer and 
common consumer
 Key: KAFKA-6723
 URL: https://issues.apache.org/jira/browse/KAFKA-6723
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Boyang Chen
Assignee: Boyang Chen


Currently, Kafka Streams use `max.poll.record` config for both restore consumer 
and normal stream consumer. In reality, they are doing different processing 
workloads, and in order to speed up the restore speed, restore consumer is 
supposed to have a higher throughput by setting `max.poll.record` higher. The 
change involved is trivial: 
[https://github.com/abbccdda/kafka/commit/cace25b74f31c8da79e93b514bcf1ed3ea9a7149]

However, this is still a public API change (introducing a new config name), so 
we need a KIP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6840) support windowing in ktable API

2018-04-30 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-6840:
--

 Summary: support windowing in ktable API
 Key: KAFKA-6840
 URL: https://issues.apache.org/jira/browse/KAFKA-6840
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 1.1.0
Reporter: Boyang Chen
Assignee: Boyang Chen


The StreamsBuilder provides table() API to materialize a changelog topic into a 
local key-value store (KTable), which is very convenient. However, current 
underlying implementation does not support materializing one topic to a 
windowed key-value store, which in certain cases would be very useful. 

To make up the gap, we proposed a new API in StreamsBuilder that could get a 
windowed Ktable.

The table() API in StreamsBuilder looks like this:

public synchronized  KTable table(final String topic,

  final Consumed consumed,

  final Materialized> materialized) {

    Objects.requireNonNull(topic, "topic can't be null");

    Objects.requireNonNull(consumed, "consumed can't be null");

    Objects.requireNonNull(materialized, "materialized can't be null");

    
materialized.withKeySerde(consumed.keySerde).withValueSerde(consumed.valueSerde);

    return internalStreamsBuilder.table(topic,

    new ConsumedInternal<>(consumed),

    new 
MaterializedInternal<>(materialized, internalStreamsBuilder, topic + "-"));

    }

 

Where we could see that the store type is given as KeyValueStore. There is no 
flexibility to change it to WindowStore.

 

To maintain compatibility of the existing API, we have two options to define a 
new API:

1.Overload existing KTable struct

public synchronized  KTable, V> windowedTable(final String 
topic,

  final Consumed consumed,

  final Materialized> materialized);

 

This could give developer an alternative to use windowed table instead. 
However, this implies that we need to make sure all the KTable logic still 
works as expected, such as join, aggregation, etc, so the challenge would be 
making sure all current KTable logics work.

 

2.Define a new type called WindowedKTable

public synchronized  WindowedKTable windowedTable(final String 
topic,

  final Consumed consumed,

  final Materialized> materialized);

The benefit of doing this is that we don’t need to worry about the existing 
functionality of KTable. However, the cost is to introduce redundancy of common 
operation logic. When upgrading common functionality, we need to take care of 
both types.

We could fill in more details in the KIP. Right now I would like to hear some 
feedbacks on the two approaches, thank you!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6657) Add StreamsConfig prefix for different consumers

2018-05-02 Thread Boyang Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-6657.

Resolution: Delivered

The code is merged: https://github.com/apache/kafka/pull/4805

> Add StreamsConfig prefix for different consumers
> 
>
> Key: KAFKA-6657
> URL: https://issues.apache.org/jira/browse/KAFKA-6657
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.1.0
>Reporter: Matthias J. Sax
>Assignee: Boyang Chen
>Priority: Major
>  Labels: beginner, needs-kip, newbie, newbie++
>
> Kafka Streams allows to pass in different configs for different clients by 
> prefixing the corresponding parameter with `producer.` or `consumer.`.
> However, Kafka Streams internally uses multiple consumers, (1) the main 
> consumer (2) the restore consumer and (3) the global consumer (that is a 
> restore consumer as well atm).
> For some use cases, it's required to set different configs for different 
> consumers. Thus, we should add two new prefix for restore and global 
> consumer. We might also consider to extend `KafkaClientSupplier` and add a 
> `getGlobalConsumer()` method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6896) add producer metrics exporting in KafkaStreams.java

2018-05-10 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-6896:
--

 Summary: add producer metrics exporting in KafkaStreams.java
 Key: KAFKA-6896
 URL: https://issues.apache.org/jira/browse/KAFKA-6896
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Boyang Chen
Assignee: Boyang Chen


We would like to also export the producer metrics from {{StreamThread}} just 
like consumer metrics, so that we could gain more visibility of stream 
application. The approach is to pass in the {{threadProducer}}into the 
StreamThread so that we could export its metrics in dynamic.

Note that this is a pure internal change that doesn't require a KIP, and in the 
future we also want to export admin client metrics. A followup KIP for admin 
client will be created once this is merged.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6986) Export Admin Client metrics through Stream Threads

2018-06-02 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-6986:
--

 Summary: Export Admin Client metrics through Stream Threads
 Key: KAFKA-6986
 URL: https://issues.apache.org/jira/browse/KAFKA-6986
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen
 Fix For: 2.0.0


We already exported producer and consumer metrics through KafkaStreams class:

[https://github.com/apache/kafka/pull/4998]

It makes sense to also export the Admin client metrics, however this might 
involve some public API change that needs a KIP.

I have allocated a KIP page here:

[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80453500]

If any new contributor wishes to take over this one, please let me know. I will 
revisit and close this ticket in one or two months later in case no one claims 
it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-6995) Make config "internal.leave.group.on.close" public

2018-06-05 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-6995:
--

 Summary: Make config "internal.leave.group.on.close" public
 Key: KAFKA-6995
 URL: https://issues.apache.org/jira/browse/KAFKA-6995
 Project: Kafka
  Issue Type: Improvement
  Components: consumer, streams
Reporter: Boyang Chen
Assignee: Boyang Chen


We are proposing to make the config "internal.leave.group.on.close" public. The 
reason is that for heavy state application the sticky assignment won't work 
because each stream worker will leave group during rolling restart, and there 
is a possibility that some members are left and rejoined while others are still 
awaiting restart. This would then cause multiple rebalance because after the 
ongoing rebalance is done, we are expecting late members to rejoin and move 
state from `stable` to `prepareBalance`. To solve this problem, heavy state 
application needs to use this config to avoid member list update, so that at 
most one rebalance will be triggered at a proper time when all the members are 
rejoined during rolling restart. This should just be one line change.

Code here:

* internal.leave.group.on.close
 * Whether or not the consumer should leave the group on close. If set to 
false then a rebalance
 * won't occur until session.timeout.ms expires.
 *
 * 
 * Note: this is an internal configuration and could be changed in the future 
in a backward incompatible way
 *
 */
 static final String LEAVE_GROUP_ON_CLOSE_CONFIG = 
"internal.leave.group.on.close";



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7018) persist memberId for consumer restart

2018-06-07 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7018:
--

 Summary: persist memberId for consumer restart
 Key: KAFKA-7018
 URL: https://issues.apache.org/jira/browse/KAFKA-7018
 Project: Kafka
  Issue Type: Improvement
  Components: consumer, streams
Reporter: Boyang Chen
Assignee: Boyang Chen


In group coordinator, there is a logic to neglect join group request from 
existing follower consumers:
{code:java}
case Empty | Stable =>
  if (memberId == JoinGroupRequest.UNKNOWN_MEMBER_ID) {
// if the member id is unknown, register the member to the group
addMemberAndRebalance(rebalanceTimeoutMs, sessionTimeoutMs, clientId, 
clientHost, protocolType, protocols, group, responseCallback)
  } else {
val member = group.get(memberId)
if (group.isLeader(memberId) || !member.matches(protocols)) {
  // force a rebalance if a member has changed metadata or if the leader 
sends JoinGroup.
  // The latter allows the leader to trigger rebalances for changes 
affecting assignment
  // which do not affect the member metadata (such as topic metadata 
changes for the consumer)
  updateMemberAndRebalance(group, member, protocols, responseCallback)
} else {
  // for followers with no actual change to their metadata, just return 
group information
  // for the current generation which will allow them to issue SyncGroup
  responseCallback(JoinGroupResult(
members = Map.empty,
memberId = memberId,
generationId = group.generationId,
subProtocol = group.protocolOrNull,
leaderId = group.leaderOrNull,
error = Errors.NONE))
}
{code}
While looking at the AbstractCoordinator, I found that the generation was 
hard-coded as 

NO_GENERATION on restart, which means we will send UNKNOWN_MEMBER_ID in the 
first join group request. This means we will treat the restarted consumer as a 
new member, so the rebalance will be triggered until session timeout.

I'm trying to clarify the following things before we extend the discussion:
 # Whether my understanding of the above logic is right (Hope [~mjsax] could 
help me double check)
 # Whether it makes sense to persist last round of memberId for consumers? We 
currently only need this feature in stream application, but will do no harm if 
we also use it for consumer in general. This would be a nice-to-have feature on 
consumer restart when we configured the loading-previous-memberId to true. If 
we failed, simply use the UNKNOWN_MEMBER_ID
 # The behavior could also be changed on the broker side, but I suspect it is 
very risky. So far client side change should be the least effort. The end goal 
is to avoid excessive rebalance from the same consumer restart, so if you feel 
server side change could also help, we could further discuss.

Thank you for helping out! [~mjsax] [~guozhang]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7071) specify number of partitions when using repartition logic

2018-06-18 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7071:
--

 Summary: specify number of partitions when using repartition logic
 Key: KAFKA-7071
 URL: https://issues.apache.org/jira/browse/KAFKA-7071
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Boyang Chen
Assignee: Boyang Chen


Hey there, I was wondering whether it makes sense to specify number of 
partitions of the output topic from a repartition operation like groupBy, 
flatMap, etc. The current DSL doesn't support adding a customized repartition 
topic which has customized number of partitions. For example, I want to reduce 
the input topic from 8 partitions to 1, there is no easy solution but to create 
a new topic instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7071) specify number of partitions when using repartition logic

2018-06-18 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7071.

Resolution: Duplicate

Dup with 6037

> specify number of partitions when using repartition logic
> -
>
> Key: KAFKA-7071
> URL: https://issues.apache.org/jira/browse/KAFKA-7071
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> Hey there, I was wondering whether it makes sense to specify number of 
> partitions of the output topic from a repartition operation like groupBy, 
> flatMap, etc. The current DSL doesn't support adding a customized repartition 
> topic which has customized number of partitions. For example, I want to 
> reduce the input topic from 8 partitions to 1, there is no easy solution but 
> to create a new topic instead.
> cc [~guozhang] [~liquanpei]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7995) Augment singleton protocol type to list for Kafka Consumer

2019-02-24 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-7995:
--

 Summary: Augment singleton protocol type to list for Kafka 
Consumer  
 Key: KAFKA-7995
 URL: https://issues.apache.org/jira/browse/KAFKA-7995
 Project: Kafka
  Issue Type: Improvement
  Components: consumer, core
Reporter: Boyang Chen


Right now Kafka consumer protocol uses a singleton marker to distinguish Kafka 
Connect worker and normal consumer. This is not upgrade-friendly approach since 
the protocol type could potential change over time. A better approach is to 
support multiple candidacies so that the no downtime protocol type switch could 
achieve.

For example, if we are trying to upgrade a Kafka Streams application towards a 
protocol type called "stream", right now there is no way to do this without 
downtime since broker will reject changing protocol type to a different one 
unless the group is back to empty. If we allow new member to provide a list of 
protocol type instead ("consumer", "stream"), there would be no compatibility 
issue.

Alternative approach is to invent an admin API to change group's protocol type 
on runtime. However, the burden introduced on administrator is not trivial, 
since we need to guarantee the operation series to be correct, otherwise we 
will see limp-upgrade experience in the midpoint, for example while we are 
changing protocol type there was unexpected rebalance that causes old members 
join failure.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8019) Better Scaling Experience for KStream

2019-02-28 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8019:
--

 Summary: Better Scaling Experience for KStream
 Key: KAFKA-8019
 URL: https://issues.apache.org/jira/browse/KAFKA-8019
 Project: Kafka
  Issue Type: New Feature
Reporter: Boyang Chen
Assignee: Boyang Chen


In our day-to-day work, we found it really hard to scale up a stateful stream 
application when its state store is very heavy. The caveat is that when the 
newly spinned hosts take ownership of some active tasks, so that they need to 
use non-trivial amount of time to restore the state store from changelog topic. 
The reassigned tasks would be available for unpredicted long time, which is not 
favorable. Secondly the current global rebalance stops the entire application 
process, which in a rolling host swap scenario would suggest an infinite 
resource shuffling without actual progress.

Following the community's [cooperative 
rebalancing|https://cwiki.apache.org/confluence/display/KAFKA/Incremental+Cooperative+Rebalancing%3A+Support+and+Policies]
 proposal, we need to build something similar for KStream to better handle the 
auto scaling experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8219) Add web documentation for static membership

2019-04-11 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8219:
--

 Summary: Add web documentation for static membership
 Key: KAFKA-8219
 URL: https://issues.apache.org/jira/browse/KAFKA-8219
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer, streams
Reporter: Boyang Chen
Assignee: Boyang Chen


Need official documentation update.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8220) Avoid kicking out members through rebalance timeout

2019-04-11 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8220:
--

 Summary: Avoid kicking out members through rebalance timeout
 Key: KAFKA-8220
 URL: https://issues.apache.org/jira/browse/KAFKA-8220
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen


As stated in KIP-345, we will no longer evict unjoined members out of the 
group. We need to take care the edge case when the leader fails to rejoin and 
switch to a new leader in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8221) Augment LeaveGroupRequest to batch operation

2019-04-11 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8221:
--

 Summary: Augment LeaveGroupRequest to batch operation
 Key: KAFKA-8221
 URL: https://issues.apache.org/jira/browse/KAFKA-8221
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Boyang Chen
Assignee: Boyang Chen


Having a batch leave group request is a required protocol change to remove a 
set of static members all at once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8222) Admin client changes to add ability to batch remove static members

2019-04-11 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8222:
--

 Summary: Admin client changes to add ability to batch remove 
static members
 Key: KAFKA-8222
 URL: https://issues.apache.org/jira/browse/KAFKA-8222
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer, streams
Reporter: Boyang Chen
Assignee: Boyang Chen


After changes become effective in JIRA, we need to add admin support to use it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8223) Deprecate group.initial.rebalance.delay

2019-04-11 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8223:
--

 Summary: Deprecate group.initial.rebalance.delay
 Key: KAFKA-8223
 URL: https://issues.apache.org/jira/browse/KAFKA-8223
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen


This is a stated step in KIP-345, however we reprioritize it since it is not 
ready to be removed for dynamic members.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8224) Add static member id into Subscription Info for better rebalance behavior

2019-04-11 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8224:
--

 Summary: Add static member id into Subscription Info for better 
rebalance behavior
 Key: KAFKA-8224
 URL: https://issues.apache.org/jira/browse/KAFKA-8224
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen


Based on discussion in [https://github.com/apache/kafka/pull/6177] and KIP-345, 
we plan to better utilize static member info to make wise rebalance call, such 
as assignors like Range or Round Robin could become more sticky to rely on 
static ids instead of coordinator auto-generated ids.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8225) handle conflicting static member id

2019-04-11 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8225:
--

 Summary: handle conflicting static member id
 Key: KAFKA-8225
 URL: https://issues.apache.org/jira/browse/KAFKA-8225
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen


We need an important fix for handling the user mis-configuration for duplicate 
group.instance.ids. Several approaches we have discussed so far:
 # Limit resetGeneration() call to only JoinGroupResponseHandler
 # Include InstanceId in the Heartbeat and OffsetCommit APIs. Then the 
coordinator can return the proper error code.
 # We can can use a convention to embed the instanceId into the generated 
memberId. At the moment, the current format is {{{clientId}-\{random uuid}}}. 
For static members, I think instanceId is more useful than clientId and we 
could probably use timestamp as a more concise alternative to uuid. So we could 
have {{{instanceId}-\{timestamp}}} as the memberId for static members. Then we 
would be able to extract this from any request and the coordinator could use 
the proper error code

Right now we are more inclined to option 2 or 3, however it requires 
non-trivial amount of code changes including protocol changes and fatal error 
handling on client side. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8230) Add static membership support in librd consumer client

2019-04-13 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8230:
--

 Summary: Add static membership support in librd consumer client 
 Key: KAFKA-8230
 URL: https://issues.apache.org/jira/browse/KAFKA-8230
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


Once the effort in https://issues.apache.org/jira/browse/KAFKA-7018 is done, 
one of the low hanging fruit is to add this support for other language Kafka 
consumers, such as c consumer in librdKafka.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8236) Incorporate version control for Kafka Streams Application Reset

2019-04-15 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8236:
--

 Summary: Incorporate version control for Kafka Streams Application 
Reset
 Key: KAFKA-8236
 URL: https://issues.apache.org/jira/browse/KAFKA-8236
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


Inspired by Spark mlflow which supports versioning log, we should be 
considering expose a special versioning tag for KStream applications to easy 
rollback bad code deploy. The naive approach is to store the versioning info in 
consumer offset topic so that when we perform rollback, we know where to read 
from the input, and where to cleanup the changelog topic. Essentially, this is 
an extension to our current application reset tool.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8246) refactor topic/group instance id validation condition

2019-04-16 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8246:
--

 Summary: refactor topic/group instance id validation condition
 Key: KAFKA-8246
 URL: https://issues.apache.org/jira/browse/KAFKA-8246
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8284) Enable static membership on KStream

2019-04-24 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8284:
--

 Summary: Enable static membership on KStream
 Key: KAFKA-8284
 URL: https://issues.apache.org/jira/browse/KAFKA-8284
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8285) Handle thread-id random switch on JVM for KStream

2019-04-24 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8285:
--

 Summary: Handle thread-id random switch on JVM for KStream
 Key: KAFKA-8285
 URL: https://issues.apache.org/jira/browse/KAFKA-8285
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8287) JVM global map to fence duplicate client id

2019-04-24 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8287:
--

 Summary: JVM global map to fence duplicate client id
 Key: KAFKA-8287
 URL: https://issues.apache.org/jira/browse/KAFKA-8287
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen


After change in https://issues.apache.org/jira/browse/KAFKA-8285, the two 
stream instances scheduled on same JVM will be mutually affected if they 
accidentally assign same client.id, since the thread-id becomes local now. The 
solution is to build a global concurrent map for solving conflict if two 
threads happen to be having the same client.id.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8291) System test consumer_test.py failed on trunk

2019-04-25 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8291:
--

 Summary: System test consumer_test.py failed on trunk
 Key: KAFKA-8291
 URL: https://issues.apache.org/jira/browse/KAFKA-8291
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen
Assignee: Boyang Chen


Looks like trunk is failing as for now 
[https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2537/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8287) JVM global map to fence duplicate client id

2019-04-25 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8287.

Resolution: Invalid

We don't want to do it for now because there are existing use cases where 
`client.id` is expected to be duplicate across different stream instances for 
request throttling purpose.

> JVM global map to fence duplicate client id
> ---
>
> Key: KAFKA-8287
> URL: https://issues.apache.org/jira/browse/KAFKA-8287
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> After change in https://issues.apache.org/jira/browse/KAFKA-8285, the two 
> stream instances scheduled on same JVM will be mutually affected if they 
> accidentally assign same client.id, since the thread-id becomes local now. 
> The solution is to build a global concurrent map for solving conflict if two 
> threads happen to be having the same client.id.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7862) Modify JoinGroup logic to incorporate group.instance.id change

2019-04-26 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7862.

Resolution: Fixed

> Modify JoinGroup logic to incorporate group.instance.id change
> --
>
> Key: KAFKA-7862
> URL: https://issues.apache.org/jira/browse/KAFKA-7862
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> The step one for KIP-345 join group logic change to corporate with static 
> membership.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7903) Replace OffsetCommit request/response with automated protocol

2019-04-29 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7903.

Resolution: Fixed

> Replace OffsetCommit request/response with automated protocol
> -
>
> Key: KAFKA-7903
> URL: https://issues.apache.org/jira/browse/KAFKA-7903
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8311) Better consumer timeout exception handling

2019-05-01 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8311:
--

 Summary: Better consumer timeout exception handling 
 Key: KAFKA-8311
 URL: https://issues.apache.org/jira/browse/KAFKA-8311
 Project: Kafka
  Issue Type: Improvement
  Components: consumer, streams
Reporter: Boyang Chen


When stream application crashed due to underlying consumer commit timeout, we 
have seen following gaps:

1. The current timeout exception doesn't provide meaningful tuning 
instructions. We should augment the error message to let user change 
`default.api.timeout.ms` in order to tolerate longer reaction time.
2. Currently we have 3 different types of consumers on KStream: 
thread-consumer, global-consumer and restore-consumer. Although we don't plan 
to explicitly handle this consumer timeout on stream level, we could wrap it 
with more meaningful message either on consumer or stream level to let user be 
aware which consumer is having trouble.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8331) Add system test for enabling static membership on KStream

2019-05-07 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8331:
--

 Summary: Add system test for enabling static membership on KStream
 Key: KAFKA-8331
 URL: https://issues.apache.org/jira/browse/KAFKA-8331
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7830) Convert Kafka RPCs to use automatically generated code

2019-05-08 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7830.

Resolution: Fixed

> Convert Kafka RPCs to use automatically generated code
> --
>
> Key: KAFKA-7830
> URL: https://issues.apache.org/jira/browse/KAFKA-7830
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, core
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>Priority: Major
>
> KAFKA-7609 added a way of automatically generating code for reading and 
> writing Kafka RPC message types from JSON schemas.
> Automatically generated code is preferrable to manually written serialization 
> code. 
> * * It is less tedious and error-prone to use than hand-written code.
> * For developers writing Kafka clients in other languages, the JSON schemas 
> are useful in a way that the java serialization code is not.
> * It will eventually be possible to automatically validate aspects of 
> cross-version compatibility, when using JSON message schemas.
> * Once all of the RPCs are converted, we can drop using Structs in favor of 
> serializing directly to ByteBuffer, to reduce GC load.
> This JIRA tracks converting the current hand-written message serialization 
> code to automatically generated serialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8196) Replace InitProducerId request/response with automated protocol

2019-05-08 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8196.

Resolution: Fixed

> Replace InitProducerId request/response with automated protocol
> ---
>
> Key: KAFKA-8196
> URL: https://issues.apache.org/jira/browse/KAFKA-8196
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: Jason Gustafson
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (KAFKA-7830) Convert Kafka RPCs to use automatically generated code

2019-05-08 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reopened KAFKA-7830:


> Convert Kafka RPCs to use automatically generated code
> --
>
> Key: KAFKA-7830
> URL: https://issues.apache.org/jira/browse/KAFKA-7830
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, core
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>Priority: Major
>
> KAFKA-7609 added a way of automatically generating code for reading and 
> writing Kafka RPC message types from JSON schemas.
> Automatically generated code is preferrable to manually written serialization 
> code. 
> * * It is less tedious and error-prone to use than hand-written code.
> * For developers writing Kafka clients in other languages, the JSON schemas 
> are useful in a way that the java serialization code is not.
> * It will eventually be possible to automatically validate aspects of 
> cross-version compatibility, when using JSON message schemas.
> * Once all of the RPCs are converted, we can drop using Structs in favor of 
> serializing directly to ByteBuffer, to reduce GC load.
> This JIRA tracks converting the current hand-written message serialization 
> code to automatically generated serialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8342) Admin tool to setup Kafka Stream topology (internal) topics

2019-05-08 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8342:
--

 Summary: Admin tool to setup Kafka Stream topology (internal) 
topics
 Key: KAFKA-8342
 URL: https://issues.apache.org/jira/browse/KAFKA-8342
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen


We have seen customers who need to deploy their application to production 
environment but don't have access to create changelog and repartition topics. 
They need to ask admin team to manually create those topics before proceeding 
to start the actual stream job. We could add an admin tool to help them go 
through the process quicker by providing a command that could
 # Read through current stream topology
 # Create corresponding topics as needed, even including output topics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8284) Enable static membership on KStream

2019-05-08 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8284.

Resolution: Fixed

> Enable static membership on KStream
> ---
>
> Key: KAFKA-8284
> URL: https://issues.apache.org/jira/browse/KAFKA-8284
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7858) Replace JoinGroup request/response with automated protocol

2019-05-10 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7858.

Resolution: Fixed

> Replace JoinGroup request/response with automated protocol
> --
>
> Key: KAFKA-7858
> URL: https://issues.apache.org/jira/browse/KAFKA-7858
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8354) Replace SyncGroup request/response with automated protocol

2019-05-10 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8354:
--

 Summary: Replace SyncGroup request/response with automated protocol
 Key: KAFKA-8354
 URL: https://issues.apache.org/jira/browse/KAFKA-8354
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8356) Add static membership to Round Robin assignor

2019-05-10 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8356:
--

 Summary: Add static membership to Round Robin assignor
 Key: KAFKA-8356
 URL: https://issues.apache.org/jira/browse/KAFKA-8356
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8355) Add static membership to Range assignor

2019-05-10 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8355:
--

 Summary: Add static membership to Range assignor
 Key: KAFKA-8355
 URL: https://issues.apache.org/jira/browse/KAFKA-8355
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8373) Add group.instance.id field into Sync/Heartbeat/OffsetCommit protocols

2019-05-15 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8373:
--

 Summary: Add group.instance.id field into 
Sync/Heartbeat/OffsetCommit protocols 
 Key: KAFKA-8373
 URL: https://issues.apache.org/jira/browse/KAFKA-8373
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8354) Replace SyncGroup request/response with automated protocol

2019-05-17 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8354.

Resolution: Fixed

> Replace SyncGroup request/response with automated protocol
> --
>
> Key: KAFKA-8354
> URL: https://issues.apache.org/jira/browse/KAFKA-8354
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8386) Use COORDINATOR_NOT_AVAILABLE to replace UNKNOWN_MEMBER_ID when the group is not available

2019-05-17 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8386:
--

 Summary: Use COORDINATOR_NOT_AVAILABLE to replace 
UNKNOWN_MEMBER_ID when the group is not available
 Key: KAFKA-8386
 URL: https://issues.apache.org/jira/browse/KAFKA-8386
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen


When the group is dead or unavailable on the coordinator, current approach is 
to return `UNKNOWN_MEMBER_ID` to let the member reset generation and rejoin. It 
is not particularly safe for static members in this case, since resetting 
`member.id` could delay the detection for duplicate instance.id.

Also considering the fact that group unavailability could mostly be caused by 
migration, it is favorable to trigger a coordinator rediscovery immediately 
than one more bounce. Thus, we decide to use `COORDINATOR_NOT_AVAILABLE` as top 
line citizen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8387) Add `Fenced` state to AbstractCoordinator

2019-05-17 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8387:
--

 Summary: Add `Fenced` state to AbstractCoordinator
 Key: KAFKA-8387
 URL: https://issues.apache.org/jira/browse/KAFKA-8387
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen
Assignee: Boyang Chen


Right now in some requests such as async commit or heartbeat could encounter 
fencing exception which should fail the consumer application entirely. It is 
better to track it within MemberState by adding a new `Fenced` stage so that 
the main thread could shutdown immediately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8373) Add group.instance.id field into Sync/Heartbeat/OffsetCommit protocols

2019-05-18 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8373.

Resolution: Fixed

Addressed in [https://github.com/apache/kafka/pull/6650/]

> Add group.instance.id field into Sync/Heartbeat/OffsetCommit protocols 
> ---
>
> Key: KAFKA-8373
> URL: https://issues.apache.org/jira/browse/KAFKA-8373
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8225) Handle conflicting static member id

2019-05-18 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8225.

Resolution: Fixed

> Handle conflicting static member id
> ---
>
> Key: KAFKA-8225
> URL: https://issues.apache.org/jira/browse/KAFKA-8225
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> We need an important fix for handling the user mis-configuration for 
> duplicate group.instance.ids. Several approaches we have discussed so far:
>  # Limit resetGeneration() call to only JoinGroupResponseHandler
>  # Include InstanceId in the Heartbeat and OffsetCommit APIs. Then the 
> coordinator can return the proper error code.
>  # We can can use a convention to embed the instanceId into the generated 
> memberId. At the moment, the current format is {{{clientId}-\{random uuid}}}. 
> For static members, I think instanceId is more useful than clientId and we 
> could probably use timestamp as a more concise alternative to uuid. So we 
> could have {{{instanceId}-\{timestamp}}} as the memberId for static members. 
> Then we would be able to extract this from any request and the coordinator 
> could use the proper error code
> Right now we are more inclined to option 2 or 3, however it requires 
> non-trivial amount of code changes including protocol changes and fatal error 
> handling on client side. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8397) Add pre-registration feature for static membership

2019-05-20 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8397:
--

 Summary: Add pre-registration feature for static membership
 Key: KAFKA-8397
 URL: https://issues.apache.org/jira/browse/KAFKA-8397
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen
Assignee: Boyang Chen


After 
[KIP-345|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances]],
 we want to further improve user experience for reducing rebalances by 
providing admin tooling to batch send join group requests for multiple 
consumers if we know their `group.instance.id`s beforehand.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8399) Add back `internal.leave.group.on.close` config for KStreams

2019-05-20 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8399:
--

 Summary: Add back `internal.leave.group.on.close` config for 
KStreams
 Key: KAFKA-8399
 URL: https://issues.apache.org/jira/browse/KAFKA-8399
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Boyang Chen
Assignee: Boyang Chen


The behavior for KStream rebalance default has changed from no leave group to 
leave group. We should add it back for system test pass, reduce the risk of 
being detected not working in other public cases.

Reference: [https://github.com/apache/kafka/pull/6673]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8424) Replace ListGroups request/response with automated protocol

2019-05-23 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8424:
--

 Summary: Replace ListGroups request/response with automated 
protocol
 Key: KAFKA-8424
 URL: https://issues.apache.org/jira/browse/KAFKA-8424
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8430) Unit test to make sure `group.id` and `group.instance.id` won't affect each other

2019-05-24 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8430:
--

 Summary: Unit test to make sure `group.id` and `group.instance.id` 
won't affect each other
 Key: KAFKA-8430
 URL: https://issues.apache.org/jira/browse/KAFKA-8430
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8432) Add static membership to Sticky assignor

2019-05-26 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8432:
--

 Summary: Add static membership to Sticky assignor
 Key: KAFKA-8432
 URL: https://issues.apache.org/jira/browse/KAFKA-8432
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8246) refactor topic/group instance id validation condition

2019-05-26 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8246.

Resolution: Not A Problem

Since it's only one time validation, we don't need to refactor.

> refactor topic/group instance id validation condition
> -
>
> Key: KAFKA-8246
> URL: https://issues.apache.org/jira/browse/KAFKA-8246
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8435) Replace DeleteGroups request/response with automated protocol

2019-05-26 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8435:
--

 Summary: Replace DeleteGroups request/response with automated 
protocol
 Key: KAFKA-8435
 URL: https://issues.apache.org/jira/browse/KAFKA-8435
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8436) Replace AddOffsetsToTxn request/response with automated protocol

2019-05-26 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8436:
--

 Summary: Replace AddOffsetsToTxn request/response with automated 
protocol
 Key: KAFKA-8436
 URL: https://issues.apache.org/jira/browse/KAFKA-8436
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8219) Add web documentation for static membership

2019-05-27 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8219.

Resolution: Fixed

> Add web documentation for static membership
> ---
>
> Key: KAFKA-8219
> URL: https://issues.apache.org/jira/browse/KAFKA-8219
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer, streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> Need official documentation update.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8445) Flaky Test UncleanLeaderElectionTest#testUncleanLeaderElectionDisabledByTopicOverride

2019-05-28 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8445:
--

 Summary: Flaky Test  
UncleanLeaderElectionTest#testUncleanLeaderElectionDisabledByTopicOverride
 Key: KAFKA-8445
 URL: https://issues.apache.org/jira/browse/KAFKA-8445
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen


[https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/8/console]

Trace:

 
*11:30:48* kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionDisabledByTopicOverride STARTED*11:31:30* 
kafka.integration.UncleanLeaderElectionTest.testUncleanLeaderElectionDisabledByTopicOverride
 failed, log available in 
/home/jenkins/jenkins-slave/workspace/kafka-pr-jdk8-scala2.11/core/build/reports/testOutput/kafka.integration.UncleanLeaderElectionTest.testUncleanLeaderElectionDisabledByTopicOverride.test.stdout*11:31:30*
 *11:31:30* kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionDisabledByTopicOverride FAILED*11:31:30* 
org.scalatest.exceptions.TestFailedException: Timing out after 3 ms since 
expected new leader 1 was not elected for partition topic4252061157831715077-0, 
leader is Some(-1)*11:31:30* at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:530)*11:31:30*
 at 
org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389)*11:31:30*
 at 
org.scalatest.Assertions$class.fail(Assertions.scala:1091)*11:31:30* at 
org.scalatest.Assertions$.fail(Assertions.scala:1389)*11:31:30* at 
kafka.utils.TestUtils$$anonfun$waitUntilLeaderIsElectedOrChanged$8.apply(TestUtils.scala:721)*11:31:30*
 at 
kafka.utils.TestUtils$$anonfun$waitUntilLeaderIsElectedOrChanged$8.apply(TestUtils.scala:711)*11:31:30*
 at scala.Option.getOrElse(Option.scala:121)*11:31:30* at 
kafka.utils.TestUtils$.waitUntilLeaderIsElectedOrChanged(TestUtils.scala:711)*11:31:30*
 at 
kafka.integration.UncleanLeaderElectionTest.verifyUncleanLeaderElectionDisabled(UncleanLeaderElectionTest.scala:258)*11:31:30*
 at 
kafka.integration.UncleanLeaderElectionTest.testUncleanLeaderElectionDisabledByTopicOverride(UncleanLeaderElectionTest.scala:153)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8430) Unit test to make sure `group.id` and `group.instance.id` won't affect each other

2019-05-28 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8430.

Resolution: Fixed

> Unit test to make sure `group.id` and `group.instance.id` won't affect each 
> other
> -
>
> Key: KAFKA-8430
> URL: https://issues.apache.org/jira/browse/KAFKA-8430
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8446) KStream restoration will crash with NPE when the record value is null

2019-05-29 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8446:
--

 Summary: KStream restoration will crash with NPE when the record 
value is null
 Key: KAFKA-8446
 URL: https://issues.apache.org/jira/browse/KAFKA-8446
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8456) Flaky Test StoreUpgradeIntegrationTest#shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi

2019-05-31 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8456:
--

 Summary: Flaky Test  
StoreUpgradeIntegrationTest#shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi
 Key: KAFKA-8456
 URL: https://issues.apache.org/jira/browse/KAFKA-8456
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Boyang Chen


[https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/22331/console]
*01:20:07* 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi
 failed, log available in 
/home/jenkins/jenkins-slave/workspace/kafka-pr-jdk8-scala2.11/streams/build/reports/testOutput/org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi.test.stdout*01:20:07*
 *01:20:07* org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi 
FAILED*01:20:07* java.lang.AssertionError: Condition not met within timeout 
15000. Could not get expected result in time.*01:20:07* at 
org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:375)*01:20:07*  
   at 
org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:335)*01:20:07*  
   at 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.verifyWindowedCountWithTimestamp(StoreUpgradeIntegrationTest.java:830)*01:20:07*
 at 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigrateWindowStoreToTimestampedWindowStoreUsingPapi(StoreUpgradeIntegrationTest.java:573)*01:20:07*
 at 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi(StoreUpgradeIntegrationTest.java:517)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8459) Flakey test BaseQuotaTest#testThrottledRequest

2019-05-31 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8459:
--

 Summary: Flakey test BaseQuotaTest#testThrottledRequest
 Key: KAFKA-8459
 URL: https://issues.apache.org/jira/browse/KAFKA-8459
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5151/consoleFull]
*12:14:29* kafka.api.UserQuotaTest > testThrottledRequest STARTED*12:14:57* 
kafka.api.UserQuotaTest.testThrottledRequest failed, log available in 
/home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/core/build/reports/testOutput/kafka.api.UserQuotaTest.testThrottledRequest.test.stdout*12:14:57*
 *12:14:57* kafka.api.UserQuotaTest > testThrottledRequest FAILED*12:14:57* 
org.scalatest.exceptions.TestFailedException: Consumer throttle metric not 
updated: avg=0.0 max=0.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8460) Flaky Test PlaintextConsumerTest#testLowMaxFetchSizeForRequestAndPartition

2019-05-31 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8460:
--

 Summary: Flaky Test  
PlaintextConsumerTest#testLowMaxFetchSizeForRequestAndPartition
 Key: KAFKA-8460
 URL: https://issues.apache.org/jira/browse/KAFKA-8460
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen


 
*16:17:04* kafka.api.PlaintextConsumerTest > 
testLowMaxFetchSizeForRequestAndPartition FAILED*16:17:04* 
org.scalatest.exceptions.TestFailedException: Timed out before consuming 
expected 2700 records. The number consumed was 1980.*16:17:04* at 
org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530)*16:17:04*
 at 
org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529)*16:17:04*
 at 
org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389)*16:17:04*
 at org.scalatest.Assertions.fail(Assertions.scala:1091)*16:17:04*  
   at org.scalatest.Assertions.fail$(Assertions.scala:1087)*16:17:04* 
at org.scalatest.Assertions$.fail(Assertions.scala:1389)*16:17:04* at 
kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:789)*16:17:04* at 
kafka.utils.TestUtils$.pollRecordsUntilTrue(TestUtils.scala:765)*16:17:04*  
   at 
kafka.api.AbstractConsumerTest.consumeRecords(AbstractConsumerTest.scala:156)*16:17:04*
 at 
kafka.api.PlaintextConsumerTest.testLowMaxFetchSizeForRequestAndPartition(PlaintextConsumerTest.scala:801)*16:17:04*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8461) Flakey test UncleanLeaderElectionTest#testUncleanLeaderElectionDisabledByTopicOverride

2019-05-31 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8461:
--

 Summary: Flakey test 
UncleanLeaderElectionTest#testUncleanLeaderElectionDisabledByTopicOverride
 Key: KAFKA-8461
 URL: https://issues.apache.org/jira/browse/KAFKA-8461
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Boyang Chen


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5168/consoleFull]
*15:47:56* kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionDisabledByTopicOverride FAILED*15:47:56* 
org.scalatest.exceptions.TestFailedException: Timing out after 3 ms since 
expected new leader 1 was not elected for partition 
topic-9147891452427084986-0, leader is Some(-1)*15:47:56* at 
org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530)*15:47:56*
 at 
org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529)*15:47:56*
 at 
org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389)*15:47:56*
 at org.scalatest.Assertions.fail(Assertions.scala:1091)*15:47:56*  
   at org.scalatest.Assertions.fail$(Assertions.scala:1087)*15:47:56* 
at org.scalatest.Assertions$.fail(Assertions.scala:1389)*15:47:56* at 
kafka.utils.TestUtils$.$anonfun$waitUntilLeaderIsElectedOrChanged$8(TestUtils.scala:722)*15:47:56*
 at scala.Option.getOrElse(Option.scala:138)*15:47:56* at 
kafka.utils.TestUtils$.waitUntilLeaderIsElectedOrChanged(TestUtils.scala:712)*15:47:56*
 at 
kafka.integration.UncleanLeaderElectionTest.verifyUncleanLeaderElectionDisabled(UncleanLeaderElectionTest.scala:258)*15:47:56*
 at 
kafka.integration.UncleanLeaderElectionTest.testUncleanLeaderElectionDisabledByTopicOverride(UncleanLeaderElectionTest.scala:153)*15:47:56*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8462) flakey test RestServerTest#testCORSEnabled

2019-06-01 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8462:
--

 Summary: flakey test RestServerTest#testCORSEnabled
 Key: KAFKA-8462
 URL: https://issues.apache.org/jira/browse/KAFKA-8462
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen


*https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/22374/console*

*21:27:25* org.apache.kafka.connect.runtime.rest.RestServerTest > 
testCORSEnabled STARTED*21:27:26* 
org.apache.kafka.connect.runtime.rest.RestServerTest.testCORSEnabled failed, 
log available in 
/home/jenkins/jenkins-slave/workspace/kafka-pr-jdk8-scala2.11/connect/runtime/build/reports/testOutput/org.apache.kafka.connect.runtime.rest.RestServerTest.testCORSEnabled.test.stdout*21:27:26*
 *21:27:26* org.apache.kafka.connect.runtime.rest.RestServerTest > 
testCORSEnabled FAILED*21:27:26* 
org.apache.kafka.connect.errors.ConnectException: Unable to initialize REST 
server*21:27:26* at 
org.apache.kafka.connect.runtime.rest.RestServer.initializeServer(RestServer.java:180)*21:27:26*
 at 
org.apache.kafka.connect.runtime.rest.RestServerTest.checkCORSRequest(RestServerTest.java:217)*21:27:26*
 at 
org.apache.kafka.connect.runtime.rest.RestServerTest.testCORSEnabled(RestServerTest.java:87)*21:27:26*
 *21:27:26* Caused by:*21:27:26* java.io.IOException: Failed to 
bind to 0.0.0.0/0.0.0.0:8083*21:27:26* at 
org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:346)*21:27:26*
 at 
org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:308)*21:27:26*
 at 
org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)*21:27:26*
 at 
org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)*21:27:26*
 at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)*21:27:26*
 at 
org.eclipse.jetty.server.Server.doStart(Server.java:396)*21:27:26* 
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)*21:27:26*
 at 
org.apache.kafka.connect.runtime.rest.RestServer.initializeServer(RestServer.java:178)*21:27:26*
 ... 2 more*21:27:26* *21:27:26* Caused by:*21:27:26*   
  java.net.BindException: Address already in use*21:27:26*  
   at sun.nio.ch.Net.bind0(Native Method)*21:27:26* at 
sun.nio.ch.Net.bind(Net.java:433)*21:27:26* at 
sun.nio.ch.Net.bind(Net.java:425)*21:27:26* at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)*21:27:26*
 at 
sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)*21:27:26*  
   at 
org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:342)*21:27:26*
 ... 9 more*21:27:26*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8445) Flaky Test UncleanLeaderElectionTest#testUncleanLeaderElectionDisabledByTopicOverride

2019-06-01 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8445.

Resolution: Duplicate

> Flaky Test  
> UncleanLeaderElectionTest#testUncleanLeaderElectionDisabledByTopicOverride
> --
>
> Key: KAFKA-8445
> URL: https://issues.apache.org/jira/browse/KAFKA-8445
> Project: Kafka
>  Issue Type: Bug
>Reporter: Boyang Chen
>Priority: Major
>
> [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/8/console]
> Trace:
> *11:30:48* kafka.integration.UncleanLeaderElectionTest > 
> testUncleanLeaderElectionDisabledByTopicOverride STARTED*11:31:30* 
> kafka.integration.UncleanLeaderElectionTest.testUncleanLeaderElectionDisabledByTopicOverride
>  failed, log available in 
> /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk8-scala2.11/core/build/reports/testOutput/kafka.integration.UncleanLeaderElectionTest.testUncleanLeaderElectionDisabledByTopicOverride.test.stdout*11:31:30*
>  *11:31:30* kafka.integration.UncleanLeaderElectionTest > 
> testUncleanLeaderElectionDisabledByTopicOverride FAILED*11:31:30* 
> org.scalatest.exceptions.TestFailedException: Timing out after 3 ms since 
> expected new leader 1 was not elected for partition 
> topic4252061157831715077-0, leader is Some(-1)*11:31:30* at 
> org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:530)*11:31:30*
>  at 
> org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389)*11:31:30*
>  at org.scalatest.Assertions$class.fail(Assertions.scala:1091)*11:31:30* at 
> org.scalatest.Assertions$.fail(Assertions.scala:1389)*11:31:30* at 
> kafka.utils.TestUtils$$anonfun$waitUntilLeaderIsElectedOrChanged$8.apply(TestUtils.scala:721)*11:31:30*
>  at 
> kafka.utils.TestUtils$$anonfun$waitUntilLeaderIsElectedOrChanged$8.apply(TestUtils.scala:711)*11:31:30*
>  at scala.Option.getOrElse(Option.scala:121)*11:31:30* at 
> kafka.utils.TestUtils$.waitUntilLeaderIsElectedOrChanged(TestUtils.scala:711)*11:31:30*
>  at 
> kafka.integration.UncleanLeaderElectionTest.verifyUncleanLeaderElectionDisabled(UncleanLeaderElectionTest.scala:258)*11:31:30*
>  at 
> kafka.integration.UncleanLeaderElectionTest.testUncleanLeaderElectionDisabledByTopicOverride(UncleanLeaderElectionTest.scala:153)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8489) Remove group immediately when DeleteGroup request could be completed

2019-06-05 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8489:
--

 Summary: Remove group immediately when DeleteGroup request could 
be completed
 Key: KAFKA-8489
 URL: https://issues.apache.org/jira/browse/KAFKA-8489
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


Inspired by discussion in [https://github.com/apache/kafka/pull/6762], we 
should attempt to shorten the time period from receiving the group deletion to 
actually remove it from cache. This saves the client unnecessary round trips to 
rebuild the group if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8490) Use `Migrated` and `Deleted` state to replace consumer group `Dead` state

2019-06-05 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8490:
--

 Summary: Use `Migrated` and `Deleted` state to replace consumer 
group `Dead` state
 Key: KAFKA-8490
 URL: https://issues.apache.org/jira/browse/KAFKA-8490
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen
Assignee: Boyang Chen


Inspired by [https://github.com/apache/kafka/pull/6762], right now the consumer 
group dead state is not clear to the user. It actually suggests 3 transient 
states:
 # a group is emigrated to another broker
 # an empty group is marked as dead by DeleteGroup request and will be deleted 
soon
 # a group is unloaded from cache due to last offset expiring

In case 1, the state name is better defined as `Migrated` to be consistent with 
what's actually going on in the background. for case 2 & 3, the state is better 
defined as `Deleted` which conveys a more accurate group status. By separating 
these two states, our error handling should also be more precise.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8500) member.id should always update upon static member rejoin despite of group state

2019-06-06 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8500:
--

 Summary: member.id should always update upon static member rejoin 
despite of group state
 Key: KAFKA-8500
 URL: https://issues.apache.org/jira/browse/KAFKA-8500
 Project: Kafka
  Issue Type: Improvement
  Components: consumer, streams
Affects Versions: 2.3.0
Reporter: Boyang Chen
Assignee: Boyang Chen


A blocking bug was detected by [~guozhang] that the `member.id` wasn't get 
updated upon static member rejoining when the group is not in stable state. 
This could make duplicate member fencing harder and potentially yield incorrect 
processing outputs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8502) Flakey test AdminClientIntegrationTest#testElectUncleanLeadersForAllPartitions

2019-06-06 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8502:
--

 Summary: Flakey test 
AdminClientIntegrationTest#testElectUncleanLeadersForAllPartitions
 Key: KAFKA-8502
 URL: https://issues.apache.org/jira/browse/KAFKA-8502
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5355/consoleFull]

 
*18:06:01* *18:06:01* kafka.api.AdminClientIntegrationTest > 
testElectUncleanLeadersForAllPartitions FAILED*18:06:01* 
java.util.concurrent.ExecutionException: 
org.apache.kafka.common.errors.TimeoutException: Aborted due to 
timeout.*18:06:01* at 
org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)*18:06:01*
 at 
org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)*18:06:01*
 at 
org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)*18:06:01*
 at 
org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)*18:06:01*
 at 
kafka.api.AdminClientIntegrationTest.testElectUncleanLeadersForAllPartitions(AdminClientIntegrationTest.scala:1496)*18:06:01*
 *18:06:01* Caused by:*18:06:01* 
org.apache.kafka.common.errors.TimeoutException: Aborted due to timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8515) Materialize KTable when upstream uses Windowed instead of

2019-06-09 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8515:
--

 Summary: Materialize KTable when upstream uses Windowed instead 
of 
 Key: KAFKA-8515
 URL: https://issues.apache.org/jira/browse/KAFKA-8515
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8529) Flakey test ConsumerBounceTest#testCloseDuringRebalance

2019-06-12 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8529:
--

 Summary: Flakey test ConsumerBounceTest#testCloseDuringRebalance
 Key: KAFKA-8529
 URL: https://issues.apache.org/jira/browse/KAFKA-8529
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Reporter: Boyang Chen


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5450/consoleFull]

 
*16:16:10* kafka.api.ConsumerBounceTest > testCloseDuringRebalance 
STARTED*16:16:22* kafka.api.ConsumerBounceTest.testCloseDuringRebalance failed, 
log available in 
/home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/core/build/reports/testOutput/kafka.api.ConsumerBounceTest.testCloseDuringRebalance.test.stdout*16:16:22*
 *16:16:22* kafka.api.ConsumerBounceTest > testCloseDuringRebalance 
FAILED*16:16:22* java.lang.AssertionError: Rebalance did not complete in 
time*16:16:22* at org.junit.Assert.fail(Assert.java:89)*16:16:22*   
  at org.junit.Assert.assertTrue(Assert.java:42)*16:16:22* at 
kafka.api.ConsumerBounceTest.waitForRebalance$1(ConsumerBounceTest.scala:402)*16:16:22*
 at 
kafka.api.ConsumerBounceTest.checkCloseDuringRebalance(ConsumerBounceTest.scala:416)*16:16:22*
 at 
kafka.api.ConsumerBounceTest.testCloseDuringRebalance(ConsumerBounceTest.scala:379)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8223) Deprecate group.initial.rebalance.delay

2019-06-13 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8223.

Resolution: Later

> Deprecate group.initial.rebalance.delay
> ---
>
> Key: KAFKA-8223
> URL: https://issues.apache.org/jira/browse/KAFKA-8223
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Minor
>
> This is a stated step in KIP-345, however we reprioritize it since it is not 
> ready to be removed for dynamic members.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8538) Add `group.instance.id` to DescribeGroup for better visibility

2019-06-13 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8538:
--

 Summary: Add `group.instance.id` to DescribeGroup for better 
visibility
 Key: KAFKA-8538
 URL: https://issues.apache.org/jira/browse/KAFKA-8538
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8539) Add `getGroupInstanceId` call to Subscription class

2019-06-13 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8539:
--

 Summary: Add `getGroupInstanceId` call to Subscription class
 Key: KAFKA-8539
 URL: https://issues.apache.org/jira/browse/KAFKA-8539
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8541) Flakey test LeaderElectionCommandTest#testTopicPartition

2019-06-14 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8541:
--

 Summary: Flakey test LeaderElectionCommandTest#testTopicPartition
 Key: KAFKA-8541
 URL: https://issues.apache.org/jira/browse/KAFKA-8541
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Boyang Chen


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5528/consoleFull]

 
*01:37:34* kafka.admin.LeaderElectionCommandTest > testTopicPartition 
STARTED*01:38:13* kafka.admin.LeaderElectionCommandTest.testTopicPartition 
failed, log available in 
/home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/core/build/reports/testOutput/kafka.admin.LeaderElectionCommandTest.testTopicPartition.test.stdout*01:38:13*
 *01:38:13* kafka.admin.LeaderElectionCommandTest > testTopicPartition 
FAILED*01:38:13* kafka.common.AdminCommandFailedException: Timeout waiting 
for election results*01:38:13* at 
kafka.admin.LeaderElectionCommand$.electLeaders(LeaderElectionCommand.scala:134)*01:38:13*
 at 
kafka.admin.LeaderElectionCommand$.run(LeaderElectionCommand.scala:89)*01:38:13*
 at 
kafka.admin.LeaderElectionCommand$.main(LeaderElectionCommand.scala:42)*01:38:13*
 at 
kafka.admin.LeaderElectionCommandTest.$anonfun$testTopicPartition$1(LeaderElectionCommandTest.scala:125)*01:38:13*
 at 
kafka.admin.LeaderElectionCommandTest.$anonfun$testTopicPartition$1$adapted(LeaderElectionCommandTest.scala:103)*01:38:13*
 at kafka.utils.TestUtils$.resource(TestUtils.scala:1528)*01:38:13* 
at 
kafka.admin.LeaderElectionCommandTest.testTopicPartition(LeaderElectionCommandTest.scala:103)*01:38:13*
 *01:38:13* Caused by:*01:38:13* 
org.apache.kafka.common.errors.TimeoutException: Aborted due to 
timeout.*01:38:13*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7853) Refactor ConsumerCoordinator/AbstractCoordinator to reduce constructor parameter list

2019-06-17 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7853.

Resolution: Fixed

> Refactor ConsumerCoordinator/AbstractCoordinator to reduce constructor 
> parameter list
> -
>
> Key: KAFKA-7853
> URL: https://issues.apache.org/jira/browse/KAFKA-7853
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> The parameter lists for class ConsumerCoordinator/AbstractCoordinator are 
> growing over time. We should think of reducing the parameter size by 
> introducing some intermediate data structs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8555) Flaky test ExampleConnectIntegrationTest#testSourceConnector

2019-06-18 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8555:
--

 Summary: Flaky test 
ExampleConnectIntegrationTest#testSourceConnector
 Key: KAFKA-8555
 URL: https://issues.apache.org/jira/browse/KAFKA-8555
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen


[https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/22798/console]
*02:03:21* 
org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector
 failed, log available in 
/home/jenkins/jenkins-slave/workspace/kafka-pr-jdk8-scala2.11/connect/runtime/build/reports/testOutput/org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector.test.stdout*02:03:21*
 *02:03:21* org.apache.kafka.connect.integration.ExampleConnectIntegrationTest 
> testSourceConnector FAILED*02:03:21* 
org.apache.kafka.connect.errors.DataException: Insufficient records committed 
by connector simple-conn in 15000 millis. Records expected=2000, 
actual=1013*02:03:21* at 
org.apache.kafka.connect.integration.ConnectorHandle.awaitCommits(ConnectorHandle.java:188)*02:03:21*
 at 
org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector(ExampleConnectIntegrationTest.java:181)*02:03:21*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8556) Add system tests for assignment stickiness validation

2019-06-18 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8556:
--

 Summary: Add system tests for assignment stickiness validation
 Key: KAFKA-8556
 URL: https://issues.apache.org/jira/browse/KAFKA-8556
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (KAFKA-4222) Transient failure in QueryableStateIntegrationTest.queryOnRebalance

2019-06-18 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reopened KAFKA-4222:

  Assignee: (was: Damian Guy)

> Transient failure in QueryableStateIntegrationTest.queryOnRebalance
> ---
>
> Key: KAFKA-4222
> URL: https://issues.apache.org/jira/browse/KAFKA-4222
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams, unit tests
>Reporter: Jason Gustafson
>Priority: Major
> Fix For: 0.11.0.0
>
>
> Seen here: https://builds.apache.org/job/kafka-trunk-jdk8/915/console
> {code}
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
> queryOnRebalance[1] FAILED
> java.lang.AssertionError: Condition not met within timeout 3. waiting 
> for metadata, store and value to be non null
> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:276)
> at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.verifyAllKVKeys(QueryableStateIntegrationTest.java:263)
> at 
> org.apache.kafka.streams.integration.QueryableStateIntegrationTest.queryOnRebalance(QueryableStateIntegrationTest.java:342)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8500) member.id should always update upon static member rejoin despite of group state

2019-06-19 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-8500.

Resolution: Fixed

> member.id should always update upon static member rejoin despite of group 
> state
> ---
>
> Key: KAFKA-8500
> URL: https://issues.apache.org/jira/browse/KAFKA-8500
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Affects Versions: 2.3.0
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Blocker
>
> A blocking bug was detected by [~guozhang] that the `member.id` wasn't get 
> updated upon static member rejoining when the group is not in stable state. 
> This could make duplicate member fencing harder and potentially yield 
> incorrect processing outputs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8569) Integrate the poll timeout warning with leave group call

2019-06-19 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8569:
--

 Summary: Integrate the poll timeout warning with leave group call
 Key: KAFKA-8569
 URL: https://issues.apache.org/jira/browse/KAFKA-8569
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen


Under static membership, we may be polluting our log by seeing a bunch of 
consecutive warning message upon poll timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7857) Add AbstractCoordinatorConfig class to consolidate consumer coordinator configs between Consumer and Connect

2019-06-19 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7857.

Resolution: Fixed

[https://github.com/apache/kafka/pull/6854] fixes it

> Add AbstractCoordinatorConfig class to consolidate consumer coordinator 
> configs between Consumer and Connect
> 
>
> Key: KAFKA-7857
> URL: https://issues.apache.org/jira/browse/KAFKA-7857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> Right now there are a lot of duplicate configuration concerning client 
> coordinator shared across ConsumerConfig and DistributedConfig (connect 
> config). It makes sense to extract all coordinator related configs into a 
> separate config class to reduce code redundancy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8587) EOS producer should understand consumer group assignment semantics

2019-06-22 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8587:
--

 Summary: EOS producer should understand consumer group assignment 
semantics
 Key: KAFKA-8587
 URL: https://issues.apache.org/jira/browse/KAFKA-8587
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen
Assignee: Boyang Chen


Currently exactly-once producer is coupled with individual input partitions. 
This is not a well scaled solution, and the root cause is that EOS producer 
doesn't understand the topic partition move throughout consumer group 
rebalance. By covering this semantic gap, we could achieve much better EOS 
scalability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8589) Flakey test ResetConsumerGroupOffsetTest#testResetOffsetsExistingTopic

2019-06-23 Thread Boyang Chen (JIRA)
Boyang Chen created KAFKA-8589:
--

 Summary: Flakey test 
ResetConsumerGroupOffsetTest#testResetOffsetsExistingTopic
 Key: KAFKA-8589
 URL: https://issues.apache.org/jira/browse/KAFKA-8589
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5724/consoleFull]
*20:25:15* 
kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsExistingTopic failed, 
log available in 
/home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/core/build/reports/testOutput/kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsExistingTopic.test.stdout*20:25:15*
 *20:25:15* kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsExistingTopic FAILED*20:25:15* 
java.util.concurrent.ExecutionException: 
org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.*20:25:15* at 
org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)*20:25:15*
 at 
org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)*20:25:15*
 at 
org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)*20:25:15*
 at 
org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)*20:25:15*
 at 
kafka.admin.ConsumerGroupCommand$ConsumerGroupService.$anonfun$resetOffsets$1(ConsumerGroupCommand.scala:379)*20:25:15*
 at 
scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:160)*20:25:15*
 at 
scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:160)*20:25:15*
 at scala.collection.Iterator.foreach(Iterator.scala:941)*20:25:15* 
at scala.collection.Iterator.foreach$(Iterator.scala:941)*20:25:15* 
at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)*20:25:15* 
at scala.collection.IterableLike.foreach(IterableLike.scala:74)*20:25:15*   
  at 
scala.collection.IterableLike.foreach$(IterableLike.scala:73)*20:25:15* 
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)*20:25:15*   
  at 
scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:160)*20:25:15*  
   at 
scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:158)*20:25:15* 
at 
scala.collection.AbstractTraversable.foldLeft(Traversable.scala:108)*20:25:15*  
   at 
kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:377)*20:25:15*
 at 
kafka.admin.ResetConsumerGroupOffsetTest.resetOffsets(ResetConsumerGroupOffsetTest.scala:507)*20:25:15*
 at 
kafka.admin.ResetConsumerGroupOffsetTest.resetAndAssertOffsets(ResetConsumerGroupOffsetTest.scala:477)*20:25:15*
 at 
kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsExistingTopic(ResetConsumerGroupOffsetTest.scala:123)*20:25:15*
 *20:25:15* Caused by:*20:25:15* 
org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
coordinator is not available.*20*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   >