[ 
https://issues.apache.org/jira/browse/KAFKA-15520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770635#comment-17770635
 ] 

A. Sophie Blee-Goldman commented on KAFKA-15520:
------------------------------------------------

{quote}Also added
props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG, 
CooperativeStickyAssignor.class.getName());

to enable CooperativeStickyAssignor
{quote}
To clarify, Streams will always override and ignore this setting because it 
needs to plug in the StreamsPartitionAssignor – so you don't need this, at 
least not for any Streams applications. It should use cooperative rebalancing 
by default as of version 2.4 and above.

I guess the one thing to watch out for is that cooperative rebalancing will not 
be enabled if you upgrade from a lower version and forget/skip the step to 
remove the StreamsConfig.UPGRADE_FROM property. So just make sure that's not 
being set anywhere (at least, not set to a version that's lower than 2.4)

> Kafka Streams Stateful Aggregation Rebalancing causing processing to pause on 
> all partitions
> --------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-15520
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15520
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.6.2
>            Reporter: Rohit Bobade
>            Priority: Major
>
> Kafka broker version: 2.8.0 Kafka Streams client version: 2.6.2
> I am running kafka streams stateful aggregations on K8s statefulset with 
> persistent volume attached to each pod. I have also specified
> props.put(ConsumerConfig.GROUP_INSTANCE_ID_CONFIG, podName);
> which makes sure it gets the sticky partition assignment.
> Enabled standby replica - 
> props.put(StreamsConfig.NUM_STANDBY_REPLICAS_CONFIG, 1);
> and set props.put(StreamsConfig.ACCEPTABLE_RECOVERY_LAG_CONFIG, "0");
> However, I'm seeing that when pods restart - it triggers rebalances and 
> causes processing to be paused on all pods till the rebalance and state 
> restore is in progress.
> My understanding is that even if there is a rebalance - only the partitions 
> that should be moved around will be restored in a cooperative way and not 
> pause all the processing. Also, it should failover to standby replica in this 
> case and avoid state restoring on other pods.
> I have increased session timeout to 480 seconds and max poll interval to 15 
> mins to minimize rebalances.
> Also added
> props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG, 
> CooperativeStickyAssignor.class.getName());
> to enable CooperativeStickyAssignor
> could someone please help if I'm missing something?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to