A. Sophie Blee-Goldman created KAFKA-12984:
----------------------------------------------

             Summary: Cooperative sticky assignor can get stuck with invalid 
SubscriptionState input metadata
                 Key: KAFKA-12984
                 URL: https://issues.apache.org/jira/browse/KAFKA-12984
             Project: Kafka
          Issue Type: Bug
          Components: consumer
            Reporter: A. Sophie Blee-Goldman
             Fix For: 3.0.0, 2.8.1


Some users have reported seeing their consumer group become stuck in the 
CompletingRebalance phase when using the cooperative-sticky assignor. Based on 
the request metadata we were able to deduce that multiple consumers were 
reporting the same partition(s) in their "ownedPartitions" field of the 
consumer protocol. Since this is an invalid state, the input causes the 
cooperative-sticky assignor to detect that something is wrong and throw an 
IllegalStateException. If the consumer application is set up to simply retry, 
this will cause the group to appear to hang in the rebalance state.

The "ownedPartitions" field is encoded based on the ConsumerCoordinator's 
SubscriptionState, which was assumed to always be up to date. However there may 
be cases where the consumer has dropped out of the group but fails to clear the 
SubscriptionState, allowing it to report some partitions as owned that have 
since been reassigned to another member.

We should (a) fix the sticky assignment algorithm to resolve cases of improper 
input conditions by invalidating the "ownedPartitions" in cases of double 
ownership, and (b) shore up the ConsumerCoordinator logic to better handle 
rejoining the group and keeping its internal state consistent. See KAFKA-12983 
for more details on (b)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to