Guozhang Wang created KAFKA-9801:
------------------------------------
Summary: Static member could get empty assignment unexpectedly
Key: KAFKA-9801
URL: https://issues.apache.org/jira/browse/KAFKA-9801
Project: Kafka
Issue Type: Bug
Components: consumer, streams
Affects Versions: 2.4.0
Reporter: Guozhang Wang
Assignee: Guozhang Wang
Take the following example trace where static members are joining the group:
1. Static member with instance A joined the group with empty member, the
coordinator generated member.id 1 for A and added it to the group. The group
state is PreparingRebalance.
2. The group is formed and now we move on to CompletingRebalance.
3. Another member joins the group, causing it to transit back to
PreparingRebalance, which would potentially send a REBALANCE_IN_PROGRESS to
member A as well.
4. Member A gets the REBALANCE_IN_PROGRESS error, trying to re-join (again with
an empty member.id)
5. The group is not advanced to CompletingRebalance again.
6. The group get the second join-group from the known instance A with an empty
member.id, will generated a new member.id 2 and replace the member.id 1.
7. The group gets the assignment from leader which only includes member.id 1
and not member.id 2.
8. The assignment for member.id 1 is dropped on the broker side while the
assignment for member.id 2 is set to an empty byte array.
9. The empty byte array is sent back to the instance A causing it the following
error:
{code}
[2020-03-27T21:13:01-05:00]
(streams-soak-2-5_soak_i-054b83e98b7ed6285_streamslog)
org.apache.kafka.common.protocol.types.SchemaException: Error reading field
'version': java.nio.BufferUnderflowException
at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:110)
{code}
This error has to be triggered when quite a few cases are aligned together, and
hence it was not triggered very frequently.
Personally I think there's a correlation with this error to the observed
https://issues.apache.org/jira/browse/KAFKA-9659 as well, which I'd keep
investigating (will update in this ticket).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)