mjsax commented on PR #21565:
URL: https://github.com/apache/kafka/pull/21565#issuecomment-4042902289
I did not review this PR, but I just had a question / comment about this. If
this PR does address this already, great, but based on the comments (which I
skimmed over) I believe the PR might not do this.
For static group membership, when a member re-joins, there is a couple of
corner case for which it it not enough to just return the previous assignment
(or just bump the epoch), but we need to do more:
- `application.server` config changed -- for this case, we **don't** need
to compute a new task assignment, but we need to ensure that the new
`application.server` config is broadcasted to all other members
- `client.tags` config changes -- if standby tasks are enabled (ie, set to
`> 1`), I think we need to re-compute the standby task assignment (not full
assignment)
- `rack.id` -- this one is a little tricky -- it's not used broker side
atm, but it might be used to compute an assignment -- for a static member, I
would not really expect it to change, but if it does change, I think we need to
re-compute the full assignment
- for `processId` I am not sure?
- are there any other thing that could change and also require specific
handling?
About Lucas' comment:
```
I think the difference is mainly in configs that are always fixed in Kafka
Streams clients
rebalance.timeout.ms
user.endpoint
client ID
client Host
```
For static member, when they are stopped and restarted, we cannot guarantee
that the configs do not change. So we need to consider this case IMHO -- as a
matter of fact, I know explicitly for `application.server` config (aka
`user.endpoint`) that it was an issue in the "classic" protocol reported by
users, that no rebalance was triggered if this config was update for static
members... It was not really possible to fix for "classic" protocol, but we
should ensure that we fix it for 1071.
Of course, bumping the epoch will be required in all these cases (not trying
to say otherwise), but what I am saying is that bumping the epoch does not
imply, that we need to re-compute the assignment for all cases, nor that it is
sufficient to only bump the epoch for all cases.
Btw: these things are very subtle, and we will need to update the
documentation accordingly to explain how this all works -- wondering if it
might even make sense to update the KIP with it? It's more than an
implementation detail IMHO.
Curious to hear what you think about this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]