[ https://issues.apache.org/jira/browse/KAFKA-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sophie Blee-Goldman reassigned KAFKA-10455: ------------------------------------------- Assignee: Leah Thomas > Probing rebalances are not guaranteed to be triggered by non-leader members > --------------------------------------------------------------------------- > > Key: KAFKA-10455 > URL: https://issues.apache.org/jira/browse/KAFKA-10455 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 2.6.0 > Reporter: Sophie Blee-Goldman > Assignee: Leah Thomas > Priority: Blocker > Fix For: 2.7.0, 2.6.1 > > > Apparently, if a consumer rejoins the group with the same subscription > userdata that it previously sent, it will not trigger a rebalance. The one > exception here is that the group leader will always trigger a rebalance when > it rejoins the group. > This has implications for KIP-441, where we rely on asking an arbitrary > thread to enforce the followup probing rebalances. Technically we do ask a > thread living on the same instance as the leader, so the odds that the leader > will be chosen aren't completely abysmal, but for any multithreaded > application they are still at best only 50%. > Of course in general the userdata will have changed within a span of 10 > minutes, so the actual likelihood of hitting this is much lower – it can > only happen if the member's task offset sums remained unchanged. > Realistically, this probably requires that the member only have > fully-restored active tasks (encoded with the constant sentinel -2) and that > no tasks be added or removed. > > One solution would be to make sure the leader is responsible for the probing > rebalance. To do this, we would need to somehow expose the memberId of the > thread's main consumer to the partition assignor. I'm actually not sure if > that's currently possible to figure out or not. If not, we could just assign > the probing rebalance to every thread on the leader's instance. This > shouldn't result in multiple followup rebalances as the rebalance schedule > will be updated/reset on the first followup rebalance. > Another solution would be to make sure the userdata is always different. We > could encode an extra bit that flip-flops, but then we'd have to persist the > latest value somewhere/somehow. Alternatively we could just encode the next > probing rebalance time in the subscription userdata, since that is guaranteed > to always be different from the previous rebalance. This might get tricky > though, and certainly wastes space in the subscription userdata. Also, this > would only solve the problem for KIP-441 probing rebalances, meaning we'd > have to individually ensure the userdata has changed for every type of > followup rebalance (see related issue below). So the first proposal, > requiring the leader trigger the rebalance, would be preferable. > Note that, imho, we should just allow anyone to trigger a rebalance by > rejoining the group. But this would presumably require a broker-side change > and thus we would still need a workaround for KIP-441 to work with brokers. > > Related issue: > This also means the Streams workaround for [KAFKA-9821|http://example.com] is > not airtight, as we encode the followup rebalance in the member who is > supposed to _receive_ a revoked partition, rather than the member who is > actually revoking said partition. While the member doing the revoking will be > guaranteed to have different userdata, the member receiving the partition may > not. Making it the responsibility of the leader to trigger _any_ type of > followup rebalance would solve this issue as well. > Note that other types of followup rebalance (version probing, static > membership with host info change) are guaranteed to have a change in the > subscription userdata, and will not hit this bug -- This message was sent by Atlassian Jira (v8.3.4#803005)