[GitHub] [kafka] ableegoldman edited a comment on pull request #8588: [WIP] KAFKA-6145: KIP-441: Validate balanced assignment

GitBox Thu, 30 Apr 2020 15:24:12 -0700


ableegoldman edited a comment on pull request #8588:
URL: https://github.com/apache/kafka/pull/8588#issuecomment-622086284



   Not sure I'm on the same page w.r.t interpreting "balance" at the client 
level. Here's the proposal we discussed a while back:
   
   - In the "under-capacity" case, there are more tasks than the number of 
threads. We aim to give each thread an equal N >= 1 number of tasks so all 
clients get tasks proportional to their capacity. Of course this means some 
clients can get more than the `balance.factor` number of tasks than another 
client, so this would violate "client-level" balance but satisfy "thread-level" 
balance.
   
   - In the "over-capacity" case, there are fewer tasks than the number of 
threads so some threads will necessarily be idle. This is the situation in 
KAFKA-9173. In this case, we can actually satisfy both thread-level and 
client-level balance: we get thread-level by default, so we just have to make 
an effort to spread the tasks evenly over clients as well. The relevant point 
here is we should **only verify client-level balance in the over-capacity 
case** (but always verify thread-level balance).
   
   Presumably most applications run instances with roughly similar capacity, in 
which case thread-level balance will collapse to give client-level balance as 
well. Since we get both from the over-capacity case as well, the only relevant 
edge case is when we are under-capacity with large per-machine capacity 
variation. Surely if you're running one machine with 10 threads and one machine 
with 1, and there are enough tasks to saturate both, you would expect the first 
machine to get 10x the task load as the first?
   
   edit: fixed the description of over/under capacity cases


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] ableegoldman edited a comment on pull request #8588: [WIP] KAFKA-6145: KIP-441: Validate balanced assignment

Reply via email to