squah-confluent commented on PR #20000:
URL: https://github.com/apache/kafka/pull/20000#issuecomment-3707344977

   @FrankYang0529 Thanks for collecting the new benchmark results. The non-rack 
aware numbers look good. The rack aware number are better but still a little 
slow. It's not ideal to be blocking the group coordinator thread for 150 ms. 
Maybe this won't be too bad in practice, since
   1. most groups won't be as large
   2. I'm working on a KIP to reduce the impact of slow assignors
   
   If we really want to, I think it's possible to improve performance further 
by re-designing the `SubscribedTopicDescriber.racksForPartition` interface, but 
maybe it's best left to a separate PR. `jmh-benchmarks/README.md` has 
instructions for running the benchmarks with libasyncProfiler which will 
generate a flame graph of the assignor run.
   
   Separately I have some concerns about stickiness when static members are 
replaced. The group coordinator assigns the new static member a new member id 
and keeps the previous assignment, so the order of member ids is not stable 
(I'm aware the existing range assignors also have this problem). How expensive 
would it be to track the previous owner of partitions in 
`maybeRevokePartitions` and maybe add a new pass in between 
`assignRackAwarenessRemainingPartitions` and `assignRemainingPartitions` to 
restore those partitions to their preferred sticky members?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to