lucasbru opened a new pull request, #20127:
URL: https://github.com/apache/kafka/pull/20127

   Task pairs is an optimization that is enabled in the current sticky task 
assignor.
   
   The basic idea is that every time we add a task A to a client that has tasks 
B, C, we add pairs (A, B) and (A, C) to a global collection of pairs. When 
adding a standby task, we then prioritize creating standby tasks that create 
new task pairs. If this does not work, we fall back to the original behavior.
   
   The complexity of this optimization is fairly significant, and its 
usefulness is questionable, the HighAvailabilityAssignor does not seem to have 
such an optimization, and the absence of this optimization does not seem to 
have caused any problems that I know of. I could not find any what this 
optimization is actually trying to achieve. 
   
   A side effect of it is that we will sometimes avoid “small loops”, such as 
        
        Node A: ActiveTask1, StandbyTask2
        Node B: ActiveTask2, StandbyTask1
        Node C: ActiveTask3, StandbyTask4
        Node D: ActiveTask4, StandbyTask3
   
   So a small loop like this, worst case losing two nodes will cause 2 tasks to 
go down, so the assignor is preferring
   
        Node A: ActiveTask1, StandbyTask4
        Node B: ActiveTask2, StandbyTask1
        Node C: ActiveTask3, StandbyTask2
        Node D: ActiveTask4, StandbyTask3
   
   Which is a “big loop” assignment, where worst-case losing two nodes will 
cause at most 1 task to be unavailable. However, this optimization seems fairly 
niche, and also the current implementation does not seem to implement it in a 
direct form, but a more relaxed constraint which usually does not always avoid 
small loops. So it remains unclear whether this is really the intention behind 
the optimization. The current unit tests of the StickyTasksAssignor pass even 
after removing the optimization.
   
   The pairs optimization has a worst-case quadratic space and time complexity 
in the number of tasks, and make a lot of other optimizations impossible, so 
I’d suggest we remove it. I don’t think, in its current form, it is suitable to 
be implemented in a broker-side assignor. Note, however, if we identify a 
useful effect of the code in the future, we can work on finding an efficient 
algorithm that can bring the optimization to our broker-side assignor. 
   
   This reduces the runtime of our worst case benchmark by 10x.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to