imply-cheddar commented on PR #11760:
URL: https://github.com/apache/druid/pull/11760#issuecomment-1162426852

   ZK segment loading is broken right now.  As of ~2 years ago, a PR was merged 
that breaks the order of segment loading and dropping via ZK, such that the 
assignment can enter into deadlocks when a cluster is mostly full.  This wasn't 
widely an issue (personally, I only learned about it ~6 months ago) because the 
largest clusters (at least that I'm aware of) have all been using http segment 
assignment.
   
   https://github.com/apache/druid/pull/11717 has been merged.  While it is and 
was a bug, it was a corner case that we've only seen in development 
environments and never actually saw it in a production environment.  Every 
cluster I touch, I move from ZK assignment to HTTP assignment because my 
experience is that HTTP assignment is more stable.  I'm +1 on this 
directionally, but the PR does need the tests fixed as Kashif suggested before 
it can be approved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to