Hi dev, Who's the best person to ask questions about the design of LeaderElector and ElectionContext?
I ask because I've found it to be somewhat brittle in practice. During a rolling restart, it's not uncommon to get into a state where there's no Overseer. I've even experienced this locally with as few as two nodes. When this happens, I've tried (for example) deleting all the children under /solr/overseer_elect/election. In theory, this should trigger all watches on all nodes, forcing everyone to re-register and contend for leadership, but in practice I haven't found this to work. I've been diving into the LeaderElection code, and it seems much more complicated than I would have expected. Can anyone give me the theory of operation, especially around the joinAtHead and replacement flags? Thanks! Scott
