kotman12 commented on PR #3330:
URL: https://github.com/apache/solr/pull/3330#issuecomment-2828824582

   > > That's a great analysis. Thanks @mlbiscoc and @kotman12.
   > > > Didn't think a test was necessary for this, but we're happy to add one 
if the community feels it's warranted
   > > 
   > > 
   > > That's very unfortunate that a broken feature was initially merged. The 
very complex shutdown sequence somehow led to this broken change. Adding a test 
would make sure there is no future regression. Saying this, I have no idea how 
hard it would be to test this. Having a test is not a requirement but a nice to 
have.
   > 
   > Thanks @psalagnac! I can look into whats possible for a test. I'll try and 
get it to fail on main and see if this PR it passes. We would need maybe an 
integration test with cc.shutdown() and check that the leader election znode in 
zookeeper goes away during shutdown. We see this bug get really bad during 
heavy ingestion and the container get stuck to shutdown until we hit.
   > 
   > ```
   > Timed out waiting for in-flight update requests to complete for core:
   > ```
   
   This might be challenging because I suspect that the zkSys.close which 
happens later on deletes the leader election node as well. So to truly trigger 
this condition you'd have to cut the shutdown process short or suspend it to 
make sure that the optimized leader node removal is the only thing that can 
delete the leader election node.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to