bradh352 commented on issue #11141:
URL: https://github.com/apache/cloudstack/issues/11141#issuecomment-3568380154

   @TadiosAbebe my environment matches yours .... ubuntu 24.04, hypercoverged 
ceph.  And the reported behavior also matches.
   
   That said, using the force reconnect never did work for me as a resolution.  
And when a host got in that state, just restarting the agents wouldn't work for 
me and I'm not exactly sure why.   I never did fully isolate it.  Sometimes I 
had to restart the management servers which was really concerning, I think I 
tried restarting libvirtd too.
   
   I set up a cron job to restart the agent daily at 5 min intervals across 
hosts: 
https://github.com/bradh352/ansible-role-service-cloudstack/commit/fdb14d92f115a5845d95880a05f467b42a84aff1
   
   This has helped considerably.  That said, I just went through to "test" 
everything right now to see if everything was healthy, which entails trying to 
open the console on one vm per host, and I found a host in a bad state ... ugh. 
 So restarting the agent daily is definitely not a resolution.  Out of 
curiosity I decided to restart libvirtd, and that *did* fix it on the bad host.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to