rajujith commented on issue #10477: URL: https://github.com/apache/cloudstack/issues/10477#issuecomment-2753247589
@alsko-icom With the latest nightly build, without making any changes, the VM HA starts about 5 minutes after a KVM host crashed. I simulated the host crash by powering off the nested KVM I used. If you want to reduce it further, you can set the global configuration 'commands.wait' value to a suitable value like "CheckHealthCommand=5,CheckOnHostCommand=5". I saw VM HA triggered after a host crash in 2 Minutes and 30 seconds. Once a host crashes CloudStack should identify that the host is unreachable, that is determined by 'ping.interval * ping.timeout' [1] So reducing the value ( use an appropriate value based on your testing) will make the host crash detection faster. Then it starts multiple investigations where it uses commands like 'CheckHealthCommand' , CheckOnHostCommand and a few more. you can view the sample log file in the issue description. Updating ping.interval,ping.timeout requires a cloudstack-management service restart. I didn't verify anything on the host HA, I tried only the VM HA. [1] https://cwiki.apache.org/confluence/display/CLOUDSTACK/High+Availability+Developer's+Guide -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
