I don’t think there was any discussion around this. Kelven have made fixes around VMSync. So to find details look into FS https://cwiki.apache.org/confluence/display/CLOUDSTACK/FS+-+VMSync+improvement .
Regards, Anshul On 16-Sep-2015, at 3:32 PM, Daan Hoogland <daan.hoogl...@gmail.com<mailto:daan.hoogl...@gmail.com>> wrote: On Wed, Sep 16, 2015 at 11:46 AM, Anshul Gangwar <anshul.gang...@citrix.com<mailto:anshul.gang...@citrix.com>> wrote: It’s not difficult to find a good grace period. It will simply depend on your Hypervisor settings how it is configured for HA. You can easily figure out for how much time there will be no VM on any Host from your settings and simply put 2-3 times of that period as grace period. That seems kludgey. It seems you have considered only one aspect of change i.e. User VMs HA. Did you consider System VMs HA? Did you consider that we have already explored that territory of separate handling of PowerOff and PowerReportMissing? for VMware or for all hypervisors? Do you have a link to the discussion? These states are different. Why was it decided to treat them the same? And even if you are still thinking of this change then add marvin tests for this change. Unit tests will not tell anything about the change. Yes, that I definitely agree on. Regards, Anshul On 16-Sep-2015, at 2:48 PM, Rene Moser <m...@renemoser.net<mailto:m...@renemoser.net>> wrote: Hi René On 09/16/2015 10:17 AM, Anshul Gangwar wrote: Currently we report only PowerOn VMs and do not report PowerOff VMs that's why we consider Missing and PowerOff as same And that's how most of the code is written for VM sync and each Hypervisor resource has same understanding. This will effect HA and many more unknown places. So please do not even consider to merge this change. So Now coming to bug we can fix that by changing global setting pingInterval to appropriate value according to hypervisor settings which takes care of these transitional period of missing report here or can be handled by introducing gracePeriod global setting. This is interesting, I also wrote in the bug report gracePeriod calculation might be related. https://github.com/apache/cloudstack/blob/4.5.2/engine/orchestration/src/com/cloud/vm/VirtualMachinePowerStateSyncImpl.java#L110 . IMHO making this value configurable would might solve it, but it is hard to "guess" what a good grace period would be. In terms of VMware it depends on amounts of esx in the clusters, and they can be different. But another question is, why make one _global_ grace period for every hypervisor. Think about, users can have mixed hypervisors setups. So to me, a global grace period setting might not be the best solution, instead we should take care hypervisor functionality, in this case VMware, it handels HA by itself. I know a VR in 4.5 would be broken after an VMware HA event, but there is another global setting, which can be enabled if you like for out of band migrations router restarts. So to me, in 4.5 I am +1 for the patch of daan makes sense, if hypervisor is VMware. Yours René -- Daan