It’s not difficult to find a good grace period. It will simply depend on your 
Hypervisor settings how it is configured for HA. You can easily figure out for 
how much time there will be no VM on any Host from your settings and simply put 
2-3 times of that period as grace period.

It seems you have considered only one aspect of change i.e. User VMs HA. 
Did you consider System VMs HA? 
Did you consider that we have already explored that territory of separate 
handling of PowerOff and PowerReportMissing?

And even if you are still thinking of this change then add marvin tests for 
this change. Unit tests will not tell anything about the change.

Regards,
Anshul

> On 16-Sep-2015, at 2:48 PM, Rene Moser <m...@renemoser.net> wrote:
> 
> 
> Hi René
> 
> On 09/16/2015 10:17 AM, Anshul Gangwar wrote:
>> Currently we report only PowerOn VMs and do not report PowerOff VMs that's 
>> why we consider Missing and PowerOff as same And that's how most of the code 
>> is written for VM sync and each Hypervisor resource has same understanding. 
>> This will effect HA and many more unknown places. So please do not even 
>> consider to merge this change.
>> 
>> So Now coming to bug we can fix that by changing global setting pingInterval 
>> to appropriate value according to hypervisor settings which takes care of 
>> these transitional period of missing report here or can be handled by 
>> introducing gracePeriod global setting.
> 
> This is interesting, I also wrote in the bug report gracePeriod
> calculation might be related.
> https://github.com/apache/cloudstack/blob/4.5.2/engine/orchestration/src/com/cloud/vm/VirtualMachinePowerStateSyncImpl.java#L110.
> 
> IMHO making this value configurable would might solve it, but it is hard
> to "guess" what a good grace period would be.
> 
> In terms of VMware it depends on amounts of esx in the clusters, and
> they can be different.
> 
> But another question is, why make one _global_ grace period for every
> hypervisor. Think about, users can have mixed hypervisors setups.
> 
> So to me, a global grace period setting might not be the best solution,
> instead we should take care hypervisor functionality, in this case
> VMware, it handels HA by itself.
> 
> I know a VR in 4.5 would be broken after an VMware HA event, but there
> is another global setting, which can be enabled if you like for out of
> band migrations router restarts.
> 
> So to me, in 4.5 I am +1 for the patch of daan makes sense, if
> hypervisor is VMware.
> 
> Yours
> René
> 

Reply via email to