On Wed, Sep 16, 2015 at 11:46 AM, Anshul Gangwar <anshul.gang...@citrix.com> wrote:
> It’s not difficult to find a good grace period. It will simply depend on > your Hypervisor settings how it is configured for HA. You can easily figure > out for how much time there will be no VM on any Host from your settings > and simply put 2-3 times of that period as grace period. > That seems kludgey. > It seems you have considered only one aspect of change i.e. User VMs HA. > Did you consider System VMs HA? > Did you consider that we have already explored that territory of separate > handling of PowerOff and PowerReportMissing? > for VMware or for all hypervisors? Do you have a link to the discussion? These states are different. Why was it decided to treat them the same? > And even if you are still thinking of this change then add marvin tests > for this change. Unit tests will not tell anything about the change. > Yes, that I definitely agree on. > Regards, > Anshul > > > On 16-Sep-2015, at 2:48 PM, Rene Moser <m...@renemoser.net> wrote: > > > > > > Hi René > > > > On 09/16/2015 10:17 AM, Anshul Gangwar wrote: > >> Currently we report only PowerOn VMs and do not report PowerOff VMs > that's why we consider Missing and PowerOff as same And that's how most of > the code is written for VM sync and each Hypervisor resource has same > understanding. This will effect HA and many more unknown places. So please > do not even consider to merge this change. > >> > >> So Now coming to bug we can fix that by changing global setting > pingInterval to appropriate value according to hypervisor settings which > takes care of these transitional period of missing report here or can be > handled by introducing gracePeriod global setting. > > > > This is interesting, I also wrote in the bug report gracePeriod > > calculation might be related. > > > https://github.com/apache/cloudstack/blob/4.5.2/engine/orchestration/src/com/cloud/vm/VirtualMachinePowerStateSyncImpl.java#L110 > . > > > > IMHO making this value configurable would might solve it, but it is hard > > to "guess" what a good grace period would be. > > > > In terms of VMware it depends on amounts of esx in the clusters, and > > they can be different. > > > > But another question is, why make one _global_ grace period for every > > hypervisor. Think about, users can have mixed hypervisors setups. > > > > So to me, a global grace period setting might not be the best solution, > > instead we should take care hypervisor functionality, in this case > > VMware, it handels HA by itself. > > > > I know a VR in 4.5 would be broken after an VMware HA event, but there > > is another global setting, which can be enabled if you like for out of > > band migrations router restarts. > > > > So to me, in 4.5 I am +1 for the patch of daan makes sense, if > > hypervisor is VMware. > > > > Yours > > René > > > > -- Daan