[ https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209545#comment-14209545 ]
Joris van Lieshout commented on CLOUDSTACK-7857: ------------------------------------------------ Hi Rohit, I did some digging around in the XenCenter code and found a possible solution there. But there is a challenge I think. The overhead is dynamic based on the instances running on the host, and, at the moment, ACS calculates this overhead at host thread startup. This is what I found in the XenCenter code: https://github.com/xenserver/xenadmin/blob/a0d31920c5ac62eda9713228043a834ba7829986/XenModel/XenAPI-Extensions/Host.cs#L1071 ====== public long xen_memory_calc { get { if (!Helpers.MidnightRideOrGreater(Connection)) { Host_metrics host_metrics = Connection.Resolve(this.metrics); if (host_metrics == null) return 0; long totalused = 0; foreach (VM vm in Connection.ResolveAll(resident_VMs)) { VM_metrics vmMetrics = vm.Connection.Resolve(vm.metrics); if (vmMetrics != null) totalused += vmMetrics.memory_actual; } return host_metrics.memory_total - totalused - host_metrics.memory_free; } long xen_mem = memory_overhead; foreach (VM vm in Connection.ResolveAll(resident_VMs)) { xen_mem += vm.memory_overhead; if (vm.is_control_domain) { VM_metrics vmMetrics = vm.Connection.Resolve(vm.metrics); if (vmMetrics != null) xen_mem += vmMetrics.memory_actual; } } return xen_mem; } } ===== We can skip the first part because, if I'm not mistaking, ACS only supports XS5.6 and up. XS5.6 = MidnightRide In short the formula is something like this: xen_mem = host_memory_overhead + residentVMs_memory_overhead + dom0_memory_actual Here is a list of xe commands that will get you the correct numbers to summarize. host_mem_overhead xe host-list name-label=$HOSTNAME params=memory-overhead --minimal residentVMs_memory_overhead xe vm-list resident-on=$(xe host-list name-label=$HOSTNAME --minimal) params=memory-overhead --minimal dom0_memory_actual xe vm-list resident-on=$(xe host-list name-label=$HOSTNAME --minimal) is-control-domain=true params=memory-actual --minimal > CitrixResourceBase wrongly calculates total memory on hosts with a lot of > memory and large Dom0 > ----------------------------------------------------------------------------------------------- > > Key: CLOUDSTACK-7857 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0 > Reporter: Joris van Lieshout > Priority: Blocker > > We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates > available memory using this formula: > CitrixResourceBase.java > protected void fillHostInfo > ram = (long) ((ram - dom0Ram - _xs_memory_used) * > _xs_virtualization_factor); > In our situation: > ram = 274841497600 > dom0Ram = 4269801472 > _xs_memory_used = 128 * 1024 * 1024L = 134217728 > _xs_virtualization_factor = 63.0/64.0 = 0,984375 > (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800 > This is in fact not the actual amount of memory available for instances. The > difference in our situation is a little less then 1GB. On this particular > hypervisor Dom0+Xen uses about 9GB. > As the comment above the definition of XsMemoryUsed allready stated it's time > to review this logic. > "//Hypervisor specific params with generic value, may need to be overridden > for specific versions" > The effect of this bug is that when you put a hypervisor in maintenance it > might try to move instances (usually small instances (<1GB)) to a host that > in fact does not have enought free memory. > This exception is thrown: > ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 > work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating] > com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to > Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration > failed due to com.cloud.utils.exception.CloudRuntim > eException: Unable to migrate VM(r-4482-VM) from > host(6805d06c-4d5b-4438-a245-7915e93041d9) due to Task failed! Task record: > uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e > nameLabel: Async.VM.pool_migrate > nameDescription: > allowedOperations: [] > currentOperations: {} > created: Thu Nov 06 13:44:14 CET 2014 > finished: Thu Nov 06 13:44:14 CET 2014 > status: failure > residentOn: com.xensource.xenapi.Host@b42882c6 > progress: 1.0 > type: <none/> > result: > errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136] > otherConfig: {} > subtaskOf: com.xensource.xenapi.Task@aaf13f6f > subtasks: [] > at > com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840) > at > com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214) > at > com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610) > at > com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865) > at > com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822) > at > com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) > at > com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831) -- This message was sent by Atlassian JIRA (v6.3.4#6332)