BryanMLima commented on PR #8252: URL: https://github.com/apache/cloudstack/pull/8252#issuecomment-1823140456
> Thanks @BryanMLima , for picking this up and for the extensive explanation. I have two questions: > > * where you say `6 cores at 2 GHz, the shares value will be 12000 ` I would say `huh` as it would either be 12 or 12.000.000.000. What is the factorial used there? (I probably just need a link to a definition) > > * I see no considerations on live systems, i.e. upgrades. Please, expand on that. Will it have any consequence, or will it be seamless? > > > regards, @DaanHoogland, regarding the first question, ACS calculates the shares by multiplying the frequency by the number of cores, both specified in the compute offering; this is done in method `LibvirtComputingResource#createCpuTuneDef`. Therefore, 6 cores * 2000 MHz (2 Ghz) results in 12,000 shares. This is the current behaviour of ACS, and the PR does not change it for cgroups v1; this PR only change the way ACS calculates the shares for hosts that use cgroups v2. OpenStack[^openstack] solved this same problem by not setting the shares value of all VMs, allowing advanced users to set willingly. TBH, I agree with this approach, setting the shares value like ACS does is misleading, as a more experienced user may question how ACS limits the CPU frequency of VMs. AFAIK, no hypervisor does this, the frequency of the VM will display the host's CPU frequency; hypervisors will only limit the CPU access time (and burst limits) of a VM to “simulate” the specified frequency. About the second question, I think I did not understand it fully; could you add more details? By live systems, do you mean hosts or VMs? I am assuming you mean upgrading a host from cgroupv1 to cgroupv2 (or even downgrading from cgroupv1 to cgroupv2). The core of this strategy happens in the `LibvirtComputingResource#initialize()` which is called when the `cloudstack-agent` service is (re)started. It is required to reboot the system when changing the version of cgroup, thus, when the `cloudstack-agent` service starts, ACS will check the version of the cgroup (`stat -fc %T /sys/fs/cgroup/`) and it will set its maximum CPU shares capacity. With this, ACS will always have the updated version of the cgroup utilized by the host. > Oh, and number 3 > > * How will this work in mixed systems, with old hosts using cgroups v1 and newer hosts using cgroups v2 This PR already address the migration of VMs between hosts with different versions, as the shares value is calculated in this process considering the VM's host destination. Thus, two VMs with the exact same compute offering will have different shares values for cgroups v1 and v2. The shares value is only a proportional weighted; as long as all VMs in the same hosts are in the same scale, the CPU time will be distributed accordingly. If the shares value is not set in the domain XML for libvirt (this never happens in ACS, it is always set), it will use the OS default value, which, for cgroupv2, is 100[^cgroup]; thus, the default behaviour for processes in the same cgroup is to have proportional CPU access time. This PR, however, does not address updating the shares of VMs on hosts with cgroupv2 that are already running, requiring restarting, migrating or scaling the VM. [^cgroup]: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#weights [^openstack]: https://review.opendev.org/c/openstack/nova/+/824048?tab=comments -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
