Martin Husemann <mar...@duskware.de> writes: > On Mon, Aug 21, 2017 at 12:50:32AM +0000, Cherry G. Mathew wrote: >> In this case, the CPU is actually exported to the domU as a x86 cpu >> as well as vcpu. This has nothing to do with baremetal cpu which is >> invisible to HVM domain guests. So you can have upto the currently >> supported max of these CPUs regardless of the baremetal number. > > > Sorry, my Xen knowledge is minimal and I fail to parse the above - could > you elaborate? >
There's two modes that Xen operates under, broadly speaking, PV and HVM. There are "sub-modes" which are a 'mix' of these - PVHVM, and PVH. These are well documented on the Xen Wiki. [1] PV aka "Paravirtualised" - is where the hypervisor runs at the highest privilege level, and the kernel runs either in an intermediate PL, or at the same PL as user programs (yes, security bugs have been found because of this). The advantage here is that the CPU hardware does not need to support any special virtualisation instructions. However, the OS kernel needs to pretty much explicitly call the hypervisor using a mechanism analogous to how userland calls the OS itself. Since both OS and userland calls go directly to Xen, it mediates this by recognising whether the OS kernel or the user program under an OS made the call, and behaves accordingly. All this is pretty expensive, which is why, despite XenMP, our current PV based xen system is comparatively slow. One of the things that is virtualised in this case is the abstraction of the cpu itself - called a VCPU. VCPUS are managed by xen, and you can do things like start them, stop them, or put them to sleep. These Virtual CPUS are pretty much CPU state contexts that are scheduled onto the actual CPUs, pretty much the same way that thread contexts are scheduled onto CPUS by an OS. From the OS point of view, it sees only VCPUS as the actual CPUs, since these are the onlythings that can be used to schedule threads. So cpu_info_list is basically a list of these VCPUs, and *NOT* the underlying baremetal CPUs - which only the hypervisor can manage. EXCEPT: dom0, which is the controlling domain that collaborates with XEN to provide things like device drivers and filesystem support, are allowed access to the MADT and ACPI tables (or a subset of them) which are various firmware tables to enumerate baremetal CPUs. In the interest of providing driver support, for things like frequency scaling or temperature control (I'm not 100% sure if Xen allows access to this, but I'm trying to illustrate the situation), our dom0 does access these tables and bring up the respective drivers, seen as "xxx at cpu" in the xen/conf/files.xen config file. Remember that the cpus enumerated by these tables are *NOT* schedulable entities - therefore they don't get on the cpu_info_list , and they are purely used as a node on the config(9) chain. The actual schedulable cpus are still the VCPUS, mentioned above, which attach via a completely unrelated boot path. John's point was that these VCPUS , since they are scheduled contexts, can and often are more than the underlying baremetal number of VCPUs (not on dom0 though - although it's technically possible). So you could bring up a domU running on a uniprocessor board, with 8vcpus, if you like. dom0s have 1:1 correspondence with underlying physical CPUS. Enter HVM, Here, *everything* is virtualised, including the CPUS. The guest OS cannot differentiate between native hardware (except via explicit interfaces like virtio) and the virtualised container it is running under. If we ran NetBSD under HVM, the "physical"/"native" cpus that we see, are basically VCPUS from Xen's PoV. NetBSD uses the standard x86 native code (mpacpi/mpbios) to probe for these CPUs and detect and schedule them. EXCEPT PVHVM - "Paravirtualised HVM" is a mode where *some* Xen functions are exported to the user domain. This makes the previously mentioned virtualised HVM cpus that NetBSD things are "baremetal" CPUs also accessible by a hypervisor mediated API. In this case, both the fully virtualised acpi/mpbios API and the 'xvcpu' API access schedulable VCPUs. This means that 'xvcpu's are merely aliases to the acpi probed cpus. thus 1:1. These CPUs have nothing to do with underlying acpi CPUs. EXCEPT PVH mode is where the PVHVM cpus are also actually underlying baremetal CPUs. Q.E.D I hope this was useful. [1] https://wiki.xen.org/wiki/Virtualization_Spectrum -- ~cherry