Re: Is xl vcpu-set broken
On 2/28/23 2:24 AM, Anthony PERARD wrote: > On Tue, Feb 28, 2023 at 10:37:00AM +0100, Jan Beulich wrote: >> On 28.02.2023 07:44, Joe Jin wrote: >>> We encountered a vcpu-set issue on old xen, when I tried to confirm >>> if xen upstream xen has the same issue I find neither my upstream build >>> nor ubuntu 22.04 xen-hypervisor-4.16 work. >>> >>> I can add vcpus(8->16) to my guest but I can not reduce vcpu number: >>> >>> root@ubuntu2204:~/vm# xl list >>> Name ID Mem VCPUs State >>> Time(s) >>> Domain-0 0 255424 32 r- >>> 991.9 >>> testvm 1 4088 16 -b >>> 94.6 >>> root@ubuntu2204:~/vm# xl vcpu-set testvm 8 >>> root@ubuntu2204:~/vm# xl list >>> Name ID Mem VCPUs State >>> Time(s) >>> Domain-0 0 255424 32 r- >>> 998.5 >>> testvm 1 4088 16 -b >>> 97.3 >>> >>> After xl vcpu-set, xenstore showed online cpu number reduced to 8: > [...] >>> But guest did not received any offline events at all. >>> >>> From source code my understand is for cpu online, libxl will >>> send device_add to qemu to trigger vcpu add, for cpu offline, >>> it still rely on xenstore, is this correct? >> Judging from the DSDT we provide, offlining looks to also be intended to >> go the ACPI way. Whereas libxl only ever sends "device_add" commands to >> qemu, afaics (or "cpu-add" for older qemu). Anthony - do you have any >> insight what the intentions here are? > The intention is to one day implement cpu offline in QEMU upstream for > HVM guest, I don't think that's ever been done so far. > > As we use device_add for cpu hotplug, we would probably do device_del > for hot-unplug, so qemu would still have to send the appropriate even to > the guest. Sounds like this still in TODO list but not of build or test env issue. > > Someone will have to figure out if "device_del" works with a Xen guest, > doc here: > https://www.qemu.org/docs/master/system/cpu-hotplug.html#vcpu-hot-unplug Tried QMP and looks it does not works: (QEMU) device_del id=cpu-5 {"error": {"class": "GenericError", "desc": "acpi: device unplug request for not supported device type: qemu32-i386-cpu"}} Thanks so much for your input! Cheers, Joe
Re: Is xl vcpu-set broken
On 2/28/23 12:49 AM, Andrew Cooper wrote: > On 28/02/2023 6:44 am, Joe Jin wrote: >> Hi, >> >> We encountered a vcpu-set issue on old xen, when I tried to confirm >> if xen upstream xen has the same issue I find neither my upstream build >> nor ubuntu 22.04 xen-hypervisor-4.16 work. >> >> I can add vcpus(8->16) to my guest but I can not reduce vcpu number: >> >> root@ubuntu2204:~/vm# xl list >> Name ID Mem VCPUs State >> Time(s) >> Domain-0 0 255424 32 r- >> 991.9 >> testvm 1 4088 16 -b >> 94.6 >> root@ubuntu2204:~/vm# xl vcpu-set testvm 8 >> root@ubuntu2204:~/vm# xl list >> Name ID Mem VCPUs State >> Time(s) >> Domain-0 0 255424 32 r- >> 998.5 >> testvm 1 4088 16 -b >> 97.3 >> >> After xl vcpu-set, xenstore showed online cpu number reduced to 8: >> >> /local/domain/1/vm = "/vm/aa109ea0-2fde-4712-81d8-75f73bd73715" >> /local/domain/1/name = "testvm" >> /local/domain/1/cpu = "" >> /local/domain/1/cpu/0 = "" >> /local/domain/1/cpu/0/availability = "online" >> /local/domain/1/cpu/1 = "" >> /local/domain/1/cpu/1/availability = "online" >> /local/domain/1/cpu/2 = "" >> /local/domain/1/cpu/2/availability = "online" >> /local/domain/1/cpu/3 = "" >> /local/domain/1/cpu/3/availability = "online" >> /local/domain/1/cpu/4 = "" >> /local/domain/1/cpu/4/availability = "online" >> /local/domain/1/cpu/5 = "" >> /local/domain/1/cpu/5/availability = "online" >> /local/domain/1/cpu/6 = "" >> /local/domain/1/cpu/6/availability = "online" >> /local/domain/1/cpu/7 = "" >> /local/domain/1/cpu/7/availability = "online" >> /local/domain/1/cpu/8 = "" >> /local/domain/1/cpu/8/availability = "offline" >> /local/domain/1/cpu/9 = "" >> /local/domain/1/cpu/9/availability = "offline" >> /local/domain/1/cpu/10 = "" >> /local/domain/1/cpu/10/availability = "offline" >> /local/domain/1/cpu/11 = "" >> /local/domain/1/cpu/11/availability = "offline" >> /local/domain/1/cpu/12 = "" >> /local/domain/1/cpu/12/availability = "offline" >> /local/domain/1/cpu/13 = "" >> /local/domain/1/cpu/13/availability = "offline" >> /local/domain/1/cpu/14 = "" >> /local/domain/1/cpu/14/availability = "offline" >> /local/domain/1/cpu/15 = "" >> /local/domain/1/cpu/15/availability = "offline" >> /local/domain/1/cpu/16 = "" >> /local/domain/1/cpu/16/availability = "offline" >> >> But guest did not received any offline events at all. >> >> From source code my understand is for cpu online, libxl will >> send device_add to qemu to trigger vcpu add, for cpu offline, >> it still rely on xenstore, is this correct? anything can block >> cpu scale down? >> >> Appreciate for any suggestions! > Your mention of Qemu suggests this is an HVM guest. Can you confirm? Yes it's HVM guest. Thanks, Joe
Re: Is xl vcpu-set broken
On Tue, Feb 28, 2023 at 10:37:00AM +0100, Jan Beulich wrote: > On 28.02.2023 07:44, Joe Jin wrote: > > We encountered a vcpu-set issue on old xen, when I tried to confirm > > if xen upstream xen has the same issue I find neither my upstream build > > nor ubuntu 22.04 xen-hypervisor-4.16 work. > > > > I can add vcpus(8->16) to my guest but I can not reduce vcpu number: > > > > root@ubuntu2204:~/vm# xl list > > Name ID Mem VCPUs State > > Time(s) > > Domain-0 0 255424 32 r- > > 991.9 > > testvm 1 4088 16 -b > > 94.6 > > root@ubuntu2204:~/vm# xl vcpu-set testvm 8 > > root@ubuntu2204:~/vm# xl list > > Name ID Mem VCPUs State > > Time(s) > > Domain-0 0 255424 32 r- > > 998.5 > > testvm 1 4088 16 -b > > 97.3 > > > > After xl vcpu-set, xenstore showed online cpu number reduced to 8: [...] > > > > But guest did not received any offline events at all. > > > > From source code my understand is for cpu online, libxl will > > send device_add to qemu to trigger vcpu add, for cpu offline, > > it still rely on xenstore, is this correct? > > Judging from the DSDT we provide, offlining looks to also be intended to > go the ACPI way. Whereas libxl only ever sends "device_add" commands to > qemu, afaics (or "cpu-add" for older qemu). Anthony - do you have any > insight what the intentions here are? The intention is to one day implement cpu offline in QEMU upstream for HVM guest, I don't think that's ever been done so far. As we use device_add for cpu hotplug, we would probably do device_del for hot-unplug, so qemu would still have to send the appropriate even to the guest. Someone will have to figure out if "device_del" works with a Xen guest, doc here: https://www.qemu.org/docs/master/system/cpu-hotplug.html#vcpu-hot-unplug Cheers, -- Anthony PERARD
Re: Is xl vcpu-set broken
On 28.02.2023 07:44, Joe Jin wrote: > Hi, > > We encountered a vcpu-set issue on old xen, when I tried to confirm > if xen upstream xen has the same issue I find neither my upstream build > nor ubuntu 22.04 xen-hypervisor-4.16 work. > > I can add vcpus(8->16) to my guest but I can not reduce vcpu number: > > root@ubuntu2204:~/vm# xl list > Name ID Mem VCPUs State Time(s) > Domain-0 0 255424 32 r- > 991.9 > testvm 1 4088 16 -b > 94.6 > root@ubuntu2204:~/vm# xl vcpu-set testvm 8 > root@ubuntu2204:~/vm# xl list > Name ID Mem VCPUs State Time(s) > Domain-0 0 255424 32 r- > 998.5 > testvm 1 4088 16 -b > 97.3 > > After xl vcpu-set, xenstore showed online cpu number reduced to 8: > > /local/domain/1/vm = "/vm/aa109ea0-2fde-4712-81d8-75f73bd73715" > /local/domain/1/name = "testvm" > /local/domain/1/cpu = "" > /local/domain/1/cpu/0 = "" > /local/domain/1/cpu/0/availability = "online" > /local/domain/1/cpu/1 = "" > /local/domain/1/cpu/1/availability = "online" > /local/domain/1/cpu/2 = "" > /local/domain/1/cpu/2/availability = "online" > /local/domain/1/cpu/3 = "" > /local/domain/1/cpu/3/availability = "online" > /local/domain/1/cpu/4 = "" > /local/domain/1/cpu/4/availability = "online" > /local/domain/1/cpu/5 = "" > /local/domain/1/cpu/5/availability = "online" > /local/domain/1/cpu/6 = "" > /local/domain/1/cpu/6/availability = "online" > /local/domain/1/cpu/7 = "" > /local/domain/1/cpu/7/availability = "online" > /local/domain/1/cpu/8 = "" > /local/domain/1/cpu/8/availability = "offline" > /local/domain/1/cpu/9 = "" > /local/domain/1/cpu/9/availability = "offline" > /local/domain/1/cpu/10 = "" > /local/domain/1/cpu/10/availability = "offline" > /local/domain/1/cpu/11 = "" > /local/domain/1/cpu/11/availability = "offline" > /local/domain/1/cpu/12 = "" > /local/domain/1/cpu/12/availability = "offline" > /local/domain/1/cpu/13 = "" > /local/domain/1/cpu/13/availability = "offline" > /local/domain/1/cpu/14 = "" > /local/domain/1/cpu/14/availability = "offline" > /local/domain/1/cpu/15 = "" > /local/domain/1/cpu/15/availability = "offline" > /local/domain/1/cpu/16 = "" > /local/domain/1/cpu/16/availability = "offline" > > But guest did not received any offline events at all. > > From source code my understand is for cpu online, libxl will > send device_add to qemu to trigger vcpu add, for cpu offline, > it still rely on xenstore, is this correct? Judging from the DSDT we provide, offlining looks to also be intended to go the ACPI way. Whereas libxl only ever sends "device_add" commands to qemu, afaics (or "cpu-add" for older qemu). Anthony - do you have any insight what the intentions here are? > anything can block cpu scale down? Nothing as far as I'm aware. But if done the xenbus way, the guest needs to watch the respective node. Yet on x86 that's done in upstream Linux only for PV and PVH (see setup_vcpu_hotplug_event()); as per Andrew's question the assumption here looks to be that you're asking about a HVM guest, though. And of course it can't be blindly enabled there, as the "online" operation should not be handled there in that case (or else there'll be colliding onlining requests from ACPI and xenbus). Jan
Re: Is xl vcpu-set broken
On 28/02/2023 6:44 am, Joe Jin wrote: > Hi, > > We encountered a vcpu-set issue on old xen, when I tried to confirm > if xen upstream xen has the same issue I find neither my upstream build > nor ubuntu 22.04 xen-hypervisor-4.16 work. > > I can add vcpus(8->16) to my guest but I can not reduce vcpu number: > > root@ubuntu2204:~/vm# xl list > Name ID Mem VCPUs State Time(s) > Domain-0 0 255424 32 r- > 991.9 > testvm 1 4088 16 -b > 94.6 > root@ubuntu2204:~/vm# xl vcpu-set testvm 8 > root@ubuntu2204:~/vm# xl list > Name ID Mem VCPUs State Time(s) > Domain-0 0 255424 32 r- > 998.5 > testvm 1 4088 16 -b > 97.3 > > After xl vcpu-set, xenstore showed online cpu number reduced to 8: > > /local/domain/1/vm = "/vm/aa109ea0-2fde-4712-81d8-75f73bd73715" > /local/domain/1/name = "testvm" > /local/domain/1/cpu = "" > /local/domain/1/cpu/0 = "" > /local/domain/1/cpu/0/availability = "online" > /local/domain/1/cpu/1 = "" > /local/domain/1/cpu/1/availability = "online" > /local/domain/1/cpu/2 = "" > /local/domain/1/cpu/2/availability = "online" > /local/domain/1/cpu/3 = "" > /local/domain/1/cpu/3/availability = "online" > /local/domain/1/cpu/4 = "" > /local/domain/1/cpu/4/availability = "online" > /local/domain/1/cpu/5 = "" > /local/domain/1/cpu/5/availability = "online" > /local/domain/1/cpu/6 = "" > /local/domain/1/cpu/6/availability = "online" > /local/domain/1/cpu/7 = "" > /local/domain/1/cpu/7/availability = "online" > /local/domain/1/cpu/8 = "" > /local/domain/1/cpu/8/availability = "offline" > /local/domain/1/cpu/9 = "" > /local/domain/1/cpu/9/availability = "offline" > /local/domain/1/cpu/10 = "" > /local/domain/1/cpu/10/availability = "offline" > /local/domain/1/cpu/11 = "" > /local/domain/1/cpu/11/availability = "offline" > /local/domain/1/cpu/12 = "" > /local/domain/1/cpu/12/availability = "offline" > /local/domain/1/cpu/13 = "" > /local/domain/1/cpu/13/availability = "offline" > /local/domain/1/cpu/14 = "" > /local/domain/1/cpu/14/availability = "offline" > /local/domain/1/cpu/15 = "" > /local/domain/1/cpu/15/availability = "offline" > /local/domain/1/cpu/16 = "" > /local/domain/1/cpu/16/availability = "offline" > > But guest did not received any offline events at all. > > From source code my understand is for cpu online, libxl will > send device_add to qemu to trigger vcpu add, for cpu offline, > it still rely on xenstore, is this correct? anything can block > cpu scale down? > > Appreciate for any suggestions! Your mention of Qemu suggests this is an HVM guest. Can you confirm? ~Andrew