> On 08-Dec-2023, at 6:27 PM, Daniel P. Berrangé <berra...@redhat.com> wrote:
>
> On Fri, Dec 08, 2023 at 05:56:11PM +0530, Ani Sinha wrote:
>> Since commit f10a570b093e6 ("KVM: x86: Add CONFIG_KVM_MAX_NR_VCPUS to allow
>> up to 4096 vCPUs")
>> Linux kernel can support upto a maximum number of 4096 vCPUS when MAXSMP is
>> enabled in the kernel. So bump up the max_cpus value for q35 machines
>> versions
>> 8.3 and newer to 4096. Older q35 machines versions 8.2 and older continue to
>> support 1024 maximum vcpus as before.
>>
>> If KVM is not able to support the specified number of vcpus, QEMU would
>> return the following error messages:
>>
>> $ ./qemu-system-x86_64 -cpu host -accel kvm -machine q35 -smp 4096
>> qemu-system-x86_64: -accel kvm: warning: Number of SMP cpus requested (4096)
>> exceeds the recommended cpus supported by KVM (12)
>> Number of SMP cpus requested (4096) exceeds the maximum cpus supported by
>> KVM (1024)
>>
>> Cc: Daniel P. Berrangé <berra...@redhat.com>
>> Cc: Igor Mammedov <imamm...@redhat.com>
>> Cc: Michael S. Tsirkin <m...@redhat.com>
>> Cc: Julia Suvorova <jus...@redhat.com>
>> Signed-off-by: Ani Sinha <anisi...@redhat.com>
>> ---
>> hw/i386/pc_q35.c | 15 ++++++++++++---
>> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> What testing has been done to confirm if QEMU is actually capable of
> booting a guest with this CPU count, either UEFI or SeaBIOS or both ?
I admit we did not test this with 4096 cpus.
It was tested downstream with edk2 with modified kernel and increased QEMU
limit for
https://bugzilla.redhat.com/show_bug.cgi?id=1983086
> We validated a ~48TB, 1728 cores, and 32 socket vm using legacy
> bios from smbios 3.0, the latest qemu modified with higher vcpu limits, a=
nd
> modified kernel limits.
I am trying to get some more clarity on the testing front and checking what max
values for max_cpu we can test with.
>
> Historically every time we wanted to raise max cpus we've seen limits
> or scalability problems that needed fixing first. The previous bump to
> 1024 had been implicitly proven via downstream testing we had done in
> RHEL, and had required the switch to SMBIOS v3 entrypoint.
>
>>
>> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>> index 4f3e5412f6..2ed57814e1 100644
>> --- a/hw/i386/pc_q35.c
>> +++ b/hw/i386/pc_q35.c
>> @@ -375,7 +375,7 @@ static void pc_q35_machine_options(MachineClass *m)
>> m->default_nic = "e1000e";
>> m->default_kernel_irqchip_split = false;
>> m->no_floppy = 1;
>> - m->max_cpus = 1024;
>> + m->max_cpus = 4096;
>> m->no_parallel = !module_object_class_by_name(TYPE_ISA_PARALLEL);
>> machine_class_allow_dynamic_sysbus_dev(m, TYPE_AMD_IOMMU_DEVICE);
>> machine_class_allow_dynamic_sysbus_dev(m, TYPE_INTEL_IOMMU_DEVICE);
>> @@ -383,12 +383,22 @@ static void pc_q35_machine_options(MachineClass *m)
>> machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
>> }
>>
>> -static void pc_q35_8_2_machine_options(MachineClass *m)
>> +static void pc_q35_8_3_machine_options(MachineClass *m)
>> {
>> pc_q35_machine_options(m);
>> m->alias = "q35";
>> }
>>
>> +DEFINE_Q35_MACHINE(v8_3, "pc-q35-8.3", NULL,
>> + pc_q35_8_3_machine_options);
>> +
>> +static void pc_q35_8_2_machine_options(MachineClass *m)
>> +{
>> + pc_q35_8_3_machine_options(m);
>> + m->alias = NULL;
>> + m->max_cpus = 1024;
>> +}
>> +
>> DEFINE_Q35_MACHINE(v8_2, "pc-q35-8.2", NULL,
>> pc_q35_8_2_machine_options);
>>
>> @@ -396,7 +406,6 @@ static void pc_q35_8_1_machine_options(MachineClass *m)
>> {
>> PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
>> pc_q35_8_2_machine_options(m);
>> - m->alias = NULL;
>> pcmc->broken_32bit_mem_addr_check = true;
>> compat_props_add(m->compat_props, hw_compat_8_1, hw_compat_8_1_len);
>> compat_props_add(m->compat_props, pc_compat_8_1, pc_compat_8_1_len);
>> --
>> 2.42.0
>>
>
> With regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|