Hi Igor,

On 11/7/24 09:48, Igor Mammedov wrote:
Currently SMBIOS maximum memory device chunk is capped at 16Gb,
which is fine for the most cases (QEMU uses it to describe initial
RAM (type 17 SMBIOS table entries)).
However when starting guest with terabytes of RAM this leads to
too many memory device structures, which eventually upsets linux
kernel as it reserves only 64K for these entries and when that
border is crossed out it runs out of reserved memory.

Instead of partitioning initial RAM on 16Gb chunks, use maximum
possible chunk size that SMBIOS spec allows[1]. Which lets
encode RAM in Mb units in uint32_t-1 field (upto 2047Tb).
As result initial RAM will generate only one type 17 structure
until host/guest reach ability to use more RAM in the future.

Compat changes:
We can't unconditionally change chunk size as it will break
QEMU<->guest ABI (and migration). Thus introduce a new machine class
field that would let older versioned machines to use 16Gb chunks
while new machine type could use maximum possible chunk size.

While it might seem to be risky to rise max entry size this much
(much beyond of what current physical RAM modules support),
I'd not expect it causing much issues, modulo uncovering bugs
in software running within guest. And those should be fixed
on guest side to handle SMBIOS spec properly, especially if
guest is expected to support so huge RAM configs.
In worst case, QEMU can reduce chunk size later if we would
care enough about introducing a workaround for some 'unfixable'
guest OS, either by fixing up the next machine type or
giving users a CLI option to customize it.

1) SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size

PS:
* tested on 8Tb host with RHEL6 guest, which seems to parse
   type 17 SMBIOS table entries correctly (according to 'dmidecode').

Signed-off-by: Igor Mammedov <imamm...@redhat.com>
---
  include/hw/boards.h |  4 ++++
  hw/arm/virt.c       |  1 +
  hw/core/machine.c   |  1 +
  hw/i386/pc_piix.c   |  1 +
  hw/i386/pc_q35.c    |  1 +
  hw/smbios/smbios.c  | 11 ++++++-----
  6 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index ef6f18f2c1..48ff6d8b93 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -237,6 +237,9 @@ typedef struct {
   *    purposes only.
   *    Applies only to default memory backend, i.e., explicit memory backend
   *    wasn't used.
+ * @smbios_memory_device_size:
+ *    Default size of memory device,
+ *    SMBIOS 3.1.0 "7.18 Memory Device (Type 17)"
   */
  struct MachineClass {
      /*< private >*/
@@ -304,6 +307,7 @@ struct MachineClass {
      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
      ram_addr_t (*fixup_ram_size)(ram_addr_t size);
+    uint64_t smbios_memory_device_size;

Quick notes since I'm on holidays (not meant to block this patch):

- How will evolve this machine class property in the context of
  a heterogeneous machine (i.e. x86_64 cores and 1 riscv32 one)?

- Should this become a SmbiosProviderInterface later?

  };
/**
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b0c68d66a3..719e83e6a1 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3308,6 +3308,7 @@ DEFINE_VIRT_MACHINE_AS_LATEST(9, 1)
  static void virt_machine_9_0_options(MachineClass *mc)
  {
      virt_machine_9_1_options(mc);
+    mc->smbios_memory_device_size = 16 * GiB;
      compat_props_add(mc->compat_props, hw_compat_9_0, hw_compat_9_0_len);
  }
  DEFINE_VIRT_MACHINE(9, 0)

[...]

Reply via email to