Re: [Qemu-devel] [PATCH v2 for-2.1 2/2] pc: hack for migration compatibility from QEMU 2.0

2014-07-28 Thread Paolo Bonzini
Il 28/07/2014 13:45, Michael S. Tsirkin ha scritto:
 +/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
 + * -M pc-i440fx-2.0.
 
 Let's just say 2.0 and earlier?

This would give the idea that 1.6 is broken, but it isn't.

  Even if the actual amount of AML generated grows
 + * a little bit, there should be plenty of free space since the DSDT
 + * shrunk by ~1.5k between QEMU 2.0 and QEMU 2.1.
 + */
 +#define ACPI_BUILD_CPU_AML_SIZE97
 +#define ACPI_BUILD_BRIDGE_AML_SIZE 1875
 
 Let's put _LEGACY_ somewhere here?

Ok.

 +
 +#define ACPI_BUILD_TABLE_SIZE  0x1
 +
  typedef struct AcpiCpuInfo {
  DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
  } AcpiCpuInfo;
 @@ -87,6 +99,8 @@ typedef struct AcpiBuildPciBusHotplugState {
  struct AcpiBuildPciBusHotplugState *parent;
  } AcpiBuildPciBusHotplugState;
  
 +unsigned bsel_alloc;
 +
 
 Patch will be better contained if instead of using a global
 bsel_alloc, we actually go and count the devices that
 have ACPI_PCIHP_PROP_BSEL.
 You can just scan all devices, or all pci devices, it
 should not matter.
 This way, this code will be local to the legacy path.

Ok.

 
  static void acpi_get_dsdt(AcpiMiscInfo *info)
  {
  uint16_t *applesmc_sta;
 @@ -759,8 +773,8 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
  static void acpi_set_pci_info(void)
  {
  PCIBus *bus = find_i440fx(); /* TODO: Q35 support */
 -unsigned bsel_alloc = 0;
  
 +assert(bsel_alloc == 0);
  if (bus) {
  /* Scan all PCI buses. Set property to enable acpi based hotplug. */
  pci_for_each_bus_depth_first(bus, acpi_set_bsel, NULL, bsel_alloc);
 @@ -1440,13 +1454,14 @@ static
  void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
  {
  GArray *table_offsets;
 -unsigned facs, dsdt, rsdt;
 +unsigned facs, ssdt, dsdt, rsdt;
  AcpiCpuInfo cpu;
  AcpiPmInfo pm;
  AcpiMiscInfo misc;
  AcpiMcfgInfo mcfg;
  PcPciInfo pci;
  uint8_t *u;
 +size_t aml_len = 0;
  
  acpi_get_cpu_info(cpu);
  acpi_get_pm_info(pm);
 @@ -1474,13 +1489,20 @@ void acpi_build(PcGuestInfo *guest_info, 
 AcpiBuildTables *tables)
  dsdt = tables-table_data-len;
  build_dsdt(tables-table_data, tables-linker, misc);
  
 +/* Count the size of the DSDT and SSDT, we will need it for legacy
 + * sizing of ACPI tables.
 + */
 +aml_len += tables-table_data-len - dsdt;
 +
  /* ACPI tables pointed to by RSDT */
  acpi_add_table(table_offsets, tables-table_data);
  build_fadt(tables-table_data, tables-linker, pm, facs, dsdt);
  
 +ssdt = tables-table_data-len;
  acpi_add_table(table_offsets, tables-table_data);
  build_ssdt(tables-table_data, tables-linker, cpu, pm, misc, pci,
 guest_info);
 +aml_len += tables-table_data-len - ssdt;
  
  acpi_add_table(table_offsets, tables-table_data);
  build_madt(tables-table_data, tables-linker, cpu, guest_info);
 @@ -1513,12 +1535,53 @@ void acpi_build(PcGuestInfo *guest_info, 
 AcpiBuildTables *tables)
  /* RSDP is in FSEG memory, so allocate it separately */
  build_rsdp(tables-rsdp, tables-linker, rsdt);
  
 -/* We'll expose it all to Guest so align size to reduce
 +/* We'll expose it all to Guest so we want to reduce
   * chance of size changes.
   * RSDP is small so it's easy to keep it immutable, no need to
   * bother with alignment.
 + *
 + * We used to align the tables to 4k, but of course this would
 + * too simple to be enough.  4k turned out to be too small an
 + * alignment very soon, and in fact it is almost impossible to
 + * keep the table size stable for all (max_cpus, max_memory_slots)
 + * combinations.  So the table size is always 64k for pc-i440fx-2.1
 + * and we give an error if the table grows beyond that limit.
 + *
 + * We still have the problem of migrating from -M pc-i440fx-2.0.  For
 + * that, we exploit the fact that QEMU 2.1 generates _smaller_ tables
 + * than 2.0 and we can always pad the smaller tables with zeros.  We can
 + * then use the exact size of the 2.0 tables.
 + *
 + * All this is for PIIX4, since QEMU 2.0 didn't support Q35 migration.
   */
 -acpi_align_size(tables-table_data, 0x1000);
 +if (guest_info-legacy_acpi_table_size) {
 +/* Subtracting aml_len gives the size of fixed tables.  Then add the
 + * size of the PIIX4 DSDT/SSDT in QEMU 2.0.
 + */
 +int legacy_aml_len =
 +guest_info-legacy_acpi_table_size +
 +ACPI_BUILD_CPU_AML_SIZE * max_cpus +
 +ACPI_BUILD_BRIDGE_AML_SIZE * (MAX(bsel_alloc, 1) - 1);
 +int legacy_table_size =
 +ROUND_UP(tables-table_data-len - aml_len + legacy_aml_len, 
 0x1000);
 +if (tables-table_data-len  legacy_table_size) {
 +/* -M pc-i440fx-2.0 doesn't support memory hotplug, so this 
 should
 + * never happen.

[Qemu-devel] [PATCH v2 for-2.1 2/2] pc: hack for migration compatibility from QEMU 2.0

2014-07-24 Thread Paolo Bonzini
Changing the ACPI table size causes migration to break, and the memory
hotplug work opened our eyes on how horribly we were breaking things in
2.0 already.

The ACPI table size is rounded to the next 4k, which one would think
gives some headroom.  In practice this is not the case, because the user
can control the ACPI table size (each CPU adds 97 bytes to the SSDT and
8 to the MADT) and so some -smp values will break the 4k boundary and
fail to migrate.  Similarly, PCI bridges add ~1870 bytes to the SSDT.

To fix this, hard-code 64k as the maximum ACPI table size, which
(despite being an order of magnitude smaller than 640k) should be enough
for everyone.

To fix migration from QEMU 2.0, compute the payload size of QEMU 2.0
and always use that one.  The previous patch shrunk the ACPI tables
enough that the QEMU 2.0 size should always be enough.

Migration from QEMU 1.7 should work for guests that have a number of CPUs
other than 12, 13, 14, 54, 55, 56, 97, 98, 139, 140.  It was already
broken from QEMU 1.7 to QEMU 2.0 in the same way, though.

Even with this patch, QEMU 1.7 and 2.0 have two different ideas of
-M pc-i440fx-2.0 when there are PCI bridges.  Igor sent a patch to
adopt the QEMU 1.7 definition.  I think distributions should apply
it if they move directly from QEMU 1.7 to 2.1+ without ever packaging
version 2.0.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
replace magic constants with #defines [Igor]
remove stray line from comment [Laszlo]

 hw/i386/acpi-build.c | 71 +---
 hw/i386/pc_piix.c| 19 ++
 hw/i386/pc_q35.c |  5 
 include/hw/i386/pc.h |  1 +
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ebc5f03..26d8dfa 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -25,7 +25,9 @@
 #include glib.h
 #include qemu-common.h
 #include qemu/bitmap.h
+#include qemu/osdep.h
 #include qemu/range.h
+#include qemu/error-report.h
 #include hw/pci/pci.h
 #include qom/cpu.h
 #include hw/i386/pc.h
@@ -52,6 +54,16 @@
 #include qapi/qmp/qint.h
 #include qom/qom-qobject.h
 
+/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
+ * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
+ * a little bit, there should be plenty of free space since the DSDT
+ * shrunk by ~1.5k between QEMU 2.0 and QEMU 2.1.
+ */
+#define ACPI_BUILD_CPU_AML_SIZE97
+#define ACPI_BUILD_BRIDGE_AML_SIZE 1875
+
+#define ACPI_BUILD_TABLE_SIZE  0x1
+
 typedef struct AcpiCpuInfo {
 DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT);
 } AcpiCpuInfo;
@@ -87,6 +99,8 @@ typedef struct AcpiBuildPciBusHotplugState {
 struct AcpiBuildPciBusHotplugState *parent;
 } AcpiBuildPciBusHotplugState;
 
+unsigned bsel_alloc;
+
 static void acpi_get_dsdt(AcpiMiscInfo *info)
 {
 uint16_t *applesmc_sta;
@@ -759,8 +773,8 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
 static void acpi_set_pci_info(void)
 {
 PCIBus *bus = find_i440fx(); /* TODO: Q35 support */
-unsigned bsel_alloc = 0;
 
+assert(bsel_alloc == 0);
 if (bus) {
 /* Scan all PCI buses. Set property to enable acpi based hotplug. */
 pci_for_each_bus_depth_first(bus, acpi_set_bsel, NULL, bsel_alloc);
@@ -1440,13 +1454,14 @@ static
 void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables)
 {
 GArray *table_offsets;
-unsigned facs, dsdt, rsdt;
+unsigned facs, ssdt, dsdt, rsdt;
 AcpiCpuInfo cpu;
 AcpiPmInfo pm;
 AcpiMiscInfo misc;
 AcpiMcfgInfo mcfg;
 PcPciInfo pci;
 uint8_t *u;
+size_t aml_len = 0;
 
 acpi_get_cpu_info(cpu);
 acpi_get_pm_info(pm);
@@ -1474,13 +1489,20 @@ void acpi_build(PcGuestInfo *guest_info, 
AcpiBuildTables *tables)
 dsdt = tables-table_data-len;
 build_dsdt(tables-table_data, tables-linker, misc);
 
+/* Count the size of the DSDT and SSDT, we will need it for legacy
+ * sizing of ACPI tables.
+ */
+aml_len += tables-table_data-len - dsdt;
+
 /* ACPI tables pointed to by RSDT */
 acpi_add_table(table_offsets, tables-table_data);
 build_fadt(tables-table_data, tables-linker, pm, facs, dsdt);
 
+ssdt = tables-table_data-len;
 acpi_add_table(table_offsets, tables-table_data);
 build_ssdt(tables-table_data, tables-linker, cpu, pm, misc, pci,
guest_info);
+aml_len += tables-table_data-len - ssdt;
 
 acpi_add_table(table_offsets, tables-table_data);
 build_madt(tables-table_data, tables-linker, cpu, guest_info);
@@ -1513,12 +1535,53 @@ void acpi_build(PcGuestInfo *guest_info, 
AcpiBuildTables *tables)
 /* RSDP is in FSEG memory, so allocate it separately */
 build_rsdp(tables-rsdp, tables-linker, rsdt);
 
-/* We'll expose it all to Guest so align size to reduce
+/* We'll expose it all to Guest so we want to reduce
  * chance of size changes.
  * RSDP is small so it's 

Re: [Qemu-devel] [PATCH v2 for-2.1 2/2] pc: hack for migration compatibility from QEMU 2.0

2014-07-24 Thread Laszlo Ersek
On 07/24/14 16:32, Paolo Bonzini wrote:
 Changing the ACPI table size causes migration to break, and the memory
 hotplug work opened our eyes on how horribly we were breaking things in
 2.0 already.
 
 The ACPI table size is rounded to the next 4k, which one would think
 gives some headroom.  In practice this is not the case, because the user
 can control the ACPI table size (each CPU adds 97 bytes to the SSDT and
 8 to the MADT) and so some -smp values will break the 4k boundary and
 fail to migrate.  Similarly, PCI bridges add ~1870 bytes to the SSDT.
 
 To fix this, hard-code 64k as the maximum ACPI table size, which
 (despite being an order of magnitude smaller than 640k) should be enough
 for everyone.
 
 To fix migration from QEMU 2.0, compute the payload size of QEMU 2.0
 and always use that one.  The previous patch shrunk the ACPI tables
 enough that the QEMU 2.0 size should always be enough.
 
 Migration from QEMU 1.7 should work for guests that have a number of CPUs
 other than 12, 13, 14, 54, 55, 56, 97, 98, 139, 140.  It was already
 broken from QEMU 1.7 to QEMU 2.0 in the same way, though.
 
 Even with this patch, QEMU 1.7 and 2.0 have two different ideas of
 -M pc-i440fx-2.0 when there are PCI bridges.  Igor sent a patch to
 adopt the QEMU 1.7 definition.  I think distributions should apply
 it if they move directly from QEMU 1.7 to 2.1+ without ever packaging
 version 2.0.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
   replace magic constants with #defines [Igor]
   remove stray line from comment [Laszlo]

I compared this too with its v1 counterpart, and it looks good. I have
one question (just curiosity): the following paragraph was dropped from
the commit message -- why?

-Non-AML tables can change depending on the configuration (especially
-MADT, SRAT, HPET) but they remain the same between QEMU 2.0 and 2.1,
-so we only compute our padding based on the sizes of the SSDT and DSDT.

I think this remains true in v2 as well:
- aml_len and legacy_aml_len still only cover the DSDT and the
SSDT, and
- the non-AML tables (eg. the MADT, now spelled out in the commit
message), although they may grow with the number of CPUs, continue to
remain the same between 2.0 and 2.1.

IOW, I think you could have kept this paragraph if you wanted to. Was it
an oversight to drop it, or did the paragraph contain something
incorrect (in v1) that I'm unaware of? Or is it just redundant?

Reviewed-by: Laszlo Ersek ler...@redhat.com

Thanks,
Laszlo



Re: [Qemu-devel] [PATCH v2 for-2.1 2/2] pc: hack for migration compatibility from QEMU 2.0

2014-07-24 Thread Paolo Bonzini
Il 24/07/2014 18:29, Laszlo Ersek ha scritto:
 I compared this too with its v1 counterpart, and it looks good. I have
 one question (just curiosity): the following paragraph was dropped from
 the commit message -- why?
 
 -Non-AML tables can change depending on the configuration (especially
 -MADT, SRAT, HPET) but they remain the same between QEMU 2.0 and 2.1,
 -so we only compute our padding based on the sizes of the SSDT and DSDT.
 
 I think this remains true in v2 as well:
 - aml_len and legacy_aml_len still only cover the DSDT and the
 SSDT, and
 - the non-AML tables (eg. the MADT, now spelled out in the commit
 message), although they may grow with the number of CPUs, continue to
 remain the same between 2.0 and 2.1.
 
 IOW, I think you could have kept this paragraph if you wanted to. Was it
 an oversight to drop it, or did the paragraph contain something
 incorrect (in v1) that I'm unaware of? Or is it just redundant?

An oversight.  I had added it to the mail before sending it, not
directly in the commit message.  I'll add it back for the pull request
(tomorrow morning).

Paolo