Re: [Qemu-devel] [RFC] q35/mch: implement extended TSEG sizes

2017-06-07 Thread Laszlo Ersek
On 06/07/17 07:52, Gerd Hoffmann wrote:
>   Hi,
> 
> Patch looks sane overall.
> 
>> Invent a new, QEMU-specific register in the config space of the DRAM
>> Controller, at offset 0x50, in order to allow guest firmware to query
>> the
>> TSEG (SMRAM) size.
> 
> Hmm, 0x50 appears to be the only unused config space register in the
> specs.

I did find more holes, in "Table 5-1. DRAM Controller Register Address
Map (D0:F0)", in "Document Number: 316966-002". The hole at 0x50 is just
the one with the lowest config space offset (after the standard PCI
device header). The others are:

- 0x58-0x59 (word)
- 0x9c (byte)
- 0x9f (byte)
- 0xb2-0xc7 (eight words)
- 0xce-0xdb (seven words)
- 0xeb-0xff (fifteen bytes, but this range appears to come after the
 capability list, so I felt it would be unsafe to use)

> I suspect in reality it isn't unused but undocumented.  I don't
> have a better idea though, and in practice it probably isn't much of a
> problem.

Thanks for confirming. Then I'll set out to write the OVMF-side patches
for this. Once I have it all working, I'm going to send out the QEMU
change as a real patch.

Thanks!
Laszlo



Re: [Qemu-devel] [RFC] q35/mch: implement extended TSEG sizes

2017-06-06 Thread Gerd Hoffmann
  Hi,

Patch looks sane overall.

> Invent a new, QEMU-specific register in the config space of the DRAM
> Controller, at offset 0x50, in order to allow guest firmware to query
> the
> TSEG (SMRAM) size.

Hmm, 0x50 appears to be the only unused config space register in the
specs.  I suspect in reality it isn't unused but undocumented.  I don't
have a better idea though, and in practice it probably isn't much of a
problem.

cheers,
  Gerd



Re: [Qemu-devel] [RFC] q35/mch: implement extended TSEG sizes

2017-06-06 Thread Laszlo Ersek
Gerd, can you please comment on this?

Thanks
Laszlo

On 05/30/17 23:26, Laszlo Ersek wrote:
> The q35 machine type currently lets the guest firmware select a 1MB, 2MB
> or 8MB TSEG (basically, SMRAM) size. In edk2/OVMF, we use 8MB, but even
> that is not enough when a lot of VCPUs (more than approx. 224) are
> configured -- SMRAM footprint scales largely proportionally with VCPU
> count.
> 
> Introduce a new property for "mch" called "extended-tseg-mbytes", which
> expresses (in megabytes) the user's choice of TSEG (SMRAM) size.
> 
> Invent a new, QEMU-specific register in the config space of the DRAM
> Controller, at offset 0x50, in order to allow guest firmware to query the
> TSEG (SMRAM) size.
> 
> According to Intel Document Number 316966-002, Table 5-1 "DRAM Controller
> Register Address Map (D0:F0)":
> 
> Warning: Address locations that are not listed are considered Intel
>  Reserved registers locations. Reads to Reserved registers may
>  return non-zero values. Writes to reserved locations may
>  cause system failures.
> 
>  All registers that are defined in the PCI 2.3 specification,
>  but are not necessary or implemented in this component are
>  simply not included in this document. The
>  reserved/unimplemented space in the PCI configuration header
>  space is not documented as such in this summary.
> 
> Offsets 0x50 and 0x51 are not listed in Table 5-1. They are also not part
> of the standard PCI config space header. And they precede the capability
> list as well, which starts at 0xe0 for this device.
> 
> When the guest writes value 0x to this register, the value that can be
> read back is that of "mch.extended-tseg-mbytes" -- unless it remains
> 0x. The guest is required to write 0x first (as opposed to a
> read-only register) because PCI config space is generally not cleared on
> QEMU reset, and after S3 resume or reboot, new guest firmware running on
> old QEMU could read a guest OS-injected value from this register.
> 
> After reading the available "extended" TSEG size, the guest firmware may
> actually request that TSEG size by writing pattern 11b to the ESMRAMC
> register's TSEG_SZ bit-field. (The Intel spec referenced above defines
> only patterns 00b (1MB), 01b (2MB) and 10b (8MB); 11b is reserved.)
> 
> On the QEMU command line, the value can be set with
> 
>   -global mch.extended-tseg-mbytes=N
> 
> The default value for 2.10+ q35 machine types is 16. The value is limited
> to 0xfff (4095) at the moment, purely so that the product (4095 MB) can be
> stored to the uint32_t variable "tseg_size" in mch_update_smram(). Users
> are responsible for choosing sensible TSEG sizes.
> 
> On 2.9 and earlier q35 machine types, the default value is 0. This lets
> the 11b bit pattern in ESMRAMC.TSEG_SZ, and the register at offset 0x50,
> keep their original behavior.
> 
> When "extended-tseg-mbytes" is nonzero, the new register at offset 0x50 is
> set to that value on reset, for completeness.
> 
> PCI config space is migrated automatically, so no VMSD changes are
> necessary.
> 
> Cc: "Michael S. Tsirkin" 
> Cc: Gerd Hoffmann 
> Cc: Paolo Bonzini 
> Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1447027
> Ref: https://lists.01.org/pipermail/edk2-devel/2017-May/010456.html
> Signed-off-by: Laszlo Ersek 
> ---
> 
> Notes:
> I haven't yet written any OVMF code to interface with this; I figured
> I'd ask for comments on the approach first. Thanks.
> 
>  include/hw/i386/pc.h  |  5 +
>  include/hw/pci-host/q35.h |  6 ++
>  hw/pci-host/q35.c | 41 ++---
>  3 files changed, 49 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index e447f5d8f4d1..78cd630f3133 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -384,6 +384,11 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t 
> *);
>  
>  #define PC_COMPAT_2_9 \
>  HW_COMPAT_2_9 \
> +{\
> +.driver   = "mch",\
> +.property = "extended-tseg-mbytes",\
> +.value= stringify(0),\
> +},\
>  
>  #define PC_COMPAT_2_8 \
>  HW_COMPAT_2_8 \
> diff --git a/include/hw/pci-host/q35.h b/include/hw/pci-host/q35.h
> index 53b6760c16c5..58983c00b32d 100644
> --- a/include/hw/pci-host/q35.h
> +++ b/include/hw/pci-host/q35.h
> @@ -60,6 +60,7 @@ typedef struct MCHPCIState {
>  uint64_t above_4g_mem_size;
>  uint64_t pci_hole64_size;
>  uint32_t short_root_bus;
> +uint16_t ext_tseg_mbytes;
>  } MCHPCIState;
>  
>  typedef struct Q35PCIHost {
> @@ -91,6 +92,11 @@ typedef struct Q35PCIHost {
>  /* D0:F0 configuration space */
>  #define MCH_HOST_BRIDGE_REVISION_DEFAULT   0x0
>  
> +#define MCH_HOST_BRIDGE_EXT_TSEG_MBYTES0x50
> +#define MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_SIZE   2
> +#define 

[Qemu-devel] [RFC] q35/mch: implement extended TSEG sizes

2017-05-30 Thread Laszlo Ersek
The q35 machine type currently lets the guest firmware select a 1MB, 2MB
or 8MB TSEG (basically, SMRAM) size. In edk2/OVMF, we use 8MB, but even
that is not enough when a lot of VCPUs (more than approx. 224) are
configured -- SMRAM footprint scales largely proportionally with VCPU
count.

Introduce a new property for "mch" called "extended-tseg-mbytes", which
expresses (in megabytes) the user's choice of TSEG (SMRAM) size.

Invent a new, QEMU-specific register in the config space of the DRAM
Controller, at offset 0x50, in order to allow guest firmware to query the
TSEG (SMRAM) size.

According to Intel Document Number 316966-002, Table 5-1 "DRAM Controller
Register Address Map (D0:F0)":

Warning: Address locations that are not listed are considered Intel
 Reserved registers locations. Reads to Reserved registers may
 return non-zero values. Writes to reserved locations may
 cause system failures.

 All registers that are defined in the PCI 2.3 specification,
 but are not necessary or implemented in this component are
 simply not included in this document. The
 reserved/unimplemented space in the PCI configuration header
 space is not documented as such in this summary.

Offsets 0x50 and 0x51 are not listed in Table 5-1. They are also not part
of the standard PCI config space header. And they precede the capability
list as well, which starts at 0xe0 for this device.

When the guest writes value 0x to this register, the value that can be
read back is that of "mch.extended-tseg-mbytes" -- unless it remains
0x. The guest is required to write 0x first (as opposed to a
read-only register) because PCI config space is generally not cleared on
QEMU reset, and after S3 resume or reboot, new guest firmware running on
old QEMU could read a guest OS-injected value from this register.

After reading the available "extended" TSEG size, the guest firmware may
actually request that TSEG size by writing pattern 11b to the ESMRAMC
register's TSEG_SZ bit-field. (The Intel spec referenced above defines
only patterns 00b (1MB), 01b (2MB) and 10b (8MB); 11b is reserved.)

On the QEMU command line, the value can be set with

  -global mch.extended-tseg-mbytes=N

The default value for 2.10+ q35 machine types is 16. The value is limited
to 0xfff (4095) at the moment, purely so that the product (4095 MB) can be
stored to the uint32_t variable "tseg_size" in mch_update_smram(). Users
are responsible for choosing sensible TSEG sizes.

On 2.9 and earlier q35 machine types, the default value is 0. This lets
the 11b bit pattern in ESMRAMC.TSEG_SZ, and the register at offset 0x50,
keep their original behavior.

When "extended-tseg-mbytes" is nonzero, the new register at offset 0x50 is
set to that value on reset, for completeness.

PCI config space is migrated automatically, so no VMSD changes are
necessary.

Cc: "Michael S. Tsirkin" 
Cc: Gerd Hoffmann 
Cc: Paolo Bonzini 
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1447027
Ref: https://lists.01.org/pipermail/edk2-devel/2017-May/010456.html
Signed-off-by: Laszlo Ersek 
---

Notes:
I haven't yet written any OVMF code to interface with this; I figured
I'd ask for comments on the approach first. Thanks.

 include/hw/i386/pc.h  |  5 +
 include/hw/pci-host/q35.h |  6 ++
 hw/pci-host/q35.c | 41 ++---
 3 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index e447f5d8f4d1..78cd630f3133 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -384,6 +384,11 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 
 #define PC_COMPAT_2_9 \
 HW_COMPAT_2_9 \
+{\
+.driver   = "mch",\
+.property = "extended-tseg-mbytes",\
+.value= stringify(0),\
+},\
 
 #define PC_COMPAT_2_8 \
 HW_COMPAT_2_8 \
diff --git a/include/hw/pci-host/q35.h b/include/hw/pci-host/q35.h
index 53b6760c16c5..58983c00b32d 100644
--- a/include/hw/pci-host/q35.h
+++ b/include/hw/pci-host/q35.h
@@ -60,6 +60,7 @@ typedef struct MCHPCIState {
 uint64_t above_4g_mem_size;
 uint64_t pci_hole64_size;
 uint32_t short_root_bus;
+uint16_t ext_tseg_mbytes;
 } MCHPCIState;
 
 typedef struct Q35PCIHost {
@@ -91,6 +92,11 @@ typedef struct Q35PCIHost {
 /* D0:F0 configuration space */
 #define MCH_HOST_BRIDGE_REVISION_DEFAULT   0x0
 
+#define MCH_HOST_BRIDGE_EXT_TSEG_MBYTES0x50
+#define MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_SIZE   2
+#define MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_QUERY  0x
+#define MCH_HOST_BRIDGE_EXT_TSEG_MBYTES_MAX0xfff
+
 #define MCH_HOST_BRIDGE_PCIEXBAR   0x60/* 64bit register */
 #define MCH_HOST_BRIDGE_PCIEXBAR_SIZE  8   /* 64bit register */
 #define MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT   0xb000
diff --git