Re: [PATCH v1 00/23] Q35 support for Xen

2023-08-22 Thread Joel Upham
I was doing this for work and at the moment got pulled off to work on some
things for our release. Most of these patches will exist as they are,
except for Xen wanting to handle the PCIe PT code a bit differently. XC-PNG
is also working on getting patches for q35 and I have been sharing my code
with them, so I am hoping progress is moving forward on the Xen side. I
wish I could work on this full time to get everything as it needs to be
soon.

-Joel

On Tue, Aug 22, 2023 at 10:18 AM Anthony PERARD 
wrote:

> Hi Joel,
>
> We had a design session about Q35 support during Xen Summit, and I think
> the result of it is that some more changes are going to be needed,
> right?
>
> So, is it worth it for me to spend some time on review this patch series
> in its current form, or should I wait until the next revision? And same
> question for the xen toolstack side.
>
> Cheers,
>
> --
> Anthony PERARD
>


Re: [PATCH v1 00/23] Q35 support for Xen

2023-07-05 Thread Joel Upham
I believe it might have been master unstable branch. Last commit before my
patches was:

commit 19a720b74fde7e859d19f12c66a72e545947a657
Merge: c6a5fc2ac7 367189efae
Author: Richard Henderson 
Date:   Thu Jun 1 08:30:29 2023 -0700

-Joel

On Thu, Jun 22, 2023 at 1:11 PM Bernhard Beschow  wrote:

>
>
> Am 20. Juni 2023 17:24:33 UTC schrieb Joel Upham :
> >These are the Qemu changes needed to support the q35 chipset for xen
> >I based the patches from 2017 found on the mailing list here:
> >
> https://lists.xenproject.org/archives/html/xen-devel/2018-03/msg01176.html
> >
> >I have been using a version of these patches on Xen 4.16 with Qemu
> >version 4.1 for over 6 months.  The guest VMs are very stable, and PCIe
> >PT is working as was designed (all of the PCIe devices are on the root
> >PCIe device).  I have successfully passed through GPUs, NICs, etc. I was
> >asked by those in the community to attempt to once again upstream the
> >patches.  I have them working with Seabios and OVMF (patches are needed
> >to OVMF which I will be sending to the mailing list). The Qemu patches
> >allow for the xenvbd to properly unplug the AHCI SATA device, and all
> >xen pv windows drivers work as intended.
> >
> >I used the original author of the patches to get a majority of this to
> work:
> >Alexey Gerasimenko.  I fixed the patches to be in line with the upstream
> >Qemu and Xen versions.  Any original issues may still exist; however, I
> >am sure in time they can be improved. If the code doesn't exist then they
> >can't be actively looked at by the community.
> >
> >I am not an expert on the Q35 chipset or PCIe technology.  This is my
> >first patch to this mailing list.
>
> Patchew was unable to apply this series onto master:
> https://patchew.org/QEMU/cover.1687278381.git.jupham...@gmail.com/ What
> revision is the series based on?
>
> Can you rebase? Rebasing this series will probably cause quite some work
> since it will simplify here and there, as indicated by Igor and by my
> comments in "version zero" of this series.
>
> Best regards,
> Bernhard
>
> >
> >
> >Joel Upham (23):
> >  pc/xen: Xen Q35 support: provide IRQ handling for PCI devices
> >  pc/q35: Apply PCI bus BSEL property for Xen PCI device hotplug
> >  q35/acpi/xen: Provide ACPI PCI hotplug interface for Xen on Q35
> >  q35/xen: Add Xen platform device support for Q35
> >  q35: Fix incorrect values for PCIEXBAR masks
> >  xen/pt: XenHostPCIDevice: provide functions for PCI Capabilities and
> >PCIe Extended Capabilities enumeration
> >  xen/pt: avoid reading PCIe device type and cap version multiple times
> >  xen/pt: determine the legacy/PCIe mode for a passed through device
> >  xen/pt: Xen PCIe passthrough support for Q35: bypass PCIe topology
> >check
> >  xen/pt: add support for PCIe Extended Capabilities and larger config
> >space
> >  xen/pt: handle PCIe Extended Capabilities Next register
> >  xen/pt: allow to hide PCIe Extended Capabilities
> >  xen/pt: add Vendor-specific PCIe Extended Capability descriptor and
> >sizing
> >  xen/pt: add fixed-size PCIe Extended Capabilities descriptors
> >  xen/pt: add AER PCIe Extended Capability descriptor and sizing
> >  xen/pt: add descriptors and size calculation for
> >RCLD/ACS/PMUX/DPA/MCAST/TPH/DPC PCIe Extended Capabilities
> >  xen/pt: add Resizable BAR PCIe Extended Capability descriptor and
> >sizing
> >  xen/pt: add VC/VC9/MFVC PCIe Extended Capabilities descriptors and
> >sizing
> >  xen/pt: Fake capability id
> >  xen platform: unplug ahci object
> >  pc/q35: setup q35 for xen
> >  qdev-monitor/pt: bypass root device check
> >  s3 support: enabling s3 with q35
> >
> > hw/acpi/ich9.c|   22 +-
> > hw/acpi/pcihp.c   |6 +-
> > hw/core/machine.c |   19 +
> > hw/i386/pc_piix.c |3 +-
> > hw/i386/pc_q35.c  |   39 +-
> > hw/i386/xen/xen-hvm.c |7 +-
> > hw/i386/xen/xen_platform.c|   19 +-
> > hw/isa/lpc_ich9.c |   53 +-
> > hw/isa/piix3.c|2 +-
> > hw/pci-host/q35.c |   28 +-
> > hw/pci/pci.c  |   17 +
> > hw/xen/xen-host-pci-device.c  |  106 +++-
> > hw/xen/xen-host-pci-device.h  |6 +-
> > hw/xen/xen_pt.c   |   49 +-
> > hw/xen/xen_pt.h   |   18 +-
> > hw/xen/xen_pt_config_init.c   | 1103 ++---
> > include/hw/acpi/pcihp.h   |2 +
> > include/hw/boards.h   |1 +
> > include/hw/i386/pc.h  |3 +
> > include/hw/pci-host/q35.h |4 +-
> > include/hw/pci/pci.h  |3 +
> > include/hw/southbridge/ich9.h |1 +
> > include/hw/xen/xen.h  |4 +-
> > qemu-options.hx   |1 +
> > softmmu/qdev-monitor.c|4 +-
> > stubs/xen-hw-stub.c   |4 +-
> > 26 files changed, 1394 insertions(+), 130 deletions(-)
> >
>


Re: [PATCH v1 23/23] s3 support: enabling s3 with q35

2023-06-21 Thread Joel Upham
On Wed, Jun 21, 2023 at 7:34 AM Igor Mammedov  wrote:

> On Tue, 20 Jun 2023 13:24:57 -0400
> Joel Upham  wrote:
>
> > Resetting pci devices after s3 causes guest freezes, as xen usually
> > likes to handle resetting devices.
>
> I'd prefer Xen side being fixed instead of hacking reset logic in qemu/q35.
>
> Handling of ACPI and initialization of memory is done in hvmloader from my
understanding.
What I noticed was when qemu attempted to reset devices, they became
unusable or would freeze the guest. It is very possible
that I am missing something that piix is doing to correctly reset, so any
input I can get to make this better
is welcome.

>
> > Signed-off-by: Joel Upham 
> > ---
> >  hw/acpi/ich9.c| 12 
> >  hw/pci-host/q35.c |  3 ++-
> >  2 files changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> > index 1c236be1c7..234706a191 100644
> > --- a/hw/acpi/ich9.c
> > +++ b/hw/acpi/ich9.c
> > @@ -143,7 +143,8 @@ static int ich9_pm_post_load(void *opaque, int
> version_id)
> >  {
> >  ICH9LPCPMRegs *pm = opaque;
> >  uint32_t pm_io_base = pm->pm_io_base;
> > -pm->pm_io_base = 0;
> > +if (!xen_enabled())
> > +pm->pm_io_base = 0;
> >  ich9_pm_iospace_update(pm, pm_io_base);
> >  return 0;
> >  }
> > @@ -274,7 +275,10 @@ static void pm_reset(void *opaque)
> >  acpi_pm1_evt_reset(>acpi_regs);
> >  acpi_pm1_cnt_reset(>acpi_regs);
> >  acpi_pm_tmr_reset(>acpi_regs);
> > -acpi_gpe_reset(>acpi_regs);
> > +/* Noticed guest freezing in xen when this was reset after S3. */
> > +if (!xen_enabled()) {
> > +acpi_gpe_reset(>acpi_regs);
> > +}
> >
> >  pm->smi_en = 0;
> >  if (!pm->smm_enabled) {
> > @@ -322,7 +326,7 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs
> *pm, qemu_irq sci_irq)
> >  acpi_pm_tco_init(>tco_regs, >io);
> >  }
> >
> > -if (pm->acpi_pci_hotplug.use_acpi_hotplug_bridge) {
> > +if (pm->acpi_pci_hotplug.use_acpi_hotplug_bridge || xen_enabled()) {
> >  acpi_pcihp_init(OBJECT(lpc_pci),
> >  >acpi_pci_hotplug,
> >  pci_get_bus(lpc_pci),
> > @@ -345,7 +349,7 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs
> *pm, qemu_irq sci_irq)
> >  legacy_acpi_cpu_hotplug_init(pci_address_space_io(lpc_pci),
> >  OBJECT(lpc_pci), >gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE);
> >
> > -if (pm->acpi_memory_hotplug.is_enabled) {
> > +if (pm->acpi_memory_hotplug.is_enabled || xen_enabled()) {
> >  acpi_memory_hotplug_init(pci_address_space_io(lpc_pci),
> OBJECT(lpc_pci),
> >   >acpi_memory_hotplug,
> >   ACPI_MEMORY_HOTPLUG_BASE);
> > diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
> > index 1fe4e5a5c9..5891839ce9 100644
> > --- a/hw/pci-host/q35.c
> > +++ b/hw/pci-host/q35.c
> > @@ -580,7 +580,8 @@ static void mch_reset(DeviceState *qdev)
> >  d->config[MCH_HOST_BRIDGE_F_SMBASE] = 0;
> >  d->wmask[MCH_HOST_BRIDGE_F_SMBASE] = 0xff;
> >
> > -mch_update(mch);
> > +if (!xen_enabled())
> > +mch_update(mch);
> >  }
> >
> >  static void mch_realize(PCIDevice *d, Error **errp)
>
>


Re: [PATCH v1 03/23] q35/acpi/xen: Provide ACPI PCI hotplug interface for Xen on Q35

2023-06-21 Thread Joel Upham
On Wed, Jun 21, 2023 at 7:28 AM Igor Mammedov  wrote:

> On Tue, 20 Jun 2023 13:24:37 -0400
> Joel Upham  wrote:
>
> > This patch allows to use ACPI PCI hotplug functionality for Xen on Q35.
> > All added code depends on xen_enabled(), so no functionality change for
> > non-Xen usage.
> >
> > We need to call the acpi_set_pci_info function from ich9_pm_init as well,
> > so it was made globally visible again (as it was before).
>
> this patch is also likely obsolete
>

Ok, I can attempt removing it.

> >
> > Signed-off-by: Alexey Gerasimenko 
> > Signed-off-by: Joel Upham 
> > ---
> >  hw/acpi/ich9.c  | 10 ++
> >  hw/acpi/pcihp.c |  2 +-
> >  include/hw/acpi/pcihp.h |  2 ++
> >  3 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> > index 25e2c7243e..1c236be1c7 100644
> > --- a/hw/acpi/ich9.c
> > +++ b/hw/acpi/ich9.c
> > @@ -39,6 +39,8 @@
> >  #include "hw/southbridge/ich9.h"
> >  #include "hw/mem/pc-dimm.h"
> >  #include "hw/mem/nvdimm.h"
> > +#include "hw/xen/xen.h"
> > +#include "sysemu/xen.h"
> >
> >  //#define DEBUG
> >
> > @@ -67,6 +69,10 @@ static void ich9_gpe_writeb(void *opaque, hwaddr
> addr, uint64_t val,
> >  ICH9LPCPMRegs *pm = opaque;
> >  acpi_gpe_ioport_writeb(>acpi_regs, addr, val);
> >  acpi_update_sci(>acpi_regs, pm->irq);
> > +
> > +if (xen_enabled()) {
> > +acpi_pcihp_reset(>acpi_pci_hotplug);
> > +}
> >  }
> >
> >  static const MemoryRegionOps ich9_gpe_ops = {
> > @@ -332,6 +338,10 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs
> *pm, qemu_irq sci_irq)
> >  pm->powerdown_notifier.notify = pm_powerdown_req;
> >  qemu_register_powerdown_notifier(>powerdown_notifier);
> >
> > +if (xen_enabled()) {
> > +acpi_set_pci_info(true);
> > +}
> > +
> >  legacy_acpi_cpu_hotplug_init(pci_address_space_io(lpc_pci),
> >  OBJECT(lpc_pci), >gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE);
> >
> > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> > index f4e39d7a9c..5b065d670c 100644
> > --- a/hw/acpi/pcihp.c
> > +++ b/hw/acpi/pcihp.c
> > @@ -99,7 +99,7 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
> >  return info;
> >  }
> >
> > -static void acpi_set_pci_info(bool has_bridge_hotplug)
> > +void acpi_set_pci_info(bool has_bridge_hotplug)
> >  {
> >  static bool bsel_is_set;
> >  Object *host = acpi_get_i386_pci_host();
> > diff --git a/include/hw/acpi/pcihp.h b/include/hw/acpi/pcihp.h
> > index ef59810c17..d35a517c9e 100644
> > --- a/include/hw/acpi/pcihp.h
> > +++ b/include/hw/acpi/pcihp.h
> > @@ -72,6 +72,8 @@ void
> acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> >  /* Called on reset */
> >  void acpi_pcihp_reset(AcpiPciHpState *s);
> >
> > +void acpi_set_pci_info(bool has_bridge_hotplug);
> > +
> >  void build_append_pcihp_slots(Aml *parent_scope, PCIBus *bus);
> >
> >  extern const VMStateDescription vmstate_acpi_pcihp_pci_status;
>
>


Re: [PATCH v1 02/23] pc/q35: Apply PCI bus BSEL property for Xen PCI device hotplug

2023-06-21 Thread Joel Upham
On Wed, Jun 21, 2023 at 7:28 AM Igor Mammedov  wrote:

> On Tue, 20 Jun 2023 13:24:36 -0400
> Joel Upham  wrote:
>
> > On Q35 we still need to assign BSEL property to bus(es) for PCI device
> > add/hotplug to work.
> > Extend acpi_set_pci_info() function to support Q35 as well. This patch
> adds new (trivial)
> > function find_q35() which returns root PCIBus object on Q35, in a way
> > similar to what find_i440fx does.
>
> I think patch is mostly obsolete, q35 ACPI PCI hotplug is supported in
> upstream QEMU.
>
> Also see comment below.
>
> I make use of the find_q35() function in later patches, but I agree now a
majority of this patch is a bit different.

> >
> > Signed-off-by: Alexey Gerasimenko 
> > Signed-off-by: Joel Upham 
> > ---
> >  hw/acpi/pcihp.c  | 4 +++-
> >  hw/pci-host/q35.c| 9 +
> >  include/hw/i386/pc.h | 3 +++
> >  3 files changed, 15 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> > index cdd6f775a1..f4e39d7a9c 100644
> > --- a/hw/acpi/pcihp.c
> > +++ b/hw/acpi/pcihp.c
> > @@ -40,6 +40,7 @@
> >  #include "qapi/error.h"
> >  #include "qom/qom-qobject.h"
> >  #include "trace.h"
> > +#include "sysemu/xen.h"
> >
> >  #define ACPI_PCIHP_SIZE 0x0018
> >  #define PCI_UP_BASE 0x
> > @@ -84,7 +85,8 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
> >  bool is_bridge = IS_PCI_BRIDGE(br);
> >
> >  /* hotplugged bridges can't be described in ACPI ignore them */
> > -if (qbus_is_hotpluggable(BUS(bus))) {
>
> > +/* Xen requires hotplugging to the root device, even on the Q35
> chipset */
> pls explain what 'root device' is.
> Why can't you use root-ports for hotplug?
>
> Wording may have been incorrect.  Root port is correct. This may not be
needed anymore,
and may have been left over for when I was debugging PCIe hotplugging
problems.
I will retest and fix patch once I know more. Xen expects the PCIe device
to be on the root port.

I can move the function to a different patch that uses it.

> > +if (qbus_is_hotpluggable(BUS(bus)) || xen_enabled()) {
> >  if (!is_bridge || (!br->hotplugged &&
> info->has_bridge_hotplug)) {
> >  bus_bsel = g_malloc(sizeof *bus_bsel);
> >
> > diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
> > index fd18920e7f..fe5fc0f47c 100644
> > --- a/hw/pci-host/q35.c
> > +++ b/hw/pci-host/q35.c
> > @@ -259,6 +259,15 @@ static void q35_host_initfn(Object *obj)
> >   qdev_prop_allow_set_link_before_realize,
> 0);
> >  }
> >
> > +PCIBus *find_q35(void)
> > +{
> > +PCIHostState *s = OBJECT_CHECK(PCIHostState,
> > +   object_resolve_path("/machine/q35",
> NULL),
> > +   TYPE_PCI_HOST_BRIDGE);
> > +return s ? s->bus : NULL;
> > +}
> > +
> > +
> >  static const TypeInfo q35_host_info = {
> >  .name   = TYPE_Q35_HOST_DEVICE,
> >  .parent = TYPE_PCIE_HOST_BRIDGE,
> > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > index c661e9cc80..550f8fa221 100644
> > --- a/include/hw/i386/pc.h
> > +++ b/include/hw/i386/pc.h
> > @@ -196,6 +196,9 @@ void pc_madt_cpu_entry(int uid, const CPUArchIdList
> *apic_ids,
> >  /* sgx.c */
> >  void pc_machine_init_sgx_epc(PCMachineState *pcms);
> >
> > +/* q35.c */
> > +PCIBus *find_q35(void);
> > +
> >  extern GlobalProperty pc_compat_8_0[];
> >  extern const size_t pc_compat_8_0_len;
> >
>
>


Re: [PATCH v1 01/23] pc/xen: Xen Q35 support: provide IRQ handling for PCI devices

2023-06-21 Thread Joel Upham
Thank you, I was working off the Xen-devel and didn’t find his email. I
will update my qemu and xen patches for the next version.

-Joel

On Wed, Jun 21, 2023 at 3:17 AM Daniel P. Berrangé 
wrote:

> On Tue, Jun 20, 2023 at 01:24:34PM -0400, Joel Upham wrote:
> >
> > Signed-off-by: Alexey Gerasimenko 
>
> This isn't a valid email address for Alexey - I presume you grabbed
> these patches from the xen-devel mail archives, which have mangled
> the addresses for anti-spam reasons.
>
> Fortunately there are alternative archives which don't mangle the
> patches:
>
>
> https://lore.kernel.org/xen-devel/6067bc3c91c9ee629a35723dfb474ef168ff4ebf.1520867955.git.x19...@gmail.com/
>
>   Signed-off-by: Alexey Gerasimenko 
>
> This affects all patches in the series, but I won't repeat my
> comment on each one.
>
> > Signed-off-by: Joel Upham 
> > ---
> >  hw/i386/pc_piix.c |  3 +-
> >  hw/i386/xen/xen-hvm.c |  7 +++--
> >  hw/isa/lpc_ich9.c | 53 ---
> >  hw/isa/piix3.c|  2 +-
> >  include/hw/southbridge/ich9.h |  1 +
> >  include/hw/xen/xen.h  |  4 +--
> >  stubs/xen-hw-stub.c   |  4 +--
> >  7 files changed, 61 insertions(+), 13 deletions(-)
> >
> > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > index d5b0dcd1fe..8c1b20f3bc 100644
> > --- a/hw/i386/pc_piix.c
> > +++ b/hw/i386/pc_piix.c
> > @@ -62,6 +62,7 @@
> >  #endif
> >  #include "hw/xen/xen-x86.h"
> >  #include "hw/xen/xen.h"
> > +#include "sysemu/xen.h"
> >  #include "migration/global_state.h"
> >  #include "migration/misc.h"
> >  #include "sysemu/numa.h"
> > @@ -233,7 +234,7 @@ static void pc_init1(MachineState *machine,
> >x86ms->above_4g_mem_size,
> >pci_memory, ram_memory);
> >  pci_bus_map_irqs(pci_bus,
> > - xen_enabled() ? xen_pci_slot_get_pirq
> > + xen_enabled() ? xen_cmn_pci_slot_get_pirq
> > : pc_pci_slot_get_pirq);
> >  pcms->bus = pci_bus;
> >
> > diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
> > index 56641a550e..540ac46639 100644
> > --- a/hw/i386/xen/xen-hvm.c
> > +++ b/hw/i386/xen/xen-hvm.c
> > @@ -15,6 +15,7 @@
> >  #include "hw/pci/pci.h"
> >  #include "hw/pci/pci_host.h"
> >  #include "hw/i386/pc.h"
> > +#include "hw/southbridge/ich9.h"
> >  #include "hw/irq.h"
> >  #include "hw/hw.h"
> >  #include "hw/i386/apic-msidef.h"
> > @@ -136,14 +137,14 @@ typedef struct XenIOState {
> >  Notifier wakeup;
> >  } XenIOState;
> >
> > -/* Xen specific function for piix pci */
> > +/* Xen-specific functions for pci dev IRQ handling */
> >
> > -int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
> > +int xen_cmn_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
> >  {
> >  return irq_num + (PCI_SLOT(pci_dev->devfn) << 2);
> >  }
> >
> > -void xen_piix3_set_irq(void *opaque, int irq_num, int level)
> > +void xen_cmn_set_irq(void *opaque, int irq_num, int level)
> >  {
> >  xen_set_pci_intx_level(xen_domid, 0, 0, irq_num >> 2,
> > irq_num & 3, level);
> > diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> > index 9c47a2f6c7..733a99d443 100644
> > --- a/hw/isa/lpc_ich9.c
> > +++ b/hw/isa/lpc_ich9.c
> > @@ -51,6 +51,9 @@
> >  #include "hw/core/cpu.h"
> >  #include "hw/nvram/fw_cfg.h"
> >  #include "qemu/cutils.h"
> > +#include "hw/xen/xen.h"
> > +#include "sysemu/xen.h"
> > +#include "hw/southbridge/piix.h"
> >  #include "hw/acpi/acpi_aml_interface.h"
> >  #include "trace.h"
> >
> > @@ -535,11 +538,49 @@ static int ich9_lpc_post_load(void *opaque, int
> version_id)
> >  return 0;
> >  }
> >
> > +static void ich9_lpc_config_write_xen(PCIDevice *d,
> > +  uint32_t addr, uint32_t val, int len)
> > +{
> > +static bool pirqe_f_warned = false;
> > +if (ranges_overlap(addr, len, ICH9_LPC_PIRQA_ROUT, 4)) {
> > +/* handle PIRQA..PIRQD routing */
> > +/* Scan for updates to PCI link routes (0x60-0x63). */
> > +int i;
> > + 

Re: [PATCH v1 1/1] Q35 Support

2023-06-21 Thread Joel Upham
Sorry, this was sent in error when I did the git send-email for the folder.
This was before I broke each patch down (after looking at the Qemu
submission guidance). This is my first time sending a patch in this way, so
thanks for the understanding. This patch can be ignored, as they are all
covered elsewhere.

-Joel Upham

On Wed, Jun 21, 2023 at 7:10 AM David Hildenbrand  wrote:

> On 20.06.23 19:24, Joel Upham wrote:
>
> Inexpressive patch subject and non-existant patch desciption. I have no
> clue what this is supposed to do, except that it involes q35 and xen ()I
> guess ?.
>
> > ---
> >   hw/acpi/ich9.c|   22 +-
> >   hw/acpi/pcihp.c   |6 +-
> >   hw/core/machine.c |   19 +
> >   hw/i386/pc_piix.c |3 +-
> >   hw/i386/pc_q35.c  |   39 +-
> >   hw/i386/xen/xen-hvm.c |7 +-
> >   hw/i386/xen/xen_platform.c|   19 +-
> >   hw/isa/lpc_ich9.c |   53 +-
> >   hw/isa/piix3.c|2 +-
> >   hw/pci-host/q35.c |   28 +-
> >   hw/pci/pci.c  |   17 +
> >   hw/xen/xen-host-pci-device.c  |  106 +++-
> >   hw/xen/xen-host-pci-device.h  |6 +-
> >   hw/xen/xen_pt.c   |   49 +-
> >   hw/xen/xen_pt.h   |   19 +-
> >   hw/xen/xen_pt_config_init.c   | 1103 ++---
> >   include/hw/acpi/ich9.h|1 +
> >   include/hw/acpi/pcihp.h   |2 +
> >   include/hw/boards.h   |1 +
> >   include/hw/i386/pc.h  |3 +
> >   include/hw/pci-host/q35.h |4 +-
> >   include/hw/pci/pci.h  |3 +
> >   include/hw/southbridge/ich9.h |1 +
> >   include/hw/xen/xen.h  |4 +-
> >   qemu-options.hx   |1 +
> >   softmmu/datadir.c |1 -
> >   softmmu/qdev-monitor.c|3 +-
> >   stubs/xen-hw-stub.c   |4 +-
> >   28 files changed, 1395 insertions(+), 131 deletions(-)
> >
>
> Usually people refrain from reviewing such massive patches. Most
> probably this can be broken up into reviewable pieces.
>
> Was this supposed to be an RFC?
>
> --
> Cheers,
>
> David / dhildenb
>
>


[PATCH v1 21/23] pc/q35: setup q35 for xen

2023-06-20 Thread Joel Upham
Mirrored the init done for piix devices when xen is being used.
This is needed for xen memory to be initialized and used with q35.

Signed-off-by: Joel Upham 
---
 hw/i386/pc_q35.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 789a23ce6b..0b53a86dd2 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -145,6 +145,7 @@ static void pc_q35_init(MachineState *machine)
 MemoryRegion *system_io = get_system_io();
 MemoryRegion *pci_memory;
 MemoryRegion *rom_memory;
+MemoryRegion *ram_memory;
 GSIState *gsi_state;
 ISABus *isa_bus;
 int i;
@@ -196,8 +197,12 @@ static void pc_q35_init(MachineState *machine)
 }
 
 pc_machine_init_sgx_epc(pcms);
-x86_cpus_init(x86ms, pcmc->default_cpu_version);
 
+x86_cpus_init(x86ms, pcmc->default_cpu_version);
+if (xen_enabled()) {
+xen_hvm_init_pc(pcms, _memory);
+machine->ram = ram_memory;
+}
 kvmclock_create(pcmc->kvmclock_create_always);
 
 /* pci enabled */
@@ -230,7 +235,15 @@ static void pc_q35_init(MachineState *machine)
 }
 
 /* allocate ram and load rom/bios */
-pc_memory_init(pcms, system_memory, rom_memory, pci_hole64_size);
+if (!xen_enabled()) 
+pc_memory_init(pcms, system_memory, rom_memory, pci_hole64_size);
+ else {
+pc_system_flash_cleanup_unused(pcms);
+if (machine->kernel_filename != NULL) {
+/* For xen HVM direct kernel boot, load linux here */
+xen_load_linux(pcms);
+}
+}
 
 object_property_add_child(OBJECT(machine), "q35", OBJECT(q35_host));
 object_property_set_link(OBJECT(q35_host), MCH_HOST_PROP_RAM_MEM,
@@ -307,7 +320,7 @@ static void pc_q35_init(MachineState *machine)
 
 assert(pcms->vmport != ON_OFF_AUTO__MAX);
 if (pcms->vmport == ON_OFF_AUTO_AUTO) {
-pcms->vmport = ON_OFF_AUTO_ON;
+pcms->vmport = xen_enabled() ? ON_OFF_AUTO_OFF : ON_OFF_AUTO_ON;
 }
 
 /* init basic PC hardware */
-- 
2.34.1




[PATCH v1 12/23] xen/pt: allow to hide PCIe Extended Capabilities

2023-06-20 Thread Joel Upham
We need to hide some unwanted PCI/PCIe capabilities for passed through
devices.
Normally we do this by marking the capability register group
as XEN_PT_GRP_TYPE_HARDWIRED which exclude this capability from the
capability list and returns zeroes on attempts to read capability body.
Skipping the capability in the linked list of capabilities can be done
by changing Next Capability register to skip one or many unwanted
capabilities.

One difference between PCI and PCIe Extended capabilities is that we don't
have the list head field anymore. PCIe Extended capabilities always start
at offset 0x100 if they're present. Unfortunately, there are typically
only few PCIe extended capabilities present which means there is a chance
that some capability we want to hide will reside at offset 0x100 in PCIe
config space.

The simplest way to hide such capabilities from guest OS or drivers
is faking their capability ID value.

This patch adds the Capability ID register handler which checks
- if the capability to which this register belong starts at offset 0x100
  in PCIe config space
- if this capability is marked as XEN_PT_GRP_TYPE_HARDWIRED

If it is the case, then a fake Capability ID value is returned.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt.c | 11 ++-
 hw/xen/xen_pt.h |  4 
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index f757978800..2399fabb2b 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -164,7 +164,16 @@ static uint32_t xen_pt_pci_read_config(PCIDevice *d, 
uint32_t addr, int len)
 reg_grp_entry = xen_pt_find_reg_grp(s, addr);
 if (reg_grp_entry) {
 /* check 0-Hardwired register group */
-if (reg_grp_entry->reg_grp->grp_type == XEN_PT_GRP_TYPE_HARDWIRED) {
+if (reg_grp_entry->reg_grp->grp_type == XEN_PT_GRP_TYPE_HARDWIRED &&
+/*
+ * For PCIe Extended Capabilities we need to emulate
+ * CapabilityID and NextCapability/Version registers for a
+ * hardwired reg group located at the offset 0x100 in PCIe
+ * config space. This allows us to hide the first extended
+ * capability as well.
+ */
+!(reg_grp_entry->base_offset == PCI_CONFIG_SPACE_SIZE &&
+ranges_overlap(addr, len, 0x100, 4))) {
 /* no need to emulate, just return 0 */
 val = 0;
 goto exit;
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index eb062be3f4..9a191cbc8f 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -93,6 +93,10 @@ typedef int (*xen_pt_conf_byte_read)
 
 #define XEN_PCI_INTEL_OPREGION 0xfc
 
+#define XEN_PCIE_CAP_ID 0
+#define XEN_PCIE_CAP_LIST_NEXT 2
+#define XEN_PCIE_FAKE_CAP_ID_BASE 0xFE00
+
 #define XEN_PCI_IGD_DOMAIN 0
 #define XEN_PCI_IGD_BUS 0
 #define XEN_PCI_IGD_DEV 2
-- 
2.34.1




[PATCH v1 15/23] xen/pt: add AER PCIe Extended Capability descriptor and sizing

2023-06-20 Thread Joel Upham
The patch provides Advanced Error Reporting PCIe Extended Capability
description structure and corresponding capability sizing function.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt_config_init.c | 72 +
 1 file changed, 72 insertions(+)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 69d8857c66..9fd0531bc4 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1861,6 +1861,70 @@ static int xen_pt_msix_size_init(XenPCIPassthroughState 
*s,
 }
 
 
+/* get Advanced Error Reporting Extended Capability register group size */
+#define PCI_ERR_CAP_TLP_PREFIX_LOG  (1U << 11)
+#define PCI_DEVCAP2_END_END_TLP_PREFIX  (1U << 21)
+static int xen_pt_ext_cap_aer_size_init(XenPCIPassthroughState *s,
+const XenPTRegGroupInfo *grp_reg,
+uint32_t base_offset,
+uint32_t *size)
+{
+uint8_t dev_type = get_pcie_device_type(s);
+uint32_t aer_caps = 0;
+uint32_t sz = 0;
+int pcie_cap_pos;
+uint32_t devcaps2;
+int ret = 0;
+
+pcie_cap_pos = xen_host_pci_find_next_cap(>real_device, 0,
+  PCI_CAP_ID_EXP);
+if (!pcie_cap_pos) {
+XEN_PT_ERR(>dev,
+   "Cannot find a required PCI Express Capability\n");
+return -1;
+}
+
+if (get_pcie_capability_version(s) > 1) {
+ret = xen_host_pci_get_long(>real_device,
+pcie_cap_pos + PCI_EXP_DEVCAP2,
+);
+if (ret) {
+XEN_PT_ERR(>dev, "Error while reading Device "
+   "Capabilities 2 Register \n");
+return -1;
+}
+}
+
+if (devcaps2 & PCI_DEVCAP2_END_END_TLP_PREFIX) {
+ret = xen_host_pci_get_long(>real_device,
+base_offset + PCI_ERR_CAP,
+_caps);
+if (ret) {
+XEN_PT_ERR(>dev,
+   "Error while reading AER Extended Capability\n");
+return -1;
+}
+
+if (aer_caps & PCI_ERR_CAP_TLP_PREFIX_LOG) {
+sz = 0x48;
+}
+}
+
+if (!sz) {
+if (dev_type == PCI_EXP_TYPE_ROOT_PORT ||
+dev_type == PCI_EXP_TYPE_RC_EC) {
+sz = 0x38;
+} else {
+sz = 0x2C;
+}
+}
+
+*size = sz;
+
+log_pcie_extended_cap(s, "AER", base_offset, *size);
+return ret;
+}
+
 static const XenPTRegGroupInfo xen_pt_emu_reg_grps[] = {
 /* Header Type0 reg group */
 {
@@ -2128,6 +2192,14 @@ static const XenPTRegGroupInfo xen_pt_emu_reg_grps[] = {
 .size_init  = xen_pt_reg_grp_size_init,
 .emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
 },
+/* Advanced Error Reporting Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_ERR),
+.grp_type   = XEN_PT_GRP_TYPE_EMU,
+.grp_size   = 0xFF,
+.size_init  = xen_pt_ext_cap_aer_size_init,
+.emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
+},
 {
 .grp_size = 0,
 },
-- 
2.34.1




[PATCH v1 01/23] pc/xen: Xen Q35 support: provide IRQ handling for PCI devices

2023-06-20 Thread Joel Upham
The primary difference in PCI device IRQ management between Xen HVM and
QEMU is that Xen PCI IRQs are "device-centric" while QEMU PCI IRQs are
"chipset-centric". Namely, Xen uses PCI device BDF and INTx as coordinates
to assert IRQ while QEMU finds out to which chipset PIRQ the IRQ is routed
through the hierarchy of PCI buses and manages IRQ assertion on chipset
side (as PIRQ inputs).

Two callback functions are used for this purpose: .map_irq and .set_irq
(named after corresponding structure fields). Corresponding Xen-specific
callback functions are piix3_set_irq() and pci_slot_get_pirq(). In Xen
case these functions do not operate on pirq pin numbers. Instead, they use
a specific value to pass BDF/INTx information between .map_irq and
.set_irq -- PCI device devfn and INTx pin number are combined into
pseudo-PIRQ in pci_slot_get_pirq, which piix3_set_irq later decodes back
into devfn and INTx number for passing to *set_pci_intx_level() call.

For Xen on Q35 this scheme is still applicable, with the exception that
function names are non-descriptive now and need to be renamed to show
their common i440/Q35 nature. Proposed new names are:

xen_pci_slot_get_pirq --> xen_cmn_pci_slot_get_pirq
xen_piix3_set_irq --> xen_cmn_set_irq

Another IRQ-related difference between i440 and Q35 is the number of PIRQ
inputs and PIRQ routers (PCI IRQ links in terms of ACPI) available. i440
has 4 PCI interrupt links, while Q35 has 8 (PIRQA...PIRQH).
Currently Xen have support for only 4 PCI links, so we describe only 4 of
8 PCI links in ACPI tables. Also, hvmloader disables PIRQ routing for
PIRQE..PIRQH by writing 80h into corresponding PIRQ[n]_ROUT registers.

All this PCI interrupt routing stuff is largely an ancient legacy from PIC
era. It's hardly worth to extend number of PCI links supported as we
normally deal with APIC mode and/or MSI interrupts.

The only useful thing to do with PIRQE..PIRQH routing currently is to
check if guest actually attempts to use it for some reason (despite ACPI
PCI routing information provided). In this case, a warning is logged.

Things have changed a bit in modern Qemu, and more changes to the IRQ
mapping had to be done inside the lpc_ich9 to write the irqs and setup
the mappings.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/i386/pc_piix.c |  3 +-
 hw/i386/xen/xen-hvm.c |  7 +++--
 hw/isa/lpc_ich9.c | 53 ---
 hw/isa/piix3.c|  2 +-
 include/hw/southbridge/ich9.h |  1 +
 include/hw/xen/xen.h  |  4 +--
 stubs/xen-hw-stub.c   |  4 +--
 7 files changed, 61 insertions(+), 13 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index d5b0dcd1fe..8c1b20f3bc 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -62,6 +62,7 @@
 #endif
 #include "hw/xen/xen-x86.h"
 #include "hw/xen/xen.h"
+#include "sysemu/xen.h"
 #include "migration/global_state.h"
 #include "migration/misc.h"
 #include "sysemu/numa.h"
@@ -233,7 +234,7 @@ static void pc_init1(MachineState *machine,
   x86ms->above_4g_mem_size,
   pci_memory, ram_memory);
 pci_bus_map_irqs(pci_bus,
- xen_enabled() ? xen_pci_slot_get_pirq
+ xen_enabled() ? xen_cmn_pci_slot_get_pirq
: pc_pci_slot_get_pirq);
 pcms->bus = pci_bus;
 
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index 56641a550e..540ac46639 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -15,6 +15,7 @@
 #include "hw/pci/pci.h"
 #include "hw/pci/pci_host.h"
 #include "hw/i386/pc.h"
+#include "hw/southbridge/ich9.h"
 #include "hw/irq.h"
 #include "hw/hw.h"
 #include "hw/i386/apic-msidef.h"
@@ -136,14 +137,14 @@ typedef struct XenIOState {
 Notifier wakeup;
 } XenIOState;
 
-/* Xen specific function for piix pci */
+/* Xen-specific functions for pci dev IRQ handling */
 
-int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
+int xen_cmn_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
 {
 return irq_num + (PCI_SLOT(pci_dev->devfn) << 2);
 }
 
-void xen_piix3_set_irq(void *opaque, int irq_num, int level)
+void xen_cmn_set_irq(void *opaque, int irq_num, int level)
 {
 xen_set_pci_intx_level(xen_domid, 0, 0, irq_num >> 2,
irq_num & 3, level);
diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
index 9c47a2f6c7..733a99d443 100644
--- a/hw/isa/lpc_ich9.c
+++ b/hw/isa/lpc_ich9.c
@@ -51,6 +51,9 @@
 #include "hw/core/cpu.h"
 #include "hw/nvram/fw_cfg.h"
 #include "qemu/cutils.h"
+#include "hw/xen/xen.h"
+#include "sysemu/xen.h"
+#include "hw/southbridge/piix.h"
 #include "hw/acpi/ac

[PATCH v1 1/1] Q35 Support

2023-06-20 Thread Joel Upham
---
 hw/acpi/ich9.c|   22 +-
 hw/acpi/pcihp.c   |6 +-
 hw/core/machine.c |   19 +
 hw/i386/pc_piix.c |3 +-
 hw/i386/pc_q35.c  |   39 +-
 hw/i386/xen/xen-hvm.c |7 +-
 hw/i386/xen/xen_platform.c|   19 +-
 hw/isa/lpc_ich9.c |   53 +-
 hw/isa/piix3.c|2 +-
 hw/pci-host/q35.c |   28 +-
 hw/pci/pci.c  |   17 +
 hw/xen/xen-host-pci-device.c  |  106 +++-
 hw/xen/xen-host-pci-device.h  |6 +-
 hw/xen/xen_pt.c   |   49 +-
 hw/xen/xen_pt.h   |   19 +-
 hw/xen/xen_pt_config_init.c   | 1103 ++---
 include/hw/acpi/ich9.h|1 +
 include/hw/acpi/pcihp.h   |2 +
 include/hw/boards.h   |1 +
 include/hw/i386/pc.h  |3 +
 include/hw/pci-host/q35.h |4 +-
 include/hw/pci/pci.h  |3 +
 include/hw/southbridge/ich9.h |1 +
 include/hw/xen/xen.h  |4 +-
 qemu-options.hx   |1 +
 softmmu/datadir.c |1 -
 softmmu/qdev-monitor.c|3 +-
 stubs/xen-hw-stub.c   |4 +-
 28 files changed, 1395 insertions(+), 131 deletions(-)

diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 25e2c7243e..234706a191 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -39,6 +39,8 @@
 #include "hw/southbridge/ich9.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/xen/xen.h"
+#include "sysemu/xen.h"
 
 //#define DEBUG
 
@@ -67,6 +69,10 @@ static void ich9_gpe_writeb(void *opaque, hwaddr addr, 
uint64_t val,
 ICH9LPCPMRegs *pm = opaque;
 acpi_gpe_ioport_writeb(>acpi_regs, addr, val);
 acpi_update_sci(>acpi_regs, pm->irq);
+
+if (xen_enabled()) {
+acpi_pcihp_reset(>acpi_pci_hotplug);
+}
 }
 
 static const MemoryRegionOps ich9_gpe_ops = {
@@ -137,7 +143,8 @@ static int ich9_pm_post_load(void *opaque, int version_id)
 {
 ICH9LPCPMRegs *pm = opaque;
 uint32_t pm_io_base = pm->pm_io_base;
-pm->pm_io_base = 0;
+if (!xen_enabled())
+pm->pm_io_base = 0;
 ich9_pm_iospace_update(pm, pm_io_base);
 return 0;
 }
@@ -268,7 +275,10 @@ static void pm_reset(void *opaque)
 acpi_pm1_evt_reset(>acpi_regs);
 acpi_pm1_cnt_reset(>acpi_regs);
 acpi_pm_tmr_reset(>acpi_regs);
-acpi_gpe_reset(>acpi_regs);
+/* Noticed guest freezing in xen when this was reset after S3. */
+if (!xen_enabled()) {
+acpi_gpe_reset(>acpi_regs);
+}
 
 pm->smi_en = 0;
 if (!pm->smm_enabled) {
@@ -316,7 +326,7 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm, 
qemu_irq sci_irq)
 acpi_pm_tco_init(>tco_regs, >io);
 }
 
-if (pm->acpi_pci_hotplug.use_acpi_hotplug_bridge) {
+if (pm->acpi_pci_hotplug.use_acpi_hotplug_bridge || xen_enabled()) {
 acpi_pcihp_init(OBJECT(lpc_pci),
 >acpi_pci_hotplug,
 pci_get_bus(lpc_pci),
@@ -332,10 +342,14 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm, 
qemu_irq sci_irq)
 pm->powerdown_notifier.notify = pm_powerdown_req;
 qemu_register_powerdown_notifier(>powerdown_notifier);
 
+if (xen_enabled()) {
+acpi_set_pci_info(true);
+}
+
 legacy_acpi_cpu_hotplug_init(pci_address_space_io(lpc_pci),
 OBJECT(lpc_pci), >gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE);
 
-if (pm->acpi_memory_hotplug.is_enabled) {
+if (pm->acpi_memory_hotplug.is_enabled || xen_enabled()) {
 acpi_memory_hotplug_init(pci_address_space_io(lpc_pci), 
OBJECT(lpc_pci),
  >acpi_memory_hotplug,
  ACPI_MEMORY_HOTPLUG_BASE);
diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
index cdd6f775a1..5b065d670c 100644
--- a/hw/acpi/pcihp.c
+++ b/hw/acpi/pcihp.c
@@ -40,6 +40,7 @@
 #include "qapi/error.h"
 #include "qom/qom-qobject.h"
 #include "trace.h"
+#include "sysemu/xen.h"
 
 #define ACPI_PCIHP_SIZE 0x0018
 #define PCI_UP_BASE 0x
@@ -84,7 +85,8 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
 bool is_bridge = IS_PCI_BRIDGE(br);
 
 /* hotplugged bridges can't be described in ACPI ignore them */
-if (qbus_is_hotpluggable(BUS(bus))) {
+/* Xen requires hotplugging to the root device, even on the Q35 chipset */
+if (qbus_is_hotpluggable(BUS(bus)) || xen_enabled()) {
 if (!is_bridge || (!br->hotplugged && info->has_bridge_hotplug)) {
 bus_bsel = g_malloc(sizeof *bus_bsel);
 
@@ -97,7 +99,7 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
 return info;
 }
 
-static void acpi_set_pci_info(bool has_bridge_hotplug)
+void acpi_set_pci_info(bool has_bridge_hotplug)
 {
 static bool bsel_is_set;
 Object *host = acpi_get_i386_pci_host();
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 1000406211..703138d2ec 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -455,6 +455,20 @@ static void machine_set_graphics(Object 

[PATCH v1 23/23] s3 support: enabling s3 with q35

2023-06-20 Thread Joel Upham
Resetting pci devices after s3 causes guest freezes, as xen usually
likes to handle resetting devices.


Signed-off-by: Joel Upham 
---
 hw/acpi/ich9.c| 12 
 hw/pci-host/q35.c |  3 ++-
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 1c236be1c7..234706a191 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -143,7 +143,8 @@ static int ich9_pm_post_load(void *opaque, int version_id)
 {
 ICH9LPCPMRegs *pm = opaque;
 uint32_t pm_io_base = pm->pm_io_base;
-pm->pm_io_base = 0;
+if (!xen_enabled())
+pm->pm_io_base = 0;
 ich9_pm_iospace_update(pm, pm_io_base);
 return 0;
 }
@@ -274,7 +275,10 @@ static void pm_reset(void *opaque)
 acpi_pm1_evt_reset(>acpi_regs);
 acpi_pm1_cnt_reset(>acpi_regs);
 acpi_pm_tmr_reset(>acpi_regs);
-acpi_gpe_reset(>acpi_regs);
+/* Noticed guest freezing in xen when this was reset after S3. */
+if (!xen_enabled()) {
+acpi_gpe_reset(>acpi_regs);
+}
 
 pm->smi_en = 0;
 if (!pm->smm_enabled) {
@@ -322,7 +326,7 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm, 
qemu_irq sci_irq)
 acpi_pm_tco_init(>tco_regs, >io);
 }
 
-if (pm->acpi_pci_hotplug.use_acpi_hotplug_bridge) {
+if (pm->acpi_pci_hotplug.use_acpi_hotplug_bridge || xen_enabled()) {
 acpi_pcihp_init(OBJECT(lpc_pci),
 >acpi_pci_hotplug,
 pci_get_bus(lpc_pci),
@@ -345,7 +349,7 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm, 
qemu_irq sci_irq)
 legacy_acpi_cpu_hotplug_init(pci_address_space_io(lpc_pci),
 OBJECT(lpc_pci), >gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE);
 
-if (pm->acpi_memory_hotplug.is_enabled) {
+if (pm->acpi_memory_hotplug.is_enabled || xen_enabled()) {
 acpi_memory_hotplug_init(pci_address_space_io(lpc_pci), 
OBJECT(lpc_pci),
  >acpi_memory_hotplug,
  ACPI_MEMORY_HOTPLUG_BASE);
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 1fe4e5a5c9..5891839ce9 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -580,7 +580,8 @@ static void mch_reset(DeviceState *qdev)
 d->config[MCH_HOST_BRIDGE_F_SMBASE] = 0;
 d->wmask[MCH_HOST_BRIDGE_F_SMBASE] = 0xff;
 
-mch_update(mch);
+if (!xen_enabled())
+mch_update(mch);
 }
 
 static void mch_realize(PCIDevice *d, Error **errp)
-- 
2.34.1




[PATCH v1 20/23] xen platform: unplug ahci object

2023-06-20 Thread Joel Upham
This will unplug the ahci device when the Xen driver calls for an unplug.
This has been tested to work in linux and Windows guests.
When q35 is detected, we will remove the ahci controller
with the hard disks.  In the libxl config, cdrom devices
are put on a seperate ahci controller. This allows for 6 cdrom
devices to be added, and 6 qemu hard disks.


Signed-off-by: Joel Upham 
---
 hw/i386/xen/xen_platform.c | 19 ++-
 hw/pci/pci.c   | 17 +
 include/hw/pci/pci.h   |  3 +++
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c
index 57f1d742c1..0375337222 100644
--- a/hw/i386/xen/xen_platform.c
+++ b/hw/i386/xen/xen_platform.c
@@ -34,6 +34,7 @@
 #include "sysemu/block-backend.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
+#include "include/hw/i386/pc.h"
 #include "qom/object.h"
 
 #ifdef CONFIG_XEN
@@ -223,6 +224,12 @@ static void unplug_disks(PCIBus *b, PCIDevice *d, void 
*opaque)
 if (flags & UNPLUG_NVME_DISKS) {
 object_unparent(OBJECT(d));
 }
+break;
+
+case PCI_CLASS_STORAGE_SATA:
+   if (!aux) {
+object_unparent(OBJECT(d));
+}
 
 default:
 break;
@@ -231,7 +238,17 @@ static void unplug_disks(PCIBus *b, PCIDevice *d, void 
*opaque)
 
 static void pci_unplug_disks(PCIBus *bus, uint32_t flags)
 {
-pci_for_each_device(bus, 0, unplug_disks, );
+PCIBus *q35 = find_q35();
+if (q35) {
+/* When q35 is detected, we will remove the ahci controller
+* with the hard disks.  In the libxl config, cdrom devices
+* are put on a seperate ahci controller. This allows for 6 cdrom
+* devices to be added, and 6 qemu hard disks.
+*/
+pci_function_for_one_bus(bus, unplug_disks, );
+} else {
+pci_for_each_device(bus, 0, unplug_disks, );
+}
 }
 
 static void platform_fixed_ioport_writew(void *opaque, uint32_t addr, uint32_t 
val)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 1cc7c89036..8eac3d751a 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1815,6 +1815,23 @@ void pci_for_each_device_reverse(PCIBus *bus, int 
bus_num,
 }
 }
 
+void pci_function_for_one_bus(PCIBus *bus,
+  void (*fn)(PCIBus *b, PCIDevice *d, void *opaque),
+  void *opaque)
+{
+bus = pci_find_bus_nr(bus, 0);
+
+if (bus) {
+PCIDevice *d;
+
+d = bus->devices[PCI_DEVFN(4,0)];
+if (d) {
+fn(bus, d, opaque);
+return;
+}
+}
+}
+
 void pci_for_each_device_under_bus(PCIBus *bus,
pci_bus_dev_fn fn, void *opaque)
 {
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index e6d0574a29..c53e21082a 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -343,6 +343,9 @@ void pci_for_each_device_under_bus(PCIBus *bus,
 void pci_for_each_device_under_bus_reverse(PCIBus *bus,
pci_bus_dev_fn fn,
void *opaque);
+void pci_function_for_one_bus(PCIBus *bus,
+ void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
+ void *opaque);
 void pci_for_each_bus_depth_first(PCIBus *bus, pci_bus_ret_fn begin,
   pci_bus_fn end, void *parent_state);
 PCIDevice *pci_get_function_0(PCIDevice *pci_dev);
-- 
2.34.1




[PATCH v1 09/23] xen/pt: Xen PCIe passthrough support for Q35: bypass PCIe topology check

2023-06-20 Thread Joel Upham
responding physical upstream PCIe
  Switch/RootPort. This will require some interaction with Dom0, hopefully
  extending xen-pciback will be enough.

3) The concept of I/O and MMIO ranges nesting, for tasks like sizing MMIO
  hole or PCI BAR allocation. This one should be pretty simple.

The actual implementation still is a matter to discuss of course.

In the meantime there can be used a very simple workaround which allows
to bypass pci.sys limitation for PCIe topology check - there exist one
good exception to "must have upstream PCIe parent" rule of pci.sys. It's
chipset-integrated devices. How pci.sys can tell if it deals with
a chipset built-in device? It checks one of PCI Express Capability fields
in the device PCI conf space. For chipset built-in devices this field will
state "root complex integrated device" while in our  case for a normal
passed thru PCIe device there will be a "PCIe endpoint" type. So that's
what the workaround does - it intercepts reading of this particular field
for passed through devices and returns the "root complex integrated
device" value for PCIe endpoints. This makes pci.sys happy and allows
Windows 7 and above to use PT device on PCIe-capable system normally.
So far no negative side effects were encountered while using this
approach, so it's a good temporary solution until multiple PCI bus support
will be added to Xen.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt_config_init.c | 49 +
 1 file changed, 49 insertions(+)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 47c8482f32..757a035aad 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -907,6 +907,55 @@ static int 
xen_pt_linkctrl2_reg_init(XenPCIPassthroughState *s,
 return 0;
 }
 
+/* initialize PCI Express Capabilities register */
+static int xen_pt_pcie_capabilities_reg_init(XenPCIPassthroughState *s,
+ XenPTRegInfo *reg,
+ uint32_t real_offset,
+ uint32_t *data)
+{
+uint8_t dev_type = get_pcie_device_type(s);
+uint16_t reg_field;
+
+if (xen_host_pci_get_word(>real_device,
+ real_offset - reg->offset + PCI_EXP_FLAGS,
+ _field)) {
+XEN_PT_ERR(>dev, "Error reading PCIe Capabilities reg\n");
+*data = 0;
+return 0;
+}
+/*
+ * Q35 workaround for Win7+ pci.sys PCIe topology check.
+ * As our PT device currently located on a bus 0, fake the
+ * device/port type field to the "Root Complex integrated device"
+ * value to bypass the check
+ */
+switch (dev_type) {
+case PCI_EXP_TYPE_ENDPOINT:
+case PCI_EXP_TYPE_LEG_END:
+XEN_PT_LOG(>dev, "Original PCIe Capabilities reg is 0x%04X\n",
+reg_field);
+reg_field &= ~PCI_EXP_FLAGS_TYPE;
+reg_field |= ((PCI_EXP_TYPE_RC_END /*9*/ << 4) & PCI_EXP_FLAGS_TYPE);
+XEN_PT_LOG(>dev, "Q35 PCIe topology check workaround: "
+   "faking Capabilities reg to 0x%04X\n", reg_field);
+break;
+
+case PCI_EXP_TYPE_ROOT_PORT:
+case PCI_EXP_TYPE_UPSTREAM:
+case PCI_EXP_TYPE_DOWNSTREAM:
+case PCI_EXP_TYPE_PCI_BRIDGE:
+case PCI_EXP_TYPE_PCIE_BRIDGE:
+case PCI_EXP_TYPE_RC_END:
+case PCI_EXP_TYPE_RC_EC:
+default:
+/* do nothing, return as is */
+break;
+}
+
+*data = reg_field;
+return 0;
+}
+
 /* PCI Express Capability Structure reg static information table */
 static XenPTRegInfo xen_pt_emu_reg_pcie[] = {
 /* Next Pointer reg */
-- 
2.34.1




[PATCH v1 11/23] xen/pt: handle PCIe Extended Capabilities Next register

2023-06-20 Thread Joel Upham
The patch adds new xen_pt_ext_cap_ptr_reg_init function which is used
to initialize the emulated next pcie extended capability pointer.

Primary mission of this function is to have a method to selectively hide
some extended capabilities from the capability linked list, skipping them
by altering the Next capability pointer value.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt_config_init.c | 87 +++--
 1 file changed, 55 insertions(+), 32 deletions(-)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 34ed9c25c5..ed36edbc4a 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -27,7 +27,10 @@
 
 static int xen_pt_ptr_reg_init(XenPCIPassthroughState *s, XenPTRegInfo *reg,
uint32_t real_offset, uint32_t *data);
-
+static int xen_pt_ext_cap_ptr_reg_init(XenPCIPassthroughState *s,
+   XenPTRegInfo *reg,
+   uint32_t real_offset,
+   uint32_t *data);
 
 /* helper */
 
@@ -1928,48 +1931,68 @@ out:
 return 0;
 }
 
+#define PCIE_EXT_CAP_NEXT_SHIFT 4
+#define PCIE_EXT_CAP_VER_MASK   0xF
 
-/*
- * Main
- */
-
-static uint8_t find_cap_offset(XenPCIPassthroughState *s, uint8_t cap)
+static int xen_pt_ext_cap_ptr_reg_init(XenPCIPassthroughState *s,
+   XenPTRegInfo *reg,
+   uint32_t real_offset,
+   uint32_t *data)
 {
-uint8_t id;
-unsigned max_cap = XEN_PCI_CAP_MAX;
-uint8_t pos = PCI_CAPABILITY_LIST;
-uint8_t status = 0;
+int i, rc;
+XenHostPCIDevice *d = >real_device;
+uint16_t reg_field;
+uint16_t cur_offset, version, cap_id;
+uint32_t header;
 
-if (xen_host_pci_get_byte(>real_device, PCI_STATUS, )) {
-return 0;
-}
-if ((status & PCI_STATUS_CAP_LIST) == 0) {
-return 0;
+if (real_offset < 0x0010) {
+XEN_PT_ERR(>dev, "Incorrect PCIe extended capability offset "
+   "encountered: 0x%04x\n", real_offset);
+return -EINVAL;
 }
 
-while (max_cap--) {
-if (xen_host_pci_get_byte(>real_device, pos, )) {
-break;
-}
-if (pos < PCI_CONFIG_HEADER_SIZE) {
-break;
-}
+rc = xen_host_pci_get_word(d, real_offset, _field);
+if (rc)
+return rc;
 
-pos &= ~3;
-if (xen_host_pci_get_byte(>real_device,
-  pos + PCI_CAP_LIST_ID, )) {
-break;
-}
+/* preserve version field */
+version= reg_field & PCIE_EXT_CAP_VER_MASK;
+cur_offset = reg_field >> PCIE_EXT_CAP_NEXT_SHIFT;
 
-if (id == 0xff) {
-break;
+while (cur_offset && cur_offset != 0xFFF) {
+rc = xen_host_pci_get_long(d, cur_offset, );
+if (rc) {
+XEN_PT_ERR(>dev, "Failed to read PCIe extended capability "
+   "@0x%x (rc:%d)\n", cur_offset, rc);
+return rc;
 }
-if (id == cap) {
-return pos;
+
+cap_id = PCI_EXT_CAP_ID(header);
+
+for (i = 0; xen_pt_emu_reg_grps[i].grp_size != 0; i++) {
+uint32_t cur_grp_id = xen_pt_emu_reg_grps[i].grp_id;
+
+if (!IS_PCIE_EXT_CAP_ID(cur_grp_id))
+continue;
+
+if (xen_pt_hide_dev_cap(d, cur_grp_id))
+continue;
+
+if (GET_PCIE_EXT_CAP_ID(cur_grp_id) == cap_id) {
+if (xen_pt_emu_reg_grps[i].grp_type == XEN_PT_GRP_TYPE_EMU)
+goto out;
+
+/* skip TYPE_HARDWIRED capability, move the ptr to next one */
+break;
+}
 }
 
-pos += PCI_CAP_LIST_NEXT;
+/* next capability */
+cur_offset = PCI_EXT_CAP_NEXT(header);
 }
+
+out:
+*data = (cur_offset << PCIE_EXT_CAP_NEXT_SHIFT) | version;
 return 0;
 }
 
-- 
2.34.1




[PATCH v1 16/23] xen/pt: add descriptors and size calculation for RCLD/ACS/PMUX/DPA/MCAST/TPH/DPC PCIe Extended Capabilities

2023-06-20 Thread Joel Upham
Add few more PCIe Extended Capabilities entries to the
xen_pt_emu_reg_grps[] array along with their corresponding *_size_init()
functions.

All these capabilities have non-fixed size but their size calculation
is very simple, hence adding them in a single batch.

For every capability register group, only 2 registers are emulated
currently: Capability ID (16 bit) and Next Capability Offset/Version (16
bit). Both needed to implement the selective capability hiding. All other
registers are passed through at the moment (unless they belong to
a capability marked as "hardwired" which is hidden)

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt_config_init.c | 224 
 1 file changed, 224 insertions(+)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 9fd0531bc4..1fba0b9d6c 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1925,6 +1925,174 @@ static int 
xen_pt_ext_cap_aer_size_init(XenPCIPassthroughState *s,
 return ret;
 }
 
+/* get Root Complex Link Declaration Extended Capability register group size */
+#define RCLD_GET_NUM_ENTRIES(x) (((x) >> 8) & 0xFF)
+static int xen_pt_ext_cap_rcld_size_init(XenPCIPassthroughState *s,
+ const XenPTRegGroupInfo *grp_reg,
+ uint32_t base_offset,
+ uint32_t *size)
+{
+uint32_t elem_self_descr = 0;
+
+int ret = xen_host_pci_get_long(>real_device,
+base_offset + 4,
+_self_descr);
+
+*size = 0x10 + RCLD_GET_NUM_ENTRIES(elem_self_descr) * 0x10;
+
+log_pcie_extended_cap(s, "Root Complex Link Declaration",
+  base_offset, *size);
+return ret;
+}
+
+/* get Access Control Services Extended Capability register group size */
+#define ACS_VECTOR_SIZE_BITS(x)x) >> 8) & 0xFF) ?: 256)
+static int xen_pt_ext_cap_acs_size_init(XenPCIPassthroughState *s,
+const XenPTRegGroupInfo *grp_reg,
+uint32_t base_offset,
+uint32_t *size)
+{
+uint16_t acs_caps = 0;
+
+int ret = xen_host_pci_get_word(>real_device,
+base_offset + PCI_ACS_CAP,
+_caps);
+
+if (acs_caps & PCI_ACS_EC) {
+uint32_t vector_sz = ACS_VECTOR_SIZE_BITS(acs_caps);
+
+*size = PCI_ACS_EGRESS_CTL_V + ((vector_sz + 7) & ~7) / 8;
+} else {
+*size = PCI_ACS_EGRESS_CTL_V;
+}
+
+log_pcie_extended_cap(s, "ACS", base_offset, *size);
+return ret;
+}
+
+/* get Multicast Extended Capability register group size */
+static int xen_pt_ext_cap_multicast_size_init(XenPCIPassthroughState *s,
+  const XenPTRegGroupInfo *grp_reg,
+  uint32_t base_offset,
+  uint32_t *size)
+{
+uint8_t dev_type = get_pcie_device_type(s);
+
+switch (dev_type) {
+case PCI_EXP_TYPE_ENDPOINT:
+case PCI_EXP_TYPE_LEG_END:
+case PCI_EXP_TYPE_RC_END:
+case PCI_EXP_TYPE_RC_EC:
+default:
+*size = PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF;
+break;
+
+case PCI_EXP_TYPE_ROOT_PORT:
+case PCI_EXP_TYPE_UPSTREAM:
+case PCI_EXP_TYPE_DOWNSTREAM:
+*size = 0x30;
+break;
+}
+
+log_pcie_extended_cap(s, "Multicast", base_offset, *size);
+return 0;
+}
+
+/* get Dynamic Power Allocation Extended Capability register group size */
+static int xen_pt_ext_cap_dpa_size_init(XenPCIPassthroughState *s,
+const XenPTRegGroupInfo *grp_reg,
+uint32_t base_offset,
+uint32_t *size)
+{
+uint32_t dpa_caps = 0;
+uint32_t num_entries;
+
+int ret = xen_host_pci_get_long(>real_device,
+base_offset + PCI_DPA_CAP,
+_caps);
+
+num_entries = (dpa_caps & PCI_DPA_CAP_SUBSTATE_MASK) + 1;
+
+*size = PCI_DPA_BASE_SIZEOF + num_entries /*byte-size registers*/;
+
+log_pcie_extended_cap(s, "Dynamic Power Allocation", base_offset, *size);
+return ret;
+}
+
+/* get TPH Requester Extended Capability register group size */
+static int xen_pt_ext_cap_tph_size_init(XenPCIPassthroughState *s,
+const XenPTRegGroupInfo *grp_reg,
+uint32_t base_offset,
+uint32_t *size)
+{
+uint32_t tph_caps = 0;
+uint32_t num_entries;
+
+int ret = xen_host_pci_get_long(>real_device,
+   

[PATCH v1 06/23] xen/pt: XenHostPCIDevice: provide functions for PCI Capabilities and PCIe Extended Capabilities enumeration

2023-06-20 Thread Joel Upham
This patch introduces 2 new functions,
- xen_host_pci_find_next_ext_cap (actually a reworked
  xen_host_pci_find_ext_cap_offset function which is unused)
- xen_host_pci_find_next_cap

These functions allow to search for PCI/PCIe capabilities in a uniform
way. Both functions allow to search either a specific capability or any
encountered next (by specifying CAP_ID_ANY as a capability ID) -- this may
be useful when we merely need to traverse the capability list one-by-one.
In both functions the 'pos' argument allows to continue searching from
last position (0 means to start from beginning).

In order not to probe PCIe Extended Capabilities existence every time,
xen_host_pci_find_next_ext_cap makes use of the new 'has_pcie_ext_caps'
field in XenHostPCIDevice structure which is filled only once (in
xen_host_pci_device_get).

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen-host-pci-device.c | 91 
 hw/xen/xen-host-pci-device.h |  5 +-
 2 files changed, 85 insertions(+), 11 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 8c6e9a1716..a7021a5d56 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -32,6 +32,7 @@
 
 #define IORESOURCE_PREFETCH 0x1000  /* No side effects */
 #define IORESOURCE_MEM_64   0x0010
+#define XEN_HOST_PCI_CAP_MAX48
 
 static void xen_host_pci_sysfs_path(const XenHostPCIDevice *d,
 const char *name, char *buf, ssize_t size)
@@ -198,6 +199,19 @@ static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice *d)
 return !stat(path, );
 }
 
+static bool xen_host_pci_dev_has_pcie_ext_caps(XenHostPCIDevice *d)
+{
+uint32_t header;
+
+if (xen_host_pci_get_long(d, PCI_CONFIG_SPACE_SIZE, ))
+return false;
+
+if (header == 0 || header == ~0U)
+return false;
+
+return true;
+}
+
 static void xen_host_pci_config_open(XenHostPCIDevice *d, Error **errp)
 {
 char path[PATH_MAX];
@@ -296,37 +310,93 @@ int xen_host_pci_set_block(XenHostPCIDevice *d, int pos, 
uint8_t *buf, int len)
 return xen_host_pci_config_write(d, pos, buf, len);
 }
 
-int xen_host_pci_find_ext_cap_offset(XenHostPCIDevice *d, uint32_t cap)
+int xen_host_pci_find_next_ext_cap(XenHostPCIDevice *d, int pos, uint32_t cap)
 {
 uint32_t header = 0;
 int max_cap = XEN_HOST_PCI_MAX_EXT_CAP;
-int pos = PCI_CONFIG_SPACE_SIZE;
+
+if (!d->has_pcie_ext_caps)
+return 0;
+
+if (!pos) {
+pos = PCI_CONFIG_SPACE_SIZE;
+} else {
+if (xen_host_pci_get_long(d, pos, ))
+return 0;
+
+pos = PCI_EXT_CAP_NEXT(header);
+}
 
 do {
+if (!pos || pos < PCI_CONFIG_SPACE_SIZE) {
+break;
+}
+
 if (xen_host_pci_get_long(d, pos, )) {
 break;
 }
 /*
  * If we have no capabilities, this is indicated by cap ID,
  * cap version and next pointer all being 0.
+* Also check for all F's returned (which means PCIe ext conf space
+* is unreadable for some reason)
  */
-if (header == 0) {
+   if (header == 0 || header == ~0U) {
 break;
 }
 
-if (PCI_EXT_CAP_ID(header) == cap) {
+if (cap == CAP_ID_ANY) {
+return pos;
+} else if (PCI_EXT_CAP_ID(header) == cap) {
 return pos;
 }
 
 pos = PCI_EXT_CAP_NEXT(header);
-if (pos < PCI_CONFIG_SPACE_SIZE) {
+} while (--max_cap);
+
+return 0;
+}
+
+int xen_host_pci_find_next_cap(XenHostPCIDevice *d, int pos, uint32_t cap)
+{
+uint8_t id;
+unsigned max_cap = XEN_HOST_PCI_CAP_MAX;
+uint8_t status = 0;
+uint8_t curpos;
+
+if (xen_host_pci_get_byte(d, PCI_STATUS, ))
+return 0;
+
+if ((status & PCI_STATUS_CAP_LIST) == 0)
+return 0;
+
+if (pos < PCI_CAPABILITY_LIST) {
+curpos = PCI_CAPABILITY_LIST;
+} else {
+curpos = (uint8_t) pos;
+}
+
+while (max_cap--) {
+if (xen_host_pci_get_byte(d, curpos, ))
+ break;
+if (!curpos)
+ break;
+
+if (cap == CAP_ID_ANY)
+return curpos;
+
+if (xen_host_pci_get_byte(d, curpos + PCI_CAP_LIST_ID, ))
 break;
-}
 
-max_cap--;
-} while (max_cap > 0);
+if (id == 0xff)
+break;
+else if (id == cap)
+return curpos;
+
+curpos += PCI_CAP_LIST_NEXT;
+}
 
-return -1;
+return 0;
 }
 
 void xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t domain,
@@ -376,7 +446,8 @@ void xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t 
domain,
 }
 d->class_code = v;
 
-d->is_virtfn = xen_host_pci_dev_is_virtfn(d);
+d->is_virtfn = xen_host_pci_dev_is_virtfn(d);
+d->has_pcie_ext_caps = xen_host_pci_dev_has_pcie_ext_caps(d);
 
 return;

[PATCH v1 22/23] qdev-monitor/pt: bypass root device check

2023-06-20 Thread Joel Upham
On xen we need to be able to have hotpluggable root devices,
even on Q35 at the moment. Having this check disables PT of
devices, so lets turn it off for now.

Signed-off-by: Joel Upham 
---
 softmmu/qdev-monitor.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
index b8d2c4dadd..f57dfa1964 100644
--- a/softmmu/qdev-monitor.c
+++ b/softmmu/qdev-monitor.c
@@ -43,6 +43,7 @@
 #include "hw/qdev-properties.h"
 #include "hw/clock.h"
 #include "hw/boards.h"
+#include "sysemu/xen.h"
 
 /*
  * Aliases were a bad idea from the start.  Let's keep them
@@ -663,7 +664,8 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
 return NULL;
 }
 
-if (phase_check(PHASE_MACHINE_READY) && bus && !qbus_is_hotpluggable(bus)) 
{
+if (phase_check(PHASE_MACHINE_READY) && bus && !qbus_is_hotpluggable(bus)
+&& !xen_enabled()) {
 error_setg(errp, QERR_BUS_NO_HOTPLUG, bus->name);
 return NULL;
 }
-- 
2.34.1




[PATCH v1 05/23] q35: Fix incorrect values for PCIEXBAR masks

2023-06-20 Thread Joel Upham
There are two small issues in PCIEXBAR address mask handling:
- wrong bit positions for address mask bits (see PCIEXBAR description
  in Q35 datasheet)
- incorrect usage of 64ADR_MASK

Due to this, attempting to write a valid PCIEXBAR address may cause it to
shift to another address, causing memory layout corruption where emulated
MMIO regions may overlap real (passed through) MMIO ranges. Fix this
by providing correct values.

I included the xen_enabled() check as I did not want to impact current
use cases that are not xen related (if they are not seeing a problem).

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/pci-host/q35.c | 16 +---
 include/hw/pci-host/q35.h |  4 ++--
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index fe5fc0f47c..1fe4e5a5c9 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -37,6 +37,7 @@
 #include "qapi/error.h"
 #include "qapi/visitor.h"
 #include "qemu/module.h"
+#include "sysemu/xen.h"
 
 /
  * Q35 host
@@ -324,12 +325,21 @@ static void mch_update_pciexbar(MCHPCIState *mch)
 break;
 case MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_128M:
 length = 128 * 1024 * 1024;
-addr_mask |= MCH_HOST_BRIDGE_PCIEXBAR_128ADMSK |
-MCH_HOST_BRIDGE_PCIEXBAR_64ADMSK;
+   if (!xen_enabled()) {
+addr_mask |= MCH_HOST_BRIDGE_PCIEXBAR_128ADMSK |
+MCH_HOST_BRIDGE_PCIEXBAR_64ADMSK;
+   } else {
+addr_mask |= MCH_HOST_BRIDGE_PCIEXBAR_128ADMSK;
+}
 break;
 case MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_64M:
 length = 64 * 1024 * 1024;
-addr_mask |= MCH_HOST_BRIDGE_PCIEXBAR_64ADMSK;
+   if (!xen_enabled()) {
+addr_mask |= MCH_HOST_BRIDGE_PCIEXBAR_64ADMSK;
+   } else {
+addr_mask |= MCH_HOST_BRIDGE_PCIEXBAR_64ADMSK |
+MCH_HOST_BRIDGE_PCIEXBAR_128ADMSK;
+}
 break;
 case MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_RVD:
 qemu_log_mask(LOG_GUEST_ERROR, "Q35: Reserved PCIEXBAR LENGTH\n");
diff --git a/include/hw/pci-host/q35.h b/include/hw/pci-host/q35.h
index e89329c51e..441cce6ccd 100644
--- a/include/hw/pci-host/q35.h
+++ b/include/hw/pci-host/q35.h
@@ -105,8 +105,8 @@ struct Q35PCIHost {
 #define MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT   0xb000
 #define MCH_HOST_BRIDGE_PCIEXBAR_MAX   (0x1000) /* 256M */
 #define MCH_HOST_BRIDGE_PCIEXBAR_ADMSK Q35_MASK(64, 35, 28)
-#define MCH_HOST_BRIDGE_PCIEXBAR_128ADMSK  ((uint64_t)(1 << 26))
-#define MCH_HOST_BRIDGE_PCIEXBAR_64ADMSK   ((uint64_t)(1 << 25))
+#define MCH_HOST_BRIDGE_PCIEXBAR_128ADMSK  ((uint64_t)(1 << 27))
+#define MCH_HOST_BRIDGE_PCIEXBAR_64ADMSK   ((uint64_t)(1 << 26))
 #define MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_MASK   ((uint64_t)(0x3 << 1))
 #define MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_256M   ((uint64_t)(0x0 << 1))
 #define MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_128M   ((uint64_t)(0x1 << 1))
-- 
2.34.1




[PATCH v1 07/23] xen/pt: avoid reading PCIe device type and cap version multiple times

2023-06-20 Thread Joel Upham
xen_pt_config_init.c reads Device/Port Type and Capability version fields
in many places. Two functions are used for this purpose:
get_capability_version and get_device_type. These functions perform PCI
conf space reading every time they're called. Another bad thing is that
these functions know nothing about where PCI Expess Capability is located,
so its offset must be provided explicitly in function arguments. Their
typical usage is like this:
uint8_t cap_ver = get_capability_version(s, real_offset - reg->offset);
uint8_t dev_type = get_device_type(s, real_offset - reg->offset);

To avoid this, the PCI Express Capability register now being read only
once and stored in  XenHostPCIDevice structure (pcie_flags field). The
capabiliy offset parameter is no longer needed, simplifying functions
usage. Also, get_device_type and get_capability_version were renamed
to more descriptive get_pcie_device_type and get_pcie_capability_version.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen-host-pci-device.c | 15 +++
 hw/xen/xen-host-pci-device.h |  1 +
 hw/xen/xen_pt_config_init.c  | 34 ++
 3 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index a7021a5d56..63481a859e 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -405,6 +405,7 @@ void xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t 
domain,
 {
 ERRP_GUARD();
 unsigned int v;
+int pcie_cap_pos;
 
 d->config_fd = -1;
 d->domain = domain;
@@ -449,6 +450,20 @@ void xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t 
domain,
 d->is_virtfn = xen_host_pci_dev_is_virtfn(d);
 d->has_pcie_ext_caps = xen_host_pci_dev_has_pcie_ext_caps(d);
 
+/* read and store PCIe Capabilities field for later use */
+pcie_cap_pos = xen_host_pci_find_next_cap(d, 0, PCI_CAP_ID_EXP);
+
+if (pcie_cap_pos) {
+if (xen_host_pci_get_word(d, pcie_cap_pos + PCI_EXP_FLAGS,
+  >pcie_flags)) {
+error_setg(errp, "Unable to read from PCI Express capability "
+   "structure at 0x%x", pcie_cap_pos);
+goto error;
+}
+} else {
+d->pcie_flags = 0x;
+}
+
 return;
 
 error:
diff --git a/hw/xen/xen-host-pci-device.h b/hw/xen/xen-host-pci-device.h
index 37c5614a24..2884c4b4b9 100644
--- a/hw/xen/xen-host-pci-device.h
+++ b/hw/xen/xen-host-pci-device.h
@@ -27,6 +27,7 @@ typedef struct XenHostPCIDevice {
 uint16_t device_id;
 uint32_t class_code;
 int irq;
+uint16_t pcie_flags;
 
 XenHostPCIIORegion io_regions[PCI_NUM_REGIONS - 1];
 XenHostPCIIORegion rom;
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 2b8680b112..47c8482f32 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -832,24 +832,18 @@ static XenPTRegInfo xen_pt_emu_reg_vendor[] = {
  * PCI Express Capability
  */
 
-static inline uint8_t get_capability_version(XenPCIPassthroughState *s,
- uint32_t offset)
+static inline uint8_t get_pcie_capability_version(XenPCIPassthroughState *s)
 {
-uint8_t flag;
-if (xen_host_pci_get_byte(>real_device, offset + PCI_EXP_FLAGS, )) 
{
-return 0;
-}
-return flag & PCI_EXP_FLAGS_VERS;
+assert(s->real_device.pcie_flags != 0x);
+
+return (uint8_t) (s->real_device.pcie_flags & PCI_EXP_FLAGS_VERS);
 }
 
-static inline uint8_t get_device_type(XenPCIPassthroughState *s,
-  uint32_t offset)
+static inline uint8_t get_pcie_device_type(XenPCIPassthroughState *s)
 {
-uint8_t flag;
-if (xen_host_pci_get_byte(>real_device, offset + PCI_EXP_FLAGS, )) 
{
-return 0;
-}
-return (flag & PCI_EXP_FLAGS_TYPE) >> 4;
+assert(s->real_device.pcie_flags != 0x);
+
+return (uint8_t) ((s->real_device.pcie_flags & PCI_EXP_FLAGS_TYPE) >> 4);
 }
 
 /* initialize Link Control register */
@@ -857,8 +851,8 @@ static int xen_pt_linkctrl_reg_init(XenPCIPassthroughState 
*s,
 XenPTRegInfo *reg, uint32_t real_offset,
 uint32_t *data)
 {
-uint8_t cap_ver = get_capability_version(s, real_offset - reg->offset);
-uint8_t dev_type = get_device_type(s, real_offset - reg->offset);
+uint8_t cap_ver  = get_pcie_capability_version(s);
+uint8_t dev_type = get_pcie_device_type(s);
 
 /* no need to initialize in case of Root Complex Integrated Endpoint
  * with cap_ver 1.x
@@ -875,7 +869,7 @@ static int xen_pt_devctrl2_reg_init(XenPCIPassthroughState 
*s,
 XenPTRegInfo *reg, uint32_t real_offset,
 uint32_t *data)
 {
-

[PATCH v1 17/23] xen/pt: add Resizable BAR PCIe Extended Capability descriptor and sizing

2023-06-20 Thread Joel Upham
Unlike other PCIe Extended Capabilities, we currently cannot allow attempts
to use Resizable BAR Capability. Without specifically handling BAR resizing
we're likely end up with corrupted MMIO hole layout if guest OS will
attempt to use this feature. Actually, recent Windows versions started
to understand and use the Resizable BAR Capability (see [1]).

For now, we need to hide the Resizable BAR Capability from guest OS until
BAR resizing emulation support will be implemented in Xen. This support
is a pretty much mandatory todo-feature as the effect of writing
to Resizable BAR control registers can be considered similar
to reprogramming normal BAR registers -- i.e. this needs to be handled
explicitly, resulting in corresponding MMIO BAR range(s) remapping.
Until then, we mark the Resizable BAR Capability as
XEN_PT_GRP_TYPE_HARDWIRED.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt_config_init.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 1fba0b9d6c..c5157ee3ee 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -2093,6 +2093,27 @@ static int 
xen_pt_ext_cap_pmux_size_init(XenPCIPassthroughState *s,
 return ret;
 }
 
+/* get Resizable BAR Extended Capability register group size */
+static int xen_pt_ext_cap_rebar_size_init(XenPCIPassthroughState *s,
+  const XenPTRegGroupInfo *grp_reg,
+  uint32_t base_offset,
+  uint32_t *size)
+{
+uint32_t rebar_ctl = 0;
+uint32_t num_entries;
+
+int ret = xen_host_pci_get_long(>real_device,
+base_offset + PCI_REBAR_CTRL,
+_ctl);
+num_entries =
+(rebar_ctl & PCI_REBAR_CTRL_NBAR_MASK) >> PCI_REBAR_CTRL_NBAR_SHIFT;
+
+*size = num_entries*8 + 4;
+
+log_pcie_extended_cap(s, "Resizable BAR", base_offset, *size);
+return ret;
+}
+
 static const XenPTRegGroupInfo xen_pt_emu_reg_grps[] = {
 /* Header Type0 reg group */
 {
@@ -2424,6 +2445,13 @@ static const XenPTRegGroupInfo xen_pt_emu_reg_grps[] = {
 .size_init  = xen_pt_ext_cap_dpc_size_init,
 .emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
 },
+/* Resizable BAR Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_REBAR),
+.grp_type   = XEN_PT_GRP_TYPE_HARDWIRED,
+.grp_size   = 0xFF,
+.size_init  = xen_pt_ext_cap_rebar_size_init,
+},
 {
 .grp_size = 0,
 },
-- 
2.34.1




[PATCH v1 18/23] xen/pt: add VC/VC9/MFVC PCIe Extended Capabilities descriptors and sizing

2023-06-20 Thread Joel Upham
Virtual Channel/MFVC capabilities are relatively useless for emulation
(passing through accesses to them should be enough in most cases) yet they
have hardest format of all PCIe Extended Capabilities, mostly because
VC capability format allows the sparse config space layout with gaps
between the parts which make up the VC capability.

We have the main capability body followed by variable number of entries
where each entry may additionally reference the arbitration table outside
main capability body. There are no constrains on these arbitration table
offsets -- in theory, they may reside outside the VC capability range
anywhere in PCIe extended config space. Also, every arbitration table size
is not fixed - it depends on current VC/Port Arbitration Select field
value.

To simplify things, this patch assume that changing VC/Port Arbitration
Select value (i.e. resizing arbitration tables) do not cause arbitration
table offsets to change. Normally the device must place arbitration tables
considering their maximum size, not current one. Maximum arbitration table
size depends on VC/Port Arbitration Capability bitmask -- this is what
actually used to calculate the arbitration table size.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt_config_init.c | 191 
 1 file changed, 191 insertions(+)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index c5157ee3ee..4e14adf2b2 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -2114,6 +2114,173 @@ static int 
xen_pt_ext_cap_rebar_size_init(XenPCIPassthroughState *s,
 return ret;
 }
 
+/* get VC/VC9/MFVC Extended Capability register group size */
+static uint32_t get_arb_table_len_max(XenPCIPassthroughState *s,
+  uint32_t max_bit_supported,
+  uint32_t arb_cap)
+{
+int n_bit;
+uint32_t table_max_size = 0;
+
+if (!arb_cap) {
+return 0;
+}
+
+for (n_bit = 7; n_bit >= 0 && !(arb_cap & (1 << n_bit)); n_bit--);
+
+if (n_bit > max_bit_supported) {
+XEN_PT_ERR(>dev, "Warning: encountered unknown VC arbitration "
+   "capability supported: 0x%02x\n", (uint8_t) arb_cap);
+}
+
+switch (n_bit) {
+case 0: break;
+case 1: return 32;
+case 2: return 64;
+case 3: /*128 too*/
+case 4: return 128;
+default:
+table_max_size = 8 << n_bit;
+}
+
+return table_max_size;
+}
+
+#define GET_ARB_TABLE_OFFSET(x)   (((x) >> 24) * 0x10)
+#define GET_VC_ARB_CAPABILITY(x)  ((x) & 0xFF)
+#define ARB_TABLE_ENTRY_SIZE_BITS(x)  (1 << (((x) & PCI_VC_CAP1_ARB_SIZE)\
+  >> 10))
+static int xen_pt_ext_cap_vchan_size_init(XenPCIPassthroughState *s,
+  const XenPTRegGroupInfo *grp_reg,
+  uint32_t base_offset,
+  uint32_t *size)
+{
+uint32_t header;
+uint32_t vc_cap_max_size = PCIE_CONFIG_SPACE_SIZE - base_offset;
+uint32_t next_ptr;
+uint32_t arb_table_start_max = 0, arb_table_end_max = 0;
+uint32_t port_vc_cap1, port_vc_cap2, vc_rsrc_cap;
+uint32_t ext_vc_count = 0;
+uint32_t arb_table_entry_size;  /* in bits */
+const char *cap_name;
+int ret;
+int i;
+
+ret = xen_host_pci_get_long(>real_device, base_offset, );
+if (ret) {
+goto err_read;
+}
+
+next_ptr = PCI_EXT_CAP_NEXT(header);
+
+switch (PCI_EXT_CAP_ID(header)) {
+case PCI_EXT_CAP_ID_VC:
+case PCI_EXT_CAP_ID_VC9:
+cap_name = "Virtual Channel";
+break;
+case PCI_EXT_CAP_ID_MFVC:
+cap_name = "Multi-Function VC";
+break;
+default:
+XEN_PT_ERR(>dev, "Unknown VC Extended Capability ID "
+   "encountered: 0x%04x\n", PCI_EXT_CAP_ID(header));
+return -1;
+}
+
+if (next_ptr && next_ptr > base_offset) {
+vc_cap_max_size = next_ptr - base_offset;
+}
+
+ret = xen_host_pci_get_long(>real_device,
+base_offset + PCI_VC_PORT_CAP1,
+_vc_cap1);
+if (ret) {
+goto err_read;
+}
+
+ret = xen_host_pci_get_long(>real_device,
+base_offset + PCI_VC_PORT_CAP2,
+_vc_cap2);
+if (ret) {
+goto err_read;
+}
+
+ext_vc_count = port_vc_cap1 & PCI_VC_CAP1_EVCC;
+
+arb_table_start_max = GET_ARB_TABLE_OFFSET(port_vc_cap2);
+
+/* check arbitration table offset for validity */
+if (arb_table_start_max >= vc_cap_max_size) {
+XEN_PT_ERR(>dev, "Warning: VC arbitration table offset points "
+   "

[PATCH v1 04/23] q35/xen: Add Xen platform device support for Q35

2023-06-20 Thread Joel Upham
Current Xen/QEMU method to control Xen Platform device on i440 is a bit
odd -- enabling/disabling Xen platform device actually modifies the QEMU
emulated machine type, namely xenfv <--> pc.

In order to avoid multiplying machine types, use a new way to control Xen
Platform device for QEMU -- "xen-platform-dev" machine property (bool).
To maintain backward compatibility with existing Xen/QEMU setups, this
is only applicable to q35 machine currently. i440 emulation still uses the
old method (i.e. xenfv/pc machine selection) to control Xen Platform
device, this may be changed later to xen-platform-dev property as well.

This way we can use a single machine type (q35) and change just
xen-platform-dev value to on/off to control Xen platform device.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/core/machine.c   | 19 +++
 hw/i386/pc_q35.c| 20 +++-
 include/hw/boards.h |  1 +
 qemu-options.hx |  1 +
 4 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 1000406211..703138d2ec 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -455,6 +455,20 @@ static void machine_set_graphics(Object *obj, bool value, 
Error **errp)
 ms->enable_graphics = value;
 }
 
+static bool machine_get_xen_platform_dev(Object *obj, Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+return ms->xen_platform_dev;
+}
+
+static void machine_set_xen_platform_dev(Object *obj, bool value, Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+ms->xen_platform_dev = value;
+}
+
 static char *machine_get_firmware(Object *obj, Error **errp)
 {
 MachineState *ms = MACHINE(obj);
@@ -1004,6 +1018,11 @@ static void machine_class_init(ObjectClass *oc, void 
*data)
 object_class_property_set_description(oc, "graphics",
 "Set on/off to enable/disable graphics emulation");
 
+object_class_property_add_bool(oc, "xen-platform-dev",
+machine_get_xen_platform_dev, machine_set_xen_platform_dev);
+object_class_property_set_description(oc, "xen-platform-dev",
+"Set on/off to enable/disable Xen Platform device");
+
 object_class_property_add_str(oc, "firmware",
 machine_get_firmware, machine_set_firmware);
 object_class_property_set_description(oc, "firmware",
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 6155427e48..789a23ce6b 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -57,10 +57,24 @@
 #include "hw/hyperv/vmbus-bridge.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/i386/acpi-build.h"
+#include "hw/xen/xen-x86.h"
+#include "sysemu/xen.h"
 
 /* ICH9 AHCI has 6 ports */
 #define MAX_SATA_PORTS 6
 
+static void q35_xen_hvm_init(MachineState *machine)
+{
+PCMachineState *pcms = PC_MACHINE(machine);
+
+if (xen_enabled()) {
+/* check if Xen Platform device is enabled */
+if (machine->xen_platform_dev) {
+pci_create_simple(pcms->bus, -1, "xen-platform");
+}
+}
+}
+
 struct ehci_companions {
 const char *name;
 int func;
@@ -273,8 +287,12 @@ static void pc_q35_init(MachineState *machine)
 for (i = 0; i < IOAPIC_NUM_PINS; i++) {
 qdev_connect_gpio_out_named(lpc_dev, ICH9_GPIO_GSI, i, x86ms->gsi[i]);
 }
-isa_bus = ISA_BUS(qdev_get_child_bus(lpc_dev, "isa.0"));
 
+if (xen_enabled()) {
+q35_xen_hvm_init(machine);
+}
+
+isa_bus = ISA_BUS(qdev_get_child_bus(lpc_dev, "isa.0"));
 if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
 pc_i8259_create(isa_bus, gsi_state->i8259_irq);
 }
diff --git a/include/hw/boards.h b/include/hw/boards.h
index a385010909..0b021f0764 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -339,6 +339,7 @@ struct MachineState {
 bool mem_merge;
 bool usb;
 bool usb_disabled;
+bool xen_platform_dev;
 char *firmware;
 bool iommu;
 bool suppress_vmdesc;
diff --git a/qemu-options.hx b/qemu-options.hx
index b37eb9662b..ea018257da 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -30,6 +30,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
 "vmport=on|off|auto controls emulation of vmport (default: 
auto)\n"
 "dump-guest-core=on|off include guest memory in a core 
dump (default=on)\n"
 "mem-merge=on|off controls memory merge support (default: 
on)\n"
+"xen-platform-dev=on|off controls Xen Platform device 
(default=off)\n"
 "aes-key-wrap=on|off controls support for AES key wrapping 
(default=on)\n"
 "dea-key-wrap=on|off controls support for DEA key wrapping 
(default=on)\n"
 "suppress-vmdesc=on|off disables self-describing migration 
(default=off)\n"
-- 
2.34.1




[PATCH v1 08/23] xen/pt: determine the legacy/PCIe mode for a passed through device

2023-06-20 Thread Joel Upham
Even if we have some real PCIe device being passed through to a guest,
there are situations when we cannot use its PCIe features, primarily
allowing to access extended (>256) config space.

Basically, we can allow reading PCIe extended config space only if both
the device and emulated system are PCIe-capable. So it's a combination
of checks:
- PCI Express capability presence
- pci_is_express(device)
- pci_bus_is_express(device bus)

The AND-product of these checks is stored to pcie_enabled_dev flag
in XenPCIPassthroughState for later use in functions like
xen_pt_pci_config_access_check.

This way we get consistent behavior when the same PCIe device being passed
through to either i440 domain or Q35 one.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt.c | 28 ++--
 hw/xen/xen_pt.h |  1 +
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index a540149639..65c5516ef4 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -701,6 +701,21 @@ static const MemoryListener xen_pt_io_listener = {
 .priority = 10,
 };
 
+static inline bool xen_pt_dev_is_pcie_mode(PCIDevice *d)
+{
+XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
+PCIBus *bus = pci_get_bus(d);
+
+if (bus != NULL) {
+if (pci_is_express(d) && pci_bus_is_express(bus) &&
+xen_host_pci_find_next_cap(>real_device, 0, PCI_CAP_ID_EXP)) {
+return true;
+}
+}
+
+return false;
+}
+
 /* destroy. */
 static void xen_pt_destroy(PCIDevice *d) {
 
@@ -787,8 +802,17 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
s->real_device.dev, s->real_device.func);
 }
 
-/* Initialize virtualized PCI configuration (Extended 256 Bytes) */
-memset(d->config, 0, PCI_CONFIG_SPACE_SIZE);
+s->pcie_enabled_dev = xen_pt_dev_is_pcie_mode(d);
+if (s->pcie_enabled_dev) {
+XEN_PT_LOG(d, "Host device %04x:%02x:%02x.%d passed thru "
+   "in PCIe mode\n", s->real_device.domain,
+s->real_device.bus, s->real_device.dev,
+s->real_device.func);
+}
+
+/* Initialize virtualized PCI configuration space (256/4K bytes) */
+memset(d->config, 0, pci_is_express(d) ? PCIE_CONFIG_SPACE_SIZE
+   : PCI_CONFIG_SPACE_SIZE);
 
 s->memory_listener = xen_pt_memory_listener;
 s->io_listener = xen_pt_io_listener;
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index b20744f7c7..1c9cd6b615 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -234,6 +234,7 @@ struct XenPCIPassthroughState {
 
 PCIHostDeviceAddress hostaddr;
 bool is_virtfn;
+bool pcie_enabled_dev;
 bool permissive;
 bool permissive_warned;
 XenHostPCIDevice real_device;
-- 
2.34.1




[PATCH v1 13/23] xen/pt: add Vendor-specific PCIe Extended Capability descriptor and sizing

2023-06-20 Thread Joel Upham
The patch provides Vendor-specific PCIe Extended Capability description
structure and corresponding sizing function. In this particular case the
size of the Vendor capability is available in the VSEC Length field.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt_config_init.c | 71 -
 1 file changed, 70 insertions(+), 1 deletion(-)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index ed36edbc4a..20b5561d25 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -124,6 +124,17 @@ static uint32_t get_throughable_mask(const 
XenPCIPassthroughState *s,
 return throughable_mask & valid_mask;
 }
 
+static void log_pcie_extended_cap(XenPCIPassthroughState *s,
+  const char *cap_name,
+  uint32_t base_offset, uint32_t size)
+{
+if (size) {
+XEN_PT_LOG(>dev, "Found PCIe Extended Capability: %s at 0x%04x, "
+"size 0x%x bytes\n", cap_name,
+(uint16_t) base_offset, size);
+}
+}
+
 /
  * general register functions
  */
@@ -1622,6 +1633,42 @@ static XenPTRegInfo xen_pt_emu_reg_igd_opregion[] = {
 },
 };
 
+/* Vendor-specific Ext Capability Structure reg static information table */
+static XenPTRegInfo xen_pt_ext_cap_emu_reg_vendor[] = {
+{
+.offset = XEN_PCIE_CAP_ID,
+.size   = 2,
+.init_val   = 0x,
+.ro_mask= 0x,
+.emu_mask   = 0x,
+.init   = xen_pt_ext_cap_capid_reg_init,
+.u.w.read   = xen_pt_word_reg_read,
+.u.w.write  = xen_pt_word_reg_write,
+},
+{
+.offset = XEN_PCIE_CAP_LIST_NEXT,
+.size   = 2,
+.init_val   = 0x,
+.ro_mask= 0x,
+.emu_mask   = 0x,
+.init   = xen_pt_ext_cap_ptr_reg_init,
+.u.w.read   = xen_pt_word_reg_read,
+.u.w.write  = xen_pt_word_reg_write,
+},
+{
+.offset = PCI_VNDR_HEADER,
+.size   = 4,
+.init_val   = 0x,
+.ro_mask= 0x,
+.emu_mask   = 0x,
+.init   = xen_pt_common_reg_init,
+.u.dw.read  = xen_pt_long_reg_read,
+.u.dw.write = xen_pt_long_reg_write,
+},
+{
+.size = 0,
+},
+};
 /
  * Capabilities
  */
@@ -1647,9 +1694,23 @@ static int 
xen_pt_vendor_size_init(XenPCIPassthroughState *s,
 return ret;
 }
 
+static int xen_pt_ext_cap_vendor_size_init(XenPCIPassthroughState *s,
+   const XenPTRegGroupInfo *grp_reg,
+   uint32_t base_offset,
+   uint32_t *size)
 {
-return xen_host_pci_get_byte(>real_device, base_offset + 0x02, size);
+uint32_t vsec_hdr = 0;
+int ret = xen_host_pci_get_long(>real_device,
+base_offset + PCI_VNDR_HEADER,
+_hdr);
+
+*size = PCI_VNDR_HEADER_LEN(vsec_hdr);
+
+log_pcie_extended_cap(s, "Vendor-specific", base_offset, *size);
+
+return ret;
 }
+
 /* get PCI Express Capability Structure register group size */
 static int xen_pt_pcie_size_init(XenPCIPassthroughState *s,
  const XenPTRegGroupInfo *grp_reg,
@@ -1876,6 +1937,14 @@ static const XenPTRegGroupInfo xen_pt_emu_reg_grps[] = {
 .size_init   = xen_pt_reg_grp_size_init,
 .emu_regs= xen_pt_emu_reg_igd_opregion,
 },
+/* Vendor-specific Extended Capability reg group */
+{
+.grp_id  = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_VNDR),
+.grp_type= XEN_PT_GRP_TYPE_EMU,
+.grp_size= 0xFF,
+.size_init   = xen_pt_ext_cap_vendor_size_init,
+.emu_regs= xen_pt_ext_cap_emu_reg_vendor,
+},
 {
 .grp_size = 0,
 },
-- 
2.34.1




[PATCH v1 03/23] q35/acpi/xen: Provide ACPI PCI hotplug interface for Xen on Q35

2023-06-20 Thread Joel Upham
This patch allows to use ACPI PCI hotplug functionality for Xen on Q35.
All added code depends on xen_enabled(), so no functionality change for
non-Xen usage.

We need to call the acpi_set_pci_info function from ich9_pm_init as well,
so it was made globally visible again (as it was before).

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/acpi/ich9.c  | 10 ++
 hw/acpi/pcihp.c |  2 +-
 include/hw/acpi/pcihp.h |  2 ++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 25e2c7243e..1c236be1c7 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -39,6 +39,8 @@
 #include "hw/southbridge/ich9.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/xen/xen.h"
+#include "sysemu/xen.h"
 
 //#define DEBUG
 
@@ -67,6 +69,10 @@ static void ich9_gpe_writeb(void *opaque, hwaddr addr, 
uint64_t val,
 ICH9LPCPMRegs *pm = opaque;
 acpi_gpe_ioport_writeb(>acpi_regs, addr, val);
 acpi_update_sci(>acpi_regs, pm->irq);
+
+if (xen_enabled()) {
+acpi_pcihp_reset(>acpi_pci_hotplug);
+}
 }
 
 static const MemoryRegionOps ich9_gpe_ops = {
@@ -332,6 +338,10 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm, 
qemu_irq sci_irq)
 pm->powerdown_notifier.notify = pm_powerdown_req;
 qemu_register_powerdown_notifier(>powerdown_notifier);
 
+if (xen_enabled()) {
+acpi_set_pci_info(true);
+}
+
 legacy_acpi_cpu_hotplug_init(pci_address_space_io(lpc_pci),
 OBJECT(lpc_pci), >gpe_cpu, ICH9_CPU_HOTPLUG_IO_BASE);
 
diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
index f4e39d7a9c..5b065d670c 100644
--- a/hw/acpi/pcihp.c
+++ b/hw/acpi/pcihp.c
@@ -99,7 +99,7 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
 return info;
 }
 
-static void acpi_set_pci_info(bool has_bridge_hotplug)
+void acpi_set_pci_info(bool has_bridge_hotplug)
 {
 static bool bsel_is_set;
 Object *host = acpi_get_i386_pci_host();
diff --git a/include/hw/acpi/pcihp.h b/include/hw/acpi/pcihp.h
index ef59810c17..d35a517c9e 100644
--- a/include/hw/acpi/pcihp.h
+++ b/include/hw/acpi/pcihp.h
@@ -72,6 +72,8 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler 
*hotplug_dev,
 /* Called on reset */
 void acpi_pcihp_reset(AcpiPciHpState *s);
 
+void acpi_set_pci_info(bool has_bridge_hotplug);
+
 void build_append_pcihp_slots(Aml *parent_scope, PCIBus *bus);
 
 extern const VMStateDescription vmstate_acpi_pcihp_pci_status;
-- 
2.34.1




[PATCH v1 00/23] Q35 support for Xen

2023-06-20 Thread Joel Upham
These are the Qemu changes needed to support the q35 chipset for xen
I based the patches from 2017 found on the mailing list here:
https://lists.xenproject.org/archives/html/xen-devel/2018-03/msg01176.html

I have been using a version of these patches on Xen 4.16 with Qemu
version 4.1 for over 6 months.  The guest VMs are very stable, and PCIe
PT is working as was designed (all of the PCIe devices are on the root
PCIe device).  I have successfully passed through GPUs, NICs, etc. I was
asked by those in the community to attempt to once again upstream the
patches.  I have them working with Seabios and OVMF (patches are needed
to OVMF which I will be sending to the mailing list). The Qemu patches 
allow for the xenvbd to properly unplug the AHCI SATA device, and all 
xen pv windows drivers work as intended.

I used the original author of the patches to get a majority of this to work:
Alexey Gerasimenko.  I fixed the patches to be in line with the upstream
Qemu and Xen versions.  Any original issues may still exist; however, I
am sure in time they can be improved. If the code doesn't exist then they
can't be actively looked at by the community.

I am not an expert on the Q35 chipset or PCIe technology.  This is my
first patch to this mailing list.


Joel Upham (23):
  pc/xen: Xen Q35 support: provide IRQ handling for PCI devices
  pc/q35: Apply PCI bus BSEL property for Xen PCI device hotplug
  q35/acpi/xen: Provide ACPI PCI hotplug interface for Xen on Q35
  q35/xen: Add Xen platform device support for Q35
  q35: Fix incorrect values for PCIEXBAR masks
  xen/pt: XenHostPCIDevice: provide functions for PCI Capabilities and
PCIe Extended Capabilities enumeration
  xen/pt: avoid reading PCIe device type and cap version multiple times
  xen/pt: determine the legacy/PCIe mode for a passed through device
  xen/pt: Xen PCIe passthrough support for Q35: bypass PCIe topology
check
  xen/pt: add support for PCIe Extended Capabilities and larger config
space
  xen/pt: handle PCIe Extended Capabilities Next register
  xen/pt: allow to hide PCIe Extended Capabilities
  xen/pt: add Vendor-specific PCIe Extended Capability descriptor and
sizing
  xen/pt: add fixed-size PCIe Extended Capabilities descriptors
  xen/pt: add AER PCIe Extended Capability descriptor and sizing
  xen/pt: add descriptors and size calculation for
RCLD/ACS/PMUX/DPA/MCAST/TPH/DPC PCIe Extended Capabilities
  xen/pt: add Resizable BAR PCIe Extended Capability descriptor and
sizing
  xen/pt: add VC/VC9/MFVC PCIe Extended Capabilities descriptors and
sizing
  xen/pt: Fake capability id
  xen platform: unplug ahci object
  pc/q35: setup q35 for xen
  qdev-monitor/pt: bypass root device check
  s3 support: enabling s3 with q35

 hw/acpi/ich9.c|   22 +-
 hw/acpi/pcihp.c   |6 +-
 hw/core/machine.c |   19 +
 hw/i386/pc_piix.c |3 +-
 hw/i386/pc_q35.c  |   39 +-
 hw/i386/xen/xen-hvm.c |7 +-
 hw/i386/xen/xen_platform.c|   19 +-
 hw/isa/lpc_ich9.c |   53 +-
 hw/isa/piix3.c|2 +-
 hw/pci-host/q35.c |   28 +-
 hw/pci/pci.c  |   17 +
 hw/xen/xen-host-pci-device.c  |  106 +++-
 hw/xen/xen-host-pci-device.h  |6 +-
 hw/xen/xen_pt.c   |   49 +-
 hw/xen/xen_pt.h   |   18 +-
 hw/xen/xen_pt_config_init.c   | 1103 ++---
 include/hw/acpi/pcihp.h   |2 +
 include/hw/boards.h   |1 +
 include/hw/i386/pc.h  |3 +
 include/hw/pci-host/q35.h |4 +-
 include/hw/pci/pci.h  |3 +
 include/hw/southbridge/ich9.h |1 +
 include/hw/xen/xen.h  |4 +-
 qemu-options.hx   |1 +
 softmmu/qdev-monitor.c|4 +-
 stubs/xen-hw-stub.c   |4 +-
 26 files changed, 1394 insertions(+), 130 deletions(-)

-- 
2.34.1




[PATCH v1 10/23] xen/pt: add support for PCIe Extended Capabilities and larger config space

2023-06-20 Thread Joel Upham
This patch provides basic facilities for PCIe Extended Capabilities and
support for controlled (via s->pcie_enabled_dev flag) access to PCIe
config space (>256).

PCIe Extended Capabilities make use of 16-bit capability ID. Also,
a capability size might exceed 8-bit width. So as the very first step
we need to increase type size for grp_id, grp_size, etc -- they were
limited to 8-bit.

The only troublesome issue with PCIe Extended Capability IDs is that their
value range is actually same as for basic PCI capabilities.
Eg. capability ID 3 means VPD Capability for PCI and at the same time
Device Serial Number Capability for PCIe Extended caps. This adds a bit of
inconvenience.

In order to distinguish between two sets of same capability IDs, the patch
introduces a set of macros to mark a capability ID as PCIe Extended one
(or check if it is basic/extended + get a raw ID value):
- PCIE_EXT_CAP_ID(cap_id)
- IS_PCIE_EXT_CAP_ID(grp_id)
- GET_PCIE_EXT_CAP_ID(grp_id)

Here is how it's used:
/* Intel IGD Opregion group */
{
.grp_id  = XEN_PCI_INTEL_OPREGION,  /* no change */
.grp_type= XEN_PT_GRP_TYPE_EMU,
.grp_size= 0x4,
.size_init   = xen_pt_reg_grp_size_init,
.emu_regs= xen_pt_emu_reg_igd_opregion,
},
/* Vendor-specific Extended Capability reg group */
{
.grp_id  = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_VNDR),
.grp_type= XEN_PT_GRP_TYPE_EMU,
.grp_size= 0xFF,
.size_init   = xen_pt_ext_cap_vendor_size_init,
.emu_regs= xen_pt_ext_cap_emu_reg_vendor,
},
By using the PCIE_EXT_CAP_ID() macro it is possible to reuse existing
header files with already defined PCIe Extended Capability ID values.

find_cap_offset() receive capabily ID and checks if it's an Extended one
by using IS_PCIE_EXT_CAP_ID(cap) macro, passing the real capabiliy
ID value to either xen_host_pci_find_next_ext_cap
or xen_host_pci_find_next_cap.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt.c | 10 -
 hw/xen/xen_pt.h | 13 --
 hw/xen/xen_pt_config_init.c | 90 ++---
 3 files changed, 83 insertions(+), 30 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 65c5516ef4..f757978800 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -96,8 +96,16 @@ void xen_pt_log(const PCIDevice *d, const char *f, ...)
 
 static int xen_pt_pci_config_access_check(PCIDevice *d, uint32_t addr, int len)
 {
+XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
 /* check offset range */
-if (addr > 0xFF) {
+if (s->pcie_enabled_dev) {
+if (addr >= PCIE_CONFIG_SPACE_SIZE) {
+XEN_PT_ERR(d, "Failed to access register with offset "
+  "exceeding 0xFFF. (addr: 0x%02x, len: %d)\n",
+  addr, len);
+return -1;
+}
+} else if (addr >= PCI_CONFIG_SPACE_SIZE) {
 XEN_PT_ERR(d, "Failed to access register with offset exceeding 0xFF. "
"(addr: 0x%02x, len: %d)\n", addr, len);
 return -1;
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 1c9cd6b615..eb062be3f4 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -33,6 +33,11 @@ void xen_pt_log(const PCIDevice *d, const char *f, ...) 
G_GNUC_PRINTF(2, 3);
 /* Helper */
 #define XEN_PFN(x) ((x) >> XC_PAGE_SHIFT)
 
+/* Macro's for PCIe Extended Capabilities */
+#define PCIE_EXT_CAP_ID(cap_id) ((cap_id) | (1U << 16))
+#define IS_PCIE_EXT_CAP_ID(grp_id)  ((grp_id) & (1U << 16))
+#define GET_PCIE_EXT_CAP_ID(grp_id) ((grp_id) & 0x)
+
 typedef const struct XenPTRegInfo XenPTRegInfo;
 typedef struct XenPTReg XenPTReg;
 
@@ -174,13 +179,13 @@ typedef const struct XenPTRegGroupInfo XenPTRegGroupInfo;
 /* emul reg group size initialize method */
 typedef int (*xen_pt_reg_size_init_fn)
 (XenPCIPassthroughState *, XenPTRegGroupInfo *,
- uint32_t base_offset, uint8_t *size);
+ uint32_t base_offset, uint32_t *size);
 
 /* emulated register group information */
 struct XenPTRegGroupInfo {
-uint8_t grp_id;
+uint32_t grp_id;
 XenPTRegisterGroupType grp_type;
-uint8_t grp_size;
+uint32_t grp_size;
 xen_pt_reg_size_init_fn size_init;
 XenPTRegInfo *emu_regs;
 };
@@ -190,7 +195,7 @@ typedef struct XenPTRegGroup {
 QLIST_ENTRY(XenPTRegGroup) entries;
 XenPTRegGroupInfo *reg_grp;
 uint32_t base_offset;
-uint8_t size;
+uint32_t size;
 QLIST_HEAD(, XenPTReg) reg_tbl_list;
 } XenPTRegGroup;
 
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 757a035aad..34ed9c25c5 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -32,28 +32,40 @@ static int xen_pt_ptr_reg_init(XenPCIPassthroughState *s, 
XenPTRegInfo *reg,
 /* helper */
 
 /* A return value of 1 means the capability should NOT be exp

[PATCH v1 19/23] xen/pt: Fake capability id

2023-06-20 Thread Joel Upham
Some PCIe capabilities needed to be faked for the xen implementation to work.

This is the situation when we were asked to hide (aka
"hardwire to 0") some PCIe ext capability, but it was located
at offset 0x100 in PCIe config space. In this case we can't
simply exclude it from the linked list of capabilities
(as it is the first entry in the list), so we must fake its
Capability ID in PCIe Extended Capability header, leaving
the Next Ptr field intact while returning zeroes on attempts
to read capability body (writes are ignored).

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt_config_init.c | 72 -
 1 file changed, 71 insertions(+), 1 deletion(-)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 4e14adf2b2..41b43b9445 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -16,6 +16,7 @@
 #include "qapi/error.h"
 #include "qemu/timer.h"
 #include "xen_pt.h"
+#include "xen-host-pci-device.h"
 #include "hw/xen/xen-legacy-backend.h"
 
 #define XEN_PT_MERGE_VALUE(value, data, val_mask) \
@@ -31,6 +32,10 @@ static int 
xen_pt_ext_cap_ptr_reg_init(XenPCIPassthroughState *s,
XenPTRegInfo *reg,
uint32_t real_offset,
uint32_t *data);
+static int xen_pt_ext_cap_capid_reg_init(XenPCIPassthroughState *s,
+ XenPTRegInfo *reg,
+ uint32_t real_offset,
+ uint32_t *data);
 
 /* helper */
 
@@ -995,6 +1000,17 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[] = {
 .u.b.read   = xen_pt_byte_reg_read,
 .u.b.write  = xen_pt_byte_reg_write,
 },
+/* PCI Express Capabilities Register */
+{
+.offset = PCI_EXP_FLAGS,
+.size   = 2,
+.init_val   = 0x,
+.ro_mask= 0x,
+.emu_mask   = 0x,
+.init   = xen_pt_pcie_capabilities_reg_init,
+.u.w.read   = xen_pt_word_reg_read,
+.u.w.write  = xen_pt_word_reg_write,
+},
 /* Device Capabilities reg */
 {
 .offset = PCI_EXP_DEVCAP,
@@ -1633,6 +1649,54 @@ static XenPTRegInfo xen_pt_emu_reg_igd_opregion[] = {
 },
 };
 
+/
+ * Emulated registers for
+ * PCIe Extended Capabilities
+ */
+
+static uint16_t fake_cap_id = XEN_PCIE_FAKE_CAP_ID_BASE;
+
+/* PCIe Extended Capability ID reg */
+static int xen_pt_ext_cap_capid_reg_init(XenPCIPassthroughState *s,
+ XenPTRegInfo *reg,
+ uint32_t real_offset,
+ uint32_t *data)
+{
+uint16_t reg_field;
+int rc;
+XenPTRegGroup *reg_grp_entry = NULL;
+
+/* use real device register's value as initial value */
+rc = xen_host_pci_get_word(>real_device, real_offset, _field);
+if (rc) {
+return rc;
+}
+
+reg_grp_entry = xen_pt_find_reg_grp(s, real_offset);
+
+if (reg_grp_entry) {
+if (reg_grp_entry->reg_grp->grp_type == XEN_PT_GRP_TYPE_HARDWIRED &&
+reg_grp_entry->base_offset == PCI_CONFIG_SPACE_SIZE) {
+/*
+ * This is the situation when we were asked to hide (aka
+ * "hardwire to 0") some PCIe ext capability, but it was located
+ * at offset 0x100 in PCIe config space. In this case we can't
+ * simply exclude it from the linked list of capabilities
+ * (as it is the first entry in the list), so we must fake its
+ * Capability ID in PCIe Extended Capability header, leaving
+ * the Next Ptr field intact while returning zeroes on attempts
+ * to read capability body (writes are ignored).
+ */
+reg_field = fake_cap_id;
+/* increment the value in order to have unique Capability IDs */
+fake_cap_id++;
+}
+}
+
+*data = reg_field;
+return 0;
+}
+
 /* Vendor-specific Ext Capability Structure reg static information table */
 static XenPTRegInfo xen_pt_ext_cap_emu_reg_vendor[] = {
 {
@@ -2938,7 +3002,13 @@ void xen_pt_config_init(XenPCIPassthroughState *s, Error 
**errp)
 }
 }
 
-if (xen_pt_emu_reg_grps[i].grp_type == XEN_PT_GRP_TYPE_EMU) {
+if (xen_pt_emu_reg_grps[i].grp_type == XEN_PT_GRP_TYPE_EMU ||
+/*
+ * We need to always emulate the PCIe Extended Capability
+ * header for a hidden capability which starts at offset 0x100
+ */
+(xen_pt_emu_reg_grps[i].grp_type == XEN_PT_GRP_TYPE_HARDWIRED &&
+reg_grp_offset == 0x100)) {
 if (xen_pt_emu_reg_grps[i].emu_regs) {
 int j = 0;
 XenPTRegInfo *regs = xen_pt_emu_reg_grps[i].emu_regs;
-- 
2.34.1




[PATCH v1 14/23] xen/pt: add fixed-size PCIe Extended Capabilities descriptors

2023-06-20 Thread Joel Upham
This adds description structures for all fixed-size PCIe Extended
Capabilities.

For every capability register group, only 2 registers are emulated
currently: Capability ID (16 bit) and Next Capability Offset/Version (16
bit). Both needed to implement selective capability hiding. All other
registers are passed through at the moment (unless they belong to
a "hardwired" capability which is hidden)

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/xen/xen_pt_config_init.c | 183 
 1 file changed, 183 insertions(+)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 20b5561d25..69d8857c66 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1669,6 +1669,37 @@ static XenPTRegInfo xen_pt_ext_cap_emu_reg_vendor[] = {
 .size = 0,
 },
 };
+
+/* Common reg static information table for all passthru-type
+ * PCIe Extended Capabilities. Only Extended Cap ID and
+ * Next pointer are handled (to support capability hiding).
+ */
+static XenPTRegInfo xen_pt_ext_cap_emu_reg_dummy[] = {
+{
+.offset = XEN_PCIE_CAP_ID,
+.size   = 2,
+.init_val   = 0x,
+.ro_mask= 0x,
+.emu_mask   = 0x,
+.init   = xen_pt_ext_cap_capid_reg_init,
+.u.w.read   = xen_pt_word_reg_read,
+.u.w.write  = xen_pt_word_reg_write,
+},
+{
+.offset = XEN_PCIE_CAP_LIST_NEXT,
+.size   = 2,
+.init_val   = 0x,
+.ro_mask= 0x,
+.emu_mask   = 0x,
+.init   = xen_pt_ext_cap_ptr_reg_init,
+.u.w.read   = xen_pt_word_reg_read,
+.u.w.write  = xen_pt_word_reg_write,
+},
+{
+.size = 0,
+},
+};
+
 /
  * Capabilities
  */
@@ -1945,6 +1976,158 @@ static const XenPTRegGroupInfo xen_pt_emu_reg_grps[] = {
 .size_init   = xen_pt_ext_cap_vendor_size_init,
 .emu_regs= xen_pt_ext_cap_emu_reg_vendor,
 },
+/* Device Serial Number Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_DSN),
+.grp_type   = XEN_PT_GRP_TYPE_EMU,
+.grp_size   = PCI_EXT_CAP_DSN_SIZEOF,   /*0x0C*/
+.size_init  = xen_pt_reg_grp_size_init,
+.emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
+},
+/* Power Budgeting Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_PWR),
+.grp_type   = XEN_PT_GRP_TYPE_EMU,
+.grp_size   = PCI_EXT_CAP_PWR_SIZEOF,   /*0x10*/
+.size_init  = xen_pt_reg_grp_size_init,
+.emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
+},
+/* Root Complex Internal Link Control Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_RCILC),
+.grp_type   = XEN_PT_GRP_TYPE_EMU,
+.grp_size   = 0x0C,
+.size_init  = xen_pt_reg_grp_size_init,
+.emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
+},
+/* Root Complex Event Collector Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_RCEC),
+.grp_type   = XEN_PT_GRP_TYPE_EMU,
+.grp_size   = 0x08,
+.size_init  = xen_pt_reg_grp_size_init,
+.emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
+},
+/* Root Complex Register Block Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_RCRB),
+.grp_type   = XEN_PT_GRP_TYPE_EMU,
+.grp_size   = 0x14,
+.size_init  = xen_pt_reg_grp_size_init,
+.emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
+},
+/* Configuration Access Correlation Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_CAC),
+.grp_type   = XEN_PT_GRP_TYPE_EMU,
+.grp_size   = 0x08,
+.size_init  = xen_pt_reg_grp_size_init,
+.emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
+},
+/* Alternate Routing ID Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_ARI),
+.grp_type   = XEN_PT_GRP_TYPE_EMU,
+.grp_size   = PCI_EXT_CAP_ARI_SIZEOF,
+.size_init  = xen_pt_reg_grp_size_init,
+.emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
+},
+/* Address Translation Services Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_ATS),
+.grp_type   = XEN_PT_GRP_TYPE_EMU,
+.grp_size   = PCI_EXT_CAP_ATS_SIZEOF,
+.size_init  = xen_pt_reg_grp_size_init,
+.emu_regs   = xen_pt_ext_cap_emu_reg_dummy,
+},
+/* Single Root I/O Virtualization Extended Capability reg group */
+{
+.grp_id = PCIE_EXT_CAP_ID(PCI_EXT_CAP_ID_SRIOV),
+.grp_type   = XEN_PT_GRP_TYPE_EMU,
+.grp_size   = PCI_EXT_CAP_SRIOV_SIZEOF,
+.size_init  = xen_pt_reg_grp_size_init,
+   

[PATCH v1 02/23] pc/q35: Apply PCI bus BSEL property for Xen PCI device hotplug

2023-06-20 Thread Joel Upham
On Q35 we still need to assign BSEL property to bus(es) for PCI device
add/hotplug to work.
Extend acpi_set_pci_info() function to support Q35 as well. This patch adds new 
(trivial)
function find_q35() which returns root PCIBus object on Q35, in a way
similar to what find_i440fx does.

Signed-off-by: Alexey Gerasimenko 
Signed-off-by: Joel Upham 
---
 hw/acpi/pcihp.c  | 4 +++-
 hw/pci-host/q35.c| 9 +
 include/hw/i386/pc.h | 3 +++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
index cdd6f775a1..f4e39d7a9c 100644
--- a/hw/acpi/pcihp.c
+++ b/hw/acpi/pcihp.c
@@ -40,6 +40,7 @@
 #include "qapi/error.h"
 #include "qom/qom-qobject.h"
 #include "trace.h"
+#include "sysemu/xen.h"
 
 #define ACPI_PCIHP_SIZE 0x0018
 #define PCI_UP_BASE 0x
@@ -84,7 +85,8 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque)
 bool is_bridge = IS_PCI_BRIDGE(br);
 
 /* hotplugged bridges can't be described in ACPI ignore them */
-if (qbus_is_hotpluggable(BUS(bus))) {
+/* Xen requires hotplugging to the root device, even on the Q35 chipset */
+if (qbus_is_hotpluggable(BUS(bus)) || xen_enabled()) {
 if (!is_bridge || (!br->hotplugged && info->has_bridge_hotplug)) {
 bus_bsel = g_malloc(sizeof *bus_bsel);
 
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index fd18920e7f..fe5fc0f47c 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -259,6 +259,15 @@ static void q35_host_initfn(Object *obj)
  qdev_prop_allow_set_link_before_realize, 0);
 }
 
+PCIBus *find_q35(void)
+{
+PCIHostState *s = OBJECT_CHECK(PCIHostState,
+   object_resolve_path("/machine/q35", NULL),
+   TYPE_PCI_HOST_BRIDGE);
+return s ? s->bus : NULL;
+}
+
+
 static const TypeInfo q35_host_info = {
 .name   = TYPE_Q35_HOST_DEVICE,
 .parent = TYPE_PCIE_HOST_BRIDGE,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index c661e9cc80..550f8fa221 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -196,6 +196,9 @@ void pc_madt_cpu_entry(int uid, const CPUArchIdList 
*apic_ids,
 /* sgx.c */
 void pc_machine_init_sgx_epc(PCMachineState *pcms);
 
+/* q35.c */
+PCIBus *find_q35(void);
+
 extern GlobalProperty pc_compat_8_0[];
 extern const size_t pc_compat_8_0_len;
 
-- 
2.34.1