Re: [SeaBIOS] Marvell 88SE9230 passthrough in KVM takes long time to boot
On Wed, 8 Aug 2018 14:11:16 +0200 Gerd Hoffmann wrote: > On Sun, Jul 29, 2018 at 01:49:10PM +0200, Konrad Eisele wrote: > > I'm passing through a Marvell 88SE9230 card to a KVM guest under > > Ubuntu 18.04. The card is a Sata controller with 4 ports. > > The option rom of the Marvell 88SE9230 card shows on a normal boot a > > bios screen. When pressing CTRL-m quick enough, you can interrupt the > > bootprocess and enter a menue wherer you can define raid > > arrays. > > > > When booting seabios inside KVM the bootprocess is very slow. > > There is a 1 min holdtime where the cpu is about 30%. The screen is > > black with only the seabios version string shown. I suspect that > > the passed-through Marvell 88SE9230 cards option roms causes this > > behaviour. > > Maybe the scanning for option rom cause the slow bootprocess? > > > > In the seabios boot case no bios menue is shown, after > > around 1 min the boot continues. > > > > Is it possible to disable the options rom processing? Is there some > > documentation about this (How can I configure it for Ubuntu) ? > > Set the romfile option to the empty string (for vfio-pci, on the qemu > command line) should do that (qemu will not expose the rom to the guest > then). Typically rombar=0 on the QEMU command line is how to disable the option ROM for an assigned device, or in libvirt. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v5] fw/pci: Add support for mapping Intel IGD via QEMU
On Wed, 01 Jun 2016 11:26:54 +0200 Gerd Hoffmann <kra...@redhat.com> wrote: > On Di, 2016-05-17 at 14:44 -0600, Alex Williamson wrote: > > QEMU provides two fw_cfg files to support IGD. The first holds the > > OpRegion data which holds the Video BIOS Table (VBT). This needs to > > be copied into reserved memory and the address stored in the ASL > > Storage register of the device at 0xFC offset in PCI config space. > > The OpRegion is generally 8KB. This file is named "etc/igd-opregion". > > > > The second file tells us the required size of the stolen memory space > > for the device. This space requires 1MB alignment and is generally > > either 1MB to 8MB depending on hardware config, but may be hundreds of > > MB for user specified stolen memory. The base address of the reserved > > memory allocated for this is written back to the Base Data of Stolen > > Memory register (BDSM) at PCI config offset 0x5C on the device. This > > file is named "etc/igd-bdsm-size". > > > > QEMU documents these fw_cfg entries in docs/igd-assign.txt. > > > > Signed-off-by: Alex Williamson <alex.william...@redhat.com> > > --- > > > > v6: fw_cfg BDSM entry now holds an 8-byte size integer as suggested > > by Gerd. Also renamed to etc/igd-bdsm-size. Filter based on bdf > > to only make use of this for the Intel VGA device at address > > 00:02.0, not that QEMU should attach this to anything else. > > > > As always, comments appreciated. I expect this will be on hold > > pending the QEMU support: > > > > http://thread.gmane.org/gmane.comp.emulators.kvm.devel/152123 > > qemu patches are merged meanwhile. > > Patch applied to master and cherry-picked into 1.9-stable. Thanks! ___ SeaBIOS mailing list SeaBIOS@seabios.org https://www.coreboot.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH v5] fw/pci: Add support for mapping Intel IGD via QEMU
QEMU provides two fw_cfg files to support IGD. The first holds the OpRegion data which holds the Video BIOS Table (VBT). This needs to be copied into reserved memory and the address stored in the ASL Storage register of the device at 0xFC offset in PCI config space. The OpRegion is generally 8KB. This file is named "etc/igd-opregion". The second file tells us the required size of the stolen memory space for the device. This space requires 1MB alignment and is generally either 1MB to 8MB depending on hardware config, but may be hundreds of MB for user specified stolen memory. The base address of the reserved memory allocated for this is written back to the Base Data of Stolen Memory register (BDSM) at PCI config offset 0x5C on the device. This file is named "etc/igd-bdsm-size". QEMU documents these fw_cfg entries in docs/igd-assign.txt. Signed-off-by: Alex Williamson <alex.william...@redhat.com> --- v6: fw_cfg BDSM entry now holds an 8-byte size integer as suggested by Gerd. Also renamed to etc/igd-bdsm-size. Filter based on bdf to only make use of this for the Intel VGA device at address 00:02.0, not that QEMU should attach this to anything else. As always, comments appreciated. I expect this will be on hold pending the QEMU support: http://thread.gmane.org/gmane.comp.emulators.kvm.devel/152123 If there's a better way to sync these projects, please let me know. Thanks, Alex src/fw/pciinit.c | 48 1 file changed, 48 insertions(+) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 35d9902..08221e6 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -287,6 +287,50 @@ static void ich9_smbus_setup(struct pci_device *dev, void *arg) ich9_smbus_enable(dev->bdf); } +static void intel_igd_setup(struct pci_device *dev, void *arg) +{ +struct romfile_s *opregion = romfile_find("etc/igd-opregion"); +u64 bdsm_size = le64_to_cpu(romfile_loadint("etc/igd-bdsm-size", 0)); +void *addr; +u16 bdf = dev->bdf; + +/* Apply OpRegion to any Intel VGA device, more than one is undefined */ +if (opregion && opregion->size) { +addr = memalign_high(PAGE_SIZE, opregion->size); +if (!addr) { +warn_noalloc(); +return; +} + +if (opregion->copy(opregion, addr, opregion->size) < 0) { +free(addr); +return; +} + +pci_config_writel(bdf, 0xFC, cpu_to_le32((u32)addr)); + +dprintf(1, "Intel IGD OpRegion enabled at 0x%08x, size %dKB, dev " +"%02x:%02x.%x\n", (u32)addr, opregion->size >> 10, +pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf), pci_bdf_to_fn(bdf)); +} + +/* Apply BDSM only to Intel VGA at 00:02.0 */ +if (bdsm_size && (bdf == pci_to_bdf(0, 2, 0))) { +addr = memalign_tmphigh(1024 * 1024, bdsm_size); +if (!addr) { +warn_noalloc(); +return; +} + +e820_add((u32)addr, bdsm_size, E820_RESERVED); + +pci_config_writel(bdf, 0x5C, cpu_to_le32((u32)addr)); + +dprintf(1, "Intel IGD BDSM enabled at 0x%08x, size %lldMB, dev " +"00:02.0\n", (u32)addr, bdsm_size >> 20); +} +} + static const struct pci_device_id pci_device_tbl[] = { /* PIIX3/PIIX4 PCI to ISA bridge */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, @@ -320,6 +364,10 @@ static const struct pci_device_id pci_device_tbl[] = { PCI_DEVICE_CLASS(PCI_VENDOR_ID_APPLE, 0x0017, 0xff00, apple_macio_setup), PCI_DEVICE_CLASS(PCI_VENDOR_ID_APPLE, 0x0022, 0xff00, apple_macio_setup), +/* Intel IGD OpRegion setup */ +PCI_DEVICE_CLASS(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA, + intel_igd_setup), + PCI_DEVICE_END, }; ___ SeaBIOS mailing list SeaBIOS@seabios.org https://www.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] RFC: Proposed vfio IGD assignment fw_cfg ABI
On Fri, 13 May 2016 10:21:00 +0200 Gerd Hoffmannwrote: > > #1: "etc/igd-opregion" > > > > the IGD OpRegion is an area of memory which contains among other > > things, the Video BIOS Table which is integral in allowing an assigned > > IGD to configure and make use of the physical display outputs of the > > system. "etc/igd-opregion" is an opaque fw_cfg file which the BIOS > > will use to allocate an appropriately sized reserved memory region, > > copy the contents of the fw_cfg file into the allocated memory region, > > and write the base address of the allocated memory region to the dword > > registers at 0xFC in PCI config space on the IGD device itself. The > > BIOS will look for this fw_cfg file any time a PCI class VGA device is > > found with Intel vendor ID. Multiple IGD devices per VM, such as might > > potentially be possible with Intel vGPU, is not within the scope of > > this proposal. The expected size of this fw_cfg file is on the order > > of a few pages, 8KB is typical. > > Looks good to me. > > > #2: "etc/igd-bdsm" > > > possibility of multiple BDSM per VM. The expected size of this fw_cfg > > file is from 1MB to multiple hundreds of MB with user specified stolen > > video memory. > > Having a big fw_cfg file without using the content looks a bit odd to > me. I'd suggest to create a fw_cfg file with the size in it instead. > Usual way to do this is to stick a 64bit value in little endian byte > order into fw_cfg, then use romfile_loadint() in seabios to read it. I think following this approach I'd rename the fw_cfg file to "etc/igd-bdsm-size", otherwise looks reasonable, I'll make the change. > > I'd appreciate any comments on these entries and I'd be happy to > > describe them further. Perhaps we should create a docs/api/ > > directory with these sorts of descriptions where we define how a > > given API is intended to work. > > There is docs/specs/ already where this would fit, and it even has a > fw_cfg.txt file. It might make sense to have a separate igd-assign.txt > though. Yeah, even beyond these fw_cfg options there's probably enough oddity with IGD assignment to have it's own docs file. I'll cook something up for the next posting. Thanks for the review and comments. Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org https://www.coreboot.org/mailman/listinfo/seabios
[SeaBIOS] RFC: Proposed vfio IGD assignment fw_cfg ABI
Hey folks, I'm planning to add a couple fw_cfg files for vfio IGD (Intel Graphics Device) assignment, but since this does represent a QEMU-BIOS ABI and since most of the vfio code is committed with only my own sign-off and review, I'd like to pull this out for discussion separate from the patches themselves. #1: "etc/igd-opregion" the IGD OpRegion is an area of memory which contains among other things, the Video BIOS Table which is integral in allowing an assigned IGD to configure and make use of the physical display outputs of the system. "etc/igd-opregion" is an opaque fw_cfg file which the BIOS will use to allocate an appropriately sized reserved memory region, copy the contents of the fw_cfg file into the allocated memory region, and write the base address of the allocated memory region to the dword registers at 0xFC in PCI config space on the IGD device itself. The BIOS will look for this fw_cfg file any time a PCI class VGA device is found with Intel vendor ID. Multiple IGD devices per VM, such as might potentially be possible with Intel vGPU, is not within the scope of this proposal. The expected size of this fw_cfg file is on the order of a few pages, 8KB is typical. #2: "etc/igd-bdsm" The BDSM is the register on IGD storing the base address of stolen memory (Base Data of Stolen Memory). This is simply an area of reserved RAM which the IGD uses for both GTT and stolen video memory. The semantics are much the same as for "etc/igd-opregion" with the exception that the fw_cfg file is only used to request a reserved memory allocation for this purpose and to indicate the size of the reserved memory. There is no content to this fw_cfg file and it should not be read or written. As above, the BIOS should look for this file upon discovering a PCI class VGA device with Intel vendor ID, allocate the necessary size and write the address to the IGD device. The BDSM register is a dword register located at 0x5C in PCI config space of the IGD device. This proposal does not intend to address the vague possibility of multiple BDSM per VM. The expected size of this fw_cfg file is from 1MB to multiple hundreds of MB with user specified stolen video memory. 8MB would be the typical maximum as QEMU currently does not allocate stolen video memory itself. I'd appreciate any comments on these entries and I'd be happy to describe them further. Perhaps we should create a docs/api/ directory with these sorts of descriptions where we define how a given API is intended to work. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org https://www.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v4] fw/pci: Add support for mapping Intel IGD via QEMU
On Wed, 11 May 2016 11:19:25 -0400 "Kevin O'Connor" <ke...@koconnor.net> wrote: > On Tue, May 10, 2016 at 09:19:52AM -0600, Alex Williamson wrote: > > On Tue, 23 Feb 2016 10:22:33 -0500 > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > On Tue, Feb 16, 2016 at 02:39:27PM -0700, Alex Williamson wrote: > > > > QEMU provides two fw_cfg files to support IGD. The first holds the > > > > OpRegion data which holds the Video BIOS Table (VBT). This needs to > > > > be copied into reserved memory and the address stored in the ASL > > > > Storage register of the device at 0xFC offset in PCI config space. > > > > The OpRegion is generally 8KB. This file is named "etc/igd-opregion". > > > > > > > > The second file tells us the required size of the stolen memory space > > > > for the device. This is a dummy file, it has no backing so we only > > > > allocate the space without copying anything into it. This space > > > > requires 1MB alignment and is generally either 1MB or 2MB, depending > > > > on the hardware config. If the user has opted in QEMU to expose > > > > additional stolen memory beyond the GTT (GGMS), the GMS may add an > > > > additional 32MB to 512MB. The base address of the reserved memory > > > > allocated for this is written back to the Base Data of Stolen Memory > > > > register (BDSM) at PCI config offset 0x5C on the device. This file is > > > > named "etc/igd-bdsm". > > > > > > Thanks. I'd be interested in seeing how the discussion with Michael > > > and Igor works out (wrt using a standardized interface for > > > communicating addresses between qemu and firmware) before adopting > > > this interface in SeaBIOS. > > > > Apologies if I missed it, but I haven't seen any work in this > > direction, please point me to it if I'm wrong. Now that QEMU 2.6 is > > wrapping up, I'd like to get IGD assignment support in for 2.7, but we > > can't do that without BIOS support. How shall we proceed? Thanks, > > If there's no consensus on standardizing the qemu/firmware interface > for addresses then I'm okay with the custom solution you last > proposed. So, I'd say submit your patches and if they are accepted in > QEMU then I will follow suit and apply them to SeaBIOS. Thanks Kevin, I'll post the proposed fw_cfg files as a separate RFC to QEMU to make sure they get eyes on them rather than just having them buried in a patch and we'll go from there. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org https://www.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v4] fw/pci: Add support for mapping Intel IGD via QEMU
On Tue, 23 Feb 2016 10:22:33 -0500 "Kevin O'Connor" <ke...@koconnor.net> wrote: > On Tue, Feb 16, 2016 at 02:39:27PM -0700, Alex Williamson wrote: > > QEMU provides two fw_cfg files to support IGD. The first holds the > > OpRegion data which holds the Video BIOS Table (VBT). This needs to > > be copied into reserved memory and the address stored in the ASL > > Storage register of the device at 0xFC offset in PCI config space. > > The OpRegion is generally 8KB. This file is named "etc/igd-opregion". > > > > The second file tells us the required size of the stolen memory space > > for the device. This is a dummy file, it has no backing so we only > > allocate the space without copying anything into it. This space > > requires 1MB alignment and is generally either 1MB or 2MB, depending > > on the hardware config. If the user has opted in QEMU to expose > > additional stolen memory beyond the GTT (GGMS), the GMS may add an > > additional 32MB to 512MB. The base address of the reserved memory > > allocated for this is written back to the Base Data of Stolen Memory > > register (BDSM) at PCI config offset 0x5C on the device. This file is > > named "etc/igd-bdsm". > > Thanks. I'd be interested in seeing how the discussion with Michael > and Igor works out (wrt using a standardized interface for > communicating addresses between qemu and firmware) before adopting > this interface in SeaBIOS. Apologies if I missed it, but I haven't seen any work in this direction, please point me to it if I'm wrong. Now that QEMU 2.6 is wrapping up, I'd like to get IGD assignment support in for 2.7, but we can't do that without BIOS support. How shall we proceed? Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org https://www.coreboot.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v4] fw/pci: Add support for mapping Intel IGD via QEMU
On Wed, 17 Feb 2016 11:09:49 + "Tian, Kevin" <kevin.t...@intel.com> wrote: > > From: Alex Williamson > > Sent: Wednesday, February 17, 2016 5:39 AM > > > > QEMU provides two fw_cfg files to support IGD. The first holds the > > OpRegion data which holds the Video BIOS Table (VBT). This needs to > > be copied into reserved memory and the address stored in the ASL > > Storage register of the device at 0xFC offset in PCI config space. > > The OpRegion is generally 8KB. This file is named "etc/igd-opregion". > > > > The second file tells us the required size of the stolen memory space > > for the device. This is a dummy file, it has no backing so we only > > allocate the space without copying anything into it. This space > > requires 1MB alignment and is generally either 1MB or 2MB, depending > > on the hardware config. If the user has opted in QEMU to expose > > additional stolen memory beyond the GTT (GGMS), the GMS may add an > > additional 32MB to 512MB. The base address of the reserved memory > > allocated for this is written back to the Base Data of Stolen Memory > > register (BDSM) at PCI config offset 0x5C on the device. This file is > > named "etc/igd-bdsm". > > What would happen if guest tries to access this range while there is > no actual memory behind? Isn't it more clear to hide stolen memory > at all instead of reporting a dummy range? It's a fw_cfg file, it's not exposed to the guest, the purpose is to convey the size of stolen memory, which the BIOS then allocates as reserved memory and writes back to the BDSM register. It would be more clean to ignore stolen memory, but in cases where we need the vBIOS, such as laptops, where my LCD panel won't turn on without it, we don't have that luxury. The vBIOS programs the device to use stolen memory, at least 1MB, I assume for GTT purposes, and makes use of that for VESA modes. So, we need the vBIOS to support laptop panels, the vBIOS needs stolen memory for GTT space, therefore we need to provide some stolen memory. This support is enabled in QEMU by using a vBIOS, I assume vGPUs won't expose a ROM BAR and therefore won't enable this. Additionally for BDW/SKL+ (gen8+) GPUs, if the user specifies rombar=0 then all of this is disabled, including the hostbridge and ISA bridge manipulation in order to support UPT mode. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [RFC PATCH v3 3/3] fw/pci: Allocate IGD stolen memory
On Mon, 15 Feb 2016 21:29:26 +0200 "Michael S. Tsirkin" <m...@redhat.com> wrote: > On Mon, Feb 15, 2016 at 12:20:23PM -0700, Alex Williamson wrote: > > On Mon, 15 Feb 2016 10:54:51 +0100 > > Igor Mammedov <imamm...@redhat.com> wrote: > > > > > On Sat, 13 Feb 2016 18:03:31 -0700 > > > Alex Williamson <alex.william...@redhat.com> wrote: > > > > > > > On Sat, 13 Feb 2016 19:20:32 -0500 > > > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > > > > > On Sat, Feb 13, 2016 at 01:57:09PM -0700, Alex Williamson wrote: > > > > > > On Sat, 13 Feb 2016 15:05:09 -0500 > > > > > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > > > On Sat, Feb 13, 2016 at 11:51:51AM -0700, Alex Williamson wrote: > > > > > > > > > > > > > > > On Sat, 13 Feb 2016 13:18:39 -0500 > > > > > > > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > > > > > On Sat, Feb 13, 2016 at 08:12:09AM -0700, Alex Williamson > > > > > > > > > wrote: > > > > > > > > > > On Fri, 12 Feb 2016 21:49:04 -0500 > > > > > > > > > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > > > > > > > On Fri, Feb 12, 2016 at 05:23:18PM -0700, Alex Williamson > > > > > > > > > > > wrote: > > > > > > > > > > > > Intel IGD makes use of memory allocated and marked > > > > > > > > > > > > reserved by the > > > > > > > > > > > > BIOS as a stolen memory range. For the most part, > > > > > > > > > > > > guest drivers don't > > > > > > > > > > > > make use of this, but our achilles heel is the vBIOS. > > > > > > > > > > > > The vBIOS > > > > > > > > > > > > programs the device to use the host stolen memory range > > > > > > > > > > > > and it's used > > > > > > > > > > > > in the pre-boot environment. Generally the guest won't > > > > > > > > > > > > have access to > > > > > > > > > > > > the host stolen memory area, so these accesses either > > > > > > > > > > > > land in VM > > > > > > > > > > > > memory or unassigned space and generate IOMMU faults. > > > > > > > > > > > > By allocating > > > > > > > > > > > > this range in SeaBIOS and programming it into the > > > > > > > > > > > > device, QEMU (via > > > > > > > > > > > > vfio) can make sure this guest allocated stolen memory > > > > > > > > > > > > range is used > > > > > > > > > > > > instead. > > > > > > > > > > > > > > > > > > > > > > What does "vBIOS" mean in this context? Is it the video > > > > > > > > > > > device option > > > > > > > > > > > rom or something else? > > > > > > > > > > > > > > > > > > > > vBIOS = video BIOS, you're correct, it's just the video > > > > > > > > > > device option > > > > > > > > > > ROM. > > > > > > > > > > > > > > > > > > Is the problem from when the host runs the video option rom, > > > > > > > > > or is the > > > > > > > > > problem from the guest (via SeaBIOS) running the video option > > > > > > > > > rom? If > > > > > > > > > the guest is running the option rom, is it the first time the > > > > > > > > > option > > > > > > > > > rom has been run for the machine (ie, was the option rom not > > > > > > > > > executed > > > > > > > > > on the host when the host machine first booted)? > > > > > > > > > &g
Re: [SeaBIOS] [Qemu-devel] [RFC PATCH v3 3/3] fw/pci: Allocate IGD stolen memory
On Mon, 15 Feb 2016 10:54:51 +0100 Igor Mammedov <imamm...@redhat.com> wrote: > On Sat, 13 Feb 2016 18:03:31 -0700 > Alex Williamson <alex.william...@redhat.com> wrote: > > > On Sat, 13 Feb 2016 19:20:32 -0500 > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > On Sat, Feb 13, 2016 at 01:57:09PM -0700, Alex Williamson wrote: > > > > On Sat, 13 Feb 2016 15:05:09 -0500 > > > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > On Sat, Feb 13, 2016 at 11:51:51AM -0700, Alex Williamson wrote: > > > > > > On Sat, 13 Feb 2016 13:18:39 -0500 > > > > > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > > > On Sat, Feb 13, 2016 at 08:12:09AM -0700, Alex Williamson wrote: > > > > > > > > > > > > > > > On Fri, 12 Feb 2016 21:49:04 -0500 > > > > > > > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > > > > > On Fri, Feb 12, 2016 at 05:23:18PM -0700, Alex Williamson > > > > > > > > > wrote: > > > > > > > > > > Intel IGD makes use of memory allocated and marked reserved > > > > > > > > > > by the > > > > > > > > > > BIOS as a stolen memory range. For the most part, guest > > > > > > > > > > drivers don't > > > > > > > > > > make use of this, but our achilles heel is the vBIOS. The > > > > > > > > > > vBIOS > > > > > > > > > > programs the device to use the host stolen memory range and > > > > > > > > > > it's used > > > > > > > > > > in the pre-boot environment. Generally the guest won't > > > > > > > > > > have access to > > > > > > > > > > the host stolen memory area, so these accesses either land > > > > > > > > > > in VM > > > > > > > > > > memory or unassigned space and generate IOMMU faults. By > > > > > > > > > > allocating > > > > > > > > > > this range in SeaBIOS and programming it into the device, > > > > > > > > > > QEMU (via > > > > > > > > > > vfio) can make sure this guest allocated stolen memory > > > > > > > > > > range is used > > > > > > > > > > instead. > > > > > > > > > > > > > > > > > > What does "vBIOS" mean in this context? Is it the video > > > > > > > > > device option > > > > > > > > > rom or something else? > > > > > > > > > > > > > > > > vBIOS = video BIOS, you're correct, it's just the video device > > > > > > > > option > > > > > > > > ROM. > > > > > > > > > > > > > > Is the problem from when the host runs the video option rom, or > > > > > > > is the > > > > > > > problem from the guest (via SeaBIOS) running the video option > > > > > > > rom? If > > > > > > > the guest is running the option rom, is it the first time the > > > > > > > option > > > > > > > rom has been run for the machine (ie, was the option rom not > > > > > > > executed > > > > > > > on the host when the host machine first booted)? > > > > > > > > > > > > > > FWIW, many of the chromebooks use coreboot with Intel graphics > > > > > > > and a > > > > > > > number of users have installed SeaBIOS (running natively) on their > > > > > > > machines. Running the intel video option rom more than once has > > > > > > > been > > > > > > > known to cause issues. > > > > > > > > > > > > The issue is in the VM and it occurs every time the option ROM is > > > > > > executed. Standard VGA text mode displays fine (ex. SeaBIOS version > > > > > > string and boot menu), but any sort of extended graphics mode (ex. > > > > > > live > > > > >
Re: [SeaBIOS] [RFC PATCH v3 3/3] fw/pci: Allocate IGD stolen memory
On Mon, 15 Feb 2016 11:31:41 +0100 Gerd Hoffmannwrote: > Hi, > > > The issue is in the VM and it occurs every time the option ROM is > > executed. Standard VGA text mode displays fine (ex. SeaBIOS version > > string and boot menu), but any sort of extended graphics mode (ex. live > > CD boot menu) tries to make use of the host memory area which > > corresponds to the stolen memory area of the physical device. > > I'm wondering whenever we can just use seavgabios (and support standard > vga modes only). The reason we need the vBIOS is to turn on the LCD panel, which in turn has this dependency on the stolen memory area. I'd be pretty surprised if a generic VGA BIOS affected the panel in any way. Maybe seavgabios could be used for desktop IGD assignment, though I don't really know how to detect an attached panel. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v3 3/3] fw/pci: Allocate IGD stolen memory
Hi Kevin, On Fri, 12 Feb 2016 21:49:04 -0500 "Kevin O'Connor" <ke...@koconnor.net> wrote: > On Fri, Feb 12, 2016 at 05:23:18PM -0700, Alex Williamson wrote: > > Intel IGD makes use of memory allocated and marked reserved by the > > BIOS as a stolen memory range. For the most part, guest drivers don't > > make use of this, but our achilles heel is the vBIOS. The vBIOS > > programs the device to use the host stolen memory range and it's used > > in the pre-boot environment. Generally the guest won't have access to > > the host stolen memory area, so these accesses either land in VM > > memory or unassigned space and generate IOMMU faults. By allocating > > this range in SeaBIOS and programming it into the device, QEMU (via > > vfio) can make sure this guest allocated stolen memory range is used > > instead. > > What does "vBIOS" mean in this context? Is it the video device option > rom or something else? vBIOS = video BIOS, you're correct, it's just the video device option ROM. > > > > Signed-off-by: Alex Williamson <alex.william...@redhat.com> > > --- > > src/fw/pciinit.c | 13 - > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c > > index 92170d5..c1ad5d4 100644 > > --- a/src/fw/pciinit.c > > +++ b/src/fw/pciinit.c > > @@ -260,7 +260,7 @@ static void ich9_smbus_setup(struct pci_device *dev, > > void *arg) > > static void intel_igd_opregion_setup(struct pci_device *dev, void *arg) > > { > > struct romfile_s *file = romfile_find("etc/igd-opregion"); > > -void *opregion; > > +void *opregion, *bdsm; > > u16 bdf = dev->bdf; > > > > if (!file || !file->size) > > @@ -281,6 +281,17 @@ static void intel_igd_opregion_setup(struct pci_device > > *dev, void *arg) > > > > dprintf(1, "Intel IGD OpRegion enabled on %02x:%02x.%x\n", > > pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf), pci_bdf_to_fn(bdf)); > > + > > +bdsm = memalign_high(1024 * 1024, 1024 * 1024); > > +if (!bdsm) { > > +warn_noalloc(); > > +return; > > +} > > The "high" memory pool is not a good fit for such a large allocation. > For so much space, I'd use memalign_tmphigh() followed by > e820_add(addr, size, E820_RESERVED). Ok, as you saw on IRC I was looking for alternatives, I can easily switch to this. > > +pci_config_writel(bdf, 0x5C, cpu_to_le32((u32)bdsm)); > > + > > +dprintf(1, "Allocated 1MB reserved memory for Intel IGD stolen memory > > at " > > +"0x%08x\n", (u32)bdsm); > > } > > Does this make sense to do unconditionally in the firmware whenever > the device is present? The SeaBIOS release schedule is not in sync > with QEMU, so changes in the firmware tend to take longer to deploy if > something needs to be done differently in the future. > > What happens if this register is not set? Does anything go wrong if > register 0xFC is set, but 0x5C is not (eg, due to allocation failure). It's not entirely unconditional, it's tagged onto the end of the OpRegion setup, so we know that we're at least dealing with a QEMU that has that support. We could link it to the PCI ROM BAR for IGD being present since this is necessary to support the code in that ROM, but that doesn't give us any idea if the QEMU support is there. Maybe we could do both, OpRegion support and ROM BAR present. The write to 0x5C is used by QEMU code that traps writes to the device I/O port BAR and replaces the host stolen memory address with the new guest address generated here. 0x5C is initialized to 0x0 by kernel vfio code, so we can detect whether it has been written. If not written, QEMU has no place to redirect to for stolen memory and it will either overlap VM memory or an unassigned area. The former may corrupt VM memory, the latter throws host errors. We could in QEMU halt with a hardware error if 0x5C hasn't been programmed. One alternative I was considering was to simply use the unused BAR5 slot on IGD to expose a fake 1MB, 32bit MMIO BAR. SeaBIOS would program this normally, QEMU would back it with its own allocation and we could easily find the address of it when we need it. The downside being that the PCI BAR can be disabled and remapped, but for the small window where we need it, I'm not sure that matters. It would certainly be more transparent to the VM BIOS. It would also make it much easier to change the size if we found some devices need more/less/none space. I initially thought the fake BAR wasn't a great fix, but maybe its advantages outweigh its downsides. I'll give it a shot. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v3 3/3] fw/pci: Allocate IGD stolen memory
On Sat, 13 Feb 2016 13:18:39 -0500 "Kevin O'Connor" <ke...@koconnor.net> wrote: > On Sat, Feb 13, 2016 at 08:12:09AM -0700, Alex Williamson wrote: > > On Fri, 12 Feb 2016 21:49:04 -0500 > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > On Fri, Feb 12, 2016 at 05:23:18PM -0700, Alex Williamson wrote: > > > > Intel IGD makes use of memory allocated and marked reserved by the > > > > BIOS as a stolen memory range. For the most part, guest drivers don't > > > > make use of this, but our achilles heel is the vBIOS. The vBIOS > > > > programs the device to use the host stolen memory range and it's used > > > > in the pre-boot environment. Generally the guest won't have access to > > > > the host stolen memory area, so these accesses either land in VM > > > > memory or unassigned space and generate IOMMU faults. By allocating > > > > this range in SeaBIOS and programming it into the device, QEMU (via > > > > vfio) can make sure this guest allocated stolen memory range is used > > > > instead. > > > > > > What does "vBIOS" mean in this context? Is it the video device option > > > rom or something else? > > > > vBIOS = video BIOS, you're correct, it's just the video device option > > ROM. > > Is the problem from when the host runs the video option rom, or is the > problem from the guest (via SeaBIOS) running the video option rom? If > the guest is running the option rom, is it the first time the option > rom has been run for the machine (ie, was the option rom not executed > on the host when the host machine first booted)? > > FWIW, many of the chromebooks use coreboot with Intel graphics and a > number of users have installed SeaBIOS (running natively) on their > machines. Running the intel video option rom more than once has been > known to cause issues. The issue is in the VM and it occurs every time the option ROM is executed. Standard VGA text mode displays fine (ex. SeaBIOS version string and boot menu), but any sort of extended graphics mode (ex. live CD boot menu) tries to make use of the host memory area which corresponds to the stolen memory area of the physical device. We're not really sure how the ROM execution arrives at these addresses (not from the device according to access traces), but we can see when the ROM is writing these addresses to the device and modify they addresses to point at a VM memory range which we've allocated. That's what this code attempts to do, allocate a buffer and tell QEMU about it via the BDSM (Base Data of Stolen Memory) register. > [...] > > The write to 0x5C is used by QEMU code that traps writes to the > > device I/O port BAR and replaces the host stolen memory address > > with the new guest address generated here. 0x5C is initialized to > > 0x0 by kernel vfio code, so we can detect whether it has been > > written. If not written, QEMU has no place to redirect to for > > stolen memory and it will either overlap VM memory or an unassigned > > area. The former may corrupt VM memory, the latter throws host > > errors. We could in QEMU halt with a hardware error if 0x5C hasn't > > been programmed. > > So, if I understand correctly, 0x5C is not a "real" register on the > hardware, but is instead just a mechanism to give QEMU the address of > some guest visible ram? It is a real register, BDSM that is virtualized by vfio turning it effectively into a scratch register. On physical hardware the register is read-only. > BTW, is 0xFC a "real" register in the hardware? How does the guest > find the location of the "OpRegion" if SeaBIOS allocates it? 0xFC is the ASL Storage register, the guest finds the location of the OpRegion using this register. This is another register that is read-only on real hardware but virtualized through vfio so we can relocate the OpRegion into the VM address space. I've found that allocating a dummy MMIO BAR does work as an alternative for mapping space for this stolen memory into the VM address space. For a Linux guest it works to allocate BAR5 on the IGD device. Windows10 is not so happy with this, but does work if I allocate the BAR on something like the ISA bridge device. My guess is that the IGD driver in Windows freaks out at finding this strange new BAR on its device. So I'll need to come up with an algorithm for either creating a dummy PCI device to host this BAR or trying to add it to other existing devices. It's certainly a more self-contained solution this way, so I expect we'll only need patch 1/3 from this series. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v3 3/3] fw/pci: Allocate IGD stolen memory
On Sat, 13 Feb 2016 15:05:09 -0500 "Kevin O'Connor" <ke...@koconnor.net> wrote: > On Sat, Feb 13, 2016 at 11:51:51AM -0700, Alex Williamson wrote: > > On Sat, 13 Feb 2016 13:18:39 -0500 > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > On Sat, Feb 13, 2016 at 08:12:09AM -0700, Alex Williamson wrote: > > > > On Fri, 12 Feb 2016 21:49:04 -0500 > > > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > On Fri, Feb 12, 2016 at 05:23:18PM -0700, Alex Williamson wrote: > > > > > > Intel IGD makes use of memory allocated and marked reserved by the > > > > > > BIOS as a stolen memory range. For the most part, guest drivers > > > > > > don't > > > > > > make use of this, but our achilles heel is the vBIOS. The vBIOS > > > > > > programs the device to use the host stolen memory range and it's > > > > > > used > > > > > > in the pre-boot environment. Generally the guest won't have access > > > > > > to > > > > > > the host stolen memory area, so these accesses either land in VM > > > > > > memory or unassigned space and generate IOMMU faults. By allocating > > > > > > this range in SeaBIOS and programming it into the device, QEMU (via > > > > > > vfio) can make sure this guest allocated stolen memory range is used > > > > > > instead. > > > > > > > > > > What does "vBIOS" mean in this context? Is it the video device option > > > > > rom or something else? > > > > > > > > vBIOS = video BIOS, you're correct, it's just the video device option > > > > ROM. > > > > > > Is the problem from when the host runs the video option rom, or is the > > > problem from the guest (via SeaBIOS) running the video option rom? If > > > the guest is running the option rom, is it the first time the option > > > rom has been run for the machine (ie, was the option rom not executed > > > on the host when the host machine first booted)? > > > > > > FWIW, many of the chromebooks use coreboot with Intel graphics and a > > > number of users have installed SeaBIOS (running natively) on their > > > machines. Running the intel video option rom more than once has been > > > known to cause issues. > > > > The issue is in the VM and it occurs every time the option ROM is > > executed. Standard VGA text mode displays fine (ex. SeaBIOS version > > string and boot menu), but any sort of extended graphics mode (ex. live > > CD boot menu) tries to make use of the host memory area which > > corresponds to the stolen memory area of the physical device. We're > > not really sure how the ROM execution arrives at these addresses (not > > from the device according to access traces), but we can see when the > > ROM is writing these addresses to the device and modify they addresses > > to point at a VM memory range which we've allocated. That's what this > > code attempts to do, allocate a buffer and tell QEMU about it via the > > BDSM (Base Data of Stolen Memory) register. > > Forgive me if I'm not fully understanding this. If I read what you're > saying then the sequence is something like: > > 1 - the host system bios (or vgabios) programs the GTT/stolen memory > base register at host system bootup time and reserves it in the > host e820 map. > > 2 - upon running qemu, the guest reruns the vga bios option rom which > seems to work (ie, text mode works) > > 3 - in the guest, upon running a bootloader that uses graphics mode, > the bootloader calls the vgabios to switch to graphics mode, and > the vgabios sends commands to the graphics hardware that somehow > reference the host stolen memory What exactly happens here isn't clear to me, but this is a plausible explanation. What we see in tracing access to the hardware is that a bunch of addresses are written to the device that fall within the host e820 reserved area and then the device starts generating IOMMU faults trying to access those addresses. > 4 - your patch causes QEMU to catch these commands with references to > the host stolen memory and replace them with references to the > guest stolen memory (which seabios creates) > > Am I understanding the above correctly? Yes. > Is the only reason to run the intel option rom in the guest for > bootloader graphic mode support? Last time I looked, the intel vga > hardware could fully emulate a l
Re: [SeaBIOS] [RFC PATCH v3 3/3] fw/pci: Allocate IGD stolen memory
On Sat, 13 Feb 2016 19:20:32 -0500 "Kevin O'Connor" <ke...@koconnor.net> wrote: > On Sat, Feb 13, 2016 at 01:57:09PM -0700, Alex Williamson wrote: > > On Sat, 13 Feb 2016 15:05:09 -0500 > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > On Sat, Feb 13, 2016 at 11:51:51AM -0700, Alex Williamson wrote: > > > > On Sat, 13 Feb 2016 13:18:39 -0500 > > > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > On Sat, Feb 13, 2016 at 08:12:09AM -0700, Alex Williamson wrote: > > > > > > On Fri, 12 Feb 2016 21:49:04 -0500 > > > > > > "Kevin O'Connor" <ke...@koconnor.net> wrote: > > > > > > > On Fri, Feb 12, 2016 at 05:23:18PM -0700, Alex Williamson wrote: > > > > > > > > > > > > > > > Intel IGD makes use of memory allocated and marked reserved by > > > > > > > > the > > > > > > > > BIOS as a stolen memory range. For the most part, guest > > > > > > > > drivers don't > > > > > > > > make use of this, but our achilles heel is the vBIOS. The vBIOS > > > > > > > > programs the device to use the host stolen memory range and > > > > > > > > it's used > > > > > > > > in the pre-boot environment. Generally the guest won't have > > > > > > > > access to > > > > > > > > the host stolen memory area, so these accesses either land in VM > > > > > > > > memory or unassigned space and generate IOMMU faults. By > > > > > > > > allocating > > > > > > > > this range in SeaBIOS and programming it into the device, QEMU > > > > > > > > (via > > > > > > > > vfio) can make sure this guest allocated stolen memory range is > > > > > > > > used > > > > > > > > instead. > > > > > > > > > > > > > > What does "vBIOS" mean in this context? Is it the video device > > > > > > > option > > > > > > > rom or something else? > > > > > > > > > > > > vBIOS = video BIOS, you're correct, it's just the video device > > > > > > option > > > > > > ROM. > > > > > > > > > > Is the problem from when the host runs the video option rom, or is the > > > > > problem from the guest (via SeaBIOS) running the video option rom? If > > > > > the guest is running the option rom, is it the first time the option > > > > > rom has been run for the machine (ie, was the option rom not executed > > > > > on the host when the host machine first booted)? > > > > > > > > > > FWIW, many of the chromebooks use coreboot with Intel graphics and a > > > > > number of users have installed SeaBIOS (running natively) on their > > > > > machines. Running the intel video option rom more than once has been > > > > > known to cause issues. > > > > > > > > The issue is in the VM and it occurs every time the option ROM is > > > > executed. Standard VGA text mode displays fine (ex. SeaBIOS version > > > > string and boot menu), but any sort of extended graphics mode (ex. live > > > > CD boot menu) tries to make use of the host memory area which > > > > corresponds to the stolen memory area of the physical device. We're > > > > not really sure how the ROM execution arrives at these addresses (not > > > > from the device according to access traces), but we can see when the > > > > ROM is writing these addresses to the device and modify they addresses > > > > to point at a VM memory range which we've allocated. That's what this > > > > code attempts to do, allocate a buffer and tell QEMU about it via the > > > > BDSM (Base Data of Stolen Memory) register. > > > > > > Forgive me if I'm not fully understanding this. If I read what you're > > > saying then the sequence is something like: > > > > > > 1 - the host system bios (or vgabios) programs the GTT/stolen memory > > > base register at host system bootup time and reserves it in the > > > host e820 map. > > > > > > 2 - upon running qemu, the guest reruns the vga bios option rom which > >
[SeaBIOS] [RFC PATCH v3 3/3] fw/pci: Allocate IGD stolen memory
Intel IGD makes use of memory allocated and marked reserved by the BIOS as a stolen memory range. For the most part, guest drivers don't make use of this, but our achilles heel is the vBIOS. The vBIOS programs the device to use the host stolen memory range and it's used in the pre-boot environment. Generally the guest won't have access to the host stolen memory area, so these accesses either land in VM memory or unassigned space and generate IOMMU faults. By allocating this range in SeaBIOS and programming it into the device, QEMU (via vfio) can make sure this guest allocated stolen memory range is used instead. Signed-off-by: Alex Williamson <alex.william...@redhat.com> --- src/fw/pciinit.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 92170d5..c1ad5d4 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -260,7 +260,7 @@ static void ich9_smbus_setup(struct pci_device *dev, void *arg) static void intel_igd_opregion_setup(struct pci_device *dev, void *arg) { struct romfile_s *file = romfile_find("etc/igd-opregion"); -void *opregion; +void *opregion, *bdsm; u16 bdf = dev->bdf; if (!file || !file->size) @@ -281,6 +281,17 @@ static void intel_igd_opregion_setup(struct pci_device *dev, void *arg) dprintf(1, "Intel IGD OpRegion enabled on %02x:%02x.%x\n", pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf), pci_bdf_to_fn(bdf)); + +bdsm = memalign_high(1024 * 1024, 1024 * 1024); +if (!bdsm) { +warn_noalloc(); +return; +} + +pci_config_writel(bdf, 0x5C, cpu_to_le32((u32)bdsm)); + +dprintf(1, "Allocated 1MB reserved memory for Intel IGD stolen memory at " +"0x%08x\n", (u32)bdsm); } static const struct pci_device_id pci_device_tbl[] = { ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [RFC PATCH v3 1/3] fw/pci: Add support for mapping Intel IGD OpRegion via QEMU
When assigning Intel IGD graphics via QEMU/vfio, the OpRegion for the device may be exposed as a fw_cfg file. Allocate space for this, copy the contents and write the ASL Storage register (0xFC) to point to this buffer. NB, it's possible for QEMU to use the write to the ASL Storage register to map access to the host OpRegion overlapping the allocated buffer, but we shouldn't care if it does. Signed-off-by: Alex Williamson <alex.william...@redhat.com> --- src/fw/pciinit.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index c31c2fa..92170d5 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -257,6 +257,32 @@ static void ich9_smbus_setup(struct pci_device *dev, void *arg) pci_config_writeb(bdf, ICH9_SMB_HOSTC, ICH9_SMB_HOSTC_HST_EN); } +static void intel_igd_opregion_setup(struct pci_device *dev, void *arg) +{ +struct romfile_s *file = romfile_find("etc/igd-opregion"); +void *opregion; +u16 bdf = dev->bdf; + +if (!file || !file->size) +return; + +opregion = memalign_high(PAGE_SIZE, file->size); +if (!opregion) { +warn_noalloc(); +return; +} + +if (file->copy(file, opregion, file->size) < 0) { +free(opregion); +return; +} + +pci_config_writel(bdf, 0xFC, cpu_to_le32((u32)opregion)); + +dprintf(1, "Intel IGD OpRegion enabled on %02x:%02x.%x\n", +pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf), pci_bdf_to_fn(bdf)); +} + static const struct pci_device_id pci_device_tbl[] = { /* PIIX3/PIIX4 PCI to ISA bridge */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, @@ -290,6 +316,10 @@ static const struct pci_device_id pci_device_tbl[] = { PCI_DEVICE_CLASS(PCI_VENDOR_ID_APPLE, 0x0017, 0xff00, apple_macio_setup), PCI_DEVICE_CLASS(PCI_VENDOR_ID_APPLE, 0x0022, 0xff00, apple_macio_setup), +/* Intel IGD OpRegion setup */ +PCI_DEVICE_CLASS(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA, + intel_igd_opregion_setup), + PCI_DEVICE_END, }; ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [RFC PATCH v3 0/3] IGD assignment support for QEMU/vfio
Turns out that maybe we weren't done with IGD by simply reading a new fw_cfg, copying it into a new reserved memory buffer and writing back to the IGD device. We need another reserved memory buffer for stolen memory. The hardware minimum size is 1MB, naturally aligned, which means we need to be able to actually be able to allocate that. We write the address of the buffer back to the BDSM register on IGD so that QEMU can make use of it when doing fixups on vBIOS config of the device. Previous comments on updating the size of the high memory area suggest that any remaining free memory is returned, so it appears there's no affect of increasing this limit for systems without IGD. Thanks, Alex --- Alex Williamson (3): fw/pci: Add support for mapping Intel IGD OpRegion via QEMU Further increase maximum size of permanent high memory area. fw/pci: Allocate IGD stolen memory src/config.h |2 +- src/fw/pciinit.c | 41 + 2 files changed, 42 insertions(+), 1 deletion(-) ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [RFC PATCH v3 2/3] Further increase maximum size of permanent high memory area.
Intel graphics (IGD) makes use of a stolen memory area reserved by the BIOS. The minimum size is 1MB, naturally aligned. Increase the space we have available to make that possible. Signed-off-by: Alex Williamson <alex.william...@redhat.com> --- src/config.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/config.h b/src/config.h index 6c47f16..755fe90 100644 --- a/src/config.h +++ b/src/config.h @@ -17,7 +17,7 @@ // Maximum number of map entries in the e820 map #define BUILD_MAX_E820 32 // Space to reserve in high-memory for tables -#define BUILD_MAX_HIGHTABLE (256*1024) +#define BUILD_MAX_HIGHTABLE (2*1024*1024) // Largest supported externaly facing drive id #define BUILD_MAX_EXTDRIVE 16 // Number of bytes the smbios may be and still live in the f-segment ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2] fw/pci: Add support for mapping Intel IGD OpRegion via QEMU
On Thu, 4 Feb 2016 17:58:23 +0100 Igor Mammedov <imamm...@redhat.com> wrote: > On Tue, 02 Feb 2016 13:10:37 -0700 > Alex Williamson <alex.william...@redhat.com> wrote: > > > When assigning Intel IGD graphics via QEMU/vfio, the OpRegion for > > the device may be exposed as a fw_cfg file. Allocate space for > > this, copy the contents and write the ASL Storage register (0xFC) > > to point to this buffer. NB, it's possible for QEMU to use the > > write to the ASL Storage register to map access to the host > > OpRegion overlapping the allocated buffer, but we shouldn't care if > > it does. > > > > References: > > kernel vfio opregion support: > > https://lkml.org/lkml/2016/2/1/884 > > QEMU vfio opregion support (revised v2 of 7/7 adds fw_cfg): > > https://lists.gnu.org/archive/html/qemu-devel/2016-02/msg00202.html > > Gerd's IGD assignment series: > > https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00244.html > > > > Signed-off-by: Alex Williamson <alex.william...@redhat.com> > > --- > > src/fw/pciinit.c | 30 ++ > > 1 file changed, 30 insertions(+) > > > > diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c > > index c31c2fa..92170d5 100644 > > --- a/src/fw/pciinit.c > > +++ b/src/fw/pciinit.c > > @@ -257,6 +257,32 @@ static void ich9_smbus_setup(struct pci_device > > *dev, void *arg) pci_config_writeb(bdf, ICH9_SMB_HOSTC, > > ICH9_SMB_HOSTC_HST_EN); } > > > > +static void intel_igd_opregion_setup(struct pci_device *dev, void > > *arg) +{ > > +struct romfile_s *file = romfile_find("etc/igd-opregion"); > > +void *opregion; > > +u16 bdf = dev->bdf; > > + > > +if (!file || !file->size) > > +return; > > + > > +opregion = memalign_high(PAGE_SIZE, file->size); > > +if (!opregion) { > > +warn_noalloc(); > > +return; > > +} > > + > > +if (file->copy(file, opregion, file->size) < 0) { > Is opregion content on host immutable? > if not then copying it probably wrong and it should be passed-through. The content is not immutable, but for the first round of things that we're interested in, it probably is. It's not clear that we'll ever move beyond that first level though. Part of the benefit of this approach is that SeaBIOS allocates the correct size, copies a static version of the OpRegion data into place, then effectively tells QEMU that it has done this by writing to the ASL Storage register. At that point QEMU can simply virtualize the register for the guest or it can map a live version of the OpRegion over top of the SeaBIOS copy. So we certainly have the option to go beyond an immutable copy with no further change to SeaBIOS. Thanks, Alex > > +free(opregion); > > +return; > > +} > > + > > +pci_config_writel(bdf, 0xFC, cpu_to_le32((u32)opregion)); > > + > > +dprintf(1, "Intel IGD OpRegion enabled on %02x:%02x.%x\n", > > +pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf), > > pci_bdf_to_fn(bdf)); +} > > + > > static const struct pci_device_id pci_device_tbl[] = { > > /* PIIX3/PIIX4 PCI to ISA bridge */ > > PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, > > @@ -290,6 +316,10 @@ static const struct pci_device_id > > pci_device_tbl[] = { PCI_DEVICE_CLASS(PCI_VENDOR_ID_APPLE, 0x0017, > > 0xff00, apple_macio_setup), PCI_DEVICE_CLASS(PCI_VENDOR_ID_APPLE, > > 0x0022, 0xff00, apple_macio_setup), > > +/* Intel IGD OpRegion setup */ > > +PCI_DEVICE_CLASS(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, > > PCI_CLASS_DISPLAY_VGA, > > + intel_igd_opregion_setup), > > + > > PCI_DEVICE_END, > > }; > > > > > > > > ___ > > SeaBIOS mailing list > > SeaBIOS@seabios.org > > http://www.seabios.org/mailman/listinfo/seabios > ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2] fw/pci: Add support for mapping Intel IGD OpRegion via QEMU
On Wed, 2016-02-03 at 12:43 -0700, Alex Williamson wrote: > On Wed, 2016-02-03 at 10:04 +0100, Gerd Hoffmann wrote: > >  Hi, > >  > > > +static void intel_igd_opregion_setup(struct pci_device *dev, void *arg) > > > +{ > > > +struct romfile_s *file = romfile_find("etc/igd-opregion"); > >  > > Is it possible to have multiple igd devices in a single machine? > > So, should we include the pci address in the file name? > >  > > Guess not needed, it's chipset graphics after all ... > > Hmm, I think that's probably a pretty good observation, we don't want to > revisit this if vGPUs need/want an OpRegion or if Intel decides to start > allowing more than one per system.  Either could pretty easily introduce > multiple into a VM. Naming is always more complicated than it seems.  For anything other than a root bus devices, the PCI address doesn't exist until SeaBIOS enumerates devices and assigns bus numbers for bridges.  So unless we want to provide a path to the device like ACPI defines, maybe we should just stick with "etc/igd-opregion".  It seems easily extensible to add more specific files later and default to this one if those aren't found. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH v2] fw/pci: Add support for mapping Intel IGD OpRegion via QEMU
On Wed, 2016-02-03 at 17:09 -0500, Kevin O'Connor wrote: > On Wed, Feb 03, 2016 at 02:38:47PM -0700, Alex Williamson wrote: > > On Wed, 2016-02-03 at 12:43 -0700, Alex Williamson wrote: > > > On Wed, 2016-02-03 at 10:04 +0100, Gerd Hoffmann wrote: > > > >  Hi, > > > >  > > > > > +static void intel_igd_opregion_setup(struct pci_device *dev, void > > > > > *arg) > > > > > +{ > > > > > +struct romfile_s *file = romfile_find("etc/igd-opregion"); > > > >  > > > > Is it possible to have multiple igd devices in a single machine? > > > > So, should we include the pci address in the file name? > > > >  > > > > Guess not needed, it's chipset graphics after all ... > > >  > > > Hmm, I think that's probably a pretty good observation, we don't want to > > > revisit this if vGPUs need/want an OpRegion or if Intel decides to start > > > allowing more than one per system.  Either could pretty easily introduce > > > multiple into a VM. > > > > Naming is always more complicated than it seems.  For anything other > > than a root bus devices, the PCI address doesn't exist until SeaBIOS > > enumerates devices and assigns bus numbers for bridges.  So unless we > > want to provide a path to the device like ACPI defines, maybe we should > > just stick with "etc/igd-opregion".  It seems easily extensible to add > > more specific files later and default to this one if those aren't found. > > Perhaps a simpler solution is to just make sure "etc/igd-opregion" is > only deployed for the "active" VGA device (ie, the device that > is_pci_vga() returns true for).  Looks like pci_enable_default_vga() > already has code for something similar. Comments we've seen from Intel suggest they prefer assigning IGD as a secondary VGA device though, so the "active" VGA device might not be IGD at all.  I'd also be surprised if QEMU ever sets an active VGA device itself versus relying on SeaBIOS to do it via pci_enable_default_vga(). Then it seems like we get into the realm of QEMU needing to know in advance how SeaBIOS' search algorithms work to know which is the "next" device.  One thing in our favor, maybe, is that there can't be multiple "etc/igd-opregion" fw_cfg files, QEMU will assert if that happens.  So if we get to SeaBIOS and there are multiple Intel VGA devices, there can only be at most one OpRegion for them and we can either give them all the same or just use it on the first one we encounter.  We're probably putting way too much thought into this scenario that Intel likely has no expectation of supporting though ;)  Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [RFC PATCH v2] fw/pci: Add support for mapping Intel IGD OpRegion via QEMU
When assigning Intel IGD graphics via QEMU/vfio, the OpRegion for the device may be exposed as a fw_cfg file. Allocate space for this, copy the contents and write the ASL Storage register (0xFC) to point to this buffer. NB, it's possible for QEMU to use the write to the ASL Storage register to map access to the host OpRegion overlapping the allocated buffer, but we shouldn't care if it does. References: kernel vfio opregion support: https://lkml.org/lkml/2016/2/1/884 QEMU vfio opregion support (revised v2 of 7/7 adds fw_cfg): https://lists.gnu.org/archive/html/qemu-devel/2016-02/msg00202.html Gerd's IGD assignment series: https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00244.html Signed-off-by: Alex Williamson <alex.william...@redhat.com> --- src/fw/pciinit.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index c31c2fa..92170d5 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -257,6 +257,32 @@ static void ich9_smbus_setup(struct pci_device *dev, void *arg) pci_config_writeb(bdf, ICH9_SMB_HOSTC, ICH9_SMB_HOSTC_HST_EN); } +static void intel_igd_opregion_setup(struct pci_device *dev, void *arg) +{ +struct romfile_s *file = romfile_find("etc/igd-opregion"); +void *opregion; +u16 bdf = dev->bdf; + +if (!file || !file->size) +return; + +opregion = memalign_high(PAGE_SIZE, file->size); +if (!opregion) { +warn_noalloc(); +return; +} + +if (file->copy(file, opregion, file->size) < 0) { +free(opregion); +return; +} + +pci_config_writel(bdf, 0xFC, cpu_to_le32((u32)opregion)); + +dprintf(1, "Intel IGD OpRegion enabled on %02x:%02x.%x\n", +pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf), pci_bdf_to_fn(bdf)); +} + static const struct pci_device_id pci_device_tbl[] = { /* PIIX3/PIIX4 PCI to ISA bridge */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, @@ -290,6 +316,10 @@ static const struct pci_device_id pci_device_tbl[] = { PCI_DEVICE_CLASS(PCI_VENDOR_ID_APPLE, 0x0017, 0xff00, apple_macio_setup), PCI_DEVICE_CLASS(PCI_VENDOR_ID_APPLE, 0x0022, 0xff00, apple_macio_setup), +/* Intel IGD OpRegion setup */ +PCI_DEVICE_CLASS(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA, + intel_igd_opregion_setup), + PCI_DEVICE_END, }; ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RFC PATCH] fw/pci: Add support for mapping Intel IGD OpRegion via QEMU
On Tue, 2016-02-02 at 12:43 +0100, Laszlo Ersek wrote: > On 02/02/16 05:15, Alex Williamson wrote: > > The proposed IGD OpRegion support in QEMU via vfio maps the host > > OpRegion into VM system memory at the address written to the ASL > > Storage register (0xFC).  The OpRegion contains a 16-byte signature > > followed by a 4-byte size field.  Therefore SeaBIOS can allocate a > > buffer of the typical size (8KB), this results in a matching e820 > > reserved entry for the range, then write the buffer address to the ASL > > Storage register, blinking the OpRegion into the VM address space.  At > > that point SeaBIOS can validate the signature and size and remap if > > necessary.  If the OpRegion is larger than 8KB, this would result in > > any conflicting ranges being temporarily mapped with the OpRegion, > > which probably needs further discussion on what that could break. > > Other options might be to use the same algorithm with an obscenely > > sized initial buffer to make sure we do not overlap, always > > re-allocating the proper sized buffer, or perhaps we could pass the > > OpRegion itself or information about the OpRegion through fw_cfg. > > > > With the posted kernel series[1] and QEMU series[2] (on top of Gerd's > > IGD patches[3]), this makes the OpRegion available to the VM and > > tracing shows that the guest driver does access it. > > > > [1] https://lkml.org/lkml/2016/2/1/884 > > [2] https://lists.gnu.org/archive/html/qemu-devel/2016-02/msg00202.html > > [3] https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00244.html > > > > Signed-off-by: Alex Williamson <alex.william...@redhat.com> > > --- > >  src/fw/pciinit.c |   50 ++ > >  1 file changed, 50 insertions(+) > > > > diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c > > index c31c2fa..4f3251e 100644 > > --- a/src/fw/pciinit.c > > +++ b/src/fw/pciinit.c > > @@ -257,6 +257,52 @@ static void ich9_smbus_setup(struct pci_device *dev, > > void *arg) > >  pci_config_writeb(bdf, ICH9_SMB_HOSTC, ICH9_SMB_HOSTC_HST_EN); > >  } > >  > > +static void intel_igd_opregion_setup(struct pci_device *dev, void *arg) > > +{ > > +u16 bdf = dev->bdf; > > +u32 orig; > > +void *opregion; > > +int size = 8; > > + > > +if (!CONFIG_QEMU) > > +return; > > + > > +orig = pci_config_readl(bdf, 0xFC); > > + > > +realloc: > > +opregion = malloc_high(size * 1024); > > +if (!opregion) { > > +warn_noalloc(); > > +return; > > +} > > + > > +/* > > + * QEMU maps the OpRegion into system memory at the address written > > here, > > + * this overlaps our malloc, which marks the range e820 reserved. > > + */ > > +pci_config_writel(bdf, 0xFC, cpu_to_le32((u32)opregion)); > > + > > +if (memcmp(opregion, "IntelGraphicsMem", 16)) { > > +pci_config_writel(bdf, 0xFC, orig); > > +free(opregion); > > +return; /* the opregion didn't magically appear, not supported */ > > +} > > + > > +if (size == le32_to_cpu(*(u32 *)(opregion + 16))) { > > +dprintf(1, "Intel IGD OpRegion enabled on %02x:%02x.%x\n", > > +pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf), > > pci_bdf_to_fn(bdf)); > > +return; /* success! */ > > +} > > + > > +pci_config_writel(bdf, 0xFC, orig); > > +free(opregion); > > + > > +if (size == 8) { /* try once more with a new size */ > > +size = le32_to_cpu(*(u32 *)(opregion + 16)); > > +goto realloc; > > Is this idiomatic SeaBIOS coding style? > > How about "for (;;)" -- the code has many instances -- and reworking > this last branch accordingly? > > (Apologies, not a very important comment.) I did check that I wasn't the only one using a goto in SeaBIOS and I don't see any sort of CodingStyle doc precluding it, but I apologize if there have been previous discussions that I've missed.  I don't have the same aversion to gotos as many people do, but of course the loop can be reworked in any number of ways if there's a preference to avoid them. Thanks, Alex > > +} > > +} > > + > >  static const struct pci_device_id pci_device_tbl[] = { > >  /* PIIX3/PIIX4 PCI to ISA bridge */ > >  PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, > > @@ -290,6 +336,10 @@ static const struct p
[SeaBIOS] [RFC PATCH] fw/pci: Add support for mapping Intel IGD OpRegion via QEMU
The proposed IGD OpRegion support in QEMU via vfio maps the host OpRegion into VM system memory at the address written to the ASL Storage register (0xFC). The OpRegion contains a 16-byte signature followed by a 4-byte size field. Therefore SeaBIOS can allocate a buffer of the typical size (8KB), this results in a matching e820 reserved entry for the range, then write the buffer address to the ASL Storage register, blinking the OpRegion into the VM address space. At that point SeaBIOS can validate the signature and size and remap if necessary. If the OpRegion is larger than 8KB, this would result in any conflicting ranges being temporarily mapped with the OpRegion, which probably needs further discussion on what that could break. Other options might be to use the same algorithm with an obscenely sized initial buffer to make sure we do not overlap, always re-allocating the proper sized buffer, or perhaps we could pass the OpRegion itself or information about the OpRegion through fw_cfg. With the posted kernel series[1] and QEMU series[2] (on top of Gerd's IGD patches[3]), this makes the OpRegion available to the VM and tracing shows that the guest driver does access it. [1] https://lkml.org/lkml/2016/2/1/884 [2] https://lists.gnu.org/archive/html/qemu-devel/2016-02/msg00202.html [3] https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00244.html Signed-off-by: Alex Williamson <alex.william...@redhat.com> --- src/fw/pciinit.c | 50 ++ 1 file changed, 50 insertions(+) diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index c31c2fa..4f3251e 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -257,6 +257,52 @@ static void ich9_smbus_setup(struct pci_device *dev, void *arg) pci_config_writeb(bdf, ICH9_SMB_HOSTC, ICH9_SMB_HOSTC_HST_EN); } +static void intel_igd_opregion_setup(struct pci_device *dev, void *arg) +{ +u16 bdf = dev->bdf; +u32 orig; +void *opregion; +int size = 8; + +if (!CONFIG_QEMU) +return; + +orig = pci_config_readl(bdf, 0xFC); + +realloc: +opregion = malloc_high(size * 1024); +if (!opregion) { +warn_noalloc(); +return; +} + +/* + * QEMU maps the OpRegion into system memory at the address written here, + * this overlaps our malloc, which marks the range e820 reserved. + */ +pci_config_writel(bdf, 0xFC, cpu_to_le32((u32)opregion)); + +if (memcmp(opregion, "IntelGraphicsMem", 16)) { +pci_config_writel(bdf, 0xFC, orig); +free(opregion); +return; /* the opregion didn't magically appear, not supported */ +} + +if (size == le32_to_cpu(*(u32 *)(opregion + 16))) { +dprintf(1, "Intel IGD OpRegion enabled on %02x:%02x.%x\n", +pci_bdf_to_bus(bdf), pci_bdf_to_dev(bdf), pci_bdf_to_fn(bdf)); +return; /* success! */ +} + +pci_config_writel(bdf, 0xFC, orig); +free(opregion); + +if (size == 8) { /* try once more with a new size */ +size = le32_to_cpu(*(u32 *)(opregion + 16)); +goto realloc; +} +} + static const struct pci_device_id pci_device_tbl[] = { /* PIIX3/PIIX4 PCI to ISA bridge */ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82371SB_0, @@ -290,6 +336,10 @@ static const struct pci_device_id pci_device_tbl[] = { PCI_DEVICE_CLASS(PCI_VENDOR_ID_APPLE, 0x0017, 0xff00, apple_macio_setup), PCI_DEVICE_CLASS(PCI_VENDOR_ID_APPLE, 0x0022, 0xff00, apple_macio_setup), +/* Intel IGD OpRegion setup */ +PCI_DEVICE_CLASS(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA, + intel_igd_opregion_setup), + PCI_DEVICE_END, }; ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [PATCH] seabios: update to 1.7.3
Don't we really want this too? commit 2a9aeabdfb34374ecac25e7a8d21c9e368618cd4 Author: Kevin O'Connor ke...@koconnor.net Date: Sun Jul 14 13:55:52 2013 -0400 Fix USB EHCI detection that was broken in hlist conversion of PCIDevices. Make sure the PCI device list is ordered in bus order. Don't iterate past the end of the list when detecting EHCI devices. Signed-off-by: Kevin O'Connor ke...@koconnor.net seabios 1.7.3 is actually a regression for me when trying to assign a secondary passthrough graphics card because the PCI devices are in the wrong order and seabios tries to run the passthrough VBIOS rather than the emulated VBIOS. Based on the comment, something is broken with EHCI too. Thanks, Alex On Wed, 2013-07-24 at 15:46 +0200, Gerd Hoffmann wrote: Changes summary (git shortlog rel-1.7.2.2..rel-1.7.3): Alex Williamson (4): seabios q35: Enable all PIRQn IRQs at startup seabios q35: Add new PCI slot to irq routing function seabios: Add a dummy PCI slot to irq mapping function pciinit: Enable default VGA device Asias He (2): virtio-scsi: Set _DRIVER_OK flag before scsi target scanning virtio-scsi: Pack struct virtio_scsi_{req_cmd,resp_cmd} Avik Sil (1): USB-EHCI: Fix null pointer assignment Christian Gmeiner (5): geodevga: fix errors in geode_fp_* functions geodevga: move framebuffer setup geodevga: move output setup to own function geodevga: add debug to msr functions geodevga: fix wrong define name David Woodhouse (26): Add macros for pushing and popping struct bregs Clean up #if in pirtable.c. CONFIG_PIRTABLE can't be set if CONFIG_COREBOOT is post: Export functions which will be used individually by CSM Export callrom() for CSM to use Export copy_smbios() from biostables.c Import LegacyBios.h from OVMF Complete and checksum EFI_COMPATIBILITY16_TABLE at build time Add pic_save_mask() and pic_restore_mask() functions Add CSM support Add README.CSM Add find_pmtimer() function Enable PMTIMER for CSM build Fix rom_reserve()/rom_confirm() for CSM oprom dispatch Don't calibrate TSC if PMTIMER is already set up Move find_pmtimer() to ACPI table setup where it logically belongs Use find_pmtimer() after copying Xen ACPI tables Use find_pmtimer() after copying coreboot ACPI tables Unify return path for CSM to go via csm_return() Make CONFIG_OPTIONROMS_DEPLOYED depend on CONFIG_QEMU Implement !CONFIG_OPTIONROMS support for CSM Implement !CONFIG_BOOT for CSM Enable VGA output when settings bochs-specific mode Disable CONFIG_THREAD_OPTIONROMS for CSM build Fix return type of le64_to_cpu() and be64_to_cpu() Rename find_pmtimer() to find_acpi_features() Add acpi_reboot() reset method using RESET_REG Gerd Hoffmann (3): config: allow DEBUG_IO for !QEMU coreboot: add qemu detection tweak coreboot qemu detection Hu Tao (1): Add pvpanic device driver Kevin O'Connor (99): pmm: Use 'struct segoff_s' in pmm header. Minor: Update README - variable changes are now reset on soft-reboots. Normalize POST initialization function name suffixes. POST: Reorganize post init functions for better grouping and reusability. Fix rebase error in commit 8a0a972f that broke LOWMEM variables. Support calling a function other than maininit() from reloc_preinit(). Ensure exported symbols are visible in the final link POST: Move QEMU specific ramsize and BIOS table setup to paravirt.c. POST: Reorganize post entry and preinit functions. POST: Move cpu caching and dma setup to platform_hardware_setup(). Undo incorrect assumptions about Xen in commit 6ca0460f. Determine century during init and store in VARLOW mem during runtime. No need to check both CONFIG_THREADS and CONFIG_THREAD_OPTIONROMS. Add runningOnQEMU() and runningOnXen() for runtime platform detection. Consistently use CONFIG_COREBOOT, CONFIG_QEMU, and runningOnXen(). Convert kvm_para_available() to runningOnKVM(). Minor - move definitions to paravirt.c from paravirt.h. Only perform SMP setup on QEMU. Start device_hardware_setup in mainint even with CONFIG_THREAD_OPTIONROMS. The mathcp setup touches the PIC and thus move to the setup phase. Update tools/acpi_extract.py to handle iasl 20130117 release. Support skipping content when reading from QEMU fw_cfg romfile entries. Convert fw_cfg ACPI entries into romfile entries. Convert fw_cfg SMBIOS entries into romfile entries. Convert basic integer fw_cfg entries into romfile entries. Convert fw_cfg NUMA entries
Re: [SeaBIOS] Call for Proposals: 2013 Linux Plumbers Virtualization Microconference
Reminder, there's one week left to submit proposals for the virtualization micro-conference at LPC. Please see below for details and note the update to submit proposals through the Linux Plumbers website: http://www.linuxplumbersconf.org/2013/ocw/events/LPC2013/proposals/new Thanks, Alex On Sun, 2013-07-14 at 15:59 -0600, Alex Williamson wrote: On Fri, 2013-07-12 at 14:38 -0600, Alex Williamson wrote: The Call for Proposals for the 2013 Linux Plumbers Virtualization Microconference is now open. This uconf is being held as part of Linux Plumbers Conference in New Orleans, Louisiana, USA September 18-20th and is co-located with LinuxCon North America. For more information see: http://www.linuxplumbersconf.org/2013/ The tentative deadline for proposals is August 1st. To submit a topic please email a brief abstract to lpc2013-virt...@codemonkey.ws If you require travel assistance (extremely limited) in order to attend, please note that in your submission. Also, please keep an eye on: http://www.linuxplumbersconf.org/2013/submitting-topic/ http://www.linuxplumbersconf.org/2013/participate/ We've setup the above email submission as an interim approach until the LPC program committee brings the official submission tool online. I'll send a follow-up message when that occurs, but please send your proposals as soon as possible. Thanks, And the official tool is now online. Please see: http://www.linuxplumbersconf.org/2013/microconference-discussion-topic-bof-submissions-now-open/ for instructions to propose a discussion topic for the virtualization microconference. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] Call for Proposals: 2013 Linux Plumbers Virtualization Microconference
On Fri, 2013-07-12 at 14:38 -0600, Alex Williamson wrote: The Call for Proposals for the 2013 Linux Plumbers Virtualization Microconference is now open. This uconf is being held as part of Linux Plumbers Conference in New Orleans, Louisiana, USA September 18-20th and is co-located with LinuxCon North America. For more information see: http://www.linuxplumbersconf.org/2013/ The tentative deadline for proposals is August 1st. To submit a topic please email a brief abstract to lpc2013-virt...@codemonkey.ws If you require travel assistance (extremely limited) in order to attend, please note that in your submission. Also, please keep an eye on: http://www.linuxplumbersconf.org/2013/submitting-topic/ http://www.linuxplumbersconf.org/2013/participate/ We've setup the above email submission as an interim approach until the LPC program committee brings the official submission tool online. I'll send a follow-up message when that occurs, but please send your proposals as soon as possible. Thanks, And the official tool is now online. Please see: http://www.linuxplumbersconf.org/2013/microconference-discussion-topic-bof-submissions-now-open/ for instructions to propose a discussion topic for the virtualization microconference. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] Call for Proposals: 2013 Linux Plumbers Virtualization Microconference
The Call for Proposals for the 2013 Linux Plumbers Virtualization Microconference is now open. This uconf is being held as part of Linux Plumbers Conference in New Orleans, Louisiana, USA September 18-20th and is co-located with LinuxCon North America. For more information see: http://www.linuxplumbersconf.org/2013/ The tentative deadline for proposals is August 1st. To submit a topic please email a brief abstract to lpc2013-virt...@codemonkey.ws If you require travel assistance (extremely limited) in order to attend, please note that in your submission. Also, please keep an eye on: http://www.linuxplumbersconf.org/2013/submitting-topic/ http://www.linuxplumbersconf.org/2013/participate/ We've setup the above email submission as an interim approach until the LPC program committee brings the official submission tool online. I'll send a follow-up message when that occurs, but please send your proposals as soon as possible. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [RESEND PATCH v2] pciinit: Enable default VGA device
On Wed, 2013-03-20 at 08:17 +0100, Gerd Hoffmann wrote: Hi, Turns out it's this: commit 76e58028d28e78431f9de3cee0b3c88d807fa39d Author: Kevin O'Connor ke...@koconnor.net Date: Wed Mar 6 21:50:09 2013 -0500 acpi: Eliminate BDAT parameter passing to DSDT code. The BDAT construct is the only ACPI mechanism that relies on SeaBIOS reserved memory. Replace it with the SSDT based template system. Signed-off-by: Kevin O'Connor ke...@koconnor.net That is most likely dsdt + bios.bin not being in sync. Simply using 'qemu -bios /path/to/seabios/out/bios.bin' doesn't fly due to qemu using a non-matching dsdt then. With qemu/master you can just use 'qemu -L /path/to/seabios/out' instead and qemu will pick up both bios.bin and dsdt from the fresh seabios build directory then (and anything else it doesn't find there from the default locations). Thanks, yes that's it. QEMU only seems to look in the -L path or the current directory for the missing files, so it's a bit messy but works. I'll just go ahead and roll a v3 of the VGA patch that converts to wmask since I'm muddied the waters on v2 here. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [RESEND PATCH v2] pciinit: Enable default VGA device
On Wed, 2013-03-20 at 17:46 +0100, Gerd Hoffmann wrote: Hi, With qemu/master you can just use 'qemu -L /path/to/seabios/out' instead and qemu will pick up both bios.bin and dsdt from the fresh seabios build directory then (and anything else it doesn't find there from the default locations). Thanks, yes that's it. QEMU only seems to look in the -L path or the current directory for the missing files, so it's a bit messy but works. You can specify -L multiple times now (master only, not in 1.4.0) and create a search path that way. Awesome. Thanks ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH v3] pciinit: Enable default VGA device
As QEMU gains PCI bridge and PCIe root port support, we won't always find the VGA device on the root bus. We therefore need to add support to find and enable a VGA device and the path to it through the VGA Enable support in the PCI bridge control register. Signed-off-by: Alex Williamson alex.william...@redhat.com --- v3: use pci_config_maskw() to trim out some code v2: move to qemu specific pciinit.c src/optionroms.c |2 +- src/pciinit.c| 40 src/util.h |1 + 3 files changed, 42 insertions(+), 1 deletion(-) diff --git a/src/optionroms.c b/src/optionroms.c index caa2151..ac92613 100644 --- a/src/optionroms.c +++ b/src/optionroms.c @@ -213,7 +213,7 @@ run_file_roms(const char *prefix, int isvga, u64 *sources) / // Verify device is a vga device with legacy address decoding enabled. -static int +int is_pci_vga(struct pci_device *pci) { if (pci-class != PCI_CLASS_DISPLAY_VGA) diff --git a/src/pciinit.c b/src/pciinit.c index ce0a4cc..bb9355f 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -316,6 +316,44 @@ static void pci_bios_init_devices(void) } } +static void pci_enable_default_vga(void) +{ +struct pci_device *pci; + +foreachpci(pci) { +if (is_pci_vga(pci)) { +dprintf(1, PCI: Using %02x:%02x.%x for primary VGA\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); +return; +} +} + +pci = pci_find_class(PCI_CLASS_DISPLAY_VGA); +if (!pci) { +dprintf(1, PCI: No VGA devices found\n); +return; +} + +dprintf(1, PCI: Enabling %02x:%02x.%x for primary VGA\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); + +pci_config_maskw(pci-bdf, PCI_COMMAND, 0, + PCI_COMMAND_IO | PCI_COMMAND_MEMORY); + +while (pci-parent) { +pci = pci-parent; + +dprintf(1, PCI: Setting VGA enable on bridge %02x:%02x.%x\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); + +pci_config_maskw(pci-bdf, PCI_BRIDGE_CONTROL, 0, PCI_BRIDGE_CTL_VGA); +pci_config_maskw(pci-bdf, PCI_COMMAND, 0, + PCI_COMMAND_IO | PCI_COMMAND_MEMORY); +} +} / * Platform device initialization @@ -804,4 +842,6 @@ pci_setup(void) pci_bios_init_devices(); free(busses); + +pci_enable_default_vga(); } diff --git a/src/util.h b/src/util.h index af029fc..99aff78 100644 --- a/src/util.h +++ b/src/util.h @@ -344,6 +344,7 @@ void vgahook_setup(struct pci_device *pci); // optionroms.c void call_bcv(u16 seg, u16 ip); +int is_pci_vga(struct pci_device *pci); void optionrom_setup(void); void vgarom_setup(void); void s3_resume_vga(void); ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH v3] pciinit: Enable default VGA device
On Wed, 2013-03-20 at 19:05 +0100, Paul Menzel wrote: Dear Alex, Am Mittwoch, den 20.03.2013, 10:58 -0600 schrieb Alex Williamson: As QEMU gains PCI bridge and PCIe root port support, could you give a commit or version for QEMU please. This would be targeted towards QEMU 1.5. Michael recently sent a pull request with this patch to enable VGA routing in bridges: https://git.kernel.org/cgit/virt/kvm/mst/qemu.git/commit/?h=pciid=e475aeac3e2e7f6632c3ea7e83f4431c0ba3a467 It's difficult to make use of without an assigned VGA device, which I plan to start posting soon. we won't always find the VGA device on the root bus. We therefore need to add support to find and enable a VGA device and the path to it through the VGA Enable support in the PCI bridge control register. Just to be sure, did you test this with older QEMU too? Yes, it works with the QEMU 1.2.x shipped in F18. Any existing QEMU configuration is going to have the VGA device on the host bridge and therefore bail out on the first loop, having done nothing other than add a debug print. Signed-off-by: Alex Williamson alex.william...@redhat.com --- v3: use pci_config_maskw() to trim out some code v2: move to qemu specific pciinit.c src/optionroms.c |2 +- src/pciinit.c| 40 src/util.h |1 + 3 files changed, 42 insertions(+), 1 deletion(-) diff --git a/src/optionroms.c b/src/optionroms.c index caa2151..ac92613 100644 --- a/src/optionroms.c +++ b/src/optionroms.c @@ -213,7 +213,7 @@ run_file_roms(const char *prefix, int isvga, u64 *sources) / // Verify device is a vga device with legacy address decoding enabled. -static int +int is_pci_vga(struct pci_device *pci) { if (pci-class != PCI_CLASS_DISPLAY_VGA) diff --git a/src/pciinit.c b/src/pciinit.c index ce0a4cc..bb9355f 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -316,6 +316,44 @@ static void pci_bios_init_devices(void) } } +static void pci_enable_default_vga(void) +{ +struct pci_device *pci; + +foreachpci(pci) { +if (is_pci_vga(pci)) { +dprintf(1, PCI: Using %02x:%02x.%x for primary VGA\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); As this is used several times, a function returning a string with %02x:% 02x.%x would be handy. +return; +} +} […] Acked-by: Paul Menzel paulepan...@users.sourceforge.net Thanks! Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RESEND PATCH v2] pciinit: Enable default VGA device
On Tue, 2013-03-19 at 19:08 -0400, Kevin O'Connor wrote: On Mon, Mar 18, 2013 at 09:07:21PM -0600, Alex Williamson wrote: As QEMU gains PCI bridge and PCIe root port support, we won't always find the VGA device on the root bus. We therefore need to add support to find and enable a VGA device and the path to it through the VGA Enable support in the PCI bridge control register. Thanks. I'm okay with the change. It would be nice if someone more familiar with the QEMU side could ack this patch. As a minor nitpick - there is a pci_config_maskw() that would simplify your code. Ah, that does clean it up a bit. Actually, hold off on this patch. I'm getting VGA routing setup as I expect, but guests are not happy with the other apertures on the bridges and I'm not sure if I'm contributing to that problem yet. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [RESEND PATCH v2] pciinit: Enable default VGA device
As QEMU gains PCI bridge and PCIe root port support, we won't always find the VGA device on the root bus. We therefore need to add support to find and enable a VGA device and the path to it through the VGA Enable support in the PCI bridge control register. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/optionroms.c |2 +- src/pciinit.c| 45 + src/util.h |1 + 3 files changed, 47 insertions(+), 1 deletion(-) diff --git a/src/optionroms.c b/src/optionroms.c index caa2151..ac92613 100644 --- a/src/optionroms.c +++ b/src/optionroms.c @@ -213,7 +213,7 @@ run_file_roms(const char *prefix, int isvga, u64 *sources) / // Verify device is a vga device with legacy address decoding enabled. -static int +int is_pci_vga(struct pci_device *pci) { if (pci-class != PCI_CLASS_DISPLAY_VGA) diff --git a/src/pciinit.c b/src/pciinit.c index ce0a4cc..eb49a76 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -316,6 +316,49 @@ static void pci_bios_init_devices(void) } } +static void pci_enable_default_vga(void) +{ +struct pci_device *pci; + +foreachpci(pci) { +if (is_pci_vga(pci)) { +dprintf(1, PCI: Using %02x:%02x.%x for primary VGA\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); +return; +} +} + +pci = pci_find_class(PCI_CLASS_DISPLAY_VGA); +if (!pci) { +dprintf(1, PCI: No VGA devices found\n); +return; +} + +dprintf(1, PCI: Enabling %02x:%02x.%x for primary VGA\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); + +u16 cmd = pci_config_readw(pci-bdf, PCI_COMMAND); +cmd |= PCI_COMMAND_IO | PCI_COMMAND_MEMORY; +pci_config_writew(pci-bdf, PCI_COMMAND, cmd); + +while (pci-parent) { +pci = pci-parent; + +dprintf(1, PCI: Setting VGA enable on bridge %02x:%02x.%x\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); + +u8 ctrl = pci_config_readb(pci-bdf, PCI_BRIDGE_CONTROL); +ctrl |= PCI_BRIDGE_CTL_VGA; +pci_config_writeb(pci-bdf, PCI_BRIDGE_CONTROL, ctrl); + +u16 cmd = pci_config_readw(pci-bdf, PCI_COMMAND); +cmd |= PCI_COMMAND_IO | PCI_COMMAND_MEMORY; +pci_config_writew(pci-bdf, PCI_COMMAND, cmd); +} +} / * Platform device initialization @@ -804,4 +847,6 @@ pci_setup(void) pci_bios_init_devices(); free(busses); + +pci_enable_default_vga(); } diff --git a/src/util.h b/src/util.h index af029fc..99aff78 100644 --- a/src/util.h +++ b/src/util.h @@ -344,6 +344,7 @@ void vgahook_setup(struct pci_device *pci); // optionroms.c void call_bcv(u16 seg, u16 ip); +int is_pci_vga(struct pci_device *pci); void optionrom_setup(void); void vgarom_setup(void); void s3_resume_vga(void); ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH v2] pciinit: Enable default VGA device
As QEMU gains PCI bridge and PCIe root port support, we won't always find the VGA device on the root bus. We therefore need to add support to find and enable a VGA device and the path to it through the VGA Enable support in the PCI bridge control register. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/optionroms.c |2 +- src/pciinit.c| 45 + src/util.h |1 + 3 files changed, 47 insertions(+), 1 deletion(-) diff --git a/src/optionroms.c b/src/optionroms.c index caa2151..ac92613 100644 --- a/src/optionroms.c +++ b/src/optionroms.c @@ -213,7 +213,7 @@ run_file_roms(const char *prefix, int isvga, u64 *sources) / // Verify device is a vga device with legacy address decoding enabled. -static int +int is_pci_vga(struct pci_device *pci) { if (pci-class != PCI_CLASS_DISPLAY_VGA) diff --git a/src/pciinit.c b/src/pciinit.c index ce0a4cc..eb49a76 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -316,6 +316,49 @@ static void pci_bios_init_devices(void) } } +static void pci_enable_default_vga(void) +{ +struct pci_device *pci; + +foreachpci(pci) { +if (is_pci_vga(pci)) { +dprintf(1, PCI: Using %02x:%02x.%x for primary VGA\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); +return; +} +} + +pci = pci_find_class(PCI_CLASS_DISPLAY_VGA); +if (!pci) { +dprintf(1, PCI: No VGA devices found\n); +return; +} + +dprintf(1, PCI: Enabling %02x:%02x.%x for primary VGA\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); + +u16 cmd = pci_config_readw(pci-bdf, PCI_COMMAND); +cmd |= PCI_COMMAND_IO | PCI_COMMAND_MEMORY; +pci_config_writew(pci-bdf, PCI_COMMAND, cmd); + +while (pci-parent) { +pci = pci-parent; + +dprintf(1, PCI: Setting VGA enable on bridge %02x:%02x.%x\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); + +u8 ctrl = pci_config_readb(pci-bdf, PCI_BRIDGE_CONTROL); +ctrl |= PCI_BRIDGE_CTL_VGA; +pci_config_writeb(pci-bdf, PCI_BRIDGE_CONTROL, ctrl); + +u16 cmd = pci_config_readw(pci-bdf, PCI_COMMAND); +cmd |= PCI_COMMAND_IO | PCI_COMMAND_MEMORY; +pci_config_writew(pci-bdf, PCI_COMMAND, cmd); +} +} / * Platform device initialization @@ -804,4 +847,6 @@ pci_setup(void) pci_bios_init_devices(); free(busses); + +pci_enable_default_vga(); } diff --git a/src/util.h b/src/util.h index 306a8bf..e1df295 100644 --- a/src/util.h +++ b/src/util.h @@ -343,6 +343,7 @@ void vgahook_setup(struct pci_device *pci); // optionroms.c void call_bcv(u16 seg, u16 ip); +int is_pci_vga(struct pci_device *pci); void optionrom_setup(void); void vgarom_setup(void); void s3_resume_vga(void); ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] RFC: Primary VGA
On Thu, 2013-02-28 at 08:50 +0200, Michael S. Tsirkin wrote: On Wed, Feb 27, 2013 at 02:16:23PM -0700, Alex Williamson wrote: When we start adding root ports and bridges to systems we need some concept of a primary VGA device.The differentiation of the primary device is that it's the default one that responds to the Legacy VGA address ranges. PCs often have a BIOS selection for this. Seabios already seems to have some concept of this and looks for a VGA class device for which the parent devices all have VGA routing enabled. This seems to work today if QEMU initializes VGA routing for the path it considers the primary. The first question is whether this bridge path pre-configuration is what we want to keep as the way QEMU communicates the primary VGA device to Seabios? Obviously we could switch to some kind of fwcfg interface, but I tend to think what we have is sufficient. If it is sufficient, then I think we need to rebuild that path on system reset and we need some way to specify which device to use. One option would be some kind of per PCIDevice property, perhaps primary_vga. A downside is that users can abuse it by trying to set it for more than one device. Maybe a better approach would be to add a machine property for it, -machine primary_vga=$id. Yes. And a command to change it when we support hotplug in the future? Looking at how this would happen on bare metal, there are ACPI methods _GPD (Get Post Device), _SPD (Set Post Device), and _VPD (Video Post Options). So I imagine that if we supported VGA hotplug we'd use those to let the guest specify the primary and continue to use an algorithm in SeaBIOS to post to the lowest B:D.F VGA device if unspecified. I don't know that we need some kind of QMP/QAPI runtime command to change the VGA post device externally on the next boot, real systems likely doesn't have such a hook. For now I'll just start with making SeaBIOS have a way to enable and post the first VGA device when it's behind bridges. Thanks, Alex We'll also need some reasonable way to pick a default if unspecified. Does anyone have any thoughts on managing this? Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH] pci vga: Support VGA behind bridges
We currently expect to find VGA devices on the root bus but we will also support them below bridges iff the VGA routing across the bridges is pre-configured. This patch maintains that behavior, but also enables SeaBIOS to enable VGA routing to the first VGA class device it finds when there is no preconfigured device. This allows us to support VGA devices behind root ports and bridges without special setup from QEMU. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/optionroms.c | 46 -- 1 file changed, 40 insertions(+), 6 deletions(-) diff --git a/src/optionroms.c b/src/optionroms.c index caa2151..1c2a714 100644 --- a/src/optionroms.c +++ b/src/optionroms.c @@ -439,13 +439,47 @@ vgarom_setup(void) memset((void*)BUILD_ROM_START, 0, rom_get_max() - BUILD_ROM_START); // Find and deploy PCI VGA rom. -struct pci_device *pci; +struct pci_device *pci, *pci_vga = NULL; +// Search for devices already enabled foreachpci(pci) { -if (!is_pci_vga(pci)) -continue; -vgahook_setup(pci); -init_pcirom(pci, 1, NULL); -break; +if (is_pci_vga(pci)) { +pci_vga = pci; +break; +} +} + +// Find and enable the first VGA device +if (!pci_vga (pci = pci_find_class(PCI_CLASS_DISPLAY_VGA)) != NULL) { +dprintf(4, Enabling device %02x:%02x.%x for primary VGA\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); + +u16 cmd = pci_config_readw(pci-bdf, PCI_COMMAND); +cmd |= PCI_COMMAND_IO | PCI_COMMAND_MEMORY; +pci_config_writew(pci-bdf, PCI_COMMAND, cmd); + +pci_vga = pci; + +while (pci-parent) { +pci = pci-parent; + +dprintf(4, Setting VGA enable for bridge %02x:%02x.%x\n, +pci_bdf_to_bus(pci-bdf), pci_bdf_to_dev(pci-bdf), +pci_bdf_to_fn(pci-bdf)); + +u8 ctrl = pci_config_readb(pci-bdf, PCI_BRIDGE_CONTROL); +ctrl |= PCI_BRIDGE_CTL_VGA; +pci_config_writeb(pci-bdf, PCI_BRIDGE_CONTROL, ctrl); + +u16 cmd = pci_config_readw(pci-bdf, PCI_COMMAND); +cmd |= PCI_COMMAND_IO | PCI_COMMAND_MEMORY; +pci_config_writew(pci-bdf, PCI_COMMAND, cmd); +} +} + +if (pci_vga) { +vgahook_setup(pci_vga); +init_pcirom(pci_vga, 1, NULL); } // Find and deploy CBFS vga-style roms not associated with a device. ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] RFC: Primary VGA
When we start adding root ports and bridges to systems we need some concept of a primary VGA device.The differentiation of the primary device is that it's the default one that responds to the Legacy VGA address ranges. PCs often have a BIOS selection for this. Seabios already seems to have some concept of this and looks for a VGA class device for which the parent devices all have VGA routing enabled. This seems to work today if QEMU initializes VGA routing for the path it considers the primary. The first question is whether this bridge path pre-configuration is what we want to keep as the way QEMU communicates the primary VGA device to Seabios? Obviously we could switch to some kind of fwcfg interface, but I tend to think what we have is sufficient. If it is sufficient, then I think we need to rebuild that path on system reset and we need some way to specify which device to use. One option would be some kind of per PCIDevice property, perhaps primary_vga. A downside is that users can abuse it by trying to set it for more than one device. Maybe a better approach would be to add a machine property for it, -machine primary_vga=$id. We'll also need some reasonable way to pick a default if unspecified. Does anyone have any thoughts on managing this? Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] RFC: Primary VGA
On Wed, 2013-02-27 at 20:11 -0500, Kevin O'Connor wrote: On Wed, Feb 27, 2013 at 02:16:23PM -0700, Alex Williamson wrote: When we start adding root ports and bridges to systems we need some concept of a primary VGA device.The differentiation of the primary device is that it's the default one that responds to the Legacy VGA address ranges. PCs often have a BIOS selection for this. Seabios already seems to have some concept of this and looks for a VGA class device for which the parent devices all have VGA routing enabled. This seems to work today if QEMU initializes VGA routing for the path it considers the primary. The first question is whether this bridge path pre-configuration is what we want to keep as the way QEMU communicates the primary VGA device to Seabios? Obviously we could switch to some kind of fwcfg interface, but I tend to think what we have is sufficient. SeaBIOS uses the VGA PCI routing information to determine which option rom to run. However, it doesn't actually initialize the VGA PCI routing in it's PCI initialization code (pciinit.c). This isn't an issue for the default QEMU setup because QEMU does not place the VGA card behind a bus. However, should QEMU support multiple VGA cards and/or VGA cards behind a bus, then the PCI init code would likely need to handle VGA setup. (I suppose QEMU could setup the bus forwarding of the legacy VGA ranges, but it would seem ugly to have SeaBIOS do most of the PCI init while QEMU did this small part.) Supporting VGA behind root ports is inevitable, modern topologies demand it and I've got rough working code to enable it for assigned devices. Once we have root ports, there's no reason bridges shouldn't also be supported and maybe we can even deprecate the QEMU -vga option in favor if a regular -device option. It does make some sense that SeaBIOS initializes PCI, so should be responsible for this programming. Should we just start with a bus ordered search for the first PCI VGA class device we find and enable routing for that device? If we do that, perhaps we can defer the question of whether QEMU can specify an alternate primary device. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH] seabios: Add a dummy PCI slot to irq mapping function
This should never get called, but if we somehow get a new chipset that fails to implement their own pci_slot_get_irq function, fail gracefully and add a debug log message. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/pciinit.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/pciinit.c b/src/pciinit.c index 70e4c07..a495742 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -91,7 +91,15 @@ const u8 pci_irqs[4] = { 10, 10, 11, 11 }; -static int (*pci_slot_get_irq)(struct pci_device *pci, int pin); +static int dummy_pci_slot_get_irq(struct pci_device *pci, int pin) +{ +dprintf(1, pci_slot_get_irq called with unknown routing\n); + +return 0xff; /* PCI defined unknown or no connection for x86 */ +} + +static int (*pci_slot_get_irq)(struct pci_device *pci, int pin) = +dummy_pci_slot_get_irq; // Return the global irq number corresponding to a host bus device irq pin. static int piix_pci_slot_get_irq(struct pci_device *pci, int pin) ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [Qemu-devel] [RESEND PATCH v2 2/2] seabios q35: Add new PCI slot to irq routing function
On Wed, 2013-02-20 at 23:29 -0500, Kevin O'Connor wrote: On Fri, Feb 15, 2013 at 02:11:41PM -0700, Alex Williamson wrote: q35/ich9 doesn't use the same interrupt mapping function as i440fx/piix. PIRQA:D and PIRQE:H are programmed identically, but we start at index 0, not index -1. Slots 25 through 31 are also programmed independently. When running qemu w/o this patch, a device at address 0:6.0 will have its PCI interrupt line register programmed with irq 10 (as seen by info pci), but it actually uses irq 11 (as reported the guest). Half of the interrupt lines are misprogrammedi like this. Functionally, a fully emulated qemu guest doesn't care much, but when we try to use device assignment, we really need to know the correct irqs. Thanks. Before committing this, I have a couple of questions. [...] --- a/src/pciinit.c +++ b/src/pciinit.c @@ -91,8 +91,10 @@ const u8 pci_irqs[4] = { 10, 10, 11, 11 }; +static int (*pci_slot_get_irq)(struct pci_device *pci, int pin); This looks like it will cause a null pointer dereference if (for what ever odd reason) neither the piix nor mch chipset is detected. That's correct. We could initialize it to a dummy function that just does a dprintf. I can send a follow-up patch if that sounds ok. [...] +static int mch_pci_slot_get_irq(struct pci_device *pci, int pin) +{ +int irq, slot, pin_addend = 0; + +while (pci-parent != NULL) { +pin_addend += pci_bdf_to_dev(pci-bdf); +pci = pci-parent; +} +slot = pci_bdf_to_dev(pci-bdf); + +switch (slot) { +/* Slots 0-24 rotate slot:pin mapping similar to piix above, but + with a different starting index - see q35-acpi-dsdt.dsl */ +case 0 ... 24: +irq = pci_irqs[(pin - 1 + pin_addend + slot) 3]; +break; +/* Slots 25-31 all use LNKA mapping (or LNKE, but A:D = E:H) */ +case 25 ... 31: +irq = pci_irqs[(pin - 1 + pin_addend) 3]; +break; +} I'm not sure I understand this. Since the pin to irq mapping for each pci slot is motherboard dependent (and not chipset dependent), why would the qemu motherboard use such a weird mapping? Why wouldn't we just have qemu do something more consistent? PCI of course has 4 interrupt lines. 440fx only supports 4 interrupt lines, so to attempt to evenly distribute device interrupts over those 4 lines, we shift the connections on each slot. Q35 supports 8 interrupt lines. My take on this distribution is that we're reserving 4 of them for root complex devices (everything directly attached to the host bridge) and use the other 4 for add-in devices (everything behind PCIe root ports or the PCIe-to-PCI bridge). Besides load balancing, an obvious benefit for a vendor to do this kind of layout is that it reduces the chances of sharing interrupts between onboard devices and plugin devices. I'd guess that this layout likely came from looking at real hardware and trying to do something similar. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [RESEND PATCH v2 0/2] seabios q35: Fix seabios IRQ mapping and setup
This enables interrupts to work pre-boot for assigned devices. I had self nak'd and resent a v2 patch for 2/2, but there were never any comments, so resending the whole series as v2. Thanks, Alex --- Alex Williamson (2): seabios q35: Enable all PIRQn IRQs at startup seabios q35: Add new PCI slot to irq routing function src/pciinit.c | 39 ++- 1 file changed, 34 insertions(+), 5 deletions(-) ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [RESEND PATCH v2 1/2] seabios q35: Enable all PIRQn IRQs at startup
We seem to use the IRQEN bit of the PIRQn registers interchangeably to select APIC mode or to disable an IRQ. I can't decide if we're intending to disable the IRQ or select APIC mode here, but in either case it prevents PIC mode assigned devices from working. When seabios writes IRQEN to these registers, qemu interprets that as APIC mode, so while the boot ROM driver is waiting for an interrupt on ISA compatible IRQ 10 or 11, KVM is injecting interrupts to APIC pins 16 - 23. Devices on the root bus use PIRQE:H while the root ports use PIRQA:D. Enable them all so we don't limit where we support boot ROMs. The guest will later disable unused IRQs with the ACPI _DIS method. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/pciinit.c |6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/pciinit.c b/src/pciinit.c index 1d34653..05c0fe8 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -143,11 +143,9 @@ void mch_isa_bridge_setup(struct pci_device *dev, void *arg) /* activate irq remapping in LPC */ /* PIRQ[A-D] routing */ -pci_config_writeb(bdf, ICH9_LPC_PIRQA_ROUT + i, - irq | ICH9_LPC_PIRQ_ROUT_IRQEN); +pci_config_writeb(bdf, ICH9_LPC_PIRQA_ROUT + i, irq); /* PIRQ[E-H] routing */ -pci_config_writeb(bdf, ICH9_LPC_PIRQE_ROUT + i, - irq | ICH9_LPC_PIRQ_ROUT_IRQEN); +pci_config_writeb(bdf, ICH9_LPC_PIRQE_ROUT + i, irq); } outb(elcr[0], ICH9_LPC_PORT_ELCR1); outb(elcr[1], ICH9_LPC_PORT_ELCR2); ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [RESEND PATCH v2 2/2] seabios q35: Add new PCI slot to irq routing function
q35/ich9 doesn't use the same interrupt mapping function as i440fx/piix. PIRQA:D and PIRQE:H are programmed identically, but we start at index 0, not index -1. Slots 25 through 31 are also programmed independently. When running qemu w/o this patch, a device at address 0:6.0 will have its PCI interrupt line register programmed with irq 10 (as seen by info pci), but it actually uses irq 11 (as reported the guest). Half of the interrupt lines are misprogrammedi like this. Functionally, a fully emulated qemu guest doesn't care much, but when we try to use device assignment, we really need to know the correct irqs. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/pciinit.c | 33 - 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/src/pciinit.c b/src/pciinit.c index 05c0fe8..4cdc777 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -91,8 +91,10 @@ const u8 pci_irqs[4] = { 10, 10, 11, 11 }; +static int (*pci_slot_get_irq)(struct pci_device *pci, int pin); + // Return the global irq number corresponding to a host bus device irq pin. -static int pci_slot_get_irq(struct pci_device *pci, int pin) +static int piix_pci_slot_get_irq(struct pci_device *pci, int pin) { int slot_addend = 0; @@ -104,6 +106,31 @@ static int pci_slot_get_irq(struct pci_device *pci, int pin) return pci_irqs[(pin - 1 + slot_addend) 3]; } +static int mch_pci_slot_get_irq(struct pci_device *pci, int pin) +{ +int irq, slot, pin_addend = 0; + +while (pci-parent != NULL) { +pin_addend += pci_bdf_to_dev(pci-bdf); +pci = pci-parent; +} +slot = pci_bdf_to_dev(pci-bdf); + +switch (slot) { +/* Slots 0-24 rotate slot:pin mapping similar to piix above, but + with a different starting index - see q35-acpi-dsdt.dsl */ +case 0 ... 24: +irq = pci_irqs[(pin - 1 + pin_addend + slot) 3]; +break; +/* Slots 25-31 all use LNKA mapping (or LNKE, but A:D = E:H) */ +case 25 ... 31: +irq = pci_irqs[(pin - 1 + pin_addend) 3]; +break; +} + +return irq; +} + /* PIIX3/PIIX4 PCI to ISA bridge */ static void piix_isa_bridge_setup(struct pci_device *pci, void *arg) { @@ -292,6 +319,8 @@ void i440fx_mem_addr_setup(struct pci_device *dev, void *arg) pcimem_start = 0x8000; else if (RamSize = 0xc000) pcimem_start = 0xc000; + +pci_slot_get_irq = piix_pci_slot_get_irq; } void mch_mem_addr_setup(struct pci_device *dev, void *arg) @@ -310,6 +339,8 @@ void mch_mem_addr_setup(struct pci_device *dev, void *arg) /* setup pci i/o window (above mmconfig) */ pcimem_start = addr + size; + +pci_slot_get_irq = mch_pci_slot_get_irq; } static const struct pci_device_id pci_platform_tbl[] = { ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH 1/2] q35: Enable all PIRQn IRQs at startup
We seem to use the IRQEN bit of the PIRQn registers interchangeably to select APIC mode or to disable an IRQ. I can't decide if we're intending to disable the IRQ or select APIC mode here, but in either case it prevents PIC mode assigned devices from working. When seabios writes IRQEN to these registers, qemu interprets that as APIC mode, so while the boot ROM driver is waiting for an interrupt on ISA compatible IRQ 10 or 11, KVM is injecting interrupts to APIC pins 16 - 23. Devices on the root bus use PIRQE:H while the root ports use PIRQA:D. Enable them all so we don't limit where we support boot ROMs. The guest will later disable unused IRQs with the ACPI _DIS method. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/pciinit.c |6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/pciinit.c b/src/pciinit.c index a406bbd..857e8af 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -143,11 +143,9 @@ void mch_isa_bridge_init(struct pci_device *dev, void *arg) /* activate irq remapping in LPC */ /* PIRQ[A-D] routing */ -pci_config_writeb(bdf, ICH9_LPC_PIRQA_ROUT + i, - irq | ICH9_LPC_PIRQ_ROUT_IRQEN); +pci_config_writeb(bdf, ICH9_LPC_PIRQA_ROUT + i, irq); /* PIRQ[E-H] routing */ -pci_config_writeb(bdf, ICH9_LPC_PIRQE_ROUT + i, - irq | ICH9_LPC_PIRQ_ROUT_IRQEN); +pci_config_writeb(bdf, ICH9_LPC_PIRQE_ROUT + i, irq); } outb(elcr[0], ICH9_LPC_PORT_ELCR1); outb(elcr[1], ICH9_LPC_PORT_ELCR2); ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH 2/2] q35: Add new PCI slot to irq routing function
q35/ich9 doesn't use the same interrupt mapping function as i440fx/piix. PIRQA:D and PIRQE:H are programmed identically, but we start at index 0, not index -1. Slots 25 through 31 are also programmed independently. When running qemu w/o this patch, a device at address 0:6.0 will have its PCI interrupt line register programmed with irq 10 (as seen by info pci), but it actually uses irq 11 (as reported the guest). Half of the interrupt lines are misprogrammedi like this. Functionally, a fully emulated qemu guest doesn't care much, but when we try to use device assignment, we really need to know the correct irqs. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/pciinit.c | 35 ++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/src/pciinit.c b/src/pciinit.c index 857e8af..ddac7e7 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -91,8 +91,10 @@ const u8 pci_irqs[4] = { 10, 10, 11, 11 }; +static int (*pci_slot_get_irq)(struct pci_device *pci, int pin); + // Return the global irq number corresponding to a host bus device irq pin. -static int pci_slot_get_irq(struct pci_device *pci, int pin) +static int piix_pci_slot_get_irq(struct pci_device *pci, int pin) { int slot_addend = 0; @@ -104,6 +106,33 @@ static int pci_slot_get_irq(struct pci_device *pci, int pin) return pci_irqs[(pin - 1 + slot_addend) 3]; } +static int mch_pci_slot_get_irq(struct pci_device *pci, int pin) +{ +int irq, slot_addend = 0; + +while (pci-parent != NULL) { +slot_addend += pci_bdf_to_dev(pci-bdf); +pci = pci-parent; +} +slot_addend += pci_bdf_to_dev(pci-bdf); + +switch (slot_addend) { +/* Slots 0-24 rotate slot:pin mapping similar to piix above, but + with a different starting index - see q35-acpi-dsdt.dsl */ +case 0 ... 24: +irq = pci_irqs[(pin - 1 + slot_addend) 3]; +break; +/* Slots 25-31 all use LNKA mapping (or LNKE, but A:D = E:H) */ +case 25 ... 31: +irq = pci_irqs[pin - 1]; +break; +default: +irq = 0; +} + +return irq; +} + /* PIIX3/PIIX4 PCI to ISA bridge */ static void piix_isa_bridge_init(struct pci_device *pci, void *arg) { @@ -292,6 +321,8 @@ void i440fx_mem_addr_init(struct pci_device *dev, void *arg) pcimem_start = 0x8000; else if (RamSize = 0xc000) pcimem_start = 0xc000; + +pci_slot_get_irq = piix_pci_slot_get_irq; } void mch_mem_addr_init(struct pci_device *dev, void *arg) @@ -310,6 +341,8 @@ void mch_mem_addr_init(struct pci_device *dev, void *arg) /* setup pci i/o window (above mmconfig) */ pcimem_start = addr + size; + +pci_slot_get_irq = mch_pci_slot_get_irq; } static const struct pci_device_id pci_platform_tbl[] = { ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH 2/2] q35: Add new PCI slot to irq routing function
On Tue, 2013-01-22 at 15:12 -0700, Alex Williamson wrote: q35/ich9 doesn't use the same interrupt mapping function as i440fx/piix. PIRQA:D and PIRQE:H are programmed identically, but we start at index 0, not index -1. Slots 25 through 31 are also programmed independently. When running qemu w/o this patch, a device at address 0:6.0 will have its PCI interrupt line register programmed with irq 10 (as seen by info pci), but it actually uses irq 11 (as reported the guest). Half of the interrupt lines are misprogrammedi like this. Functionally, a fully emulated qemu guest doesn't care much, but when we try to use device assignment, we really need to know the correct irqs. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/pciinit.c | 35 ++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/src/pciinit.c b/src/pciinit.c index 857e8af..ddac7e7 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -91,8 +91,10 @@ const u8 pci_irqs[4] = { 10, 10, 11, 11 }; +static int (*pci_slot_get_irq)(struct pci_device *pci, int pin); + // Return the global irq number corresponding to a host bus device irq pin. -static int pci_slot_get_irq(struct pci_device *pci, int pin) +static int piix_pci_slot_get_irq(struct pci_device *pci, int pin) { int slot_addend = 0; @@ -104,6 +106,33 @@ static int pci_slot_get_irq(struct pci_device *pci, int pin) return pci_irqs[(pin - 1 + slot_addend) 3]; } +static int mch_pci_slot_get_irq(struct pci_device *pci, int pin) +{ +int irq, slot_addend = 0; + +while (pci-parent != NULL) { +slot_addend += pci_bdf_to_dev(pci-bdf); +pci = pci-parent; +} +slot_addend += pci_bdf_to_dev(pci-bdf); + +switch (slot_addend) { Nak, I'm not accounting for the bridges properly here. +/* Slots 0-24 rotate slot:pin mapping similar to piix above, but + with a different starting index - see q35-acpi-dsdt.dsl */ +case 0 ... 24: +irq = pci_irqs[(pin - 1 + slot_addend) 3]; +break; +/* Slots 25-31 all use LNKA mapping (or LNKE, but A:D = E:H) */ +case 25 ... 31: +irq = pci_irqs[pin - 1]; +break; +default: +irq = 0; +} + +return irq; +} + /* PIIX3/PIIX4 PCI to ISA bridge */ static void piix_isa_bridge_init(struct pci_device *pci, void *arg) { @@ -292,6 +321,8 @@ void i440fx_mem_addr_init(struct pci_device *dev, void *arg) pcimem_start = 0x8000; else if (RamSize = 0xc000) pcimem_start = 0xc000; + +pci_slot_get_irq = piix_pci_slot_get_irq; } void mch_mem_addr_init(struct pci_device *dev, void *arg) @@ -310,6 +341,8 @@ void mch_mem_addr_init(struct pci_device *dev, void *arg) /* setup pci i/o window (above mmconfig) */ pcimem_start = addr + size; + +pci_slot_get_irq = mch_pci_slot_get_irq; } static const struct pci_device_id pci_platform_tbl[] = { ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [PATCH v2] q35: Add new PCI slot to irq routing function
q35/ich9 doesn't use the same interrupt mapping function as i440fx/piix. PIRQA:D and PIRQE:H are programmed identically, but we start at index 0, not index -1. Slots 25 through 31 are also programmed independently. When running qemu w/o this patch, a device at address 0:6.0 will have its PCI interrupt line register programmed with irq 10 (as seen by info pci), but it actually uses irq 11 (as reported the guest). Half of the interrupt lines are misprogrammedi like this. Functionally, a fully emulated qemu guest doesn't care much, but when we try to use device assignment, we really need to know the correct irqs. Signed-off-by: Alex Williamson alex.william...@redhat.com --- v2: separate the root bus slot for the switch statement, always 0-31 src/pciinit.c | 33 - 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/src/pciinit.c b/src/pciinit.c index 857e8af..8675adb 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -91,8 +91,10 @@ const u8 pci_irqs[4] = { 10, 10, 11, 11 }; +static int (*pci_slot_get_irq)(struct pci_device *pci, int pin); + // Return the global irq number corresponding to a host bus device irq pin. -static int pci_slot_get_irq(struct pci_device *pci, int pin) +static int piix_pci_slot_get_irq(struct pci_device *pci, int pin) { int slot_addend = 0; @@ -104,6 +106,31 @@ static int pci_slot_get_irq(struct pci_device *pci, int pin) return pci_irqs[(pin - 1 + slot_addend) 3]; } +static int mch_pci_slot_get_irq(struct pci_device *pci, int pin) +{ +int irq, slot, pin_addend = 0; + +while (pci-parent != NULL) { +pin_addend += pci_bdf_to_dev(pci-bdf); +pci = pci-parent; +} +slot = pci_bdf_to_dev(pci-bdf); + +switch (slot) { +/* Slots 0-24 rotate slot:pin mapping similar to piix above, but + with a different starting index - see q35-acpi-dsdt.dsl */ +case 0 ... 24: +irq = pci_irqs[(pin - 1 + pin_addend + slot) 3]; +break; +/* Slots 25-31 all use LNKA mapping (or LNKE, but A:D = E:H) */ +case 25 ... 31: +irq = pci_irqs[(pin - 1 + pin_addend) 3]; +break; +} + +return irq; +} + /* PIIX3/PIIX4 PCI to ISA bridge */ static void piix_isa_bridge_init(struct pci_device *pci, void *arg) { @@ -292,6 +319,8 @@ void i440fx_mem_addr_init(struct pci_device *dev, void *arg) pcimem_start = 0x8000; else if (RamSize = 0xc000) pcimem_start = 0xc000; + +pci_slot_get_irq = piix_pci_slot_get_irq; } void mch_mem_addr_init(struct pci_device *dev, void *arg) @@ -310,6 +339,8 @@ void mch_mem_addr_init(struct pci_device *dev, void *arg) /* setup pci i/o window (above mmconfig) */ pcimem_start = addr + size; + +pci_slot_get_irq = mch_pci_slot_get_irq; } static const struct pci_device_id pci_platform_tbl[] = { ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [RFC PATCH 0/2] Q35/ICH9 interrupt routing
I was trying to revive the patch to add Qemu INTx routing support for Q35 and stumbled on some rather broken interrupt routing problems. Sadly a seabios release claiming Q35 support snuck out already, but I'd like to at least discuss these before Qemu 1.4 even though they may not get fixed for that release either. ICH9 adds 4 more PCI interrupt routing registers, so in addition to the previous PIRQ[A:D], we now also have PIRQ[E:H]. AIUI, each pin can operate in either PIC or APIC mode based on whether the IRQEN bit of each register is clear or set respectively. In PIC mode bits 3:0 of the PIRQn register identifiy the ISA compatible interrupt to trigger. In APIC mode each PIRQn is statically mapped to an APIC pin, where PIRQA-16, PIRQB-17, ..., PIRQH-23. The first problem we encounter is that the system boots into PIC mode but we're programming the PIRQn registers into APIC mode. After fixing that, we quickly notice that seabios has a single function for programming PCI interrupt line registers based on the PIIX mapping of PIRQn registers. A comment in the Qemu source indicates the slot:pin to PIRQn mapping is arbitrary so long as Qemu and ACPI match, disregarding that boot ROMs may also use interrupts and have no ACPI support. Solutions to this are either to create an ICH9 specific mapping function or to adjust slot:pin mappings to match PIIX. I've done the latter here, but it's incomplete as slots 25-31 have directly programmable mappings. I'm looking for comments on how to proceed. Do we want to try to meld ICH9 to be more PIIX compatible or do we want to do our own thing. I suspect there are still a number of holes in ICH9 irq programming no matter which path we take and these in particular are likely to be challenging for migration compatibility. Thanks, Alex --- Alex Williamson (2): q35: Fix PIC-mode interrupt setup q35: Fix ACPI _PRT routing to match PIIX src/pciinit.c |6 +-- src/q35-acpi-dsdt.dsl | 100 + 2 files changed, 52 insertions(+), 54 deletions(-) ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
[SeaBIOS] [RFC PATCH 1/2] q35: Fix PIC-mode interrupt setup
We're initializing the ICH9 PIRQn registers with the IRQEN bit set, which actuall makes them operate in APIC mode rather than PIC mode (see 13.1.17 13.1.9). AFAICT, the system boots up in PIC mode and trying to make use of APIC IRQs in boot ROMs does not work. Fix this to use the ISA compatible IRQs. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/pciinit.c |6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/pciinit.c b/src/pciinit.c index a406bbd..857e8af 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -143,11 +143,9 @@ void mch_isa_bridge_init(struct pci_device *dev, void *arg) /* activate irq remapping in LPC */ /* PIRQ[A-D] routing */ -pci_config_writeb(bdf, ICH9_LPC_PIRQA_ROUT + i, - irq | ICH9_LPC_PIRQ_ROUT_IRQEN); +pci_config_writeb(bdf, ICH9_LPC_PIRQA_ROUT + i, irq); /* PIRQ[E-H] routing */ -pci_config_writeb(bdf, ICH9_LPC_PIRQE_ROUT + i, - irq | ICH9_LPC_PIRQ_ROUT_IRQEN); +pci_config_writeb(bdf, ICH9_LPC_PIRQE_ROUT + i, irq); } outb(elcr[0], ICH9_LPC_PORT_ELCR1); outb(elcr[1], ICH9_LPC_PORT_ELCR2); ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] vga bios woes
On Thu, 2013-01-03 at 20:49 -0700, Alex Williamson wrote: On Thu, 2013-01-03 at 20:57 -0500, Kevin O'Connor wrote: On Thu, Jan 03, 2013 at 06:41:35PM -0700, Alex Williamson wrote: Thanks for filling in that piece of the puzzle for me. If the above is just a reporting problem, how do I fix it so I can actually step through the rom? Thanks, One other note - I think Darmawan may have been doing something similar with PCI rom debugging - you may wish to reach out to him. Thanks for the connection. For now the link Daniel provided has given me enough to make progress. It looks like this is a case where the VGA bios manages to get the physical address of the device through a legacy VGA register (0x3c3) and shoots itself by using that rather than the emulated address found through config space. I can kludge the offset and get the VGA option rom to finish, but something is still wrong since it doesn't trigger the monitor to sync. Still digging. Thanks for the help, I enabled unassigned memory debugging in qemu and get some peculiar output during the vga bios execution. Given this state: ---[ STACK ]--- 6E02 4942 B5D4 B5E5 6DAE 6DB2 B56A B572 6DBA 6E0A 6DA6 8001 ---[ DS:SI ]--- C000: 55 AA 75 E9 21 02 00 00 00 00 00 00 00 00 00 00 U.u.!... C010: 00 00 00 00 00 00 00 00 D4 01 00 00 00 00 49 42 ..IB C020: 4D 25 00 00 00 00 00 00 00 00 00 00 00 00 00 04 M%.. C030: 20 37 36 31 32 39 35 35 32 30 00 00 00 00 00 00 .761295520.. ---[ ES:DI ]--- 6E0A: 1A 6E 00 00 00 20 28 03 E8 FD 00 00 E8 FD 00 00 .n(. 6E1A: 00 00 A4 0D 10 00 00 00 10 00 00 00 6E 02 28 17 n.(. 6E2A: 00 00 90 D1 00 00 00 00 00 00 00 00 00 00 FF FF 6E3A: 00 00 FF FF 00 00 00 00 00 00 10 00 00 00 00 F0 [ CPU ] AX: 6DBA BX: B5E5 CX: DX: 0001 SI: DI: 6E0A SP: 6DA2 BP: 6E02 CS: C000 DS: C000 ES: SS: IP: 460E EIP:460E CS:IP: C000:460E (0xC460E) SS:SP: :6DA2 (0x06DA2) SS:BP: :6E02 (0x06E02) OF 0 DF 0 IF 1 TF 0 SF 1 ZF 0 AF 1 PF 1 CF 1 ID 0 VIP 0 VIF 0 AC 0 VM 0 RF 0 NT 0 IOPL 0 ---[ CODE ] 0xc460e: movbp,sp 0xc4610: push bx 0xc4611: push cx 0xc4612: push dx 0xc4613: push di 0xc4614: push ax 0xc4615: movbx,ax 0xc4617: movcx,ss 0xc4619: moves,cx 0xc461b: movsi,WORD PTR es:[bx+0x2] How does that mov generate this: Unassigned mem read b5e5b5d4 Real-mode tcg bug? Here's the next state: ---[ STACK ]--- 6E02 4942 B5D4 B5E5 6DAE 6DB2 B56A B572 6DBA 6E0A 6DA6 8001 ---[ DS:SI ]--- C000: 55 AA 75 E9 21 02 00 00 00 00 00 00 00 00 00 00 U.u.!... C010: 00 00 00 00 00 00 00 00 D4 01 00 00 00 00 49 42 ..IB C020: 4D 25 00 00 00 00 00 00 00 00 00 00 00 00 00 04 M%.. C030: 20 37 36 31 32 39 35 35 32 30 00 00 00 00 00 00 .761295520.. ---[ ES:DI ]--- 6E0A: 1A 6E 00 00 00 20 28 03 E8 FD 00 00 E8 FD 00 00 .n(. 6E1A: 00 00 A4 0D 10 00 00 00 10 00 00 00 6E 02 28 17 n.(. 6E2A: 00 00 90 D1 00 00 00 00 00 00 00 00 00 00 FF FF 6E3A: 00 00 FF FF 00 00 00 00 00 00 10 00 00 00 00 F0 [ CPU ] AX: 6DBA BX: B5E5 CX: DX: 0001 SI: DI: 6E0A SP: 6DA2 BP: 6DA2 CS: C000 DS: C000 ES: SS: IP: 4610 EIP:4610 CS:IP: C000:4610 (0xC4610) SS:SP: :6DA2 (0x06DA2) SS:BP: :6DA2 (0x06DA2) OF 0 DF 0 IF 1 TF 0 SF 1 ZF 0 AF 1 PF 1 CF 1 ID 0 VIP 0 VIF 0 AC 0 VM 0 RF 0 NT 0 IOPL 0 ---[ CODE ] 0xc4610: push bx 0xc4611: push cx 0xc4612: push dx 0xc4613: push di 0xc4614: push ax 0xc4615: movbx,ax 0xc4617: movcx,ss 0xc4619: moves,cx 0xc461b: movsi,WORD PTR es:[bx+0x2] 0xc461f: movsi,WORD PTR es:[si+0x2] Here's another odd one, this state: ---[ STACK ]--- 6E02 4942 B5D4 B5E7 6DAE 6DB2 B56A B572 6DBA 6E0A 6DA6 8001 ---[ DS:SI ]--- C004: 21 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 !... C014: 00 00 00 00 D4 01 00 00 00 00 49 42 4D 25 00 00 ..IBM%.. C024: 00 00 00 00 00 00 00 00 00 00 00 04 20 37 36 31 .761 C034: 32 39 35 35 32 30 00 00 00 00 00 00 19 02 00 00 295520.. ---[ ES:DI ]--- 6E0A: 1A 6E 00 00 00 20 28 03 E8 FD 00 00 E8 FD 00 00 .n(. 6E1A: 00 00 A4 0D 10 00 00 00 10 00 00 00 6E 02 28 17 n.(. 6E2A: 00 00 90 D1 00 00 00 00 00
[SeaBIOS] vga bios woes
Hi, I was playing a bit with vfio-based PCI device assignment of VGA in qemu and I seem to be hitting a wall just trying to jump into the VGA BIOS. I'm booting qemu with -vga none and assigning a radeon hd5450 via vfio-pci with some extra code to handle passing legacy accesses through to the host. Legacy access hardly seems to matter though as the experiment quickly dies when the vcpu starts executing zero'd memory. gdb shows me something like this: 0x000f257c __callrom+72: 66 c7 44 24 16 ff ffmovw $0x,0x16(%esp) 0x000f2583 __callrom+79: 66 c7 44 24 1a ff ffmovw $0x,0x1a(%esp) 0x000f258a __callrom+86: 66 c7 44 24 08 00 f0movw $0xf000,0x8(%esp) 0x000f2591 __callrom+93: b8 80 d1 0f 00 mov$0xfd180,%eax 0x000f2596 __callrom+98: 66 89 44 24 0a mov%ax,0xa(%esp) 0x000f259b __callrom+103: c1 e5 10shl$0x10,%ebp 0x000f259e __callrom+106: 0f b7 d7movzwl %di,%edx 0x000f25a1 __callrom+109: 09 ea or %ebp,%edx 0x000f25a3 __callrom+111: 89 54 24 26 mov%edx,0x26(%esp) 0x000f25a7 __callrom+115: 89 e0 mov%esp,%eax 0x000f25a9 __callrom+117: 3d 00 70 00 00 cmp$0x7000,%eax 0x000f25ae __callrom+122: 76 0a jbe0xf25ba __callrom+134 0x000f25ba __callrom+134: 89 f0 mov%esi,%eax 0x000f25bc __callrom+136: bb 58 68 00 00 mov$0x6858,%ebx 0x000f25c1 __callrom+141: e8 31 98 00 00 call 0xfbdf7 0x000fbdf7: ba 01 be 00 00 mov$0xbe01,%edx 0x000fbdfc: e9 0e ff ff ff jmp0xfbd0f 0x000fbd0f: 89 c1 mov%eax,%ecx 0x000fbd11: b8 30 00 00 00 mov$0x30,%eax 0x000fbd16: 8e d8 mov%eax,%ds 0x000fbd18: 8e c0 mov%eax,%es 0x000fbd1a: 8e d0 mov%eax,%ss 0x000fbd1c: 8e e0 mov%eax,%fs 0x000fbd1e: 8e e8 mov%eax,%gs 0x000fbd20: 66 ea 26 bd 28 00 ljmpw $0x28,$0xbd26 0xbd26: 00 00 add%al,(%eax) (qemu) xp /16x 0xbd26 bd26: 0x 0x 0x 0x bd36: 0x 0x 0x 0x bd46: 0x 0x 0x 0x bd56: 0x 0x 0x 0x (qemu) xp /16x 0x000c 000c: 0xe975aa55 0x0221 0x 0x 000c0010: 0x 0x 0x01d4 0x4249 000c0020: 0x254d 0x 0x 0x0400 000c0030: 0x31363720 0x35353932 0x3032 0x Trying to follow the code into __callrom(), I'm really confused how the option rom init vector is actually used since callrom() passes the option rom header offset to the init vector rather than anything actually resembling the value of the init vector. I really don't know x86 though, so maybe I'm missing something. The option rom is loaded as a PCI expansion rom from the device, it appears to be non-PnP. The seabios log shows this: Scan for VGA option rom Attempting to init PCI bdf 00:02.0 (vd 1002:68f9) Attempting to map option rom on dev 00:02.0 Option rom sizing returned febe fffe Inspecting possible rom at 0xfebe (vd=1002:68f9 bdf=00:02.0) Copying option rom (size 59904) from 0xfebe to 0x000c handle_08 Checking rom 0x000c (sig aa55 size 117) Running option rom at c000:0003 here we go into the weeds, I'll continue to see periodic handle_08 if I let it run Appreciate any hints to make this work. Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] vga bios woes
On Thu, 2013-01-03 at 19:13 -0500, Kevin O'Connor wrote: On Thu, Jan 03, 2013 at 02:31:43PM -0700, Alex Williamson wrote: Hi, I was playing a bit with vfio-based PCI device assignment of VGA in qemu and I seem to be hitting a wall just trying to jump into the VGA BIOS. I'm booting qemu with -vga none and assigning a radeon hd5450 via vfio-pci with some extra code to handle passing legacy accesses through to the host. Legacy access hardly seems to matter though as the experiment quickly dies when the vcpu starts executing zero'd memory. gdb shows me something like this: [...] 0x000fbd20: 66 ea 26 bd 28 00 ljmpw $0x28,$0xbd26 0xbd26: 00 00 add%al,(%eax) Everything looks okay except for here. I'd guess it's likely just a reporting issue. The code being run is actually at 0xfbd26 - as part of jumping into real-mode, the code has a segment offset (0xf) that must be added in. Trying to follow the code into __callrom(), I'm really confused how the option rom init vector is actually used since callrom() passes the option rom header offset to the init vector rather than anything actually resembling the value of the init vector. I really don't know x86 though, so maybe I'm missing something. The option rom should actually have code (eg, a jmp instruction) at offset 3 of the option rom. So, the goal really is to jump to the 3rd byte of the option rom to execute it. Yes! (gdb) x/i 0xc0003 0xc0003: jmp0xc0229 (gdb) x/10i 0xc0229 0xc0229: push %ax 0xc022b: push %cx 0xc022d: push %dx 0xc022f: push %bx 0xc0231: push %bp 0xc0233: push %si 0xc0235: push %di 0xc0237: push %cs 0xc0238: pop%ds 0xc0239: mov%eax,0xe8c01c2 Thanks for filling in that piece of the puzzle for me. If the above is just a reporting problem, how do I fix it so I can actually step through the rom? Thanks, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] vga bios woes
On Thu, 2013-01-03 at 18:43 -0700, Daniel Verkamp wrote: On Thu, Jan 3, 2013 at 2:31 PM, Alex Williamson alex.william...@redhat.com wrote: Hi, I was playing a bit with vfio-based PCI device assignment of VGA in qemu and I seem to be hitting a wall just trying to jump into the VGA BIOS. I'm booting qemu with -vga none and assigning a radeon hd5450 via vfio-pci with some extra code to handle passing legacy accesses through to the host. Legacy access hardly seems to matter though as the experiment quickly dies when the vcpu starts executing zero'd memory. gdb shows me something like this: 0x000f257c __callrom+72: 66 c7 44 24 16 ff ffmovw $0x,0x16(%esp) 0x000f2583 __callrom+79: 66 c7 44 24 1a ff ffmovw $0x,0x1a(%esp) 0x000f258a __callrom+86: 66 c7 44 24 08 00 f0movw $0xf000,0x8(%esp) 0x000f2591 __callrom+93: b8 80 d1 0f 00 mov$0xfd180,%eax 0x000f2596 __callrom+98: 66 89 44 24 0a mov%ax,0xa(%esp) 0x000f259b __callrom+103: c1 e5 10shl$0x10,%ebp 0x000f259e __callrom+106: 0f b7 d7movzwl %di,%edx 0x000f25a1 __callrom+109: 09 ea or %ebp,%edx 0x000f25a3 __callrom+111: 89 54 24 26 mov%edx,0x26(%esp) 0x000f25a7 __callrom+115: 89 e0 mov%esp,%eax 0x000f25a9 __callrom+117: 3d 00 70 00 00 cmp$0x7000,%eax 0x000f25ae __callrom+122: 76 0a jbe0xf25ba __callrom+134 0x000f25ba __callrom+134: 89 f0 mov%esi,%eax 0x000f25bc __callrom+136: bb 58 68 00 00 mov$0x6858,%ebx 0x000f25c1 __callrom+141: e8 31 98 00 00 call 0xfbdf7 0x000fbdf7: ba 01 be 00 00 mov$0xbe01,%edx 0x000fbdfc: e9 0e ff ff ff jmp0xfbd0f 0x000fbd0f: 89 c1 mov%eax,%ecx 0x000fbd11: b8 30 00 00 00 mov$0x30,%eax 0x000fbd16: 8e d8 mov%eax,%ds 0x000fbd18: 8e c0 mov%eax,%es 0x000fbd1a: 8e d0 mov%eax,%ss 0x000fbd1c: 8e e0 mov%eax,%fs 0x000fbd1e: 8e e8 mov%eax,%gs 0x000fbd20: 66 ea 26 bd 28 00 ljmpw $0x28,$0xbd26 0xbd26: 00 00 add%al,(%eax) I think GDB is probably getting lost at the far jump (ljmpw). To my knowledge, GDB does not understand the x86 segmented real-mode memory model. The trace above is __callrom() - farcall16big() - call16big() - __call16big() - transition16big in romlayout.S, which contains the ljmpw in question. Trying to follow the code into __callrom(), I'm really confused how the option rom init vector is actually used since callrom() passes the option rom header offset to the init vector rather than anything actually resembling the value of the init vector. I really don't know x86 though, so maybe I'm missing something. The far address of the ROM initialization vector (seg:offset) is being loaded into br.code, which will get used later on in the farcall helper functions noted above. The offset is set to OPTION_ROM_INITVECTOR by callrom(), which is 3; this is relative to the segment of the ROM header (hence the FLATPTR_TO_SEG in __callrom). The ROM header should have a jump at that location to the real initialization code. I don't have any real tips on how to make this work - hopefully someone else will know the magic incantation to clue in GDB to the new code segment. This blog post also looks relevant: http://crustynation.net/wiki/doku.php?id=blog:gdb_real_mode Awesome! That gdbinit is magic. Thanks for the pointer, Alex ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] vga bios woes
On Thu, 2013-01-03 at 20:57 -0500, Kevin O'Connor wrote: On Thu, Jan 03, 2013 at 06:41:35PM -0700, Alex Williamson wrote: Thanks for filling in that piece of the puzzle for me. If the above is just a reporting problem, how do I fix it so I can actually step through the rom? Thanks, One other note - I think Darmawan may have been doing something similar with PCI rom debugging - you may wish to reach out to him. Thanks for the connection. For now the link Daniel provided has given me enough to make progress. It looks like this is a case where the VGA bios manages to get the physical address of the device through a legacy VGA register (0x3c3) and shoots itself by using that rather than the emulated address found through config space. I can kludge the offset and get the VGA option rom to finish, but something is still wrong since it doesn't trigger the monitor to sync. Still digging. Thanks for the help, Alex On Sat, Nov 12, 2011 at 03:41:15PM +0700, Darmawan Salihun wrote: I made a blogpost detailing the steps to debug PCI Option ROM with Coreboot+SeaBIOS and a GDB-server-compatible debugger: http://bioshacking.blogspot.com/search/label/PCI%20Option%20ROM Hopefully would help those in need because it takes quite a while to get it right. Thanks to Kevin O'Connor for the helps :-) Regards, Darmawan ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [RESEND PATCH v3] hotplug: add device per func in ACPI DSDT tables
On Wed, 2012-05-09 at 15:24 +0800, Amos Kong wrote: Boot up a Linux VM with 8 pci block devices which are the 8 functions in one pci slot. | # qemu-kvm ... | -drive file=images/u0,if=none,id=drv0,format=qcow2,cache=none \ | -device virtio-blk-pci,drive=drv0,id=v0,multifunction=on,addr=0x03.0 \ | | -drive file=images/u7,if=none,id=drv7,format=qcow2,cache=none \ | -device virtio-blk-pci,drive=drv7,id=v7,multifunction=on,addr=0x03.7 \ Check devices in guest. | vm)# ls /dev/vd* |vda vdb vdc vde vdf vdg vdh | vm)# lspci |grep block | 00:03.0 SCSI storage controller: Red Hat, Inc Virtio block device |... | 00:03.7 SCSI storage controller: Red Hat, Inc Virtio block device | Func1~7 still exist in guest after hot-removing the whole slot through qemu monitor. | vm)# lspci |grep block(00:03.0 disappeared) | 00:03.1 SCSI storage controller: Red Hat, Inc Virtio block device (rev ff) |... | 00:03.7 SCSI storage controller: Red Hat, Inc Virtio block device (rev ff) | vm)# ls /dev/vd* (vda disappeared) |vdb vdc vde vdf vdg vdh | vm)# mkfs /dev/vdb |INFO: task mkfs.ext2:1784 blocked for more than 120 seconds. (task hung) Currently only func0 is defined in ACPI DSDT table of seabios, and only hot-adding func0 would cause a ACPI event for notification. Other funcs except func0 wouldn't be registered in linux pci driver. (we can only found func0 in slot-funcs list of pci driver). When VM pci driver receives an ACPI event for hot-removing, it will only clean functions in slot-funcs list, the other funcs could not be cleaned. This patch adds device per function in ACPI DSDT tables, then all funcs will be registered in slot-funcs list. It's coincident with microsoft's example: http://www.microsoft.com/china/whdc/system/pnppwr/hotadd/hotplugpci.mspx#EUH Have tested with linux/winxp/win7, hot-adding/hot-remving, single/multiple function devices, they are all fine(all added devices can be removed). This patch includes some bits mst wrote, thanks! --- old discussion: http://marc.info/?l=kvmm=132428400917405w=2 Signed-off-by: Amos Kong ak...@redhat.com CC: Michael S. Tsirkin m...@redhat.com --- src/ssdt-pcihp.dsl | 17 src/ssdt-pcihp.hex | 8869 +++- 2 files changed, 8781 insertions(+), 105 deletions(-) diff --git a/src/ssdt-pcihp.dsl b/src/ssdt-pcihp.dsl index 4b435b8..2a3c326 100644 --- a/src/ssdt-pcihp.dsl +++ b/src/ssdt-pcihp.dsl @@ -17,14 +17,23 @@ DefinitionBlock (ssdt-pcihp.aml, SSDT, 0x01, BXPC, BXSSDTPCIHP, 0x1) // at runtime, if the slot is detected to not support hotplug. // Extract the offset of the address dword and the // _EJ0 name to allow this patching. -#define hotplug_slot(slot) \ -Device (S##slot) { \ +#define hotplug_func(slot, fn) \ +Device (S##slot##fn) { \ ACPI_EXTRACT_NAME_DWORD_CONST aml_adr_dword \ - Name (_ADR, 0x##slot##) \ I would have guessed it to be sufficient to change _ADR to 0x##slot##, does that not work? Thanks, Alex + Name (_ADR, 0x##slot##000##fn) \ ACPI_EXTRACT_METHOD_STRING aml_ej0_name \ Method (_EJ0, 1) { Return(PCEJ(0x##slot)) } \ Name (_SUN, 0x##slot)\ } +#define hotplug_slot(slot) \ +hotplug_func(slot, 0) \ +hotplug_func(slot, 1) \ +hotplug_func(slot, 2) \ +hotplug_func(slot, 3) \ +hotplug_func(slot, 4) \ +hotplug_func(slot, 5) \ +hotplug_func(slot, 6) \ +hotplug_func(slot, 7) hotplug_slot(01) hotplug_slot(02) @@ -59,7 +68,7 @@ DefinitionBlock (ssdt-pcihp.aml, SSDT, 0x01, BXPC, BXSSDTPCIHP, 0x1) hotplug_slot(1f) #define gen_pci_hotplug(slot) \ -If (LEqual(Arg0, 0x##slot)) { Notify(S##slot, Arg1) } +If (LEqual(Arg0, 0x##slot)) { Notify(S##slot##0, Arg1) } Method(PCNT, 2) { gen_pci_hotplug(01) ___ SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
Re: [SeaBIOS] [PATCH] seabios: acpi: Add _STA for PCI hotplug slots
On Sun, 2012-03-04 at 20:53 +0200, Michael S. Tsirkin wrote: On Fri, Feb 24, 2012 at 04:21:17PM -0700, Alex Williamson wrote: When a Status method is provided on a slot, the OSPM evaluates _STA in response to the device check notify on the slot. This allows some degree of a handshake between the platform and the OSPM that the hotplug has been acknowledged. In order to implement _STA, we need to know which slots have devices. A slot with device returns 0x0F, a slot without a device returns Zero. We get this information from Qemu using the 0xae08 I/O port register. This was previously the read-side of the register written to commit a device eject and always returned 0 on read. It now returns a bitmap of present slots, so we know that reading 0 means we have and old Qemu and dynamically modify our SSDT to rename the _STA methods. This is necessary to allow backwards compatibility. Interesting. Isn't the UP register sufficient for _STA? No, UP only reports the current slot being added, so we'd only be able to report a present value for that slot and not static or previously added slots. Assuming we want to implement _STA - for which the only motivation seems the handshake hack below. The _STA method also writes the slot identifier to I/O port register 0xae00 as an acknowledgment of the hotplug request. This part looks a bit like a hack. _STA is not intended as an acknowledgement - it's a query for state. ACPI spec 5.0 requires that _STA is called before _INI, but ACPI 1.0b doesn't. Did you try some 1.0 OSPMs (e.g. XP) to see what they do? I did test with XP. Section 6.3 of ACPI spec 1.0b references the _STA method during hotplug. I also found this reference for Windows ACPI procedure for hotplug/unplug: http://www.microsoft.com/china/whdc/system/pnppwr/hotadd/hotplugpci.mspx#EYH I agree, _STA is not intended as an acknowledgment, but that doesn't mean we can't use it as one. The OSPM can call _STA at any point in time, however calling it after we've done a notify for device check is about the best indication we can get that the OSPM is processing it. It doesn't hurt anything if _STA is called spuriously. I also think I see how this can cause a race, see below. Signed-off-by: Alex Williamson alex.william...@redhat.com Your description of the qemu patches made me think that all you really want is detect an OS without OSPM. If that is the case, I would suggest adding an _INI method at top level as a simpler and more robust procedure. No, having OSPM is a prerequisite, but does not imply supporting hotplug. Otherwise, how about implementing _PS0 (and probably _PS3) to manage slot power? Maybe this what you are really after, and it seems like a better interface than 'acknowledge' which does not seem to make sense for real hardware. I tried this, _PS0/3 also requires _STA. Implementing both caused interrupts to stop working on Linux guests. Note that _PS0/3 is even less closely associated with device removal in 1.0b than _STA even though the MSFT document references it. --- src/acpi-dsdt.dsl | 36 ++- src/acpi-dsdt.hex | 124 ++ src/acpi.c | 27 ++ src/ssdt-pcihp.dsl |3 src/ssdt-pcihp.hex | 658 5 files changed, 686 insertions(+), 162 deletions(-) diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl index 7082b65..6b87086 100644 --- a/src/acpi-dsdt.dsl +++ b/src/acpi-dsdt.dsl @@ -119,17 +119,15 @@ DefinitionBlock ( prt_slot3(0x001f), }) -OperationRegion(PCST, SystemIO, 0xae00, 0x08) +OperationRegion(PCST, SystemIO, 0xae00, 0x0c) Field (PCST, DWordAcc, NoLock, WriteAsZeros) { -PCIU, 32, -PCID, 32, -} - -OperationRegion(SEJ, SystemIO, 0xae08, 0x04) -Field (SEJ, DWordAcc, NoLock, WriteAsZeros) -{ -B0EJ, 32, +// PCI Up/ACK +PUPA, 32, +// PCI Down +PDWN, 32, +// PCI Present/Eject +PPEJ, 32, Note on the comment: this only affects bus0 not all of PCI. As has always been the case. } Name (_CRS, ResourceTemplate () @@ -462,10 +460,20 @@ DefinitionBlock ( /* Methods called by hotplug devices */ Method (PCEJ, 1, NotSerialized) { // _EJ0 method - eject callback -Store(ShiftLeft(1, Arg0), B0EJ) +Store(ShiftLeft(1, Arg0), PPEJ) Return (0x0) } +Method (PSTA, 1, NotSerialized) { +Store(ShiftLeft(1, Arg0), PUPA) So this looks wrong to me. Specifically _STA is also called at the end after _EJ0. If the device is ejected then insterted, you get a window where
[SeaBIOS] [PATCH] seabios: acpi: Add _STA for PCI hotplug slots
When a Status method is provided on a slot, the OSPM evaluates _STA in response to the device check notify on the slot. This allows some degree of a handshake between the platform and the OSPM that the hotplug has been acknowledged. In order to implement _STA, we need to know which slots have devices. A slot with device returns 0x0F, a slot without a device returns Zero. We get this information from Qemu using the 0xae08 I/O port register. This was previously the read-side of the register written to commit a device eject and always returned 0 on read. It now returns a bitmap of present slots, so we know that reading 0 means we have and old Qemu and dynamically modify our SSDT to rename the _STA methods. This is necessary to allow backwards compatibility. The _STA method also writes the slot identifier to I/O port register 0xae00 as an acknowledgment of the hotplug request. Signed-off-by: Alex Williamson alex.william...@redhat.com --- src/acpi-dsdt.dsl | 36 ++- src/acpi-dsdt.hex | 124 ++ src/acpi.c | 27 ++ src/ssdt-pcihp.dsl |3 src/ssdt-pcihp.hex | 658 5 files changed, 686 insertions(+), 162 deletions(-) diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl index 7082b65..6b87086 100644 --- a/src/acpi-dsdt.dsl +++ b/src/acpi-dsdt.dsl @@ -119,17 +119,15 @@ DefinitionBlock ( prt_slot3(0x001f), }) -OperationRegion(PCST, SystemIO, 0xae00, 0x08) +OperationRegion(PCST, SystemIO, 0xae00, 0x0c) Field (PCST, DWordAcc, NoLock, WriteAsZeros) { -PCIU, 32, -PCID, 32, -} - -OperationRegion(SEJ, SystemIO, 0xae08, 0x04) -Field (SEJ, DWordAcc, NoLock, WriteAsZeros) -{ -B0EJ, 32, +// PCI Up/ACK +PUPA, 32, +// PCI Down +PDWN, 32, +// PCI Present/Eject +PPEJ, 32, } Name (_CRS, ResourceTemplate () @@ -462,10 +460,20 @@ DefinitionBlock ( /* Methods called by hotplug devices */ Method (PCEJ, 1, NotSerialized) { // _EJ0 method - eject callback -Store(ShiftLeft(1, Arg0), B0EJ) +Store(ShiftLeft(1, Arg0), PPEJ) Return (0x0) } +Method (PSTA, 1, NotSerialized) { +Store(ShiftLeft(1, Arg0), PUPA) +Store(PPEJ, Local0) +If (And(Local0, ShiftLeft(1, Arg0))) { +Return(0x0F) +} Else { +Return(Zero) +} +} + /* Hotplug notification method supplied by SSDT */ External (\_SB.PCI0.PCNT, MethodObj) @@ -473,12 +481,16 @@ DefinitionBlock ( Method(PCNF, 0) { // Local0 = iterator Store (Zero, Local0) +// Local1 = slots marked up +Store (PUPA, Local1) +// Local2 = slots marked down +Store (PDWN, Local2) While (LLess(Local0, 31)) { Increment(Local0) -If (And(PCIU, ShiftLeft(1, Local0))) { +If (And(Local1, ShiftLeft(1, Local0))) { PCNT(Local0, 1) } -If (And(PCID, ShiftLeft(1, Local0))) { +If (And(Local2, ShiftLeft(1, Local0))) { PCNT(Local0, 3) } } diff --git a/src/acpi-dsdt.hex b/src/acpi-dsdt.hex index 5dc7bb4..6d99f53 100644 --- a/src/acpi-dsdt.hex +++ b/src/acpi-dsdt.hex @@ -3,12 +3,12 @@ static unsigned char AmlCode[] = { 0x53, 0x44, 0x54, -0xd3, +0xeb, 0x10, 0x0, 0x0, 0x1, -0x2d, +0x10, 0x42, 0x58, 0x50, @@ -110,16 +110,16 @@ static unsigned char AmlCode[] = { 0x47, 0x42, 0x10, -0x44, -0x81, +0x40, +0x80, 0x5f, 0x53, 0x42, 0x5f, 0x5b, 0x82, -0x4c, -0x80, +0x48, +0x7f, 0x50, 0x43, 0x49, @@ -2014,47 +2014,27 @@ static unsigned char AmlCode[] = { 0x0, 0xae, 0xa, -0x8, +0xc, 0x5b, 0x81, -0x10, +0x15, 0x50, 0x43, 0x53, 0x54, 0x43, 0x50, -0x43, -0x49, 0x55, +0x50, +0x41, 0x20, 0x50, -0x43, -0x49, 0x44, +0x57, +0x4e, 0x20, -0x5b, -0x80, -0x53, -0x45, -0x4a, -0x5f, -0x1, -0xb, -0x8, -0xae, -0xa, -0x4, -0x5b, -0x81, -0xb, -0x53, -0x45, -0x4a, -0x5f, -0x43, -0x42, -0x30, +0x50, +0x50, 0x45, 0x4a, 0x20, @@ -3039,8 +3019,8 @@ static unsigned char AmlCode[] = { 0x4a, 0x20, 0x10, -0x46, -0x5, +0x42, +0x8, 0x2e, 0x5f, 0x53, @@ -3062,14 +3042,52 @@ static unsigned char AmlCode[] = { 0x1, 0x68, 0x0, -0x42, -0x30, +0x50, +0x50, 0x45, 0x4a, 0xa4, 0x0, 0x14, -0x38, +0x25, +0x50, +0x53, +0x54, +0x41, +0x1, +0x70, +0x79, +0x1, +0x68, +0x0, +0x50, +0x55, +0x50, +0x41, +0x70, +0x50, +0x50, +0x45, +0x4a, +0x60, +0xa0, +0xb, +0x7b, +0x60, +0x79, +0x1, +0x68, +0x0, +0x0, +0xa4, +0xa, +0xf, +0xa1, +0x3, +0xa4, +0x0, +0x14, +0x3e, 0x50, 0x43, 0x4e, @@ -3078,8 +3096,20