Re: [SeaBIOS] [PATCH] qemu: piix: PCI bridge ACPI hotplug support

2013-06-10 Thread Michael S. Tsirkin
On Mon, Jun 10, 2013 at 01:58:41PM -0500, Anthony Liguori wrote:
> "Michael S. Tsirkin"  writes:
> 
> > This adds support for device hotplug behind
> > pci bridges. Bridge devices themselves need
> > to be pre-configured on qemu command line.
> >
> > One of the goals of this project was to
> > demonstrate the advantages of generating
> > ACPI tables in QEMU.
> > In my opinion, it does this successfully.
> 
> Since you've gone out of your way to make this difficult to actually
> review...

What's wrong?

> >
> >* This saves the need to provide a new interface for firmware
> >  to discover bus number to pci brige mapping
> 
> Do you mean fw_cfg?  The BIOS normally does bus numbering.  I see no
> reason for it not to.

I did my best to explain this. ACPI spec explicitly states
config access to devices not on bus 0 is forbidden.
This is to allow OS to renumber the bus at any time.
So you have pci bus number but can't use it.

> >* No need for yet another interface to bios to detect qemu version
> >  to check if it's safe to activate new code,
> >  and to ship multiple table versions:
> 
> We only care about supporting one version of SeaBIOS.  SeaBIOS should
> only care about supporting one version of QEMU.  We're not asking it to
> support multiple versions.

We do e.g. when we run with -M1.2 and migrate to old QEMU
early in boot process.


> >  we generated code so we know this is new qemu
> >* Use of glib's GArray makes it much easier to build
> >  up tables in code without need for iasl and code patching
> 
> Adding a dynamic array to SeaBIOS isn't rocket science.

Glib reimplementation in bios will be more code than using existing
glib.

> >
> > I have also tried to address the concerns that
> > Anthony has raised with this approach, please see below.
> >
> > Design:
> > - each bus gets assigned a number 0-255
> > - generated ACPI code writes this number
> >   to a new BSEL register, then uses existing
> >   UP/DOWN registers to probe slot status;
> >   to eject, write number to BSEL register,
> >   then slot into existing EJ
> >
> > This is to address the ACPI spec requirement to
> > avoid config cycle access to any bus except PCI roots.
> >
> > Portability:
> > - Non x86 (or any Linux) platforms don't need any of this code.
> >   They can keep happily using SHPC the way
> >   they always did.
> >
> > Things to note:
> > - this is on top of acpi patchset posted a month ago,
> >   with a small patch adding a core function to walk all
> >   pci buses, on top.
> >   Can also be found in my git tree
> > git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git acpi
> >
> > - Extensive use of glib completely removes
> >   pointer math in new code: we use
> >   g_array_append_vals exclusively.
> >   There's no code patching in new code.
> >   This is in response to comments about security
> >   concerns with adding code to qemu.
> >   In this sense it is more secure than existing
> >   code in hw/acpi/core.c - that can be switched to
> >   glib if desired.
> >
> > - New code (not imported from seabios)
> >   uses glib to reduce dependency on iasl.
> >   With time all code can be rewritted to
> >   use glib and avoid iasl, if desired.
> >
> > - As was the case previously,
> >   systems that lack working iasl are detected at configure time,
> >   pre-generated hex files in source tree are used in this case.
> >   This addresses the concern with iasl/big-endian
> >   systems.
> >
> > - Compile tested only. Migration is known to be
> >   broken - there are a bunch of TODO tags in code.
> >   It was agreed on the last qemu conf meeting that
> >   this code is posted without looking at migration,
> >   with understanding that it is addressed before
> >   being merged. Please do not mistake this
> >   limitation for a fundamental one - I have a
> >   very good idea how to fix it, including
> >   cross-version migration.
> >
> > - Cross version migration: when running with -M 1.5
> >   and older, all ACPI table generation should be disabled.
> >   We'll present FW_CFG interface compatible with 1.5.
> 
> So TL;DR, you break a bunch of stuff

Confused.
When you asked me to write this up during last conf call,
you explicitly said not to bother with migration?
Almost any change affecting guest needs migration
related work.

> and introduce a mess of code.

What did you expect? New functionality with less code?

>  It
> would be less code and wouldn't break anything to add this logic to
> SeaBIOS.

It will be much more code - we would need to add a bunch of interfaces
to qemu then code using that to seabios, worry about running this
seabios on old qemu, etc etc.

> How is this even a discussion?

Interface change is minimized.

Interface to bios is not changed at all.
Interface to gu

Re: [SeaBIOS] [PATCH] qemu: piix: PCI bridge ACPI hotplug support

2013-06-10 Thread Anthony Liguori
Hi Kevin,

On Mon, Jun 10, 2013 at 7:23 PM, Kevin O'Connor  wrote:
> On Mon, Jun 10, 2013 at 06:34:29PM -0500, Anthony Liguori wrote:
>> Kevin O'Connor  writes:
>>
>> For the MADT table, right now SeaBIOS needs the CPU count.  That can be
>> found by counting the number of CPU nodes.  Today cpus are unattached so
>> they appear in /machine/unattached but we should put them in a
>> /machine/cpu container for clarity.
>
> SeaBIOS needs much more than the CPU count.  The madt contains info on
> each interrupt - its global irq number, the legacy irq number, the
> acpi defined type of the irq, and the acpi defined flags for the irq.
> It also contains similar info on each cpu (including apic id), the io
> apic, and NMIs.

See below.

>> QOM is the full representation of the device state so we should not ever
>> need to add additional things to fw_cfg.  More likely than not, when
>> SeaBIOS needs more information, it's already there so added
>> functionality will probably Just Work with older QEMUs.
>>
>> > I think it would also be
>> > useful if you could do the same for a couple DSDT entries (eg,
>> > \_SB.PCI0, \_SB.PCI0.ISA) and also describe how you plan to have the
>> > guest build the AML from that info.
>>
>> Likewise the slot count should be available too.  We hard code slots
>> today but it is something we should model properly.  We would model it
>> using QOM of course.
>>
>> Internally within QEMU, this initial discussion started by saying that
>> any ACPI generation within QEMU should happen strictly with QOM
>> methods.  This was the crux of my argument, if QOM is the only interface
>> we need, everything is there for the firmware to do the same job that
>> QEMU would be doing.
>
> I think we keep talking past each other.  You seem to be pointing out
> that the information seabios uses to patch its hardcoded tables can be
> passed in via QOM.  Agreed, it can.  I'm pointing out all the info
> that is hardcoded in seabios - I don't see how that can be passed via
> QOM.

No, we aren't talking past each other.  We both have the same goal.

A lot of what is hard coded in SeaBIOS is hard coded in QEMU too.  IRQ
routing is a good example of that.  We want to make this information
dynamic.

Part of the trouble is that we haven't had a need to not hard code it
because this is only information consumed by the BIOS.

> All the hardcoded data in seabios is a problem, because when it
> changes (and it frequently does) it requires changes to both QEMU and
> SeaBIOS and those changes have to be coordinated.  The key reason for
> moving the tables into QEMU is not that it can better patch the tables
> - the advantage is that it can hardcode the tables that need patching.

So let's fix it.  It's very easy to add read-only properties to QOM so
we can hard code the bits that are in SeaBIOS now as read-only
properties.  For the MADT table, the per-CPU and IOAPIC info can
simply be properties of those devices.  We already represent irqs in
QOM as integers so I suspect much of the information SeaBIOS needs is
already there.

> I can cite several recent examples of seabios change requests that
> require changing of the hardcoded tables: liguang wants to add an
> "embedded controller" hardware device which requires changes to
> seabios' dsdt, Corey Minyard wants to add IPMI hardware support (both
> smbios changes and DSDT changes), and Corey Bryant wants to add TPM
> hardware support.
>
> How do we solve the problem of seabios having a ton of hardcoded
> information about the qemu hardware, and seabios having to change with
> the hardware modifications that qemu makes?

I think that we can pretty much touch a table once pulling all of the
info from QOM and then from a SeaBIOS point of view, never have to
touch it again.

The benefit is that solving this problem for x86 solves it for other
architectures too and also lays the ground work to let a user actually
control these bits too.

Regards,

Anthony Liguori

> -Kevin

___
SeaBIOS mailing list
SeaBIOS@seabios.org
http://www.seabios.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] qemu: piix: PCI bridge ACPI hotplug support

2013-06-10 Thread Anthony Liguori
David Woodhouse  writes:

> On Mon, 2013-06-10 at 18:34 -0500, Anthony Liguori wrote:
>> Internally within QEMU, this initial discussion started by saying that
>> any ACPI generation within QEMU should happen strictly with QOM
>> methods.  This was the crux of my argument, if QOM is the only
>> interface we need, everything is there for the firmware to do the same
>> job that QEMU would be doing.
>
> That's nice in theory, but I'm not sure how it works as things evolve
> and new things / new features get exposed. The firmware's
> *interpretation* of the QOM tree needs to be kept in sync with qemu.
>
> Hm, make that: The firmwares' *interpretation*…
>
> Let's take a specific, recent example. We fixed the PIIX4 code to
> actually handle the hard reset on port 0xcf9. We need to fix the ACPI
> tables to indicate a usable RESET_REG.

Very good example.

Normally, we try to be bug-for-bug compatible for guests such that -M
pc-1.4 would behave exactly the same.

In this case, we failed to introduce a property to control this behavior
like we should have.  If this changes the guest ACPI tables, it
definitely should definitely be set based on a property.

This is a good example of why this approach is good for QEMU, it helps
us catch stuff like this.

> How is that exposed in the QOM tree, and how does it all work? With qemu
> exposing ACPI tables in their close-to-final form, it's just fine. Boot
> with a recent qemu and it's all nice and shiny; boot with an old qemu
> and it doesn't reset properly.

If we did this right in QEMU, we'd have to introduce a QOM property
anyway as that's how we trigger differences in machine behavior.  And -M
pc-1.4 ought to generate the same tables as qemu 1.4 did.

Regards,

Anthony Liguori

> But if the firmware has to be updated to interpret the new feature
> advertised in the QOM tree and translate it into the ACPI table, then we
> haven't really got much of an improvement.
>
> Please explain how this is supposed to work in *practice*.
>
> -- 
> dwmw2

___
SeaBIOS mailing list
SeaBIOS@seabios.org
http://www.seabios.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] qemu: piix: PCI bridge ACPI hotplug support

2013-06-10 Thread Kevin O'Connor
On Mon, Jun 10, 2013 at 04:45:53PM -0500, Anthony Liguori wrote:
> This discussion comes down to two things I think: (a) our existing
> firmware interface is pretty poor (b) we are duplicating work because of
> firmware licensing.
> 
> We can fix (a) and there's lots of value in doing that in terms of
> improving support for other architectures.  We've discussed (b) in other
> threads and I've stated my opinion on the direction we need to take.

I'm not concerned about (b).

I'm quite curious how you are planning to solve (a).  I think it would
help move this conversation forward if you could take a couple acpi
tables in use today (eg, madt, srat) and describe the future format
and location for each field in those tables.  I think it would also be
useful if you could do the same for a couple DSDT entries (eg,
\_SB.PCI0, \_SB.PCI0.ISA) and also describe how you plan to have the
guest build the AML from that info.

-Kevin

___
SeaBIOS mailing list
SeaBIOS@seabios.org
http://www.seabios.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] qemu: piix: PCI bridge ACPI hotplug support

2013-06-10 Thread David Woodhouse
On Mon, 2013-06-10 at 18:34 -0500, Anthony Liguori wrote:
> Internally within QEMU, this initial discussion started by saying that
> any ACPI generation within QEMU should happen strictly with QOM
> methods.  This was the crux of my argument, if QOM is the only
> interface we need, everything is there for the firmware to do the same
> job that QEMU would be doing.

That's nice in theory, but I'm not sure how it works as things evolve
and new things / new features get exposed. The firmware's
*interpretation* of the QOM tree needs to be kept in sync with qemu.

Hm, make that: The firmwares' *interpretation*…

Let's take a specific, recent example. We fixed the PIIX4 code to
actually handle the hard reset on port 0xcf9. We need to fix the ACPI
tables to indicate a usable RESET_REG.

How is that exposed in the QOM tree, and how does it all work? With qemu
exposing ACPI tables in their close-to-final form, it's just fine. Boot
with a recent qemu and it's all nice and shiny; boot with an old qemu
and it doesn't reset properly.

But if the firmware has to be updated to interpret the new feature
advertised in the QOM tree and translate it into the ACPI table, then we
haven't really got much of an improvement.

Please explain how this is supposed to work in *practice*.

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
___
SeaBIOS mailing list
SeaBIOS@seabios.org
http://www.seabios.org/mailman/listinfo/seabios


Re: [SeaBIOS] [PATCH] qemu: piix: PCI bridge ACPI hotplug support

2013-06-10 Thread Anthony Liguori
"Michael S. Tsirkin"  writes:

> This adds support for device hotplug behind
> pci bridges. Bridge devices themselves need
> to be pre-configured on qemu command line.
>
> One of the goals of this project was to
> demonstrate the advantages of generating
> ACPI tables in QEMU.
> In my opinion, it does this successfully.

Since you've gone out of your way to make this difficult to actually
review...

>
>* This saves the need to provide a new interface for firmware
>  to discover bus number to pci brige mapping

Do you mean fw_cfg?  The BIOS normally does bus numbering.  I see no
reason for it not to.

>* No need for yet another interface to bios to detect qemu version
>  to check if it's safe to activate new code,
>  and to ship multiple table versions:

We only care about supporting one version of SeaBIOS.  SeaBIOS should
only care about supporting one version of QEMU.  We're not asking it to
support multiple versions.

>  we generated code so we know this is new qemu
>* Use of glib's GArray makes it much easier to build
>  up tables in code without need for iasl and code patching

Adding a dynamic array to SeaBIOS isn't rocket science.

>
> I have also tried to address the concerns that
> Anthony has raised with this approach, please see below.
>
> Design:
> - each bus gets assigned a number 0-255
> - generated ACPI code writes this number
>   to a new BSEL register, then uses existing
>   UP/DOWN registers to probe slot status;
>   to eject, write number to BSEL register,
>   then slot into existing EJ
>
> This is to address the ACPI spec requirement to
> avoid config cycle access to any bus except PCI roots.
>
> Portability:
> - Non x86 (or any Linux) platforms don't need any of this code.
>   They can keep happily using SHPC the way
>   they always did.
>
> Things to note:
> - this is on top of acpi patchset posted a month ago,
>   with a small patch adding a core function to walk all
>   pci buses, on top.
>   Can also be found in my git tree
> git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git acpi
>
> - Extensive use of glib completely removes
>   pointer math in new code: we use
>   g_array_append_vals exclusively.
>   There's no code patching in new code.
>   This is in response to comments about security
>   concerns with adding code to qemu.
>   In this sense it is more secure than existing
>   code in hw/acpi/core.c - that can be switched to
>   glib if desired.
>
> - New code (not imported from seabios)
>   uses glib to reduce dependency on iasl.
>   With time all code can be rewritted to
>   use glib and avoid iasl, if desired.
>
> - As was the case previously,
>   systems that lack working iasl are detected at configure time,
>   pre-generated hex files in source tree are used in this case.
>   This addresses the concern with iasl/big-endian
>   systems.
>
> - Compile tested only. Migration is known to be
>   broken - there are a bunch of TODO tags in code.
>   It was agreed on the last qemu conf meeting that
>   this code is posted without looking at migration,
>   with understanding that it is addressed before
>   being merged. Please do not mistake this
>   limitation for a fundamental one - I have a
>   very good idea how to fix it, including
>   cross-version migration.
>
> - Cross version migration: when running with -M 1.5
>   and older, all ACPI table generation should be disabled.
>   We'll present FW_CFG interface compatible with 1.5.

So TL;DR, you break a bunch of stuff and introduce a mess of code.  It
would be less code and wouldn't break anything to add this logic to
SeaBIOS.

How is this even a discussion?

Regards,

Anthony Liguori

>
> Cc: Jordan Justen 
> Cc: Anthony Liguori 
> Cc: Laszlo Ersek 
> Cc: Gerd Hoffmann 
> Cc: seabios@seabios.org
> Cc: David Woodhouse 
> Cc: Kevin O'Connor 
> Cc: Peter Maydell 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  docs/specs/acpi_pci_hotplug.txt  |   8 +
>  include/hw/i386/pc.h |   4 +-
>  hw/i386/acpi-dsdt.dsl|  36 +++-
>  hw/i386/ssdt-pcihp.dsl   |  51 -
>  hw/acpi/piix4.c  | 145 ++
>  hw/i386/acpi-build.c | 411 
> ++-
>  hw/i386/Makefile.objs|   2 +-
>  hw/i386/ssdt-pcihp.hex.generated | 108 --
>  8 files changed, 510 insertions(+), 255 deletions(-)
>  delete mode 100644 hw/i386/ssdt-pcihp.dsl
>  delete mode 100644 hw/i386/ssdt-pcihp.hex.generated
>
> diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
> index a839434..951b99a 100644
> --- a/docs/specs/acpi_pci_hotplug.txt
> +++ b/docs/specs/acpi_pci_hotplug.txt
> @@ -43,3 +43,11 @@ PCI removability status (IO port 0xae0c-0xae0f, 4-byte 
> access):
>  
>  Used by ACPI BIOS _RMV m

[SeaBIOS] [PATCH] qemu: piix: PCI bridge ACPI hotplug support

2013-06-10 Thread Michael S. Tsirkin
This adds support for device hotplug behind
pci bridges. Bridge devices themselves need
to be pre-configured on qemu command line.

One of the goals of this project was to
demonstrate the advantages of generating
ACPI tables in QEMU.
In my opinion, it does this successfully.

   * This saves the need to provide a new interface for firmware
 to discover bus number to pci brige mapping
   * No need for yet another interface to bios to detect qemu version
 to check if it's safe to activate new code,
 and to ship multiple table versions:
 we generated code so we know this is new qemu
   * Use of glib's GArray makes it much easier to build
 up tables in code without need for iasl and code patching

I have also tried to address the concerns that
Anthony has raised with this approach, please see below.

Design:
- each bus gets assigned a number 0-255
- generated ACPI code writes this number
  to a new BSEL register, then uses existing
  UP/DOWN registers to probe slot status;
  to eject, write number to BSEL register,
  then slot into existing EJ

This is to address the ACPI spec requirement to
avoid config cycle access to any bus except PCI roots.

Portability:
- Non x86 (or any Linux) platforms don't need any of this code.
  They can keep happily using SHPC the way
  they always did.

Things to note:
- this is on top of acpi patchset posted a month ago,
  with a small patch adding a core function to walk all
  pci buses, on top.
  Can also be found in my git tree
git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git acpi

- Extensive use of glib completely removes
  pointer math in new code: we use
  g_array_append_vals exclusively.
  There's no code patching in new code.
  This is in response to comments about security
  concerns with adding code to qemu.
  In this sense it is more secure than existing
  code in hw/acpi/core.c - that can be switched to
  glib if desired.

- New code (not imported from seabios)
  uses glib to reduce dependency on iasl.
  With time all code can be rewritted to
  use glib and avoid iasl, if desired.

- As was the case previously,
  systems that lack working iasl are detected at configure time,
  pre-generated hex files in source tree are used in this case.
  This addresses the concern with iasl/big-endian
  systems.

- Compile tested only. Migration is known to be
  broken - there are a bunch of TODO tags in code.
  It was agreed on the last qemu conf meeting that
  this code is posted without looking at migration,
  with understanding that it is addressed before
  being merged. Please do not mistake this
  limitation for a fundamental one - I have a
  very good idea how to fix it, including
  cross-version migration.

- Cross version migration: when running with -M 1.5
  and older, all ACPI table generation should be disabled.
  We'll present FW_CFG interface compatible with 1.5.

Cc: Jordan Justen 
Cc: Anthony Liguori 
Cc: Laszlo Ersek 
Cc: Gerd Hoffmann 
Cc: seabios@seabios.org
Cc: David Woodhouse 
Cc: Kevin O'Connor 
Cc: Peter Maydell 
Signed-off-by: Michael S. Tsirkin 
---
 docs/specs/acpi_pci_hotplug.txt  |   8 +
 include/hw/i386/pc.h |   4 +-
 hw/i386/acpi-dsdt.dsl|  36 +++-
 hw/i386/ssdt-pcihp.dsl   |  51 -
 hw/acpi/piix4.c  | 145 ++
 hw/i386/acpi-build.c | 411 ++-
 hw/i386/Makefile.objs|   2 +-
 hw/i386/ssdt-pcihp.hex.generated | 108 --
 8 files changed, 510 insertions(+), 255 deletions(-)
 delete mode 100644 hw/i386/ssdt-pcihp.dsl
 delete mode 100644 hw/i386/ssdt-pcihp.hex.generated

diff --git a/docs/specs/acpi_pci_hotplug.txt b/docs/specs/acpi_pci_hotplug.txt
index a839434..951b99a 100644
--- a/docs/specs/acpi_pci_hotplug.txt
+++ b/docs/specs/acpi_pci_hotplug.txt
@@ -43,3 +43,11 @@ PCI removability status (IO port 0xae0c-0xae0f, 4-byte 
access):
 
 Used by ACPI BIOS _RMV method to indicate removability status to OS. One
 bit per slot.  Read-only
+
+PCI bus selector (IO port 0xae10-0xae13, 4-byte access):
+---
+
+Used by ACPI BIOS methods to select the current PCI bus.
+When written, makes all the other PCI registers (0xae00 - 0xae0f)
+to refer to the appropriate bus.
+0 selects PCI root bus (default).
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index f8d0871..66ec787 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -38,7 +38,9 @@ struct PcGuestInfo {
 bool s3_disabled;
 bool s4_disabled;
 uint8_t s4_val;
-DECLARE_BITMAP(slot_hotplug_enable, PCI_SLOT_MAX);
+#define GUEST_INFO_MAX_HOTPLUG_BUS 256
+PCIBus *hotplug_buses[GUEST_INFO_MAX_HOTPLUG_BUS];
+DECLARE_BITMAP(hotplug_enable[GUEST_INFO_MAX_HOTPLUG_BUS], PCI_SLO