On Wed, May 08, 2013 at 02:07:24PM +0300, Michael S. Tsirkin wrote:
> On Wed, May 08, 2013 at 01:59:12PM +0300, Gleb Natapov wrote:
> > On Wed, May 08, 2013 at 01:43:25PM +0300, Michael S. Tsirkin wrote:
> > > On Wed, May 08, 2013 at 01:34:59PM +0300, Gleb Natapov wrote:
> > > > On Wed, May 08, 2013 at 01:29:12PM +0300, Michael S. Tsirkin wrote:
> > > > > On Wed, May 08, 2013 at 12:31:50PM +0300, Gleb Natapov wrote:
> > > > > > On Tue, May 07, 2013 at 07:01:13PM -0400, Kevin O'Connor wrote:
> > > > > > > On Tue, May 07, 2013 at 09:00:48PM +0300, Michael S. Tsirkin 
> > > > > > > wrote:
> > > > > > > > On Thu, Apr 25, 2013 at 12:02:20PM +0300, Michael S. Tsirkin 
> > > > > > > > wrote:
> > > > > > > > > Untested yet, but I thought I'd share the
> > > > > > > > > BIOS bits so we can agree on direction.
> > > > > > > > > 
> > > > > > > > > In particular check out ROM sizes:
> > > > > > > > > - Before patchset with DSDT enabled
> > > > > > > > >     Total size: 127880  Fixed: 59060  Free: 3192 (used 97.6% 
> > > > > > > > > of 128KiB rom)
> > > > > > > > > - Before patchset with DSDT disabled
> > > > > > > > >     Total size: 122844  Fixed: 58884  Free: 8228 (used 93.7% 
> > > > > > > > > of 128KiB rom)
> > > > > > > > > - After patchset:
> > > > > > > > >     Total size: 128776  Fixed: 59100  Free: 2296 (used 98.2% 
> > > > > > > > > of 128KiB rom)
> > > > > > > > > - Legacy disabled at build time:
> > > > > > > > >     Total size: 119836  Fixed: 58996  Free: 11236 (used 91.4% 
> > > > > > > > > of 128KiB rom)
> > > > > > > > > 
> > > > > > > > > As can be seen from this, most size savings come
> > > > > > > > > from dropping DSDT, but we do save a bit by removing
> > > > > > > > > other tables. Of course the real reason to move tables to QEMU
> > > > > > > > > is so that ACPI can better match hardware.
> > > > > > > > > 
> > > > > > > > > This patchset adds an option to move all code for formatting 
> > > > > > > > > acpi tables
> > > > > > > > > out of BIOS. With this, QEMU has full control over the table 
> > > > > > > > > layout.
> > > > > > > > > All tables are loaded from the new "/etc/acpi/" directory.
> > > > > > > > > Any entries in this directory cause BIOS to disable
> > > > > > > > > ACPI table generation completely.
> > > > > > > > > A generic linker script, controlled by QEMU, is
> > > > > > > > > loaded from "/etc/linker-script". It is used to
> > > > > > > > > patch in table pointers and checksums.
> > > > > > > > 
> > > > > > > > After some thought, there are two additional
> > > > > > > > options worth considering, in that they simplify
> > > > > > > > bios code somewhat:
> > > > > > > > 
> > > > > > > > - bios could get size from qemu, allocate a buffer
> > > > > > > >   (e.g. could be one buffer for all tables)
> > > > > > > >   and pass the address to qemu.
> > > > > > > >   qemu does all the patching
> > > > > > > > 
> > > > > > > > - further, qemu could do the copy of tables into
> > > > > > > >   that address directly
> > > > > > > 
> > > > > > > This seems more complex than necessary to me.
> > > > > > > 
> > > > > > > The important task is to get the tables generated in QEMU - I'd 
> > > > > > > focus
> > > > > > > on getting the tables generated in QEMU (one table per fw_cfg 
> > > > > > > "file").
> > > > > > > Once that is done, the SeaBIOS side can be easily implemented, 
> > > > > > > and we
> > > > > > > can add any enhancements on top if we feel it is necessary.
> > > > > > > 
> > > > > > +1. This "copy of tables into that address directly" is just an 
> > > > > > ad-hoc PV
> > > > > > isa DMA device in disguise. Such device was refused when libguestfs
> > > > > > asked for it, and they wanted it for much better reason - 
> > > > > > performance.
> > > > > > There is existing mechanism to pass data into firmware. Use it 
> > > > > > please.
> > > > > 
> > > > > Yes I can code it up using FW_CFG for now.
> > > > > 
> > > > > One issue with QEMU_CFG_FILE_DIR is that it's broken wrt migration,
> > > > > unless we pass in very small bits of data which we
> > > > > can guarantee never changes across qemu versions.
> > > > > 
> > > > Shouldn't we guaranty that ACPI tables do not change for the same
> > > > machine type anyway?
> > > 
> > > That's not practical. They are too big to stay completely unchanged.
> > > 
> > I will not be surprised if this will cause us problem somehow. Guest
> > will see new tables only after reboot/resume from S4 so damage is
> > limited, but one thing that comes to mind is table's size change. If
> > they grow from one version to the other after resuming a guest from S4
> > on new QEMU version part of the tables may be corrupted.
> 
> Why would it be corrupted?
> 
Because ACPI tables are stored in the memory marked as "ACPI data" (IIRC
seabios mark them as reserved). Guests do not save reserved memory during
S4 (don't know about "ACPI data", but if guest does not copy tables from
it to another location I doubt it saves the memory, anyway ACPI spec does
not mandate it), so what happens on resume if the memory grows? Part
of it, that was not marked as reserved before S4, is re-written with
whatever data guest had there and rest contains now corrupted ACPI tables
that BIOS put there during boot. We can hope that guest is smart enough
to see that memory map changed and refuse to resume or we can put ACPI
tables into NVS memory which has to be saved and restored during S4 by
OSPM.

> In any case, FACS has a hardware signature value for
> just such a case. If we know VM can not be resumed on new QEMU,
> we can change the signature and it will cold-boot instead.
> 
Nice, does Linux check it?

> > > > > Off-list, I suggested fixing it and migrating file
> > > > > content, but Anthony thinks it's a bad idea.
> > > > > 
> > > > Why is this a bad idea to fix device migration?
> > > 
> > > You misunderstand I think.
> > > Question is whether we should be putting so much info in fw_cfg.
> > > If we keep fw_cfg for small things we don't need to
> > > migrate it. In that case ACPI tables have to be passed in
> > > using some other mechanism.
> > > 
> > Where this notion that fw_cfg is only for a small things is coming
> > from? I can assure you this was not the case when the device was
> > introduced. In fact it is used today for not so small things like
> > bootindex splash screen bitmaps, option rom loading and kernel/initrd
> > loading. Some of those are bigger then ACPI tables will ever be.
> > And they all should be migrated, so fw_cfg should be fixed anyway.
> > 
> > --
> >                     Gleb.
> 
> I'm not arguing with that. Convince Anthony please.
> 
Convince him in what? That fw_cfg is broken vrt migration and there are
cases that will fail _today_ without any ACPI related changes? This is
knows for ages.

--
                        Gleb.

Reply via email to