On Fri, 17 Feb 2017 11:06:54 +0100 Laszlo Ersek <ler...@redhat.com> wrote:
> CC Kevin > > On 02/17/17 07:10, Ben Warren wrote: > > Hi Laszlo > >> On Feb 9, 2017, at 12:24 AM, Laszlo Ersek <ler...@redhat.com > >> <mailto:ler...@redhat.com>> wrote: > >> > >> On 02/09/17 09:17, Laszlo Ersek wrote: > >>> Ben, > >>> > >>> On 02/05/17 18:09, b...@skyportsystems.com > >>> <mailto:b...@skyportsystems.com> wrote: > >>>> From: Ben Warren <b...@skyportsystems.com > >>>> <mailto:b...@skyportsystems.com>> > >>>> > >>>> This command is similar to ADD_POINTER, but instead of patching > >>>> memory, it writes the pointer back to QEMU over the DMA interface. > >>>> > >>>> Signed-off-by: Ben Warren <b...@skyportsystems.com > >>>> <mailto:b...@skyportsystems.com>> > >>>> --- > >>>> src/fw/romfile_loader.c | 37 +++++++++++++++++++++++++++++++++++++ > >>>> src/fw/romfile_loader.h | 16 ++++++++++------ > >>>> 2 files changed, 47 insertions(+), 6 deletions(-) > >>> > >>> while working on the OVMF patches for WRITE_POINTER, something else > >>> occurred to me. > >>> > >>> As I mentioned in the QEMU thread, > >>> > >>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg428495.html > >>> > >>> the VMGENID device (or more generally, each device that produces > >>> WRITE_POINTER command(s)) should have a reset handler that makes the > >>> device forget the GPA(s) that the firmware communicated to it, via > >>> WRITE_POINTER. This is because after system reset, all earlier GPAs > >>> become meaningless. > >>> > >>> What I want to mention here is the corollary to the above. ACPI S3 > >>> resume is basically a reset, but the GPAs remain valid, because system > >>> memory is preserved. Therefore, at S3 resume, the guest firmware has to > >>> replay all the WRITE_POINTER commands. The QEMU reset handler will > >>> forcibly forget about the GPAs (which is correct), so the firmware has > >>> to re-store them. > >> > >> ... By that I don't necessarily mean to re-run the exact same > >> linker-loader logic; it should be okay to save the *results* of those > >> operations in a simpler array (that is, write exactly what values to > >> what fw_cfg files at what offsets). > >> > >> And, you can detect whether this is needed at all, the > >> "etc/system-states" fw_cfg file can be used to see if ACPI S3 is enabled > >> or disabled in QEMU. (Please grep QEMU for "disable_s3" and > >> "etc/system-states".) > >> > > I’ve made some changes to SeaBIOS that I believe should do this, but > > don’t really know how to test it. > > Here’s what I’ve tried: > > • Instrumented QEMU to print a message when a fw_cfg write comes from > > the guest > > • called QEMU with the additional arguments "-global > > PIIX4_PM.disable_s3=0”. Note that my machine type is pc-i440fx-2.8 > > • Booted a Linux guest. Once logged in, typed “pm-suspend” > > • Noted the following in the QEMU monitor: > > • (QEMU) > > {u'timestamp': {u'seconds': 1487310498, u'microseconds': 971046}, > > u'event': u'SUSPEND’} > > • Woke the guest up from QEMU monitor: > > • (QEMU) system_wakeup > > {"return": {}} > > (QEMU) > > {u'timestamp': {u'seconds': 1487310558, u'microseconds': > > 135357}, u'event': u'WAKEUP’} > > Yes, this is exactly how it should be tested. > > > > > But didn’t see the fw_cfg getting written. I don’t understand the ACPI > > states well enough to know if the above sequence is exercising S3, so > > don’t really want to spend more effort without knowing I’m doing the > > right thing. > > You are doing the right thing. > > For additional confirmation, you can instrument SeaBIOS too, by adding > dprintf()s to handle_resume() in "resume.c": > > https://www.seabios.org/Execution_and_code_flow#Resume_and_reboot > > Actually, if you see > > dprintf(1, "In resume (status=%d)\n", status); > > in the log (from the QEMU debug port), then you can be sure you're resuming. > > On resume, QEMU will set the CMOS_RESET_CODE register to 0xFE, so the > SeaBIOS code will jump to handle_resume32(), and then to s3_resume(). > This last function is which is where I think the commands should be > replayed (likely from a more condensed form). > > Roughly, what the OVMF code does is the following (omitting all the > PI/UEFI terms for now): > > - during normal (first) boot, when processing the WRITE_POINTER > commands, OVMF stashes them in a condensed form in a buffer that is > allocated as "reserved memory" (so that the OS stays away from it) > > - in addition, a small buffer is similarly pre-allocated in reserved > memory; this buffer provides storage for one fw_cfg DMA access object, > plus one uint64_t data element, to be transferred (written) by an > appropriately formatted fw_cfg DMA access object > > - during S3 resume, the code iterates through the condensed > WRITE_POINTER commands; for each, the fw_cfg DMA request is formatted > into the pre-allocated buffer, and the pointer value to write is also > formatted into the appropriate part of that buffer. The idea is that we > can replay the fw_cfg writes without any linker/loader script processing > at this point, and without needing dynamic memory (there's no such thing > during S3 resume). > > Because the memory requirements are so strict for S3 resume -- i.e., you > can only use (some) stack, and pre-reserved / static data -- it is quite > possible that you cannot use the existing fw_cfg helper functions. You > may have to factor out (or reimplement) very low level fw_cfg access. > > About the "condensed" form, think something like: > - uint16_t dst_file_key > - uint32_t dst_file_offset > - uint64_t pointer_value > - uint8_t pointer_size > > This contains all the information you need for two fw_cfg DMA > operations: (1) a combined select + skip operation (based on the first > two fields), and (2) a write operation (based on the last two fields). > For each WRITE_POINTER command that you encounter during normal boot, > you can "distill" such a structure (after the dst_file name has been > resolved (--> dst_file_key), and the src blob's base address and the src > offset from the command itself have been summed (--> pointer_value)). Here is an example on how it's been handled for similar albeit simpler case 023b1d0d6a5 ("add helpers to read etc/boot-cpus at resume time") > > Here's a recent example I recall for S3 memory management: commit > 54e3a88609da ("smp: restore MSRs on S3 resume", 2016-07-07). > > The "smp_msr" array (statically allocated with a fixed size, for 32 > elements) is populated during normal boot with wrmsr_smp(). (The fact > that it overlaps with "SMP" is a complication, but we can mostly ignore > it for now; the point is the lifetime of the "smp_msr" array.) If client > code tries to call wrmsr_smp() for saving more than 32 entries, during > normal boot, the overflow is simply dropped (with warnings). Then, the > array is "replayed" during S3 resume, on the following call chain -- and > I'm ignoring SMP again -- > > ... > handle_resume32() > s3_resume() > smp_resume() > smp_write_msrs() > > I think a similar pattern should work here -- it should be okay to use a > small, fixed size, static array for stashing WRITE_POINTER commands in > condensed form (static data in SeaBIOS will automatically end up > reserved memory IIRC). And, because you are reprogramming devices, not > CPU registers that need to be set on each CPU simultaneously, it should > be okay to do the obvious thing -- just let the boot processor write the > data. > > > Any advice you can give here would be great. > > > > I’m posting the patch set without this change so that at least we have > > something, although I’d really like to get this S3 resume code in place > > too. > > I agree, it would be a good thing. > > Thanks, > Laszlo > > _______________________________________________ > SeaBIOS mailing list > SeaBIOS@seabios.org > https://www.coreboot.org/mailman/listinfo/seabios _______________________________________________ SeaBIOS mailing list SeaBIOS@seabios.org https://www.coreboot.org/mailman/listinfo/seabios