Do you have the failed DSDT table dumped? Even there's recent change around NVSA, but looks that's different.
Raul Rangel <rran...@chromium.org> 于2021年1月21日周四 上午4:24写道: > Over the weekend I had the realization that SMI logging was enabled > and interfering with WinDbg. Once I flashed a non-serial firmware > WinDbg became a lot more stable and I was able to reliably attach to > the boot loader debugger i.e., `/bootdebug {default}`. The OS debugger > (`/debug {default} on`) was still not functioning though. I wasn't > sure If the BSOD was happening in the boot loader or the OS kernel, so > I stepped through the boot loader until I saw it jump to the OS. From > there WinDbg failed to restore the connection. The exception happens > before the OS is capable of writing a kernel dump. I was also > suspecting it was happening before the debugger was set up since the > connection could not be re-established. > > I then saw Felix's reply: > > To decode the bug check values and their parameters, see > https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check-0xa5--acpi-bios-error > > > The third parameter you posted decodes to _UID (that one is 4 char ASCII > stored as little endian number). > > The parameters I gathered last week: > > > 0x0000000000000000 > > OxFFFFD38AC66EC7FO > > Ox000000004449555F > > 0x0000000000000000 > > The first parameter 0x0 wasn't listed in the table. So this led me to > believe that maybe there was a problem parsing the ACPI tables. Felix > suggested I use the [Microsoft ASL > compiler]( > https://docs.microsoft.com/en-us/windows-hardware/drivers/bringup/microsoft-asl-compiler > ) > to decompile the AML and verify if the tables were valid. > > I used linux to dump the ACPI tables via `/sys/firmware/acpi/tables/`, > added the `.AML` suffix, and ran `asl.exe /u DSDT.AML`. This printed > an error saying `NVSA was already defined`. Using iasl to decompile > the table I saw the following: > > ``` > External (NVSA) > Name (NVSA, 0xCA6B2000) > OperationRegion (GNVS, SystemMemory, NVSA, 0x1000) > ``` > > The external reference was defined in `.asl`: > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/soc/amd/picasso/acpi/globalnvs.asl;l=12 > The `Name` node was created by acpigen: > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/soc/amd/picasso/acpi.c;l=429 > > Removing the External from the `.asl`. results in the `iasl` compiler > complaining about a missing reference. So I move the `Name` node to > the SSDT table. This resulted in linux complaining that `GVNS` was > invalid because it couldn't find `NVSA`. The ACPI spec says the > following: > > > OperationRegion (RegionName, RegionSpace, Offset, Length) > > Operation regions are regions in some space that contain hardware > registers for exclusive use by ACPI > control methods. ... > > The entire Operation Region can be allocated for exclusive use to the > ACPI subsystem in the host OS. > > Operation Regions that are defined within the scope of a method are the > exception to this rule. These Operation Regions are known as “Dynamic” > since the OS has no idea that they exist or what registers they use until > the control method is executed. > > I'm guessing that we can't move the `NVSA` node to the SSDT because > the value is required when instantiating `OperationRegion`. > > I'm not quite sure how to solve this. For now I just hard coded the > address in the `OperationRegion`. I'm open to suggestions. > > The second problem was `OIPG`. It is also defined twice: > > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/vendorcode/google/chromeos/acpi/chromeos.asl;l=8 > > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/vendorcode/google/chromeos/acpi.c;l=15 > > Changing the callback to write to the SSDT table fixed that problem > and I was finally able to decode `DSDT.AML` using the `Microsoft ASL > Compiler`. > > I then disabled `/bootdebug` and `/debug` since they weren't providing > any value and were preventing me from seeing the BSOD error codes. One > thing I noticed was that rebooting after a BSOD the boot loader would > boot a "system restore" image. This image used a different registry > than the OS so the error codes were not printed on the screen. > Rebooting again the boot loader would load the OS. So each test > required a double reboot. I'm also using Tianocore to boot Windows, > which is super slow... > > The BSOD this time looked identical to the previous one... But upon > closer inspection the error code was different: > > > 0x000000000000000D > > ... I went back and looked at the photo I took of the original BSOD > and it was indeed 0x000000000000000D! The font made this easy to miss > the first time around. Google Lens didn't pick it up either. > > The exception now made sense: > > > ACPI could not find a required method or object in the namespace This > bug check code is used if there is no _HID or _ADR present. > > and to re quote Felix: > > > The third parameter you posted decodes to _UID (that one is 4 char ASCII > stored as little endian number). > > So a device was missing a _UID. I manually audited all the Device > nodes in the DSDT and SSDT and indeed we had devices that were missing > `_UID` and some devices that even had duplicate `_UID`s. When I fixes > this I got a new BSOD: > > > 0x06 - ACPI tried to find a named object, but it could not find the > object. > > 0x<some pointer> > > 0x<some pointer> > > 0x<some pointer> > > This was discouraging since I didn't have a way of dereferencing the > pointers. I decided to double check the `SPCR` and the `DBG2` tables. > The `DBG2` table was using `MMIO` while the `SPCR` was using `IO`. So > I switched it over to `IO` since I knew that worked, set `/debug > {default} on` and voila OS kernel debugger! > > Doing `!analyze -v` showed the error and parameters. It did lock up > trying to print the details section. Hitting the `Break` button > cancelled the operation and I was able to continue. A simple `!nsobj > <pointer>` showed that the FUR0 power resource wasn't being found: > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/soc/amd/picasso/acpi/sb_fch.asl;l=169;drc=18593759918afe7ed67c097d444be7555575f50e > I suspect it's because the `AOAC` > [node]( > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/soc/amd/picasso/acpi/aoac.asl;l=126 > ) > is defined as a bridge device. I commented out all the `AOAC` and > power resources and was then greeted with: > > > 0x1000D - _PRW specified with no wake-capable interrupts and at least > one GPIO interrupt > > A device used both GPE and GPIO interrupts, which is not supported. > > If I understand it correctly, that means that we can't use a GPE in > the _PRW and a GPIO in the _CRS. > > i.e., > Device (CRFP) > { > Name (_HID, "PRP0001") // _HID: Hardware ID > Name (_UID, Zero) // _UID: Unique ID > Name (_DDN, "Fingerprint Reader") // _DDN: DOS Device Name > > Name (_CRS, ResourceTemplate () // _CRS: Current Resource Settings > { > UartSerialBusV2 (0x002DC6C0, DataBitsEight, StopBitsOne, > 0x00, LittleEndian, ParityTypeNone, FlowControlNone, > 0x0040, 0x0040, "\\_SB.FUR1", > 0x00, ResourceConsumer, , Exclusive, > ) > GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000, > "\\_SB.GPIO", 0x00, ResourceConsumer, , > ) > { // Pin list > 0x0006 > } > }) > Name (_S0W, 0x04) // _S0W: S0 Device Wake State > Name (_PRW, Package (0x02) // _PRW: Power Resources for Wake > { > 0x0A, > 0x03 > }) > } > > I'm guessing the _PRW needs to reference the GPIO controller, and that > controller must have an `_AEI` defining the pin. Not really sure why > Windows has a problem with mixed event types. For now I just commented > out all the I2C and UART peripherals. > > With all that I was finally able to boot into Windows! > > Now on to making some CLs and fixing the remaining issues. > > > On Fri, Jan 15, 2021 at 5:52 PM Felix Held <felix-coreb...@felixheld.de> > wrote: > > > > Forgot to add that to find out what the cause is the easiest way is > > probably having the installed image configured in a way that it'll write > > full kernel memory dumps to disk and then use !analyze -v in WinDbg on > > that generated kernel dump. At least that's what I remember from more > > than 1.5 years ago, so some of the info might not be 100% accurate. > > > > Regards, > > Felix >
_______________________________________________ coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-le...@coreboot.org