hey Stephan, Thanks for such a great email! the outcome is excellent!
On Thu, Jul 1, 2021 at 11:07 AM Stephan Gerhold <step...@gerhold.net> wrote: > > Hi! > > at the moment the U-Boot ports for both DragonBoard 410c and 820c are > designed to be loaded as an Android boot image after Qualcomm's LK > bootloader. This is simple to set up but LK is redundant in this case, > since everything done by LK can be also done directly by U-Boot. > > Dropping LK entirely would have at least the following advantages: > - Easier installation/board code (no need for Android boot images) > - (Slightly) faster boot > - Boot directly in 64-bit without a round trip to 32-bit for LK > > This was not possible so far because of some unsolved problems. > For clarity I try to describe them together with some background here, > but I want to apologize for the long text. It's all quite complicated. :) > > 1. "Signing" 64-bit U-Boot > ========================== > > Ramon already tried to eliminate LK for DB410c 3 years ago [1]. > One of the open problems back then was to have a proper "signing" > tool with 64-bit support. The firmware expects an ELF image with a few > Qualcomm-specific ELF headers. Normally this is used for secure boot > setups. This is not used on DragonBoards, but the firmware still insists > on having a dummy (self-signed) certificate chain in the ELF images. Yeah, the signing was the last step we missed. We were able to sign using internal / non open source tools.. but never finalized the boot process completely.. I am very happy you persisted with that! > > Linaro uses signlk [2] to sign their builds of LK. It looks like Nicolas > extended it with ELF64 support after Ramon's mail [3]. However, for some > reason signlk literally works only for LK for me. I tried to "sign" > U-Boot and some other firmware, but everything except LK is always > rejected with the following message on boot: > > B - 1031113 - Error code 302e at boot_config.c Line 296 > > I tried to track down the issue in the source code for quite some time > but did not manage to find the problem. Perhaps it's some subtle mistake > with some of the ELF modifications, I'm not sure. (For some reason, > signlk makes subtle changes to all of the existing ELF headers...) > > After reading about the image format myself I decided to try to make my > own "signing" tool, qtestsign: https://github.com/msm8916-mainline/qtestsign > It's based on a mixture of the specification [4] and some missing bits > taken from signlk, put in a simple and clean Python tool. I still don't > know what exactly qtestsign does different, but unlike signlk it can > successfully "sign" U-Boot and all other firmware from DragonBoard 410c. There is no specific reason to restrict ourselves to using signlk.. if you have something better, which works, that's perfect! > > [1]: > https://lore.kernel.org/u-boot/CA+Kvs9kS=dbjknaixk_3tz+3iwnrasp0gjdz8ekrzaskor6...@mail.gmail.com/ > [2]: https://git.linaro.org/landing-teams/working/qualcomm/signlk.git/ > [3]: > https://git.linaro.org/landing-teams/working/qualcomm/signlk.git/commit/?id=1f61c03322c3728f35b3f0cd4ff04f73522f1e67 > [4]: > https://www.qualcomm.com/media/documents/files/secure-boot-and-image-authentication-technical-overview-v1-0.pdf > > My solution > ----------- > > Now we have all we need to install U-Boot without LK. For DragonBoard 410c > the following steps end up in the U-Boot prompt without going through LK: > > 1. Change dragonboard410c_defconfig as follows: > > -CONFIG_SYS_TEXT_BASE=0x80080000 > +CONFIG_SYS_TEXT_BASE=0x8F600000 > +CONFIG_OF_EMBED=y (I discuss this at the end of the mail) > > 2. $ make > 3. Sign the ELF image: $ qtestsign.py aboot <out>/u-boot [5] > 4. Flash "<out>/u-boot-test-signed.mbn" to the "aboot" partition > > [5]: https://github.com/msm8916-mainline/qtestsign > > 2. Linux gets stuck when loaded by 64-bit U-Boot without LK > =========================================================== > > This should work well enough to get the U-Boot prompt on serial. > However, once you load Linux you will likely notice a problem: > > [ 0.059043] smp: Bringing up secondary CPUs ... > [ 5.120691] CPU1: failed to come online > [ 10.246760] CPU2: failed to come online > [ 15.372848] CPU3: failed to come online > [ 15.406275] CPU: All CPU(s) started at EL1 > ... > [ 16.185527] genirq: irq_chip msmgpio did not update eff. affinity mask > of irq 79 > Board freezes forever. :( > > My investigations have shown this is a bug in the PSCI implementation on > DB410c (part of the TrustZone/"tz" firmware). Shortly said, since we > have never done the 32-bit -> 64-bit switch in LK, the PSCI implementation > seems to believe we are still running in 32-bit mode and starts all > further CPUs in 32-bit mode. The other CPU cores crash immediately when > coming up and CPU 0 hangs once CPU idle suspends it for the first time. > > I have described this problem together with a workaround in detail here: > https://github.com/msm8916-mainline/qhypstub#boot-flow > > The idea is to execute the TZ syscall to switch from 32-bit -> 64-bit > even though we are already running in 64-bit mode. This will make the > PSCI implementation aware that we want all further CPU cores booted in > 64-bit mode as well. You haven't asked.. but just in case.. chances to get a fix for this firmware is close to 0 (really close). I am glad you have a workaround. > > My solution > ----------- > > The workaround is applied automatically when using my open-source "hyp" > firmware replacement qhypstub: https://github.com/msm8916-mainline/qhypstub > As a bonus, both U-Boot and Linux start in EL2, making it possible to > use virtualization (e.g. KVM in Linux). > > $ git clone https://github.com/msm8916-mainline/qhypstub.git > $ cd qhypstub > $ make CROSS_COMPILE=aarch64-linux-gnu- > $ qtestsign.py hyp qhypstub.elf > # Flash "qhypstub-test-signed.mbn" to "hyp" partition and reboot. > > Now it works: > > [ 0.063411] CPU1: Booted secondary processor 0x0000000001 [0x410fd030] > [ 0.064184] CPU2: Booted secondary processor 0x0000000002 [0x410fd030] > [ 0.064906] CPU3: Booted secondary processor 0x0000000003 [0x410fd030] > [ 0.123032] CPU: All CPU(s) started at EL2 > [ 0.448743] kvm [1]: Hyp mode initialized successfully > ... > > And with that U-Boot is fully working as far as I can tell. > (I have only tested serial, SD card and USB so far. If something is > broken, it's likely some missing register initialization that should > be ported from LK/Linux...) you mean you tested serial, SD and USB from u-boot, or from Linux once booted from uboot? what's the overall status in Linux when you boot with this new boot flow? > > 3. Remaining open questions > =========================== > > I still see 3 questions that we need to discuss: > > 1. This is a quite fundamental change. > Can we just make it to dragonboard410c_defconfig? > Does it make sense to keep the old setup with LK? > When would it be used? I believe it's used by distro. iirc, at least Archlinux, Fedora and Ubuntu have some level of support (and instructions) for the DB410c, and they are using this uboot config. So we need to check at least that we are not breaking any Linux features with this boot flow. There is indeed no reason to keep LK in the boot flow if nothing breaks once we remove it. However it's going to change their installation instructions, since uboot becomes 'aboot' and 'boot' is no longer used. In other words, this change is not transparent for users. But this is a great improvement, and I am hoping it will replace the legacy boot. > > 2. Workaround for PSCI bug: I'm not sure if we want to make qhypstub [6] > a requirement for U-Boot. On the one hand it's open-source, solves > the problem nicely without changes in U-Boot and provides EL2 > additionally. I'm also not aware of any problem/disadvantage when > using it (if you find a problem, please let me know!). > > But I realize it's unofficial. If we want to support using Qualcomm's > "hyp" firmware as well I could try porting the PSCI workaround > from qhypstub to U-Boot. It should be ~10 lines of ARM64 assembly [7] > placed e.g. in board/qualcomm/dragonboard410c/head.S. > > However, I will need to make sure to detect if U-Boot was started > in EL2 by qhypstub because otherwise doing the workaround twice > will conflict and U-Boot might demote itself back to EL1. I think we want (we need?) to support both HYP implementations, I would prefer to have a workaround in u-boot to support existing users (with QCOM hyp). In everything we've done in linux for qcom, we always tend to (try to) support the default firmware released by QCOM, to get a chance to support more users... > > 3. CONFIG_OF_EMBED: There is a big warning about this in the build log: > "This option should only be used for debugging purposes. Please use > CONFIG_OF_SEPARATE for boards in mainline." > > The important part here is that we need an ELF image with both > U-Boot and the DTB. CONFIG_OF_EMBED is convenient for that because > we can just use the ELF image built by the linker and it already > contains the DTB. > > If CONFIG_OF_EMBED is really so bad it might be possible to build > a new boot image based on "u-boot-dtb.bin" (which is U-Boot with > DTB appended). I'm not sure if this is really much better though. > > Bonus question: Could something similar also work for DB820c? I don't > have one myself but I think a similar setup short also work on it. > If someone is interested in testing this I would be happy to help. :) The clk, regulators, ... implementation on 820 are different from 410 in general, and I don't remember how we left things on 820.. but in general it should work. If you want to get a DB820c, I should be able to help with that (ping me privately ;-). > > Thanks for reading! Thanks for your amazing and interesting work on this platform! > Stephan > > [6]: https://github.com/msm8916-mainline/qhypstub > [7]: > https://github.com/msm8916-mainline/qhypstub/blob/c9c3fd0f66ea60032812b06b51da39f25e678638/qhypstub.s#L197-L204