Hi Peter,

Thanks for your detailed explanation. I still have one question: 

On normal platforms, there are more boot options created by UEFI other than the
hard disk boot option created by OS installation, like PXE network boot, USB
stick boot, etc. If we use the PlatformRecovery#### method, will UEFI first
try all normal boot options like PXE and USB? If PXE boot environment is always
available, then the system may enter PXE OS installation again, which is not
what we expect.

Thanks and regards,

Gary (Heyi Guo)

On Wed, Feb 07, 2018 at 04:16:55PM -0500, Peter Jones wrote:
> Coming late to the party because Leif called my attention to this
> thread...
> 
> On Mon, Jan 29, 2018 at 11:16:21AM +0000, Leif Lindholm wrote:
> > This type of system behaviour has been seen multiple times to break
> > installations in the real world.
> 
> I can't agree more; that's why there's a pile of language in 2.6 and
> later that says how to do a better job of this.
> 
> > Note: my main objections here are really with regards to:
> > 1) the expectation that variable store is erased on fw update
> > 2) automatically rewriting boot variables
> > 
> > If (1) was resolved, then I could potentially see a use for a
> > last-ditch fallback option (but even then, I don't think it should be
> > enabled by default).
> 
> I certainly agree resolving #1 is the thing to do here, but there is is
> actually useful functionality to have in general on the fallback path -
> though I don't think this patch implements it in a correct or preferred
> way.  In particular, it would be better to follow the advice in 3.1.1 of
> the UEFI 2.7 spec, where we say that while yes, the firmware is allowed
> to do boot order maintenance, it shouldn't remove anything or change the
> order itself except in the most dire of circumstances.  In particular:
> 
> | The firmware should not, under normal operation, automatically remove
> | any correctly formed Boot#### variable currently referenced by the
> | BootOrder or BootNext variables. Such removal should be limited to
> | scenarios where the firmware is guided by direct user interaction.
> 
> The right thing to do here is to publish some PlatformRecovery####
> variables as specified in 3.4.2, which can be a static list that's the
> same every time you boot up, and when there's a failure here BootNext
> and BootOrder have been exhausted, proceed according to 3.4.1 and 3.4.2,
> which I'll attempt to summarize below.
> 
> The PlatformRecovery#### variables here should have device paths that
> are something like:
> 
> PlatformRecovery0000: File(\EFI\BOOT\BOOTAA64.EFI)
> PlatformRecovery0001:
> File(\EFI\centos\grubaa64.efi)/EndInstance/File(\EFI\debian\grubaa64.efi)/EndInstance/File(\EFI\GRUB2\GRUBAA64.EFI)/EndInstance/File(\EFI\Microsoft\Boot\bootmgfw.efi)/EndInstance/File(\EFI\redhat\grub.efi)/EndInstance/...
>  etc .../EndEntire
> 
> The EFI_OS_INDICATIONS_START_OS_RECOVERY and
> EFI_OS_INDICATIONS_START_PLATFORM_RECOVERY bits should be set in
> OsIndicationsSupported.
> 
> (As an aside, I don't think we've explicitly said that multi-instance
> device paths do or don't work in boot variables, so that may require
> some work, or if you have a stronger preference about order you could
> just make each one its own PlatformRecovery#### variable.  For what it's
> worth, I don't think that the list in this patch is great - at the least
> the arch suffixes should be generated according to what the build target
> is, and what you've got currently doesn't correctly match several
> shipping OSes - many of which *do* provide a file at
> \EFI\BOOT\BOOT${ARCH}.EFI which will fix the BootOrder for you.)
> 
> If you supply those static variables, then in the normal boot path, if
> you've exhausted BootNext and all of BootOrder, continue according to
> chapter 3.4.2 and then 3.4.3 of the spec.  That basically says there's a
> list of things to try as if they're Boot#### options:
> 
> - If OsRecoveryOrder exists, it's a list of GUIDs under which there may
>   be OsRecovery#### variables.  The GUIDs are processed in the order
>   they're listed, and the OsRecovery#### variables under each GUID are
>   processed in hexadecimal numerical order.
> - PlatformRecovery#### variables are processed in hexadecimal numerical
>   order if OsRecovery variables are exhausted without successfully
>   booting anything.
> 
> For everything in that list, the variables basically get treated exactly
> like Boot#### variables.  For each #### variable:
> - parse the variable like you'd parse Boot####, and use the normal
>   discovery method to iterate across any files that match it.
>   - if so, see if there's a Boot#### that matches that (it isn't
>     necessarily in BootOrder); if not create one.
>   - Try to boot it like any other boot entry: set BootCurrent to the
>     Boot#### number, do LoadImage() and StartImage(), etc.  There's no
>     modification of BootOrder here.
>   - If the binary there can't be found, loaded, or returns an
>     error, continue iterating the normal way to see if there are more
>     matching files, and if there aren't, then proceed to the next
>     variable.
> 
> If you exhaust this list, it's time for an error message and a
> menu or something ;)
> 
> There are some important characteristics here that we need to maintain:
> 1) We haven't really talked enough about when it's really okay to
>    Boot#### entries, because in general removing them is undoing
>    something that was done on purpose (even if that purpose has been
>    obviated now.) A pretty good rule is: don't remove Boot#### variables
>    unless there are dire circumstances, like you've nearly completely
>    run out of flash.  The user or the OS set these for a reason.  If you
>    absolutely must prune them, start with the ones that aren't in
>    BootOrder or BootNext, and then proceed to the high-numbered ones
>    that are duplicates of other ones.  And even then, only remove until
>    you're under some known safe storage threshold.
> 2) Don't change BootOrder ever.  The user set that for a reason.  If you
>    absolutely have to change it, append to the end.  But you really
>    don't need to change it unless the user told you to.
> 3) Don't probe the disk except to try to match a boot entry you're
>    attempting to boot from.  If you need to read a partition table to
>    match an HD(), that's fine.  If you need to read a disk to match a
>    File(), that's fine.  But only when you're trying to boot those
>    device paths and they require iterating the disks.  All probing the
>    disk accomplishes in other cases is slowing things down and powering
>    up devices unnecessarily.
> 
> -- 
>         Peter
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to