The move of risc-v to UEFI was driven by all the usual suspects. It's
been a huge disappointment.

On Wed, May 20, 2026 at 7:47 PM Dan Cross <[email protected]> wrote:
>
> On Tue, May 19, 2026 at 12:20 AM ron minnich <[email protected]> wrote:
> > There's a reason I started the RISC-V port, a year ago.
> >
> > background: I wrote the first SBI after the berkeley one, in 2014. I
> > put a lot of work into making it small, in every sense: space, time,
> > complexity.
> >
> > The current standard, OpenSBI, is big in every sense: space, time, 
> > complexity.
>
> My sense was that they started out trying to do something analogous to
> DEC's "PALcode" for Alpha, but it's metastasized into something closer
> to UEFI in scope. Is that basically correct?
>
> > So I intend to show an alternate path, starting with this port.
> >
> > As part of main(), the kernel will do 3 things:
> > 1) wipe out the opensbi firmware from RAM
> > 2) replace it with a minimal, simple, version which does not even have
> >     its own memory.  This version will hark back to an earlier day, when
> >     all sbi did was fiddle bits in registers, at the direction of the kernel
> > 3) all higher level functions will be done in the kernel or even a server
>
> This begs the immediate question: if the _kernel_ is overwriting SBI,
> then do you need SBI at all once the kernel is loaded? Is the kernel
> still making use of SBI after that point?
>
> > I am fairly sure I can do this, having done something like it before.
> >
> > I intend to show that risc-v does not need OpenSBI, UEFI, or ACPI.
> >
> > I do think there's a way out of the firmware mess we're in; I don't
> > expect it to ever happen on x86 or ARM.
>
> It can be done, and what we did at Oxide is an existence proof
> (https://rfd.shared.oxide.computer/rfd/0241). Whether it can be done
> _generally_ is an open question. It only works at Oxide because of the
> luxurious flexibility we built ourselves by co-designing the hardware
> and software.
>
> > Even though I did do an experiment, years ago, with moving SMM to
> > kernel space: 
> > https://docs.google.com/presentation/d/1ECTPzrt4hT37Ef729GAXh0o9lMvx4MJrkJxN88-k9Ps/edit?usp=sharing
> >
> > I've fought and lost this SMM battle for 27 years. Why do vendors like
> > it? Because UEFI/ACPI/SMM are a way to implement vendor lockin. I'm
> > too old for this nonsense. RISC-V offers a way out and I'm going to
> > try to take it before things get much worse.
>
> I think it's slightly more complex than just lock-in.  Everyone is
> over everyone else's barrel, the system has found some kind of
> uncomfortable equilibrium at a local maxima, and the fear is that if
> anyone tries to shift, everyone's spines will snap.
>
> I know you know all of this, Ron, but others may not, so indulge me a
> bit: a BIOS (by which I mean a general category, including things like
> UEFI and OpenBoot) has three primary uses:
>
> 1. It provides a platform-independent, least-common-denominator, way to do IO;
> 2. It provides a place for vendors to put all of their device-specific
> platform enablement code;
> 3. Related to (2), it addresses the boot problem.
>
> (1) is rather obvious. The concept of a BIOS form the CP/M and DOS
> days meant that simple early OSes could offload the nuts and bolts of
> driving IO devices to code stored in ROM, freeing up precious RAM for
> user programs. It's also helpful for things like standalone boot
> loaders, where you needs need a minimum of functionality to load the
> actual OS, which (presumably) has real device drivers (cf the 9front
> loaders).
>
> The other two are a little less obvious.
>
> Starting a modern CPU is a bit like how I imagine starting a container
> ship: you don't just press a button and boom, the ship's main engines
> are going and the screws are turning and you're ready to get under
> way. Rather, you press a button, and that engages a small electric
> starter motor, which starts something analogous to a car engine, that
> in turn drives a pump that starts building compression in the main
> ship's engines, primes the fuel system, and so on.  Only after all of
> that is ready can you start the _actual_ enginges.
>
> Similarly, when you press the power button on a recent machine, it's
> not like the main CPUs immediately come out of reset and start trying
> to boot the OS; instead, a little microcontroller or FPGA or CPLD or
> something springs to life driven by firmware that lives in an onboard
> flash part, and that starts on a long list of tasks including power
> sequencing, ramping up the voltage regulators, turning on the DRAM
> controllers, applying early power to the DIMMs and the main SoC so
> they can charge up internal capacitors and start speaking whatever
> protocol (e.g., the DIMMs probably speak I2C to identify themselves),
> turning _off_ things that are not electrically present, and so on.
> Eventually you're far enough along you can signal the main CPU to
> start, and another little embedded controller inside of that thing
> starts loading the BIOS out of flash and arranging for the big cores
> start; after all of that, the main CPU comes comes out of reset, and
> now you're running BIOS code. But wait, we're not done: depending on
> the CPU/vendor, DRAM may still need to be trained and the full BIOS
> image loaded, the IO buses might need to be configured and trained,
> configuration needs to be loaded from non-volatile storage, the CPUs
> are enumerated, tables built describing the system for the actual OS,
> and so on. Once all of that is done, the system can try booting from
> the configured devices, and assuming that's successful, only now are
> you running your actual system software; by this point, many millions
> (if not billions of instructions) have already been retired.
> Critically, much of the pre-OS work depends on the exact configuration
> of CPU, board, and devices.
>
> I mentioned the "boot problem": If the OS is going to be loaded from
> some device at the far end of a PCIe link that has to be set up and
> initialized by software running on the main CPU cores (as it is on AMD
> EPYC SoCs, for example) then you can't boot it until _that's_ been
> done, and _something_ has to do that work. The solution so far has
> been to shove it into UEFI.
>
> Similarly, operating systems have abdicated much of the responsibility
> for actually running the machine, giving it over to the BIOS, which
> doesn't actually go away after you've booted. For example, Ron
> mentioned SMM on x86: this was meant to patch over the mismatch
> between old OSes and newer machines so that firmware could emulate old
> systems for compatiblity with old software. Windows 95 doesn't have an
> xhci stack for the HID devices on your workstation? No problem; BIOS
> implements an emulation layer that makes those look like a PS/2
> keyboard and mouse, and runs it in SMM mode: Win95 is none the wiser,
> and users are happy because their software runs. But if you've got an
> OS that _does_ have drivers for that stuff, and doesn't want it, you
> can't get rid of it: oh well, can't make all the people happy all the
> time. And since we've got system services in UEFI, we can stick all of
> the deeply platform-specific C-state and power management stuff in
> there; the OS doesn't have to do the super fiddly bits.
>
> Why do they do all of this? Because otherwise, they have to have
> precise details of every machine they may run on, and drivers for all
> of the parts that the user cares about. The BIOS
>
> The upshot is that unless you're tying the version of the OS tightly
> to hardware, you need some kind of interface between the machine and
> the OS; interfaces in that abstract sense are good. The problem is
> that the concrete interfaces we have, and their implementations, are
> bad; they were not well-designed, and instead (as Mothy Roscoe said at
> OSDI'21) they have "congealed" into something big, heavy, and utter
> opaque.
>
> In the other thread, Vester mentioned this nightmare world where the
> firmware actually runs the machine and you can't get around it. Well,
> we're already there, and have been for quite some time.
>
> Btw, I mentioned we worked around this at Oxide, and that's true. We
> are using EPYC CPUs, but we have no UEFI and we do not use any AGESA
> code; the x86 cores come out of reset running Oxide-authored code from
> the first instruction. However, we are still reliant on some AMD
> firmware blobs to drive the ASP (nee PSP) which does DRAM training and
> is responsible for loading the "BIOS image" out of flash. The trick we
> use is to package up our code to look like the BIOS, as far as the ASP
> is concerned: it assumes it's loading AGESA but it's really loading
> the Oxide OS. But obviously that's not going to work generally; first,
> we have a fair bit of code in the OS that is specific to the
> microarchitectures we explicitly support (and in some cases, even the
> silicon spin, if we need to work around chip errata). Second, since we
> designed the boards, we know exactly how the IO devices are lined out
> to the PHYs on individual pins on the CPU: that's something that the
> third-party board vendors modify AGESA for, but not something you can
> (easily) tell the OS or a boot loader in a far more loosely coupled
> environment.
>
>         - Dan C.
>
>
>
>
> > Anyway, there are three of us now working on the port, we've gotten to
> > the point of 'root is from' working over 9p, and we're going to rebase
> > on 9front tip in the next week or so. At that point, I hope to start
> > working on the simple SBI.

------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/T0ca6c03d6fda29a7-Me839326a4866f05a70d0fe04
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to