On Sun, May 24, 2026 at 4:43 AM Paulo Duarte <[email protected]> wrote: > > The imported boot.S places the boot stack inside the .bss segment: > > .bss > .boot_stack: > .space 4096 > .boot_stack_end: > > c_boot_entry() is the first C function called from _start, with sp > already pointing at .boot_stack_end. Its first action is to call > zero_out_bss(), which memsets [__bss_start, __bss_end) — the whole > .bss range, including the very boot stack the kernel is *currently > running on*. That wipes the saved x29/x30 and any locals the > compiler spilled on entry, so the next return / function call > branches to 0 and the kernel hangs in EL1. > > Move the boot stack into its own `.boot_stack` nobits section and > place that section after `__bss_end` in the linker script so > zero_out_bss() leaves it alone: > > .section .boot_stack, "aw", %nobits > boot_stack: > .space 4096 > .boot_stack_end: > > Brought up under qemu-system-aarch64 -M virt the bug fires > immediately; wip-aarch64 likely never exercised the > zero_out_bss-from-_start path because its testing was on a > different boot route.
Could you expand? What different boot route? The patch makes sense, but it is really interesting that this was not causing issues for us at the time. > --- > aarch64/aarch64/boot.S | 10 ++++++++-- > aarch64/ldscript | 3 +++ > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/aarch64/aarch64/boot.S b/aarch64/aarch64/boot.S > index 85d3b944..ab736489 100644 > --- a/aarch64/aarch64/boot.S > +++ b/aarch64/aarch64/boot.S > @@ -92,8 +92,14 @@ ENTRY(_start) > b EXT(c_boot_entry) > END(_start) > > - .bss > -.boot_stack: > + /* > + * Put the boot stack in its own nobits section so it lives outside > + * [__bss_start, __bss_end). Otherwise c_boot_entry's call to > + * zero_out_bss() (which memsets the whole BSS region) would clobber > + * its own saved x29/x30, sending us to PC=0 on ret. > + */ > + .section .boot_stack, "aw", %nobits > +boot_stack: > .space 4096 > .boot_stack_end: > > diff --git a/aarch64/ldscript b/aarch64/ldscript > index 236fc6f8..a5aec69d 100644 > --- a/aarch64/ldscript > +++ b/aarch64/ldscript > @@ -27,6 +27,9 @@ SECTIONS > __bss_start = .; > *(.bss); > __bss_end = .; > + /* Boot stack lives in its own nobits region after __bss_end so > + it survives zero_out_bss() running from within itself. */ > + *(.boot_stack); > } > _image_end = .; > } > -- > 2.54.0 >
