On 5/5/22 14:16, Joel Stanley wrote:
On Thu, 5 May 2022 at 03:31, Alexey Kardashevskiy <a...@ozlabs.ru> wrote:



On 5/5/22 05:16, Fabiano Rosas wrote:
Alexey Kardashevskiy <a...@ozlabs.ru> writes:

tl;dr: This allows Big Endian zImage booting via -kernel + x-vof=on.

QEMU loads the kernel at 0x400000 by default which works most of
the time as Linux kernels are relocatable, 64bit and compiled with "-pie"
(position independent code). This works for a little endian zImage too.

However a big endian zImage is compiled without -pie, is 32bit, linked to
0x4000000 so current QEMU ends up loading it at
0x4400000 but keeps spapr->kernel_addr unchanged so booting fails.

This uses the kernel address returned from load_elf().
If the default kernel_addr is used, there is no change in behavior (as
translate_kernel_address() takes care of this), which is:
LE/BE vmlinux and LE zImage boot, BE zImage does not.
If the VM created with "-machine kernel-addr=0,x-vof=on", then QEMU
prints a warning and BE zImage boots.

I think we can fix this without needing a different command line for BE
zImage (apart from x-vof, which is a separate matter).

If you look at translate_kernel_address, it cannot really work when the
ELF PhysAddr is != 0. We would always hit this sort of 0x4400000 issue,
so if we fix that function like this...

static uint64_t translate_kernel_address(void *opaque, uint64_t addr)
{
      SpaprMachineState *spapr = opaque;

      return addr ? addr : spapr->kernel_addr;
}


The qemu elf loader is supposed to handle relocations which should be
calling this hook more than once, now I wonder why it is not doing so.


...then we could always use the ELF PhysAddr if it is different from 0
and only use the default load addr if the ELF PhysAddr is 0. If the user
gives kernel_addr on the cmdline, we honor that, even if puts the kernel
over the firmware (we have code to detect that).


ELF address is 0 for LE zImage only, vmlinux BE/LE uses
0xc000000000000000. And we are already chopping all these tops bits off
in translate_kernel_address() and I do not really know why _exactly_ it
is 0x0fffffff and not let's say 0x7fffffff.



@@ -2988,6 +2990,12 @@ static void spapr_machine_init(MachineState *machine)
               exit(1);
           }

+        if (spapr->kernel_addr != loaded_addr) {

This could be:

          if (spapr->kernel_addr == KERNEL_LOAD_ADDR &&
           spapr->kernel_addr != loaded_addr) {

So the precedence would be:

1- ELF PhysAddr, if != 0. After all, that is what it's for. BE zImage
     falls here;

2- KERNEL_LOAD_ADDR. Via translate_kernel_address, LE/BE vmlinux fall
     here;

3- kernel_addr. The user is probably hacking something, just use what
     they gave us. QEMU will yell if they load the kernel over the fw.


imho too complicated.

What if the user runs QEMU with kernel-addr=0x400000? (0x400000 is
KERNEL_LOAD_ADDR noooow but not necessarily forever). Is it 2) or 3)?

I am basically fixing a bug when we ignore where load_elf() loaded the
ELF and just assume it is KERNEL_LOAD_ADDR. Now the code checks if the
ELF was loaded where we want it to be. Everything else can be done but
on top of this.

It would be good to fix this so we don't need to specify kernel-addr=0.


This means the pseries code in QEMU needs to read the ELF header and figure out if it is position independent and what is the base address. And because it is Linux which is special, just reading the ELF header is not enough, need more heuristics (there is some already in translate_kernel_address()).

LE vmlinux is 64bit EXEC type and entry=0xc000000000000000
BE vmlinux is 64bit EXEC type and entry=0xc000000000000000
LE zImage is 64bit DYN type and entry=0x0
BE zImage is 32bit EXEC type and entry=0x4000000

And the default address for these in QEMU is 0x400000. Asking kernel-addr=0 and vof=on looks like a small evil :)

And also worth mentioning that with this hack it should be possible to boot grub.elf via -kernel which might be interesting for debugging, and that thing is linked to 0x200000 or so, and probably also 32bit BE (I do not have one handy).



I only recently learnt the pseries machine doesn't support loading the zImage:

  https://github.com/linuxppc/issues/issues/402

So whatever the fix is, writing down what is expected to work and what
isn't would be useful.

I tested your patch and it worked with this command line:

  qemu-system-ppc64 -M pseries,kernel-addr=0,x-vof=on -nographic
-kernel arch/powerpc/boot/zImage.pseries -serial mon:stdio -nodefaults

Tested-by: Joel Stanley <j...@jms.id.au>

Cool thanks!


Cheers,

Joel



+            warn_report("spapr: kernel_addr changed from 0x%lx to 0x%lx",
+                        spapr->kernel_addr, loaded_addr);
+            spapr->kernel_addr = loaded_addr;
+        }
+
           /* load initrd */
           if (initrd_filename) {
               /* Try to locate the initrd in the gap between the kernel


Reply via email to