On Sat, 22 May 2021, Alexey Kardashevskiy wrote:
On 21/05/2021 19:05, BALATON Zoltan wrote:On Fri, 21 May 2021, Alexey Kardashevskiy wrote:On 21/05/2021 07:59, BALATON Zoltan wrote:On Thu, 20 May 2021, Alexey Kardashevskiy wrote:The PAPR platform describes an OS environment that's presented by a combination of a hypervisor and firmware. The features it specifies require collaboration between the firmware and the hypervisor.Since the beginning, the runtime component of the firmware (RTAS) has been implemented as a 20 byte shim which simply forwards it to a hypercall implemented in qemu. The boot time firmware component is SLOF - but a build that's specific to qemu, and has always needed to be updated in sync with it. Even though we've managed to limit the amount of runtime communication we need between qemu and SLOF, there's some, and it has become increasingly awkward to handle as we've implemented new features. This implements a boot time OF client interface (CI) which isenabled by a new "x-vof" pseries machine option (stands for "Virtual OpenFirmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall which implements Open Firmware Client Interface (OF CI). This allows using a smaller stateless firmware which does not have to manage the device tree. The new "vof.bin" firmware image is included with source code under pc-bios/. It also includes RTAS blob. This implements a handful of CI methods just to get -kernel/-initrd working. In particular, this implements the device tree fetching andsimple memory allocator - "claim" (an OF CI memory allocator) and updates"/memory@0/available" to report the client about available memory. This implements changing some device tree properties which we know how to deal with, the rest is ignored. To allow changes, this skips fdt_pack() when x-vof=on as not packing the blob leaves some room for appending. In absence of SLOF, this assigns phandles to device tree nodes to make device tree traversing work. When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree. This adds basic instances support which are managed by a hash map ihandle -> [phandle]. Before the guest started, the used memory is: 0..e60 - the initial firmware 8000..10000 - stack 400000.. - kernel 3ea0000.. - initramdisk This OF CI does not implement "interpret". Unlike SLOF, this does not format uninitialized nvram. Instead, this includes a disk image with pre-formatted nvram. With this basic support, this can only boot into kernel directly. However this is just enough for the petitboot kernel and initradmdisk toboot from any possible source. Note this requires reasonably recent guestkernel with:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 The immediate benefit is much faster booting time which especiallycrucial with fully emulated early CPU bring up environments. Also this may come handy when/if GRUB-in-the-userspace sees light of the day. This separates VOF and sPAPR in a hope that VOF bits may be reused by other POWERPC boards which do not support pSeries. This is coded in assumption that later on we might be adding support for booting from QEMU backends (blockdev is the first candidate) without devices/drivers in between as OF1275 does not require that and it is quite easy to so. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- The example command line is: /home/aik/pbuild/qemu-killslof-localhost-ppc64/qemu-system-ppc64 \ -nodefaults \ -chardev stdio,id=STDIO0,signal=off,mux=on \ -device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \ -mon id=MON0,chardev=STDIO0,mode=readline \ -nographic \ -vga none \ -enable-kvm \ -m 8G \-machine pseries,x-vof=on,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off \-kernel pbuild/kernel-le-guest/vmlinux \ -initrd pb/rootfs.cpio.xz \-drive id=DRIVE0,if=none,file=./p/qemu-killslof/pc-bios/vof-nvram.bin,format=raw \-global spapr-nvram.drive=DRIVE0 \ -snapshot \ -smp 8,threads=8 \ -L /home/aik/t/qemu-ppc64-bios/ \ -trace events=qemu_trace_events \ -d guest_errors \ -chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.tmux26 \ -mon chardev=SOCKET0,mode=control --- Changes: v20: * compile vof.bin with -mcpu=power4 for better compatibility * s/std/stw/ in entry.S to make it work on ppc32 * fixed dt_available property to support both 32 and 64bit * shuffled prom_args handling code * do not enforce 32bit in MSR (again, to support 32bit platforms)[...]diff --git a/default-configs/devices/ppc64-softmmu.mak b/default-configs/devices/ppc64-softmmu.makindex ae0841fa3a18..9fb201dfacfa 100644 --- a/default-configs/devices/ppc64-softmmu.mak +++ b/default-configs/devices/ppc64-softmmu.mak @@ -9,3 +9,4 @@ CONFIG_POWERNV=y # For pSeries CONFIG_PSERIES=y CONFIG_NVDIMM=y +CONFIG_VOF=y diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig index e51e0e5e5ac6..964510dfc73d 100644 --- a/hw/ppc/Kconfig +++ b/hw/ppc/Kconfig @@ -143,3 +143,6 @@ config FW_CFG_PPC config FDT_PPC bool + +config VOF + boolI think you should just add "select VOF" to config PSERIES section in Kconfig instead of adding it to default-configs/devices/ppc64-softmmu.mak.oh well, can do that too.I think most config options should be selected by KConfig and the default config should only include machines, otherwise VOF would be added also when you don't compile PSERIES or PEGASOS2. With select in Kconfig it will be added when needed. That's why it's better to use select in this case.That should do it, it works in my updated pegasos2 patch:https://osdn.net/projects/qmiga/scm/git/qemu/commits/3c1fad08469b4d3c04def22044e52b2d27774a61 [...]diff --git a/pc-bios/vof/entry.S b/pc-bios/vof/entry.S new file mode 100644 index 000000000000..569688714c91 --- /dev/null +++ b/pc-bios/vof/entry.S @@ -0,0 +1,51 @@ +#define LOAD32(rn, name) \ + lis rn,name##@h; \ + ori rn,rn,name##@l + +#define ENTRY(func_name) \ + .text; \ + .align 2; \ + .globl .func_name; \ + .func_name: \ + .globl func_name; \ + func_name: + +#define KVMPPC_HCALL_BASE 0xf000 +#define KVMPPC_H_RTAS (KVMPPC_HCALL_BASE + 0x0) +#define KVMPPC_H_VOF_CLIENT (KVMPPC_HCALL_BASE + 0x5) + + . = 0x100 /* Do exactly as SLOF does */ + +ENTRY(_start) +# LOAD32(%r31, 0) /* Go 32bit mode */ +# mtmsrd %r31,0 + LOAD32(2, __toc_start) + b entry_c + +ENTRY(_prom_entry) + LOAD32(2, __toc_start) + stwu %r1,-112(%r1) + stw %r31,104(%r1) + mflr %r31 + bl prom_entry + nop + mtlr %r31 + ld %r31,104(%r1)It's getting there, now I see the first client call from the guest boot code but then it crashes on this ld opcode which apparently is 64 bit only:Oh right.Hopefully this is the last such opcode left before I can really test this.Make it lwz, and test it?Yes, figured that out too after sending this message. Replacing with lwz works but I wonder that now you have stwu lwz do the stack offsets need adjusting too or you just waste 4 bytes now?Well, this assumes the 64bit client and that ABI. I think ideally the firmware is supposed to use its own stack but I did not bother here. I do not know 32bit ABI at all so say whether the existing code should just work or not :-/
It seems to work so that's OK, just thought if the firmware is 32 bit it does not need 64 bit values on stack but if that's also potentially used by a 64 bit kernel then it may be better to keep it that way to avoid confusion. With the 64 bit opcodes replaced it seems to work on pegasos2 and the guest can call CI functions and get a reply so maybe it's just a few wasted bytes that's not a big deal.
With lwz here I found no further 64 bit opcodes and the guest boot code could walk the device tree. It failed later but I think that's because I'll need to fill more info about the machine in the device tree. I'll experiment with that but it looks like it could work at least for MorphOS. I'll have to try Linux too.There are plenty of tracepoints, enable them all.
I'm running with -trace enable="vof*" but it does not give me too much info as a lot of calls (such as peer, child, etc.) don't log anything other than there was a hypercall so only get info about opening paths and querying some props. The MorphOS boot.img just walks the device tree gathering some data about the machine then calls quiesce and boot into the OS that later tries to use the gathered info at which point it crashes without any logs if some info is not as expected. This does not make it easy to debug but I think once I fill the device tree enough with all needed info it should work. Currently I'm missing info about PCI devices that it may need.
Do you have some info on how the stdout works in VOF? I think I'll need that to test with Linux and get output but I'm not sure what's needed on the machine side.VOF opens stsout and stores the ihandle (in fdt) which the client (==kernel) uses for writing. To make it work properly, you need to hook up that instance to a device backend similar to what I have for spapr-vty:https://github.com/aik/qemu/commit/a381a5b50c23c74013e2bd39cc5dad5b6385965d This is not a part of this patch as I'm trying to keep things simpler and accessing backends from VOF is still unsettled. But there is a workaround which is trace_vof_write, I use this. Thanks,The above patch is about stdin but stdout seems to be added by the current vof patch. What is spapr-vty?It is pseries' paravirtual serial device, pegasos does not have it.I don't think I have something similar in pegasos2 where I just have a normal serial port created by ISASuperIO in the vt8231 model.Correct.Can I use that backend somehow or have to create some other serial device to connect to stdout?Does trace_vof_write work for stuff output by the guest? I guess that's only for things printed by VOF itselfVOF itself does not prints anything in this patch.
However it seems to be needed for linux as the first thing it does seems to be getting /chosen/stdout and calls exit if it returns nothing. So I'll need this at least for linux. (I think MorphOS may also query it to print a banner or some messages but not sure it needs it, at least it does not abort right away if not found.)
but to see Linux output do I need a stdout in VOF or it will just open the serial with its own driver and use that? So I'm not sure what's the stdout parts in the current vof patch does and if I need that for anything. I'll try to experiment with it some more but fixing the ld and Kconfig seems to be enough to get it work for me.So for the client to print something, /chosen/stdout needs to have a valid ihandle. The only way to get a valid ihandle is having a valid phandle which vof_client_open() can open. A valid phandle is a phandle of any node in the device tree. On spapr we pick some spapr-vty, open it and store in /chosen/stdout.From this point output from the client can be seen via a tracepoint.Now if we want proper output without tracepoints - we need to hook it up with some chardev backend (not a device such a vt8231 or spapr-vty but backend).
I don't know much about it but devices are also connected to some backend so is it possible to use the same backend for VOF as used for the normal serial port? But I need a way to find that and connect it to VOF and I'm not qure how to do that yet. Or do I need to create a separate serial backend and connect that to VOF? I'll try to look at spapr-vty to see what it does.
https://github.com/aik/qemu/commit/a381a5b50c23c74013e2bd3 does this:1. when a phandle is open, QEMU will search for DeviceState* for the specific FDT node and get a chardev from the device. 2. when write() is called, QEMU calls qemu_chr_fe_write_all() on chardev from 1.From this point you do not need a tracepoint and the output will appears in the console you set up for stdout.Now if you want input from this console, things get tricky. First, on powernv/pseries we only need this for grub as otherwise the kernel has all the drivers needed and will not use the client interface. For the grub, we need to provide a valid ihandle for /chosen/stdin which is easy but implementing read() on this is not as there is no simple device-type-independend way of reading from chardev. I hacked it for spapr-tvy but other serial devices will need special handling, or we'll have to introduce some VOF_SERIAL_READ interface for those which will face opposition :)Makes sense?
It explains things a bit but still not entirely clear how can I get something to add as a stdout. With the pegasos2 firmware it puts the serial device there normally that it inits and opens. Without that firmware we have to somehow do that from QEMU so find the serial backend used by the serial device within the vt8231 model (or use a different backend just for this?) then open it and put it in the device tree. If that's correct or how to do it is not clear yet.
Regards. BALATON Zoltan