On 17/06/2021 19:16, BALATON Zoltan wrote:
On Thu, 17 Jun 2021, Alexey Kardashevskiy wrote:On 16/06/2021 20:34, BALATON Zoltan wrote:On Wed, 16 Jun 2021, Alexey Kardashevskiy wrote:On 6/15/21 20:29, BALATON Zoltan wrote:On Tue, 15 Jun 2021, Alexey Kardashevskiy wrote:The PAPR platform describes an OS environment that's presented by a combination of a hypervisor and firmware. The features it specifies require collaboration between the firmware and the hypervisor. Since the beginning, the runtime component of the firmware (RTAS) has been implemented as a 20 byte shim which simply forwards it to a hypercall implemented in qemu. The boot time firmware component isSLOF - but a build that's specific to qemu, and has always needed to be updated in sync with it. Even though we've managed to limit the amountof runtime communication we need between qemu and SLOF, there's some, and it has become increasingly awkward to handle as we've implemented new features. This implements a boot time OF client interface (CI) which isenabled by a new "x-vof" pseries machine option (stands for "Virtual OpenFirmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall which implements Open Firmware Client Interface (OF CI). This allows using a smaller stateless firmware which does not have to manage the device tree. The new "vof.bin" firmware image is included with source code under pc-bios/. It also includes RTAS blob. This implements a handful of CI methods just to get -kernel/-initrd working. In particular, this implements the device tree fetching andsimple memory allocator - "claim" (an OF CI memory allocator) and updates"/memory@0/available" to report the client about available memory.This implements changing some device tree properties which we know howto deal with, the rest is ignored. To allow changes, this skips fdt_pack() when x-vof=on as not packing the blob leaves some room for appending.In absence of SLOF, this assigns phandles to device tree nodes to makedevice tree traversing work. When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree. This adds basic instances support which are managed by a hash map ihandle -> [phandle]. Before the guest started, the used memory is: 0..e60 - the initial firmware 8000..10000 - stack 400000.. - kernel 3ea0000.. - initramdisk This OF CI does not implement "interpret". Unlike SLOF, this does not format uninitialized nvram. Instead, this includes a disk image with pre-formatted nvram. With this basic support, this can only boot into kernel directly.However this is just enough for the petitboot kernel and initradmdisk to boot from any possible source. Note this requires reasonably recent guestkernel with:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 The immediate benefit is much faster booting time which especially crucial with fully emulated early CPU bring up environments. Also thismay come handy when/if GRUB-in-the-userspace sees light of the day. This separates VOF and sPAPR in a hope that VOF bits may be reused by other POWERPC boards which do not support pSeries. This make VOF optional, it is disabled by default, add --enable-vof to ./configure to enable it. This assumes potential support for booting from QEMU backends such as blockdev or netdev without devices/drivers used. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- The example command line is: /home/aik/pbuild/qemu-killslof-localhost-ppc64/qemu-system-ppc64 \ -nodefaults \ -chardev stdio,id=STDIO0,signal=off,mux=on \ -device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \ -mon id=MON0,chardev=STDIO0,mode=readline \ -nographic \ -vga none \ -enable-kvm \ -m 8G \-machine pseries,x-vof=on,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off \-kernel pbuild/kernel-le-guest/vmlinux \ -initrd pb/rootfs.cpio.xz \-drive id=DRIVE0,if=none,file=./p/qemu-killslof/pc-bios/vof-nvram.bin,format=raw \-global spapr-nvram.drive=DRIVE0 \ -snapshot \ -smp 8,threads=8 \ -L /home/aik/t/qemu-ppc64-bios/ \ -trace events=qemu_trace_events \ -d guest_errors \ -chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.tmux26 \ -mon chardev=SOCKET0,mode=controlI haven't looked at it in detail yet, just some quick comments I have on first skim through.--- Changes: v21: * s/ld/ldz/ in entry.STypo? Has this become lwz?Yup, lwz.* moved CONFIG_VOF from default-configs/devices/ppc64-softmmu.mak to Kconfig* made CONFIG_VOF optionalThis won't work for pegasos2, see below.* s/l.lds/vof.lds/ * force 32 BE in spapr_machine_reset() instead of the firmware * added checks for non-null methods of VofMachineIfClass* moved OF_STACK_SIZE to vof.h, renamed to VOF_..., added a better comment* added path_offset wrapper for handling mixed case for addresses after "@" in node names * changed getprop() to check for actual "name" property in the fdt* moved VOF_MEM_READ/VOF_MEM_WRITE to vof.h for sharing as (unlike similarrtas_ld/ldl_be_*) they return error codes* VOF_MEM_READ uses now address_space_read (it was address_space_read_fullbefore, not sure why)[...]--- configure | 9 + pc-bios/vof/Makefile | 23 + include/hw/ppc/spapr.h | 25 +- include/hw/ppc/vof.h | 55 ++ pc-bios/vof/vof.h | 43 ++ hw/ppc/spapr.c | 87 +++- hw/ppc/spapr_hcall.c | 29 +- hw/ppc/spapr_vof.c | 153 ++++++hw/ppc/vof.c | 1052 +++++++++++++++++++++++++++++++++++++++pc-bios/vof/bootmem.c | 14 + pc-bios/vof/ci.c | 91 ++++ pc-bios/vof/libc.c | 92 ++++ pc-bios/vof/main.c | 21 + tests/qtest/rtas-test.c | 17 +- MAINTAINERS | 12 + hw/ppc/Kconfig | 3 + hw/ppc/meson.build | 3 + hw/ppc/trace-events | 24 + meson.build | 1 + pc-bios/README | 2 + pc-bios/vof-nvram.bin | Bin 0 -> 16384 bytes pc-bios/vof.bin | Bin 0 -> 3784 bytes pc-bios/vof/entry.S | 49 ++ pc-bios/vof/vof.lds | 48 ++ 24 files changed, 1840 insertions(+), 13 deletions(-) create mode 100644 pc-bios/vof/Makefile create mode 100644 include/hw/ppc/vof.h create mode 100644 pc-bios/vof/vof.h create mode 100644 hw/ppc/spapr_vof.c create mode 100644 hw/ppc/vof.c create mode 100644 pc-bios/vof/bootmem.c create mode 100644 pc-bios/vof/ci.c create mode 100644 pc-bios/vof/libc.c create mode 100644 pc-bios/vof/main.c create mode 100644 pc-bios/vof-nvram.bin create mode 100755 pc-bios/vof.bin create mode 100644 pc-bios/vof/entry.S create mode 100644 pc-bios/vof/vof.lds[...]diff --git a/include/hw/ppc/vof.h b/include/hw/ppc/vof.h new file mode 100644 index 000000000000..65ca2fed0d41 --- /dev/null +++ b/include/hw/ppc/vof.h @@ -0,0 +1,55 @@ +/* + * Virtual Open Firmware + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ +#ifndef HW_VOF_H +#define HW_VOF_H + +typedef struct Vof { + uint64_t top_addr; /* copied from rma_size */ + GArray *claimed; /* array of SpaprOfClaimed */ + uint64_t claimed_base; + GHashTable *of_instances; /* ihandle -> SpaprOfInstance */ + uint32_t of_instance_last; + char *bootargs; + long fw_size; +} Vof; + +int vof_client_call(MachineState *ms, Vof *vof, void *fdt, + target_ulong args_real);+uint64_t vof_claim(Vof *vof, uint64_t virt, uint64_t size, uint64_t align);+void vof_init(Vof *vof, uint64_t top_addr, Error **errp); +void vof_cleanup(Vof *vof); +void vof_build_dt(void *fdt, Vof *vof);+uint32_t vof_client_open_store(void *fdt, Vof *vof, const char *nodename,+ const char *prop, const char *path); + +#define TYPE_VOF_MACHINE_IF "vof-machine-if" + +typedef struct VofMachineIfClass VofMachineIfClass;+DECLARE_CLASS_CHECKERS(VofMachineIfClass, VOF_MACHINE, TYPE_VOF_MACHINE_IF)+ +struct VofMachineIfClass { + InterfaceClass parent;+ target_ulong (*client_architecture_support)(MachineState *ms, CPUState *cs,+ target_ulong vec); + void (*quiesce)(MachineState *ms);+ bool (*setprop)(MachineState *ms, const char *path, const char *propname,+ void *val, int vallen); +}; + +/* + * Initial stack size is from+ * https://www.devicetree.org/open-firmware/bindings/ppc/release/ppc-2_1.html + */+#define VOF_STACK_SIZE 0x8000Maybe also add a define for RTAS_SIZE here? We'll need to put that in the device tree but it depends on the rtas shim size that's part of VOF so it should be defined here instead of hardcoding it in boards that use VOF so it can be updated later at one place if needed.This is rtas-size for pseries: _FDT(fdt_setprop_cell(fdt, rtas, "rtas-size", RTAS_ERROR_LOG_MAX + ms->smp.max_cpus * sizeof(uint64_t)*2 + sizeof(uint64_t))); => depends on cpus => depends on the command line.RTAS_SIZE is not used by anything in pseries anymore, I'll send a patch to ditch it.I mean you need to have at least the size of code in pc-bios/vof/entry.S hv_rtas where also hv_rtas_size is defined but that value is not available in QEMU where one needs to add it to the device tree. So a define for that should be here in vof.h. Currently I've counted instructions and haveqemu_fdt_setprop_cell(fdt, "/rtas", "rtas-size", 20);in pegasos2.c but that 20 should be some VOF_RTAS_SIZE instead that you define corresponding to hv_rtas_size. You'll probably need the same even after changing above rtas size calculation in spapr because client has to allocate memory for instantiate-rtas.Ah fair point. I do not like "20" here and I think the right thing will be adding whatever number of bytes to rtas-size in the firmware itself and update it in QEMU via "setprop" as we do for "linux,rtas-base". And then do the same in SLOF.This is not the base address but the size of the shim with the hypercall that instantiate-rtas copies. Why does it need to be updated?
The vm kernel allocates the space for it.
And why does it need to be more bytes than necessary?
What is necessary? It is definitely way more than 20 bytes.
I don't know what you do for spapr and why do you need larger rtas-size than this but for pegasos2 this /rtas/rtas-size property is only used by guests to allocate memory for rtas so all I need is how many bytes are needed for hv_rtas in pc-bios/vof/entry.S which is what should be #defined in vof.h. I've found 20 is just enough so you could add that to vof.h.
I am thinking now that may be the property should be created by vof.bin and not QEMU, QEMU just has to tell how many bytes on top it needs.
-- Alexey