On Thu, May 27, 2021 at 02:42:39PM +0200, BALATON Zoltan wrote: > On Thu, 27 May 2021, David Gibson wrote: > > On Tue, May 25, 2021 at 12:08:45PM +0200, BALATON Zoltan wrote: > > > On Tue, 25 May 2021, David Gibson wrote: > > > > On Mon, May 24, 2021 at 12:55:07PM +0200, BALATON Zoltan wrote: > > > > > On Mon, 24 May 2021, David Gibson wrote: > > > > > > On Sun, May 23, 2021 at 07:09:26PM +0200, BALATON Zoltan wrote: > > > > > > > On Sun, 23 May 2021, BALATON Zoltan wrote: > > > > > > > > On Sun, 23 May 2021, Alexey Kardashevskiy wrote: > > > > > > > > > One thing to note about PCI is that normally I think the > > > > > > > > > client > > > > > > > > > expects the firmware to do PCI probing and SLOF does it. But > > > > > > > > > VOF > > > > > > > > > does not and Linux scans PCI bus(es) itself. Might be a > > > > > > > > > problem for > > > > > > > > > you kernel. > > > > > > > > > > > > > > > > I'm not sure what info does MorphOS get from the device tree > > > > > > > > and what it > > > > > > > > probes itself but I think it may at least need device ids and > > > > > > > > info about > > > > > > > > the PCI bus to be able to access the config regs, after that it > > > > > > > > should > > > > > > > > set the devices up hopefully. I could add these from the board > > > > > > > > code to > > > > > > > > device tree so VOF does not need to do anything about it. > > > > > > > > However I'm > > > > > > > > not getting to that point yet because it crashes on something > > > > > > > > that it's > > > > > > > > missing and couldn't yet find out what is that. > > > > > > > > > > > > > > > > I'd like to get Linux working now as that would be enough to > > > > > > > > test this > > > > > > > > and then if for MorphOS we still need a ROM it's not a problem > > > > > > > > if at > > > > > > > > least we can boot Linux without the original firmware. But I > > > > > > > > can't make > > > > > > > > Linux open a serial console and I don't know what it needs for > > > > > > > > that. Do > > > > > > > > you happen to know? I've looked at the sources in > > > > > > > > Linux/arch/powerpc but > > > > > > > > not sure how it would find and open a serial port on pegasos2. > > > > > > > > It seems > > > > > > > > to work with the board firmware and now I can get it to boot > > > > > > > > with VOF > > > > > > > > but then it does not open serial so it probably needs something > > > > > > > > in the > > > > > > > > device tree or expects the firmware to set something up that we > > > > > > > > should > > > > > > > > add in pegasos2.c when using VOF. > > > > > > > > > > > > > > I've now found that Linux uses rtas methods read-pci-config and > > > > > > > write-pci-config for PCI access on pegasos2 so this means that > > > > > > > we'll > > > > > > > probably need rtas too (I hoped we could get away without it if > > > > > > > it were only > > > > > > > used for shutdown/reboot or so but seems Linux needs it for PCI > > > > > > > as well and > > > > > > > does not scan the bus and won't find some devices without it). > > > > > > > > > > > > Yes, definitely sounds like you'll need an RTAS implementation. > > > > > > > > > > I plan to fix that after managed to get serial working as that seems > > > > > to not > > > > > need it. If I delete the rtas-size property from /rtas on the original > > > > > firmware that makes Linux skip instantiating rtas, but I still get > > > > > serial > > > > > output just not accessing PCI devices. So I think it should work and > > > > > keeps > > > > > things simpler at first. Then I'll try rtas later. > > > > > > > > > > > > While VOF can do rtas, this causes a problem with the hypercall > > > > > > > method using > > > > > > > sc 1 that goes through vhyp but trips the assert in > > > > > > > ppc_store_sdr1() so > > > > > > > cannot work after guest is past quiesce. > > > > > > > > > > > > > So the question is why is that > > > > > > > assert there > > > > > > > > > > > > Ah.. right. So, vhyp was designed for the PAPR use case, where we > > > > > > want to model the CPU when it's in supervisor and user mode, but not > > > > > > when it's in hypervisor mode. We want qemu to mimic the behaviour > > > > > > of > > > > > > the hypervisor, rather than attempting to actually execute > > > > > > hypervisor > > > > > > code in the virtual CPU. > > > > > > > > > > > > On systems that have a hypervisor mode, SDR1 is hypervisor > > > > > > privileged, > > > > > > so it makes no sense for the guest to attempt to set it. That > > > > > > should > > > > > > be caught by the general SPR code and turned into a 0x700, hence the > > > > > > assert() if we somehow reach ppc_store_sdr1(). > > > > > > > > > > > > So, we are seeing a problem here because you want the 'sc 1' > > > > > > interception of vhyp, but not the rest of the stuff that goes with > > > > > > it. > > > > > > > > > > > > > and would using sc 1 for hypercalls on pegasos2 cause other > > > > > > > problems later even if the assert could be removed? > > > > > > > > > > > > At least in the short term, I think you probably can remove the > > > > > > assert. In your case the 'sc 1' calls aren't truly to a hypervisor, > > > > > > but a special case escape to qemu for the firmware emulation. I > > > > > > think > > > > > > it's unlikely to cause problems later, because nothing on a 32-bit > > > > > > system should be attempting an 'sc 1'. The only thing I can think > > > > > > of > > > > > > that would fail is some test case which explicitly verified that 'sc > > > > > > 1' triggered a 0x700 (SIGILL from userspace). > > > > > > > > > > OK so the assert should check if the CPU has an HV bit. I think there > > > > > was a > > > > > #detine for that somewhere that I can add to the assert then I can > > > > > try that. > > > > > What I wasn't sure about is that sc 1 would conflict with the guest's > > > > > usage > > > > > of normal sc calls or are these going through different paths and > > > > > only sc 1 > > > > > will trigger vhyp callback not affecting notmal sc calls? > > > > > > > > The vhyp shouldn't affect normal system calls, 'sc 1' is specifically > > > > for hypercalls, as opposed to normal 'sc' (a.k.a. 'sc 0'), and the > > > > vhyp only intercepts the hypercall version (after all Linux on PAPR > > > > certainly uses its own system calls, and hypercalls are active for the > > > > lifetime of the guest there). > > > > > > > > > (Or if this causes > > > > > an otherwise unnecessary VM exit on KVM even when it works then maybe > > > > > looking for a different way in the future might be needed. > > > > > > > > What you're doing here won't work with KVM as it stands. There are > > > > basically two paths into the vhyp hypercall path: 1) from TCG, if we > > > > interpret an 'sc 1' instruction we enter vhyp, 2) from KVM, if we get > > > > a KVM_EXIT_PAPR_HCALL KVM exit then we also go to the vhyp path. > > > > > > > > The second path is specific to the PAPR (ppc64) implementation of KVM, > > > > and will not work for a non-PAPR platform without substantial > > > > modification of the KVM code. > > > > > > OK so then at that point when we try KVM we'll need to look at alternative > > > ways, I think MOL OSI worked with KVM at least in MOL but will probably > > > make > > > all syscalls exit KVM but since we'll probably need to use KVM PR it will > > > exit anyway. For now I keep this vhyp as it does not run with KVM for > > > other > > > reasons yet so that's another area to clean up so as a proof of concept > > > first version of using VOF vhyp will do. > > > > Eh, since you'll need to modify KVM anyway, it probably makes just as > > much sense to modify it to catch the 'sc 1' as MoL's magic thingy. > > I'm not sure how KVM works for this case so I also don't know why and what > would need to be modified. I think we'll only have KVM PR working as newer > POWER CPUs having HV (besides being rare among potential users) are probably > too different to run the OSes that expect at most a G4 on pegasos2 so likely > it won't work with KVM HV.
Oh, it definitely won't work with KVM HV. > If we have KVM PR doesn't sc already trap so we > could add MOL OSI without further modification to KVM itself only needing > change in QEMU? Uh... I guess so? > I also hope that MOL OSI could be useful for porting some > paravirt drivers from MOL for running Mac OS X on Mac emulation but I don't > know about that for sure so I'm open to any other solution too. Maybe. I never know much about MOL to begin with, and anything I did know was a decade or more ago so I've probably forgotten. > For now I'm > going with vhyp which is enough fot testing with TCG and if somebody wants > KVM they could use he original firmware for now so this could be improved in > a later version unless a simple solution is found before the freeze for 6.1. > If we're in KVM PR what happens for sc 1 could that be used too so maybe > what we have now could work? Note that if you do go down the MOL path it wouldn't be that complex to make a "vMOL" interface so you can use the same mechanism for KVM and TCG. > > > [...] > > > > > > > I've tested that the missing rtas is not the reason for getting > > > > > > > no output > > > > > > > via serial though, as even when disabling rtas on pegasos2.rom it > > > > > > > boots and > > > > > > > I still get serial output just some PCI devices are not detected > > > > > > > (such as > > > > > > > USB, the video card and the not emulated ethernet port but these > > > > > > > are not > > > > > > > fatal so it might even work as a first try without rtas, just to > > > > > > > boot a > > > > > > > Linux kernel for testing it would be enough if I can fix the > > > > > > > serial output). > > > > > > > I still don't know why it's not finding serial but I think it may > > > > > > > be some > > > > > > > missing or wrong info in the device tree I generat. I'll try to > > > > > > > focus on > > > > > > > this for now and leave the above rtas question for later. > > > > > > > > > > > > Oh.. another thought on that. You have an ISA serial port on > > > > > > Pegasos, > > > > > > I believe. I wonder if the PCI->ISA bridge needs some > > > > > > configuration / > > > > > > initialization that the firmware is expected to do. If so you'll > > > > > > need > > > > > > to mimic that setup in qemu for the VOF case. > > > > > > > > > > That's what I begin to think because I've added everything to the > > > > > device > > > > > tree that I thought could be needed and I still don't get it working > > > > > so it > > > > > may need some config from the firmware. But how do I access device > > > > > registers > > > > > from board code? I've tried adding a machine reset method and write to > > > > > memory mapped device registers but all my attempts failed. I've tried > > > > > cpu_stl_le_data and even memory_region_dispatch_write but these did > > > > > not get > > > > > to the device. What's the way to access guest mmio regs from QEMU? > > > > > > > > That's odd, cpu_stl() and memory_region_dispatch_write() should work > > > > from board code (after the relevant memory regions are configured, of > > > > course). As an ISA serial port, it's probably accessed through IO > > > > space, not memory space though, so you'd need &address_space_io. And > > > > if there is some bridge configuration then it's the bridge control > > > > registers you need to look at not the serial registers - you'd have to > > > > look at the bridge documentation for that. Or, I guess the bridge > > > > implementation in qemu, which you wrote part of. > > > > > > I've found at last that stl_le_phys() works. There are so many of these > > > that > > > I never know when to use which. > > > > > > I think the address_space_rw calls in vof_client_call() in vof.c could > > > also > > > use these for somewhat shorter code. I've ended up with > > > stl_le_phys(CPU(cpu)->as, addr, val) in my machine reset methodbut I don't > > > even need that now as it works without additional setup. Also VOF's memory > > > access is basically the same as the already existing rtas_st() and co. so > > > maybe that could be reused to make code smaller? > > > > rtas_ld() and rtas_st() should only be used for reading/writing RTAS > > parameters to and from memory. Accessing IO shouldn't be done with > > those. > > > > For IO you probably want the cpu_st*() variants in most cases, since > > you're trying to emulate an IO access from the virtual cpu. > > I think I've tried that but what worked to access mmio device registers are > stl_le_phys and similar that are wrappers around address_space_stl_*. But I > did not mean that for rtas_ld/_st but the part when vof accessing the > parameters passed by its hypercall which is memory access: > > https://github.com/patchew-project/qemu/blob/patchew/20210520090557.435689-1-aik%40ozlabs.ru/hw/ppc/vof.c > > line 893, and vof_client_call before that is very similar to what h_rtas > does here: > > https://git.qemu.org/?p=qemu.git;a=blob;f=hw/ppc/spapr_hcall.c;h=f25014afda408002ee1ec1027a0dd7a6025eca61;hb=HEAD#l639 > > and I also need to do the same for rtas in pegasos2 for which I'm just using > ldl_be_phys for now but I wonder if we really need 3 ways to do the same or > the rtas_ld/_st could be made more generic and reused here? For your rtas implementation you could definitely re-use them. For the client call I'm a bit less confident, but if the in-guest-memory structures are really the same, then it would make sense. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature