On Mon, 10 Dec 2018 17:20:43 +1100 David Gibson <da...@gibson.dropbear.id.au> wrote:
> On Mon, Nov 12, 2018 at 03:12:26PM +1100, Alexey Kardashevskiy wrote: > > > > > > On 12/11/2018 05:10, Greg Kurz wrote: > > > Hi Alexey, > > > > > > Just a few remarks. See below. > > > > > > On Thu, 8 Nov 2018 12:44:06 +1100 > > > Alexey Kardashevskiy <a...@ozlabs.ru> wrote: > > > > > >> SLOF receives a device tree and updates it with various properties > > >> before switching to the guest kernel and QEMU is not aware of any changes > > >> made by SLOF. Since there is no real RTAS (QEMU implements it), it makes > > >> sense to pass the SLOF final device tree to QEMU to let it implement > > >> RTAS related tasks better, such as PCI host bus adapter hotplug. > > >> > > >> Specifially, now QEMU can find out the actual XICS phandle (for PHB > > >> hotplug) and the RTAS linux,rtas-entry/base properties (for firmware > > >> assisted NMI - FWNMI). > > >> > > >> This stores the initial DT blob in the sPAPR machine and replaces it > > >> in the KVMPPC_H_UPDATE_DT (new private hypercall) handler. > > >> > > >> This adds an @update_dt_enabled machine property to allow backward > > >> migration. > > >> > > >> SLOF already has a hypercall since > > >> https://github.com/aik/SLOF/commit/e6fc84652c9c0073f9183 > > >> > > >> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > > >> --- > > >> include/hw/ppc/spapr.h | 7 ++++++- > > >> hw/ppc/spapr.c | 29 ++++++++++++++++++++++++++++- > > >> hw/ppc/spapr_hcall.c | 32 ++++++++++++++++++++++++++++++++ > > >> hw/ppc/trace-events | 2 ++ > > >> 4 files changed, 68 insertions(+), 2 deletions(-) > > >> > > >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > > >> index ad4d7cfd97..f5dcaf44cb 100644 > > >> --- a/include/hw/ppc/spapr.h > > >> +++ b/include/hw/ppc/spapr.h > > >> @@ -100,6 +100,7 @@ struct sPAPRMachineClass { > > >> > > >> /*< public >*/ > > >> bool dr_lmb_enabled; /* enable dynamic-reconfig/hotplug of > > >> LMBs */ > > >> + bool update_dt_enabled; /* enable KVMPPC_H_UPDATE_DT */ > > >> bool use_ohci_by_default; /* use USB-OHCI instead of XHCI */ > > >> bool pre_2_10_has_unused_icps; > > >> bool legacy_irq_allocation; > > >> @@ -136,6 +137,9 @@ struct sPAPRMachineState { > > >> int vrma_adjust; > > >> ssize_t rtas_size; > > >> void *rtas_blob; > > >> + uint32_t fdt_size; > > >> + uint32_t fdt_initial_size; > > > > > > I don't quite see the purpose of fdt_initial_size... it seems to be only > > > used to print a trace. > > > > > > Ah, lost in rebase. The purpose was to test if the new device tree has > > not grown too much. > > > > > > > > > > > >> + void *fdt_blob; > > >> long kernel_size; > > >> bool kernel_le; > > >> uint32_t initrd_base; > > >> @@ -462,7 +466,8 @@ struct sPAPRMachineState { > > >> #define KVMPPC_H_LOGICAL_MEMOP (KVMPPC_HCALL_BASE + 0x1) > > >> /* Client Architecture support */ > > >> #define KVMPPC_H_CAS (KVMPPC_HCALL_BASE + 0x2) > > >> -#define KVMPPC_HCALL_MAX KVMPPC_H_CAS > > >> +#define KVMPPC_H_UPDATE_DT (KVMPPC_HCALL_BASE + 0x3) > > >> +#define KVMPPC_HCALL_MAX KVMPPC_H_UPDATE_DT > > >> > > >> typedef struct sPAPRDeviceTreeUpdateHeader { > > >> uint32_t version_id; > > >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > >> index c08130facb..5e2d4d211c 100644 > > >> --- a/hw/ppc/spapr.c > > >> +++ b/hw/ppc/spapr.c > > >> @@ -1633,7 +1633,10 @@ static void spapr_machine_reset(void) > > >> /* Load the fdt */ > > >> qemu_fdt_dumpdtb(fdt, fdt_totalsize(fdt)); > > >> cpu_physical_memory_write(fdt_addr, fdt, fdt_totalsize(fdt)); > > >> - g_free(fdt); > > >> + g_free(spapr->fdt_blob); > > >> + spapr->fdt_size = fdt_totalsize(fdt); > > >> + spapr->fdt_initial_size = spapr->fdt_size; > > >> + spapr->fdt_blob = fdt; > > > > > > Hmm... It looks weird to store state in a reset handler. I'd rather zeroe > > > both fdt_blob and fdt_size here. > > > > The device tree is built from the reset handler and the idea is that we > > want to always have some tree in the machine. > > Yes, I think the approach here is fine. Otherwise when we want to > look up the current fdt state in RTAS calls or whatever we'd always > have to do > if (fdt_blob) > look up that > else > look up qemu created fdt. > No. We only have one fdt blob: the initial one, I'd rather call reset time one, or the updated one. > Incidentally 'fdt' and 'fdt_blob' names do a terrible job of > distinguishing what the difference is. Renaming fdt to fdt_initial > (to match fdt_initial_size) and fdt_blob to fdt should make that > clearer. > As mentioned earlier in this thread, spapr->fdt_initial_size is only used for tracing if the received fdt blob fails fdt_check_full()... $ git grep -H fdt_initial_size hw/ppc/spapr.c: spapr->fdt_initial_size = spapr->fdt_size; hw/ppc/spapr.c: VMSTATE_UINT32(fdt_initial_size, sPAPRMachineState), hw/ppc/spapr_hcall.c: trace_spapr_update_dt_failed(spapr->fdt_initial_size, cb, include/hw/ppc/spapr.h: uint32_t fdt_initial_size; Not sure it is helpful, and anyway, it is expected to be the same in source and destination, so why put it in the migration stream ? The only case where we want to migrate something is when h_update_dt() has succeeded, ie, the guest passed a valid DT blob. This implies that its size isn't 0, otherwise fdt_check_full() would return -FDT_ERR_TRUNCATED. I would suggest rather to: - completely drop spapr->fdt_initial_size - clear spapr->fdt_size at machine reset - migrate if spapr->fdt_size is not zero Also, I've just realized another problem... nothing prevents a malicious guest to pass an insanely great size to h_update_dt, which would cause g_malloc0() to abort... The passed size should be checked against FDT_MAX_SIZE. Cheers, -- Greg
pgphVMVcWhTV0.pgp
Description: OpenPGP digital signature