On Thu, 14 Sep 2017 13:48:26 +0530 Bharata B Rao <bhar...@linux.vnet.ibm.com> wrote:
> On Thu, Sep 14, 2017 at 10:00:11AM +0200, Igor Mammedov wrote: > > On Thu, 14 Sep 2017 12:31:18 +0530 > > Bharata B Rao <bhar...@linux.vnet.ibm.com> wrote: > > > > > Hi, > > > > > > QEMU hits the below assert > > > > > > qemu-system-ppc64: used ring relocated for ring 2 > > > qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion `r > > > >= 0' failed. > > > > > > in the following scenario: > > > > > > 1. Boot guest with vhost=on > > > -netdev tap,id=mynet0,script=qemu-ifup,downscript=qemu-ifdown,vhost=on > > > -device virtio-net-pci,netdev=mynet0 > > > 2. Hot add a DIMM device > > > 3. Reboot > > > When the guest reboots, we can see > > > vhost_virtqueue_start:vq->used_phys getting assigned an address that > > > falls in the hotplugged memory range. > > > 4. Remove the DIMM device > > > Guest refuses the removal as the hotplugged memory is under use. > > > 5. Reboot > > > > > QEMU forces the removal of the DIMM device during reset and that's > > > when we hit the above assert. > > I don't recall implementing forced removal om DIMM, > > could you point out to the related code, pls? > > This is ppc specific. We have DR Connector objects for each LMB (multiple > LMBs make up one DIMM device) and during reset we invoke the > release routine for these LMBs which will further invoke > pc_dimm_memory_unplug(). > > See hw/ppc/spapr_drc.c: spapr_drc_reset() > hw/ppc/spapr.c: spapr_lmb_release() > > > > > > Any pointers on why we are hitting this assert ? Shouldn't vhost be > > > done with using the hotplugged memory when we hit reset ? > > > > >From another point of view, > > DIMM shouldn't be removed unless guest explicitly ejects it > > (at least that should be so in x86 case). > > While that is true for ppc also, shouldn't we start fresh from reset ? we should. when it aborts vhost should print out error from vhost_verify_ring_mappings() if (r == -ENOMEM) { error_report("Unable to map %s for ring %d", part_name[j], i); } else if (r == -EBUSY) { error_report("%s relocated for ring %d", part_name[j], i); that might give a clue where that memory stuck in. Michael might point out where to start look at, but he's on vacation so ... > Related comment from hw/ppc/spapr_drc.c: spapr_drc_reset() > > /* immediately upon reset we can safely assume DRCs whose devices > * are pending removal can be safely removed. > */ > if (drc->unplug_requested) { > spapr_drc_release(drc); > } > > So essentially in the scenario I listed, the unplug request is rejected > by the guest, but during next reboot, QEMU excersies the above code > and removes any devices (memory, CPU etc) that are marked for removal. I'd remove pending removal on reset for DIMM/CPU if it's acceptable from PPC HW pov. > > Regards, > Bharata. >