On Wed, 20 May 2020, Stefano Stabellini wrote: > On Wed, 20 May 2020, Christian Schoenebeck wrote: > > On Mittwoch, 20. Mai 2020 03:47:12 CEST Stefano Stabellini wrote: > > > From: Stefano Stabellini <stefano.stabell...@xilinx.com> > > > > > > Instead of truncating replies, which is problematic, wait until the > > > client reads more data and frees bytes on the reply ring. > > > > > > Do that by calling qemu_coroutine_yield(). The corresponding > > > qemu_coroutine_enter_if_inactive() is called from xen_9pfs_bh upon > > > receiving the next notification from the client. > > > > > > We need to be careful to avoid races in case xen_9pfs_bh and the > > > coroutine are both active at the same time. In xen_9pfs_bh, wait until > > > either the critical section is over (ring->co == NULL) or until the > > > coroutine becomes inactive (qemu_coroutine_yield() was called) before > > > continuing. Then, simply wake up the coroutine if it is inactive. > > > > > > Signed-off-by: Stefano Stabellini <stefano.stabell...@xilinx.com> > > > --- > > > > In general this patch makes sense to me, and much better and cleaner > > solution > > than what we discussed before. Just one detail ... > > > > > hw/9pfs/xen-9p-backend.c | 28 ++++++++++++++++++++++------ > > > 1 file changed, 22 insertions(+), 6 deletions(-) > > > > > > diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c > > > index fc197f6c8a..3939539028 100644 > > > --- a/hw/9pfs/xen-9p-backend.c > > > +++ b/hw/9pfs/xen-9p-backend.c > > > @@ -37,6 +37,7 @@ typedef struct Xen9pfsRing { > > > > > > struct iovec *sg; > > > QEMUBH *bh; > > > + Coroutine *co; > > > > > > /* local copies, so that we can read/write PDU data directly from > > > * the ring */ > > > @@ -198,16 +199,18 @@ static void xen_9pfs_init_in_iov_from_pdu(V9fsPDU > > > *pdu, g_free(ring->sg); > > > > > > ring->sg = g_new0(struct iovec, 2); > > > - xen_9pfs_in_sg(ring, ring->sg, &num, pdu->idx, size); > > > + ring->co = qemu_coroutine_self(); > > > + smp_wmb(); > > > > > > +again: > > > + xen_9pfs_in_sg(ring, ring->sg, &num, pdu->idx, size); > > > buf_size = iov_size(ring->sg, num); > > > if (buf_size < size) { > > > - xen_pv_printf(&xen_9pfs->xendev, 0, "Xen 9pfs request type %d" > > > - "needs %zu bytes, buffer has %zu\n", pdu->id, size, > > > - buf_size); > > > - xen_be_set_state(&xen_9pfs->xendev, XenbusStateClosing); > > > - xen_9pfs_disconnect(&xen_9pfs->xendev); > > > + qemu_coroutine_yield(); > > > + goto again; > > > } > > > + ring->co = NULL; > > > + smp_wmb(); > > > > > > *piov = ring->sg; > > > *pniov = num; > > > @@ -292,6 +295,19 @@ static int xen_9pfs_receive(Xen9pfsRing *ring) > > > static void xen_9pfs_bh(void *opaque) > > > { > > > Xen9pfsRing *ring = opaque; > > > + bool wait; > > > + > > > +again: > > > + wait = ring->co != NULL && qemu_coroutine_entered(ring->co); > > > + smp_rmb(); > > > + if (wait) { > > > + cpu_relax(); > > > + goto again; > > > + } > > > + > > > + if (ring->co != NULL) { > > > + qemu_coroutine_enter_if_inactive(ring->co); > > > > ... correct me if I am wrong, but AFAIK qemu_coroutine_enter_if_inactive() > > will simply run the coroutine directly on caller's thread, it will not > > dispatch the coroutine onto the thread which yielded the coroutine before. > > Yes, that is correct. I thought it would be fine because the caller here > is a bh function so it should have no problems entering the coroutine. > > But I am not that much of an expert on coroutines... Do you think there > could be issues? > > > > > + } > > > xen_9pfs_receive(ring); > > > } > > > > AFAICS you have not addressed the problem msize >> xen ringbuffer size, in > > which case I would expect the Xen driver to loop forever. Am I missing > > something or have you postponed addressing this? > > Yes, I postponed addressing that issue because of a couple of reasons. > > For starter strictly speaking it should not be required: msize cannot be > bigger than the ring, but it can be equal to the ring increasing the > chances of having to wait in QEMU. It should still be correct but the > performance might not be great. > > The other reason is that I already have the patches for both QEMU and > Linux, but I am seeing a strange error setting order = 10. Order = 9 > works fine. I would like to do a bit more investigation before sending > those patches.
As it turns out, order 9 is the max that the protocol allows at the moment, so I am going to add a patch increasing the max order supported by QEMU to 9 in the next version of this series. The msize limitation to 8 (order 9 - 1) is done at the Linux side (https://marc.info/?l=linux-kernel&m=159000008631935&w=2).