9pfs: yield when there isn't enough room on the ring

Stefano Stabellini Wed, 20 May 2020 11:49:10 -0700

On Wed, 20 May 2020, Stefano Stabellini wrote:
> On Wed, 20 May 2020, Christian Schoenebeck wrote:
> > On Mittwoch, 20. Mai 2020 03:47:12 CEST Stefano Stabellini wrote:
> > > From: Stefano Stabellini <stefano.stabell...@xilinx.com>
> > > 
> > > Instead of truncating replies, which is problematic, wait until the
> > > client reads more data and frees bytes on the reply ring.
> > > 
> > > Do that by calling qemu_coroutine_yield(). The corresponding
> > > qemu_coroutine_enter_if_inactive() is called from xen_9pfs_bh upon
> > > receiving the next notification from the client.
> > > 
> > > We need to be careful to avoid races in case xen_9pfs_bh and the
> > > coroutine are both active at the same time. In xen_9pfs_bh, wait until
> > > either the critical section is over (ring->co == NULL) or until the
> > > coroutine becomes inactive (qemu_coroutine_yield() was called) before
> > > continuing. Then, simply wake up the coroutine if it is inactive.
> > > 
> > > Signed-off-by: Stefano Stabellini <stefano.stabell...@xilinx.com>
> > > ---
> > 
> > In general this patch makes sense to me, and much better and cleaner 
> > solution 
> > than what we discussed before. Just one detail ...
> > 
> > >  hw/9pfs/xen-9p-backend.c | 28 ++++++++++++++++++++++------
> > >  1 file changed, 22 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
> > > index fc197f6c8a..3939539028 100644
> > > --- a/hw/9pfs/xen-9p-backend.c
> > > +++ b/hw/9pfs/xen-9p-backend.c
> > > @@ -37,6 +37,7 @@ typedef struct Xen9pfsRing {
> > > 
> > >      struct iovec *sg;
> > >      QEMUBH *bh;
> > > +    Coroutine *co;
> > > 
> > >      /* local copies, so that we can read/write PDU data directly from
> > >       * the ring */
> > > @@ -198,16 +199,18 @@ static void xen_9pfs_init_in_iov_from_pdu(V9fsPDU
> > > *pdu, g_free(ring->sg);
> > > 
> > >      ring->sg = g_new0(struct iovec, 2);
> > > -    xen_9pfs_in_sg(ring, ring->sg, &num, pdu->idx, size);
> > > +    ring->co = qemu_coroutine_self();
> > > +    smp_wmb();
> > > 
> > > +again:
> > > +    xen_9pfs_in_sg(ring, ring->sg, &num, pdu->idx, size);
> > >      buf_size = iov_size(ring->sg, num);
> > >      if (buf_size  < size) {
> > > -        xen_pv_printf(&xen_9pfs->xendev, 0, "Xen 9pfs request type %d"
> > > -                "needs %zu bytes, buffer has %zu\n", pdu->id, size,
> > > -                buf_size);
> > > -        xen_be_set_state(&xen_9pfs->xendev, XenbusStateClosing);
> > > -        xen_9pfs_disconnect(&xen_9pfs->xendev);
> > > +        qemu_coroutine_yield();
> > > +        goto again;
> > >      }
> > > +    ring->co = NULL;
> > > +    smp_wmb();
> > > 
> > >      *piov = ring->sg;
> > >      *pniov = num;
> > > @@ -292,6 +295,19 @@ static int xen_9pfs_receive(Xen9pfsRing *ring)
> > >  static void xen_9pfs_bh(void *opaque)
> > >  {
> > >      Xen9pfsRing *ring = opaque;
> > > +    bool wait;
> > > +
> > > +again:
> > > +    wait = ring->co != NULL && qemu_coroutine_entered(ring->co);
> > > +    smp_rmb();
> > > +    if (wait) {
> > > +        cpu_relax();
> > > +        goto again;
> > > +    }
> > > +
> > > +    if (ring->co != NULL) {
> > > +        qemu_coroutine_enter_if_inactive(ring->co);
> > 
> > ... correct me if I am wrong, but AFAIK qemu_coroutine_enter_if_inactive() 
> > will simply run the coroutine directly on caller's thread, it will not 
> > dispatch the coroutine onto the thread which yielded the coroutine before.
> 
> Yes, that is correct. I thought it would be fine because the caller here
> is a bh function so it should have no problems entering the coroutine.
> 
> But I am not that much of an expert on coroutines... Do you think there
> could be issues?
> 
> 
> > > +    }
> > >      xen_9pfs_receive(ring);
> > >  }
> > 
> > AFAICS you have not addressed the problem msize >> xen ringbuffer size, in 
> > which case I would expect the Xen driver to loop forever. Am I missing 
> > something or have you postponed addressing this?
> 
> Yes, I postponed addressing that issue because of a couple of reasons.
> 
> For starter strictly speaking it should not be required: msize cannot be
> bigger than the ring, but it can be equal to the ring increasing the
> chances of having to wait in QEMU. It should still be correct but the
> performance might not be great.
> 
> The other reason is that I already have the patches for both QEMU and
> Linux, but I am seeing a strange error setting order = 10. Order = 9
> works fine. I would like to do a bit more investigation before sending
> those patches.


As it turns out, order 9 is the max that the protocol allows at the
moment, so I am going to add a patch increasing the max order supported
by QEMU to 9 in the next version of this series.

The msize limitation to 8 (order 9 - 1) is done at the Linux side
(https://marc.info/?l=linux-kernel&m=159000008631935&w=2).

Re: [PATCH 2/2] xen/9pfs: yield when there isn't enough room on the ring

Reply via email to