On Wed, Jan 22, 2025 at 06:33:04PM +0100, Christian Borntraeger wrote:
> Am 22.01.25 um 15:44 schrieb Boyer, Andrew:
> [...]
> 
> > > > > --- a/drivers/block/virtio_blk.c
> > > > > +++ b/drivers/block/virtio_blk.c
> > > > > @@ -379,14 +379,10 @@ static void virtio_commit_rqs(struct 
> > > > > blk_mq_hw_ctx *hctx)
> > > > >  {
> > > > >    struct virtio_blk *vblk = hctx->queue->queuedata;
> > > > >    struct virtio_blk_vq *vq = &vblk->vqs[hctx->queue_num];
> > > > > -   bool kick;
> > > > >    spin_lock_irq(&vq->lock);
> > > > > -   kick = virtqueue_kick_prepare(vq->vq);
> > > > > +   virtqueue_kick(vq->vq);
> > > > >    spin_unlock_irq(&vq->lock);
> > > > > -
> > > > > -   if (kick)
> > > > > -           virtqueue_notify(vq->vq);
> > > > >  }
> > > > 
> > > > I would assume this will be a performance nightmare for normal IO.
> > > 
> > 
> > Hello Michael and Christian and Jason,
> > Thank you for taking a look.
> > 
> > Is the performance concern that the vmexit might lead to the underlying 
> > virtual storage stack doing the work immediately? Any other job posting to 
> > the same queue would presumably be blocked on a vmexit when it goes to 
> > attempt its own notification. That would be almost the same as having the 
> > other job block on a lock during the operation, although I guess if you are 
> > skipping notifications somehow it would look different.
> 
> The performance concern is that you hold a lock and then exit. Exits are 
> expensive and can schedule so you will increase the lock holding time 
> significantly. This is begging for lock holder preemption.
> Really, dont do it.


The issue is with hardware that wants a copy of an index sent to
it in a notification. Now, you have a race:


thread 1:

        index = 1
                ->                      -> send 1 to hardware
           

thread 2:

        index = 2
                -> send 2 to hardware




the spec unfortunately does not say whether that is legal.

As far as I could tell, the device can easily use the
wrap counter inside the notification to detect this
and simply discard the "1" notification.


If not, I'd like to understand why.

Thanks!


-- 
MST


Reply via email to