On Wed, Jan 22, 2025 at 06:33:04PM +0100, Christian Borntraeger wrote:
> Am 22.01.25 um 15:44 schrieb Boyer, Andrew:
> [...]
>
> > > > > --- a/drivers/block/virtio_blk.c
> > > > > +++ b/drivers/block/virtio_blk.c
> > > > > @@ -379,14 +379,10 @@ static void virtio_commit_rqs(struct
> > > > > blk_mq_hw_ctx *hctx)
> > > > > {
> > > > > struct virtio_blk *vblk = hctx->queue->queuedata;
> > > > > struct virtio_blk_vq *vq = &vblk->vqs[hctx->queue_num];
> > > > > - bool kick;
> > > > > spin_lock_irq(&vq->lock);
> > > > > - kick = virtqueue_kick_prepare(vq->vq);
> > > > > + virtqueue_kick(vq->vq);
> > > > > spin_unlock_irq(&vq->lock);
> > > > > -
> > > > > - if (kick)
> > > > > - virtqueue_notify(vq->vq);
> > > > > }
> > > >
> > > > I would assume this will be a performance nightmare for normal IO.
> > >
> >
> > Hello Michael and Christian and Jason,
> > Thank you for taking a look.
> >
> > Is the performance concern that the vmexit might lead to the underlying
> > virtual storage stack doing the work immediately? Any other job posting to
> > the same queue would presumably be blocked on a vmexit when it goes to
> > attempt its own notification. That would be almost the same as having the
> > other job block on a lock during the operation, although I guess if you are
> > skipping notifications somehow it would look different.
>
> The performance concern is that you hold a lock and then exit. Exits are
> expensive and can schedule so you will increase the lock holding time
> significantly. This is begging for lock holder preemption.
> Really, dont do it.
The issue is with hardware that wants a copy of an index sent to
it in a notification. Now, you have a race:
thread 1:
index = 1
-> -> send 1 to hardware
thread 2:
index = 2
-> send 2 to hardware
the spec unfortunately does not say whether that is legal.
As far as I could tell, the device can easily use the
wrap counter inside the notification to detect this
and simply discard the "1" notification.
If not, I'd like to understand why.
Thanks!
--
MST