On 12/20/2012 05:41 PM, Alexander Graf wrote: > > > On 20.12.2012, at 09:46, Asias He <as...@redhat.com> wrote: > >> On 12/20/2012 12:15 PM, Rusty Russell wrote: >>> Alexander Graf <ag...@suse.de> writes: >>> >>>> When a file system is mounted on a virtio-blk disk, we then remove it >>>> and then reattach it, the reattached disk gets the same disk name and >>>> ids as the hot removed one. >>>> >>>> This leads to very nasty effects - mostly rendering the newly attached >>>> device completely unusable. >>>> >>>> Trying what happens when I do the same thing with a USB device, I saw >>>> that the sd node simply doesn't get free'd when a device gets forcefully >>>> removed. >>>> >>>> Imitate the same behavior for vd devices. This way broken vd devices >>>> simply are never free'd and newly attached ones keep working just fine. >>>> >>>> Signed-off-by: Alexander Graf <ag...@suse.de> >>> >>> I think deserves a CC:stable, no? > > I wanted to make sure people can comment on it first. But yes, it should go > into stable when accepted. > >>> >>> I've put it in my pending queue for *next* merge window for now... >> >> Thanks for looking into this, Alexander! >> >> I also noticed this problem. The problem is that, if you hot-unplug a >> mounted or opened disk, the disk is in opened state. Next time, when you >> hotplug the same disk. The kernel thought it was opened already. The >> driver will use the wrong gendisk data structure in bdev. > > Exactly :) > >> >> blkdev_open >> blkdev_get >> __blkdev_get >> if (!bdev->bd_openers) { <-- Here, bd_disk not got updated >> still points to old one >> bdev->bd_disk = disk; >> bdev->bd_queue = disk->queue; >> ... >> >> I tried something like this: >> >> @@ -854,6 +862,19 @@ static int __devinit virtblk_probe(struct >> virtio_device *vdev) >> blk_queue_io_opt(q, blk_size * opt_io_size); >> >> add_disk(vblk->disk); >> + >> + for (i = 0; i < 1 << PART_BITS; i++) { >> + bdev = bdget_disk(vblk->disk, i); >> + if (bdev) { >> + bdev->bd_disk = vblk->disk; >> + bdev->bd_queue = q; >> + bdput(bdev); > > Does this stack? What if the device was mounted twice? Or mounted and dd'ed > at the same time? Then your refcount here is wrong.
This is in the probe path. And there is no extra refcount here. bdev->bd_openers stays what it was before the remove. > Also this means that you're free'ing a bdev even though there's still a > reference held to it. That means whoever holds it could later use that > pointer and access bogus memory. The bdev is not freed. > Alex > >> + } >> + } >> >> >> 1) Before: >> ---> hot-plug >> [ 35.730183] virtio_blk: virtblk_probe: vblk=ffff880078b0e000, >> disk=ffff880078d54c00, q=ffff88007f88b3d8 >> [ 35.735352] virtio_blk: virtblk_ioctl: vblk=ffff880078b0e000, >> disk=ffff880078d54c00, bdev=ffff88007c45cc00, q=ffff88007f88b3d8 >> ---> hot-unplug >> >> ---> hot-plug >> [ 83.570480] virtio_blk: virtblk_probe: vblk=ffff880078b0e000, >> disk=ffff880078d55800, q=ffff88007f88bb40 >> [ 83.575614] virtio_blk: virtblk_ioctl: vblk=ffff880078b0e000, >> disk=ffff880078d54c00, bdev=ffff88007c45cc00, q=ffff88007f88b3d8 >> >> The disk points to old one ffff880078d54c00. The queue also points to >> old one ffff88007f88b3d8. >> >> 2) After: >> ---> hot-plug >> [ 68.035063] virtio_blk: virtblk_probe: vblk=ffff880079b20000, >> disk=ffff88007f9ebc00, q=ffff8800784d8ed0 >> [ 68.041140] virtio_blk: virtblk_ioctl: vblk=ffff880079b20000, >> disk=ffff88007f9ebc00, bdev=ffff88007ab2c000, q=ffff8800784d8ed0 >> ---> hot-unplug >> >> ---> hot-plug >> [ 86.317706] virtio_blk: virtblk_probe: vblk=ffff880079b20000, >> disk=ffff88007f9eb000, q=ffff8800784d9638 >> [ 86.322535] virtio_blk: virtblk_ioctl: vblk=ffff880079b20000, >> disk=ffff88007f9eb000, bdev=ffff88007ab2c000, q=ffff8800784d9638 >> >> The disk and queue are updated correctly. The attached disk works and >> still uses the old name. >> >> -- >> Asias -- Asias _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization