On 01/04/14 17:48, Markus Armbruster wrote:
Heinz Graalfs <graa...@linux.vnet.ibm.com> writes:

Hi Kevin,

doing a

      virsh detach-device ...

ends up in the following QEMU monitor commands:

1. device_del ...
2. drive_del ...

qmp_device_del() performs the device unplug path.
In case of a block device do_drive_del() tries to
prevent further IO against the host device.

However, bdrv_find() during drive_del() results in
an error, because the device is already gone. Due to
this error all the bdrv_xxx calls to quiesce the block
driver as well as all other processing is skipped.

Is the sequence that libvirt triggers OK?
Shouldn't drive_del be executed first?

No.

OK, I see. The drive is deleted implicitly (release_drive()).
Doing a device_del() requires another drive_add() AND device_add().
(Doing just a device_add() complains about the missing drive.
A subsequent info qtree lets QEMU abort.)


drive_del is nasty.  Its purpose is to revoke access to an image even
when the guest refuses to cooperate.  To the guest, this looks like
hardware failure.

Deleting a device during active IO is nasty and it should look like a
hardware failure. I would expect lots of errors.


If you drive_del before device_del, even a perfectly well-behaved guest
guest is exposed to a terminally broken device between drive_del and
completion of unplug.

The early drive_del() would mean that no further IO would be
possible.


Always try a device_del first, and only if that does not complete within
reasonable time, and you absolutely must revoke access to the image,
then whack it over the head with drive_del.

What is this reasonable time?

On 390 we experience problems (QEMU abort) when asynch block IO
completes and the virtqueues are already gone. I suppose the
bdrv_drain_all() in bdrv_close() is a little late. I don't see such
problems with an early bdrv_drain_all() (drive_del) and an unplug
(device_del) afterwards.




Reply via email to