Re: [Qemu-devel] [PATCH 0/3] v4 Decouple block device removal from device removal

Ryan Harper Tue, 02 Nov 2010 09:54:00 -0700

* Michael S. Tsirkin <m...@redhat.com> [2010-11-02 10:56]:
> On Tue, Nov 02, 2010 at 09:22:01AM -0500, Ryan Harper wrote:
> > * Michael S. Tsirkin <m...@redhat.com> [2010-11-02 08:59]:
> > > On Tue, Nov 02, 2010 at 08:46:22AM -0500, Ryan Harper wrote:
> > > > * Markus Armbruster <arm...@redhat.com> [2010-11-02 04:40]:
> > 
> > > > > >> >> I'd like to have some consistency among net, block and char 
> > > > > >> >> device
> > > > > >> >> commands, i.e. a common set of operations that work the same 
> > > > > >> >> for all of
> > > > > >> >> them.  Can we agree on such a set?
> > > > > >> >
> > > > > >> > Yeah; the current trouble (or at least what I perceive to be 
> > > > > >> > trouble) is
> > > > > >> > that in the case where the guest responds to device_del induced 
> > > > > >> > ACPI
> > > > > >> > removal event; the current qdev code already does the host-side 
> > > > > >> > device
> > > > > >> > tear down.  Not sure if it is OK to do a blockdev_del() 
> > > > > >> > immediately
> > > > > >> > after the device_del.  What happens when we do:
> > > > > >> >
> > > > > >> > device_del
> > > > > >> > ACPI to guest
> > > > > >> > blockdev_del /* removes host-side device */
> > > > > >> 
> > > > > >> Fails in my tree, because the blockdev's still in use.  See below.
> > > > > >> 
> > > > > >> > guest responds to ACPI
> > > > > >> > qdev calls pci device removal code
> > > > > >> > qemu attempts to destroy the associated host-side block
> > > > > >> >
> > > > > >> > That may just work today; and if not, it shouldn't be hard to 
> > > > > >> > fix up the
> > > > > >> > code to check for NULLs
> > > > > >> 
> > > > > >> I hate the automatic deletion of host part along with the guest 
> > > > > >> part.
> > > > > >> device_del should undo device_add.  {block,net,char}dev_{add,del} 
> > > > > >> should
> > > > > >> be similarly paired.
> > > > > >
> > > > > > Agreed.
> > > > > >> 
> > > > > >> In my blockdev branch, I keep the automatic delete only for 
> > > > > >> backwards
> > > > > >> compatibility: if you create the drive with drive_add, it gets
> > > > > >> auto-deleted, but if you use blockdev_add, it stays around.
> > > > > >
> > > > > > But what to do about the case where we're doing drive_add and then a
> > > > > > device_del()  That's the urgent situation that needs to be resolved.
> > > > > 
> > > > > What's the exact problem we need to solve urgently?
> > > > > 
> > > > > Is it "provide means to cut the connection to the host part 
> > > > > immediately,
> > > > > even with an uncooperative guest"?
> > > > 
> > > > Yes, need to ensure that if the mgmt layer (libvirt) has done what it
> > > > believes should have disassociated the host block device from the guest,
> > > > we want to ensure that the host block device is no longer accessible
> > > > from the guest.
> > > > 
> > > > > 
> > > > > Does this need to be separate from device_del?
> > > > 
> > > > no, it doesn't have to be.  Honestly, I didn't see a clear way to do
> > > > something like unplug early in the device_del because that's all pci
> > > > device code which has no knowledge of host block devices; having it
> > > > disconnect seemed like a layering violation.
> > > 
> > > We invoke the cleanup callback, isn't that enough?
> > 
> > Won't that look a bit strange?  on device_del, call the cleanup callback
> > first;, then notify the guest, if the guest responds, I suppose as long
> > as the cleanup callback can handle being called a second time that'd
> > work.
> 
> Well this is exactly what happens with surpise removal.
> If you yank a card out the slot, guest only gets notification
> afterwards.


Right, though the card ripper can (in some systems) press the removal
button which would send notification.  I think I'm fine with not
bothering to notify; this was mgmt interface driven anyhow so who ever
is doing it should have already ensured they weren't using the device.

> 
> > I like the idea of disconnect; if part of the device_del method was to
> > invoke a disconnect method, we could implement that for block, net, etc;
> > 
> > I'd think we'd want to send the notification, then disconnect.
> > Struggling with whether it's worth having some reasonable timeout
> > between notification and disconnect.  
> 
> The problem with this is that it has no analog in real world.
> In real world, you can send some notifications to the guest, and you can
> remove the card.  Tying them together is what created the problem in the
> first place.
> 
> Timeouts can be implemented by management, maybe with a nice dialog
> being shown to the user.

Very true.  I'm fine with forcing a disconnect during the removal path
prior to notification.  Do we want a new disconnect method at the device
level (pci)? or just use the existing removal callback and call that
during the initial hotremov event?


-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ry...@us.ibm.com

Re: [Qemu-devel] [PATCH 0/3] v4 Decouple block device removal from device removal

Reply via email to