* Michael S. Tsirkin <m...@redhat.com> [2010-11-02 10:56]: > On Tue, Nov 02, 2010 at 09:22:01AM -0500, Ryan Harper wrote: > > * Michael S. Tsirkin <m...@redhat.com> [2010-11-02 08:59]: > > > On Tue, Nov 02, 2010 at 08:46:22AM -0500, Ryan Harper wrote: > > > > * Markus Armbruster <arm...@redhat.com> [2010-11-02 04:40]: > > > > > > > >> >> I'd like to have some consistency among net, block and char > > > > > >> >> device > > > > > >> >> commands, i.e. a common set of operations that work the same > > > > > >> >> for all of > > > > > >> >> them. Can we agree on such a set? > > > > > >> > > > > > > >> > Yeah; the current trouble (or at least what I perceive to be > > > > > >> > trouble) is > > > > > >> > that in the case where the guest responds to device_del induced > > > > > >> > ACPI > > > > > >> > removal event; the current qdev code already does the host-side > > > > > >> > device > > > > > >> > tear down. Not sure if it is OK to do a blockdev_del() > > > > > >> > immediately > > > > > >> > after the device_del. What happens when we do: > > > > > >> > > > > > > >> > device_del > > > > > >> > ACPI to guest > > > > > >> > blockdev_del /* removes host-side device */ > > > > > >> > > > > > >> Fails in my tree, because the blockdev's still in use. See below. > > > > > >> > > > > > >> > guest responds to ACPI > > > > > >> > qdev calls pci device removal code > > > > > >> > qemu attempts to destroy the associated host-side block > > > > > >> > > > > > > >> > That may just work today; and if not, it shouldn't be hard to > > > > > >> > fix up the > > > > > >> > code to check for NULLs > > > > > >> > > > > > >> I hate the automatic deletion of host part along with the guest > > > > > >> part. > > > > > >> device_del should undo device_add. {block,net,char}dev_{add,del} > > > > > >> should > > > > > >> be similarly paired. > > > > > > > > > > > > Agreed. > > > > > >> > > > > > >> In my blockdev branch, I keep the automatic delete only for > > > > > >> backwards > > > > > >> compatibility: if you create the drive with drive_add, it gets > > > > > >> auto-deleted, but if you use blockdev_add, it stays around. > > > > > > > > > > > > But what to do about the case where we're doing drive_add and then a > > > > > > device_del() That's the urgent situation that needs to be resolved. > > > > > > > > > > What's the exact problem we need to solve urgently? > > > > > > > > > > Is it "provide means to cut the connection to the host part > > > > > immediately, > > > > > even with an uncooperative guest"? > > > > > > > > Yes, need to ensure that if the mgmt layer (libvirt) has done what it > > > > believes should have disassociated the host block device from the guest, > > > > we want to ensure that the host block device is no longer accessible > > > > from the guest. > > > > > > > > > > > > > > Does this need to be separate from device_del? > > > > > > > > no, it doesn't have to be. Honestly, I didn't see a clear way to do > > > > something like unplug early in the device_del because that's all pci > > > > device code which has no knowledge of host block devices; having it > > > > disconnect seemed like a layering violation. > > > > > > We invoke the cleanup callback, isn't that enough? > > > > Won't that look a bit strange? on device_del, call the cleanup callback > > first;, then notify the guest, if the guest responds, I suppose as long > > as the cleanup callback can handle being called a second time that'd > > work. > > Well this is exactly what happens with surpise removal. > If you yank a card out the slot, guest only gets notification > afterwards.
Right, though the card ripper can (in some systems) press the removal button which would send notification. I think I'm fine with not bothering to notify; this was mgmt interface driven anyhow so who ever is doing it should have already ensured they weren't using the device. > > > I like the idea of disconnect; if part of the device_del method was to > > invoke a disconnect method, we could implement that for block, net, etc; > > > > I'd think we'd want to send the notification, then disconnect. > > Struggling with whether it's worth having some reasonable timeout > > between notification and disconnect. > > The problem with this is that it has no analog in real world. > In real world, you can send some notifications to the guest, and you can > remove the card. Tying them together is what created the problem in the > first place. > > Timeouts can be implemented by management, maybe with a nice dialog > being shown to the user. Very true. I'm fine with forcing a disconnect during the removal path prior to notification. Do we want a new disconnect method at the device level (pci)? or just use the existing removal callback and call that during the initial hotremov event? -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ry...@us.ibm.com