On 05/19/2013 05:51 PM, Michal Privoznik wrote: > On 18.05.2013 05:31, Li Wei wrote: >> On 05/17, Michal Privoznik wrote: >>> On 17.05.2013 11:37, Li Wei wrote: >>>> On 05/17/2013 04:52 PM, Michal Privoznik wrote: >>>>> On 17.05.2013 10:16, Li Wei wrote: >>>>>> On 05/17/2013 04:07 PM, Michal Privoznik wrote: >>>>>>> On 17.05.2013 07:47, Li Wei wrote: >>>>>>>> Hi Michal, Daniel >>>>>>>> >>>>>>>> It seems there is something wrong with the 2/2 part of this patchset. >>>>>>>> When I do an "change-media" command in virsh, it doesn't do any better >>>>>>>> than before, but even worse(I must to wait 5 secs to see the error). >>>>>>>> >>>>>>>> I'm not family with libvirt, just add some log things in the >>>>>>>> qemu_hotplug.c >>>>>>>> and found the tray_status never change to open, but with michal's >>>>>>>> original >>>>>>>> patch (which do active poll on tray_status), I can do "change-media" >>>>>>>> successfully >>>>>>>> every time. >>>>>>> >>>>>>> Yes & no. Thing is - originally we ignored tray status in the guest. So >>>>>>> in case guest has locked the tray and thus ignored our eject request, we >>>>>>> have changed the media anyway, resulting in possible I/O errors within >>>>>>> guest. >>>>>>> >>>>>>> With my patch, we send eject request to the qemu, and actively poll for >>>>>>> the tray to open. We check for up to 10 times every 200ms. So guest has >>>>>>> 2 seconds to open the tray. If the guest doesn't eject media and open >>>>>>> the tray we don't proceed with forcibly changing the media. >>>>>>> >>>>>>> These patches are actually a bug fix for >>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=892289 >>>>>> >>>>>> Hmm... I think you misunderstood my meaning(sorry for my pool english), >>>>>> simply to say, I can get expected behavior by apply patch V1 but not V2. >>>>>> >>>>>> In V1, we do active poll on qemuMonitorGetBlockInfo(), and in V2 we just >>>>>> wait for origdisk->tray_status to become VIR_DOMAIN_DISK_TRAY_OPEN but >>>>>> this never happened. >>>>>> >>>>>> Thanks, >>>>>> Wei >>>>> >>>>> Aaah, now I understand. So you are saying that the event is not being >>>>> delivered. What's the qemu version? >>>> >>>> >>>> I have the following version: >>>> >>>> virsh # version >>>> Compiled against library: libvirt 1.0.5 >>>> Using library: libvirt 1.0.5 >>>> Using API: QEMU 1.0.5 >>>> Running hypervisor: QEMU 1.4.1 >>>> >>>>> >>>>> If you turn the debug logs on, do you see DEVICE_TRAY_MOVED in them? >>>>> http://wiki.libvirt.org/page/DebugLogs >>>>> >>>>> Or maybe the timeout is just short for your guest to eject the tray. So >>>>> after a while, does 'virsh change-media' succeed? >>>> >>>> Yes, I can see the DEVICE_TRAY_MOVED in the log file, it goes after >>>> the debug message "qemuDomainChangeEjectableMedia:103 : Waiting 500ms for >>>> tray to open. Retries left 0". >>>> does that mean the timeout is too short? >>>> >>>> If the guest needs too much time to eject, I'm wondering why when I use >>>> the v1 patch, I can get succeed in about 1 to 2 seconds. I doubt if there >>>> are other problems cause tray_status not updated in time? >>>> >>>> I appended the log file for your reference, if you need any other >>>> information, >>>> please tell me. >>>> >>>> Thanks, >>>> Wei >>> >>> I am pasting some interesting knobs from the log: >>> >>> 2013-05-17 09:04:46.234+0000: 27004: debug : qemuMonitorSend:892 : >>> QEMU_MONITOR_SEND_MSG: mon=0x7f7ac4000bd0 >>> msg={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"} >>> fd=-1 >>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: >>> obj=0x7f7ac4000bd0 >>> 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOWrite:458 : >>> QEMU_MONITOR_IO_WRITE: mon=0x7f7ac4000bd0 >>> buf={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"} >>> len=92 ret=92 errno=2 >>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : >>> OBJECT_UNREF: obj=0x7f7ac4000bd0 >>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: >>> obj=0x7f7ac4000bd0 >>> 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOProcess:350 : >>> QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"id": "libvirt-7", >>> "error": {"class": "GenericError", "desc": "Device 'drive-ide0-1-0' is >>> locked"}} >>> len=102 >>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : >>> OBJECT_UNREF: obj=0x7f7ac4000bd0 >>> 2013-05-17 09:04:46.235+0000: 27004: error : qemuMonitorJSONCheckError:352 >>> : internal error unable to execute QEMU command 'eject': Device >>> 'drive-ide0-1-0' is locked >>> 2013-05-17 09:04:46.235+0000: 27004: debug : >>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. >>> Retries left 9 >>> 2013-05-17 09:04:46.735+0000: 27004: debug : >>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. >>> Retries left 8 >>> 2013-05-17 09:04:47.235+0000: 27004: debug : >>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. >>> Retries left 7 >>> 2013-05-17 09:04:47.735+0000: 27004: debug : >>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. >>> Retries left 6 >>> 2013-05-17 09:04:48.186+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: >>> obj=0x7f7ac4000bd0 >>> 2013-05-17 09:04:48.235+0000: 27004: debug : >>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. >>> Retries left 5 >>> 2013-05-17 09:04:48.735+0000: 27004: debug : >>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. >>> Retries left 4 >>> 2013-05-17 09:04:49.235+0000: 27004: debug : >>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. >>> Retries left 3 >>> 2013-05-17 09:04:49.736+0000: 27004: debug : >>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. >>> Retries left 2 >>> 2013-05-17 09:04:50.236+0000: 27004: debug : >>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. >>> Retries left 1 >>> 2013-05-17 09:04:50.736+0000: 27004: debug : >>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. >>> Retries left 0 >>> 2013-05-17 09:04:51.236+0000: 27004: error : >>> qemuDomainChangeEjectableMedia:117 : operation failed: Unable to eject >>> media before changing it >>> 2013-05-17 09:04:51.236+0000: 27004: debug : virObjectUnref:258 : >>> OBJECT_UNREF: obj=0x7f7ac4000bd0 >>> 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockDiskDetach:282 : >>> plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 disk=0x7f7ac4003690 >>> 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockManagerNew:128 : >>> plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 withResources=0 >>> 2013-05-17 09:04:51.236+0000: 27004: debug : >>> virLockManagerPluginGetDriver:295 : plugin=0x7f7acc007c00 >>> 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerNew:319 : >>> driver=0x7f7ae53dc5e0 type=0 nparams=5 params=0x7f7ad7d25830 flags=0 >>> 2013-05-17 09:04:51.236+0000: 27001: debug : qemuMonitorIOProcess:350 : >>> QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"timestamp": {"seconds": >>> 1368781488, "microseconds": 186399}, "event": "DEVICE_TRAY_MOVED", "data": >>> {"device": "drive-ide0-1-0", "tray-open": true}} >>> len=151 >>> >>> >>> What we can see here is: >>> 1) libvirt asks qemu to eject >>> 2) qemu responds with "Device is locked" - which is correct as it emulates >>> guest request >>> 3) However, after a while, the guest decides it's now safe to eject the >>> media (e.g. all reads have finished, and the device has been unmounted) >>> which is done >>> 4) [not in the pasted log] After a while, you've decided to try again and >>> this time it succeeded, as the tray is already opened. >>> >>> What we may do here is lifting the number of retries up. You can try >>> yourself: >> >> I have indeed increased retries to 20 and unfortunately that didn't work >> either, >> we can see in the log file that the DEVICE_TRAY_MOVED event received >> immediately >> after the final retry. After we doubled the retries, that would be more that >> enough >> time for the guest to eject the tray. So, maybe there is something wrong >> otherwhere. >> >> Thanks, >> Wei > > I think I know where the bug is. When waiting on tray to open, we don't > unlock the domain. So later, when an event arrives, the first thing that the > event handler does is locking the domain. Hence, it seems like the event is > delivered right after we've failed waiting. That's just because after we've > finished waiting, we've unlocked the domain. And I think same applies for the > domain monitor. So can you please try this out:
This patch works fine, thanks :) Tested-by: Li Wei <l...@cn.fujitsu.com> Thanks, Wei > > diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c > index d037c9d..767d5a1 100644 > --- a/src/qemu/qemu_hotplug.c > +++ b/src/qemu/qemu_hotplug.c > @@ -95,14 +95,20 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr > driver, > qemuDomainObjEnterMonitor(driver, vm); > ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force); > > + virObjectRef(vm); > /* we don't want to report errors from media tray_open polling */ > while (retries--) { > if (origdisk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN) > break; > > + qemuDomainObjExitMonitor(driver, vm); > + virObjectUnlock(vm); > VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", > retries); > usleep(500 * 1000); /* sleep 500ms */ > + virObjectLock(vm); > + qemuDomainObjEnterMonitor(driver, vm); > } > + virObjectUnref(vm); > > if (disk->src) { > /* deliberately don't depend on 'ret' as 'eject' may have failed for > the > > > Michal > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list