On 05/19/2013 05:51 PM, Michal Privoznik wrote:
> On 18.05.2013 05:31, Li Wei wrote:
>> On 05/17, Michal Privoznik wrote:
>>> On 17.05.2013 11:37, Li Wei wrote:
>>>> On 05/17/2013 04:52 PM, Michal Privoznik wrote:
>>>>> On 17.05.2013 10:16, Li Wei wrote:
>>>>>> On 05/17/2013 04:07 PM, Michal Privoznik wrote:
>>>>>>> On 17.05.2013 07:47, Li Wei wrote:
>>>>>>>> Hi Michal, Daniel
>>>>>>>>
>>>>>>>> It seems there is something wrong with the 2/2 part of this patchset.
>>>>>>>> When I do an "change-media" command in virsh, it doesn't do any better
>>>>>>>> than before, but even worse(I must to wait 5 secs to see the error).
>>>>>>>>
>>>>>>>> I'm not family with libvirt, just add some log things in the 
>>>>>>>> qemu_hotplug.c
>>>>>>>> and found the tray_status never change to open, but with michal's 
>>>>>>>> original
>>>>>>>> patch (which do active poll on tray_status), I can do "change-media" 
>>>>>>>> successfully
>>>>>>>> every time.
>>>>>>>
>>>>>>> Yes & no. Thing is - originally we ignored tray status in the guest. So
>>>>>>> in case guest has locked the tray and thus ignored our eject request, we
>>>>>>> have changed the media anyway, resulting in possible I/O errors within
>>>>>>> guest.
>>>>>>>
>>>>>>> With my patch, we send eject request to the qemu, and actively poll for
>>>>>>> the tray to open. We check for up to 10 times every 200ms. So guest has
>>>>>>> 2 seconds to open the tray. If the guest doesn't eject media and open
>>>>>>> the tray we don't proceed with forcibly changing the media.
>>>>>>>
>>>>>>> These patches are actually a bug fix for
>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=892289
>>>>>>
>>>>>> Hmm... I think you misunderstood my meaning(sorry for my pool english),
>>>>>> simply to say, I can get expected behavior by apply patch V1 but not V2.
>>>>>>
>>>>>> In V1, we do active poll on qemuMonitorGetBlockInfo(), and in V2 we just
>>>>>> wait for origdisk->tray_status to become VIR_DOMAIN_DISK_TRAY_OPEN but
>>>>>> this never happened.
>>>>>>
>>>>>> Thanks,
>>>>>> Wei
>>>>>
>>>>> Aaah, now I understand. So you are saying that the event is not being
>>>>> delivered. What's the qemu version?
>>>>
>>>>
>>>> I have the following version:
>>>>
>>>> virsh # version
>>>> Compiled against library: libvirt 1.0.5
>>>> Using library: libvirt 1.0.5
>>>> Using API: QEMU 1.0.5
>>>> Running hypervisor: QEMU 1.4.1
>>>>
>>>>>
>>>>> If you turn the debug logs on, do you see DEVICE_TRAY_MOVED in them?
>>>>> http://wiki.libvirt.org/page/DebugLogs
>>>>>
>>>>> Or maybe the timeout is just short for your guest to eject the tray. So
>>>>> after a while, does 'virsh change-media' succeed?
>>>>
>>>> Yes, I can see the DEVICE_TRAY_MOVED in the log file, it goes after
>>>> the debug message "qemuDomainChangeEjectableMedia:103 : Waiting 500ms for 
>>>> tray to open. Retries left 0".
>>>> does that mean the timeout is too short? 
>>>>
>>>> If the guest needs too much time to eject, I'm wondering why when I use
>>>> the v1 patch, I can get succeed in about 1 to 2 seconds. I doubt if there
>>>> are other problems cause tray_status not updated in time?
>>>>
>>>> I appended the log file for your reference, if you need any other 
>>>> information,
>>>> please tell me.
>>>>
>>>> Thanks,
>>>> Wei
>>>
>>> I am pasting some interesting knobs from the log:
>>>
>>> 2013-05-17 09:04:46.234+0000: 27004: debug : qemuMonitorSend:892 : 
>>> QEMU_MONITOR_SEND_MSG: mon=0x7f7ac4000bd0 
>>> msg={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"}
>>>  fd=-1
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: 
>>> obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOWrite:458 : 
>>> QEMU_MONITOR_IO_WRITE: mon=0x7f7ac4000bd0 
>>> buf={"execute":"eject","arguments":{"device":"drive-ide0-1-0","force":false},"id":"libvirt-7"}
>>>  len=92 ret=92 errno=2
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : 
>>> OBJECT_UNREF: obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: 
>>> obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : qemuMonitorIOProcess:350 : 
>>> QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"id": "libvirt-7", 
>>> "error": {"class": "GenericError", "desc": "Device 'drive-ide0-1-0' is 
>>> locked"}}
>>>  len=102
>>> 2013-05-17 09:04:46.234+0000: 27001: debug : virObjectUnref:258 : 
>>> OBJECT_UNREF: obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:46.235+0000: 27004: error : qemuMonitorJSONCheckError:352 
>>> : internal error unable to execute QEMU command 'eject': Device 
>>> 'drive-ide0-1-0' is locked
>>> 2013-05-17 09:04:46.235+0000: 27004: debug : 
>>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. 
>>> Retries left 9
>>> 2013-05-17 09:04:46.735+0000: 27004: debug : 
>>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. 
>>> Retries left 8
>>> 2013-05-17 09:04:47.235+0000: 27004: debug : 
>>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. 
>>> Retries left 7
>>> 2013-05-17 09:04:47.735+0000: 27004: debug : 
>>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. 
>>> Retries left 6
>>> 2013-05-17 09:04:48.186+0000: 27001: debug : virObjectRef:295 : OBJECT_REF: 
>>> obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:48.235+0000: 27004: debug : 
>>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. 
>>> Retries left 5
>>> 2013-05-17 09:04:48.735+0000: 27004: debug : 
>>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. 
>>> Retries left 4
>>> 2013-05-17 09:04:49.235+0000: 27004: debug : 
>>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. 
>>> Retries left 3
>>> 2013-05-17 09:04:49.736+0000: 27004: debug : 
>>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. 
>>> Retries left 2
>>> 2013-05-17 09:04:50.236+0000: 27004: debug : 
>>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. 
>>> Retries left 1
>>> 2013-05-17 09:04:50.736+0000: 27004: debug : 
>>> qemuDomainChangeEjectableMedia:103 : Waiting 500ms for tray to open. 
>>> Retries left 0
>>> 2013-05-17 09:04:51.236+0000: 27004: error : 
>>> qemuDomainChangeEjectableMedia:117 : operation failed: Unable to eject 
>>> media before changing it
>>> 2013-05-17 09:04:51.236+0000: 27004: debug : virObjectUnref:258 : 
>>> OBJECT_UNREF: obj=0x7f7ac4000bd0
>>> 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockDiskDetach:282 : 
>>> plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 disk=0x7f7ac4003690
>>> 2013-05-17 09:04:51.236+0000: 27004: debug : virDomainLockManagerNew:128 : 
>>> plugin=0x7f7acc007c00 dom=0x7f7acc0a4160 withResources=0
>>> 2013-05-17 09:04:51.236+0000: 27004: debug : 
>>> virLockManagerPluginGetDriver:295 : plugin=0x7f7acc007c00
>>> 2013-05-17 09:04:51.236+0000: 27004: debug : virLockManagerNew:319 : 
>>> driver=0x7f7ae53dc5e0 type=0 nparams=5 params=0x7f7ad7d25830 flags=0
>>> 2013-05-17 09:04:51.236+0000: 27001: debug : qemuMonitorIOProcess:350 : 
>>> QEMU_MONITOR_IO_PROCESS: mon=0x7f7ac4000bd0 buf={"timestamp": {"seconds": 
>>> 1368781488, "microseconds": 186399}, "event": "DEVICE_TRAY_MOVED", "data": 
>>> {"device": "drive-ide0-1-0", "tray-open": true}}
>>>  len=151
>>>
>>>
>>> What we can see here is:
>>> 1) libvirt asks qemu to eject
>>> 2) qemu responds with "Device is locked" - which is correct as it emulates 
>>> guest request
>>> 3) However, after a while, the guest decides it's now safe to eject the 
>>> media (e.g. all reads have finished, and the device has been unmounted) 
>>> which is done
>>> 4) [not in the pasted log] After a while, you've decided to try again and 
>>> this time it succeeded, as the tray is already opened.
>>>
>>> What we may do here is lifting the number of retries up. You can try 
>>> yourself:
>>
>> I have indeed increased retries to 20 and unfortunately that didn't work 
>> either,
>> we can see in the log file that the DEVICE_TRAY_MOVED event received 
>> immediately
>> after the final retry. After we doubled the retries, that would be more that 
>> enough
>> time for the guest to eject the tray. So, maybe there is something wrong 
>> otherwhere.
>>
>> Thanks,
>> Wei
> 
> I think I know where the bug is. When waiting on tray to open, we don't 
> unlock the domain. So later, when an event arrives, the first thing that the 
> event handler does is locking the domain. Hence, it seems like the event is 
> delivered right after we've failed waiting. That's just because after we've 
> finished waiting, we've unlocked the domain. And I think same applies for the 
> domain monitor. So can you please try this out:

This patch works fine, thanks :)

Tested-by: Li Wei <l...@cn.fujitsu.com>

Thanks,
Wei

> 
> diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
> index d037c9d..767d5a1 100644
> --- a/src/qemu/qemu_hotplug.c
> +++ b/src/qemu/qemu_hotplug.c
> @@ -95,14 +95,20 @@ int qemuDomainChangeEjectableMedia(virQEMUDriverPtr 
> driver,
>      qemuDomainObjEnterMonitor(driver, vm);
>      ret = qemuMonitorEjectMedia(priv->mon, driveAlias, force);
>  
> +    virObjectRef(vm);
>      /* we don't want to report errors from media tray_open polling */
>      while (retries--) {
>          if (origdisk->tray_status == VIR_DOMAIN_DISK_TRAY_OPEN)
>              break;
>  
> +        qemuDomainObjExitMonitor(driver, vm);
> +        virObjectUnlock(vm);
>          VIR_DEBUG("Waiting 500ms for tray to open. Retries left %d", 
> retries);
>          usleep(500 * 1000); /* sleep 500ms */
> +        virObjectLock(vm);
> +        qemuDomainObjEnterMonitor(driver, vm);
>      }
> +    virObjectUnref(vm);
>  
>      if (disk->src) {
>          /* deliberately don't depend on 'ret' as 'eject' may have failed for 
> the
> 
> 
> Michal
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to