On Tue, Mar 06, 2012 at 10:16:42AM -0600, Anthony Liguori wrote: > On 03/06/2012 09:56 AM, Alon Levy wrote: > >On Tue, Mar 06, 2012 at 07:51:29AM -0600, Anthony Liguori wrote: > >>On 03/06/2012 07:16 AM, Alon Levy wrote: > >>>On Tue, Mar 06, 2012 at 09:24:27AM -0300, Luiz Capitulino wrote: > >>>>On Tue, 06 Mar 2012 08:36:34 +0100 > >>>>Gerd Hoffmann<kra...@redhat.com> wrote: > >>>> > >>>>> Hi, > >>>>> > >>>>>>>How would the parallel execution facility be opaque to the implementer? > >>>>>>>screendump returns, screendump_async needs to pass a closure. You can > >>>>>>>automatically generate any amount of code, but you can only have a > >>>>>>>single function implementation with longjmp/coroutine, or having a > >>>>>>>saparate thread per command but that would mean taking locks for > >>>>>>>anything not trivial, which avoids the no-change-to-implementation. Is > >>>>>>>this what you have in mind? > >>>>>> > >>>>>>It would not be opaque to the implementer. But it would avoid > >>>>>>introducing new commands and events, instead we have a unified mechanism > >>>>>>to signal completion. > >>>>> > >>>>>Ok. We have a async mechanism today: .mhandler.cmd_async = ... > >>>>> > >>>>>I know it has its problems like no cancelation and is deprecated and > >>>>>all. But still: how about using it as interim until QAPI-based async > >>>>>monitor support is ready? We could unbreak qxl screendumps without > >>>>>having to introduce a new (but temporary!) screendump_async command + > >>>>>completion event. > >>>> > >>>>There are a few problems here, but the blocking one is that a command > >>>>can't go from sync to async. This is an incompatible change. > >>>> > >>>>If you mind adding the temporary command and if this issue is so rare > >>>>that none can reproduce it, then I'd suggest to wait for 1.2. > >>>> > >>> > >>>There are two options really: > >>> 1. revert the patches that changed qxl screendump to save the ppm > >>> before (possibly) updating the framebuffer. > >>> 2. introduce a new command that is really async > >>> > >>> The third option, what Gerd proposes, doesn't break the blocking chain > >>> going from the A, the dual purpose spice client and libvirt client, > >>> through libvirt, qemu, spice and back to A. > >>> > >>>If no one can reproduce the block then it would seem 1 makes sense. > >> > >>So let's start with a reproducible test case that demonstrates the > >>problem before we start introducing new commands then if there's > >>doubt about the nature of the problem. > > > >s/the problem/different problem/: > > > > NB: To get this hang I had to disable update_area's from the guest, just > > because otherwise there is a very small window between suspending the > > client and qemu waiting on the same stack trace below issued from the > > guest update_area io out, instead of from the screendump monitor > > command. > > > > spice client, qemu, libvirt, virsh screenshot. > > > > libvirt controlled qemu with qxl device and spice connection. > > qemu ... -vga qxl -spice disable-ticketing,port=5900... > > spicec -h localhost -p 5900 > > [boot until qxl driver is loaded, and guest is doing something (Running > > toms 2d benchmark works), suspend spicec] > > virsh screenshot<vm-name> /tmp/dump.ppm > > > >virsh will hang at: > > #1 virCondWait > > #2 qemuMonitorSend > > #3 qemuMonitorJSONCommandWithFd > > #4 qemuMonitorJSONCommand > > #5 qemuMonitorJSONScreendump > > > >qemu is hung at: > > main thread: > > #0 read > > Is qxl doing a blocking read? If so, that's a bug in qxl. You > cannot do a blocking read while holding qemu_mutex.
What are you saying, that that read should be fixed? then I guess it's good that patches fixing it have landed. That stack was prior to "qxl: make qxl_render_update async", 81fb6f1504fb9ef71f2382f44af34756668296e8 . > > Regards, > > Anthony Liguori >