On Fri, Oct 22, 2010 at 10:59 AM, Amit Shah <amit.s...@redhat.com> wrote: > On (Wed) Oct 20 2010 [08:18:51], Anthony Liguori wrote: >> On 10/20/2010 07:05 AM, Stefan Hajnoczi wrote: >> >On Wed, Oct 20, 2010 at 12:57 PM, Amit Shah<amit.s...@redhat.com> wrote: >> >>On (Tue) Oct 19 2010 [23:12:20], Arun R Bharadwaj wrote: >> >>>Hi, >> >>> >> >>>This is the v6 of the patch-series to have a generic asynchronous task >> >>>offloading framework (called threadlets) within qemu. >> >>> >> >>>Request to consider pulling this series as discussed during the >> >>>Qemu-devel call. >> >>I tried this out with virtio-serial (patch below). Have a couple of >> >>things to note: >> >> >> >>- Guests get a SIGUSR2 on startup sometimes. This doesn't happen with >> >> qemu.git, so looks like it's introduced by this patchset. >> >> >> >>- After running some tests, I get an abort. I still have to look at >> >> what's causing it, but doesn't look like it's related to virtio-serial >> >> code. >> >> >> >>Program received signal SIGABRT, Aborted. >> >>0x0000003dc76329a5 in raise () from /lib64/libc.so.6 >> >>Missing separate debuginfos, use: debuginfo-install >> >>SDL-1.2.14-8.fc13.x86_64 glibc-2.12.1-2.x86_64 >> >>libX11-1.3.1-3.fc13.x86_64 libXau-1.0.5-1.fc12.x86_64 >> >>libpng-1.2.44-1.fc13.x86_64 libxcb-1.5-1.fc13.x86_64 >> >>ncurses-libs-5.7-7.20100130.fc13.x86_64 zlib-1.2.3-23.fc12.x86_64 >> >>(gdb) bt >> >>#0 0x0000003dc76329a5 in raise () from /lib64/libc.so.6 >> >>#1 0x0000003dc7634185 in abort () from /lib64/libc.so.6 >> >>#2 0x00000000004bf829 in qemu_get_ram_ptr (addr=<value optimized out>) >> >>at /home/amit/src/qemu/exec.c:2936 >> >>#3 0x00000000004bf9a7 in lduw_phys (addr=<value optimized out>) at >> >>/home/amit/src/qemu/exec.c:3836 >> >>#4 0x0000000000557c90 in vring_avail_idx (vq=0x17b9320, idx=1333) at >> >>/home/amit/src/qemu/hw/virtio.c:133 >> >>#5 virtqueue_num_heads (vq=0x17b9320, idx=1333) at >> >>/home/amit/src/qemu/hw/virtio.c:252 >> >>#6 0x0000000000557e5e in virtqueue_avail_bytes (vq=0x17b9320, >> >>in_bytes=4096, out_bytes=0) at /home/amit/src/qemu/hw/virtio.c:311 >> >> >> >>- I'm using a threadlet to queue up several work items which are to be >> >> processed in a fifo order. There's no cancel function for a threadlet >> >> that either processes all work and then quits the thread or just >> >> cancels all pending work and quits. >> >> >> >> Amit >> >> >> >> >> >>diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c >> >>index 74ba5ec..caaafbe 100644 >> >>--- a/hw/virtio-serial-bus.c >> >>+++ b/hw/virtio-serial-bus.c >> >>@@ -51,6 +51,14 @@ struct VirtIOSerial { >> >> struct virtio_console_config config; >> >> }; >> >> >> >>+typedef struct VirtIOSerialWork { >> >>+ ThreadletWork work; >> >>+ VirtIOSerialPort *port; >> >>+ VirtQueue *vq; >> >>+ VirtIODevice *vdev; >> >>+ int discard; >> >>+} VirtIOSerialWork; >> >>+ >> >> static VirtIOSerialPort *find_port_by_id(VirtIOSerial *vser, uint32_t id) >> >> { >> >> VirtIOSerialPort *port; >> >>@@ -113,10 +121,20 @@ static size_t write_to_port(VirtIOSerialPort *port, >> >> return offset; >> >> } >> >> >> >>-static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq, >> >>- VirtIODevice *vdev, bool discard) >> >>+static void async_flush_queued_data(ThreadletWork *work) >> >> { >> >>+ VirtIOSerialPort *port; >> >>+ VirtIOSerialWork *vs_work; >> >>+ VirtQueue *vq; >> >>+ VirtIODevice *vdev; >> >> VirtQueueElement elem; >> >>+ int discard; >> >>+ >> >>+ vs_work = DO_UPCAST(VirtIOSerialWork, work, work); >> >>+ port = vs_work->port; >> >>+ vq = vs_work->vq; >> >>+ vdev = vs_work->vdev; >> >>+ discard = vs_work->discard; >> >> >> >> assert(port || discard); >> >> assert(virtio_queue_ready(vq)); >> >You cannot access guest memory using QEMU RAM functions (or use the >> >virtqueue_pop() function which uses them) from a thread without taking >> >the QEMU global mutex. >> > >> >The abort stack trace is a result of accessing guest RAM from two >> >threads simultaneously. >> > >> >In general it is not safe to use QEMU functions from a thread unless >> >they are explicitly written to work outside the QEMU global mutex. >> >Most functions assume the global mutex, which serializes I/O thread >> >and vcpu changes to global state, is held. >> >> Yes, threadlets are only meant to be used to make synchronous system >> calls asynchronous. They are not meant to add parallelism to QEMU >> (yet). > > Yes -- I realised that. (But I don't see why the virtio rings get > modified elsewhere other than the only function I push/pop from above.)
The particular race condition triggered by virtqueue_pop() is because guest RAM access functions try to optimize frequently accessed memory to keep lookup time to a minimum. The current RAM block is put onto the beginning of the lookup list so the next access to it will be faster. When two threads are looking up guest memory simultaneously one may be modifying the list while the other traverses it... > Anyway, just one question as I've still not read the code: does a > running work item in a threadlet block migration? Do the remaining work > items in a threadlet get migrated fine? There's no migration support in the threadlets infrastructure itself. Threadlets users need to think about how to safely migrate, either through cancellation or waiting for all pending work to complete. For example, migration will quiesce aio requests using qemu_aio_flush() so there is no outstanding work to be migrated. Stefan