> >>> Is that known behavior, or should aio give about the same > >>> performance as using a thread? > >> > >> It depends, but this is what we saw for migration too. You may also > >> have less contention on the big QEMU lock if you use a thread.
I managed to fix the performance issue by doing the following: - use O_DIRECT to avoid host cache on write - limit write size to max 4096 bytes (throughput gets a bit lower, but VM is much more responsive during backup) Now I get about the same behavior/performance as using a thread. Unfortunately, the bug in bdrv_drain_all() trigger more often. > > But when I use a thread it triggers the bug in bdrv_drain_all(). So > > how can I fix bdrv_drain_all() if I use a separate thread to write data? > > The bug is, in all likelihood, in your own code. Sorry. :) > > > I currently use CoQueue to wait for the output thread. > > The simplest way to synchronize the main event loop with another thread is > a bottom half. There is no example yet, but I will post it soon for > migration. The problem is not really how I can synchronize. The problem is that I need to halt the blockjob when the output buffer is full, and this causes the bug in bdrv_drain_all(). I am currently a bit out of ideas - will do more testing.