Liran Schour wrote: > > Jan Kiszka <jan.kis...@siemens.com> wrote on 26/11/2009 19:24:40: >>>>>> + bdrv_write(bs, (addr >> SECTOR_BITS), >>>>>> + buf, block_mig_state-> > sectors_per_block); >>>>> This synchronous write-back translates appears to be the reason for > an >>>>> unusable migration (or restore from snapshot) speed: about 100 KB/s > here >>>>> (ie. 22h for a rather small 8G guest :( ). Did you already try to >>>>> improve this situation? >>>> I have seen this behavior, but it seems that there is a very big > difference >>>> in the performance if the new block device is based on an allocated > file >>>> already (try the same migration to an already allocated file in the >>>> requested size) I am trying to figure out why we see this > behavior.(any >>>> ideas?) >>> Yes, much faster, more than 6 MB/s. Not really impressing, but that's >>> now likely due to the tiny block size. >>> >>> That 4K also made the unallocated write so awfully slow as the image >>> file had to be continuously extended by this amount. >>> >>>> Anyway we can turn the writes to be async but we have to synchronize > all >>>> destinations writes before completing the migration and moving the > guest to >>>> the destination. When the guest starts to run on the destination all > writes >>>> should be finished, so anyhow we need to wait synchronously to the > writes. >>>> I will look on this further next week. >>> Well, we actually need these changes: >>> 1. Reasonable block size (if we cannot overcome the cluster size, we >>> need to coalesce blocks on write out) >> If I understood the logic correctly we can easily increase it during >> phase 2, ie. while in async mode. We just need to save the block size in >> its header if we vary it. > > Right, no problem to increase the block size or to use changed block size. >
Yep, I'm using such a patch now: 1 MB block size already helps a lot. >>> 2. Async write, throttled by the sync read, or both asynchronously. >>> Unfortunately, qemu's aio does not work yet at the point we need >>> it... >>> >>> I will also continue to dig into this. >>> >>> Jan >>> >> At this chance I realized that the length of the synchronous phase 3 >> depends on the number of dirty blocks collected during phase 2. Thus, >> the guest downtime will increase when it is under load, ie. typically at >> the point when the down-time must be minimal. We likely need an approach >> that takes the number of dirty sectors into account. > > True. It is in my todo list. We can do something similar to what is done > while migrating RAM. FYI, git://git.kiszka.org/qemu.git queues/migration contains my current patch queue. I was about to post, but now I'm back into debugging. The issue (EAGAIN when reading a snapshot via "-incoming exec:'cat snapshot'") may or may not be related to my changes. Once it's resolved, I will post officially, but maybe you already want to have a look. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux