Anthony Liguori <anth...@codemonkey.ws> wrote: > On 12/01/2010 11:01 AM, Avi Kivity wrote: >> On 12/01/2010 06:56 PM, Anthony Liguori wrote: >>> On 12/01/2010 10:52 AM, Avi Kivity wrote: >>>> On 12/01/2010 06:49 PM, Anthony Liguori wrote: >>>>>> We need actual measurements instead of speculations. >>>>> >>>>> >>>>> Yes, I agree 100%. I think the place to start is what I >>>>> suggested in a previous note in this thread, we need to measure >>>>> actual stall time in the guest. >>>> >>>> I'd actually start at the host. How much time does >>>> ioctl(KVM_GET_DIRTY_LOG) take? What's the percentage of time >>>> qemu_mutex is held? >>> >>> The question is, what really are the symptoms of the problem. It's >>> not necessarily a bad thing if KVM_GET_DIRTY_LOG takes a long while >>> qemu_mutex is held. >> >> Whether or not qemu_mutex is held, long KVM_GET_DIRTY_LONG runtimes >> are bad, since they are a lower bound on your downtime. And >> KVM_GET_DIRTY_LOG does a lot of work, and invokes >> synchronize_srcu_expedited(), which can be very slow. > > That's fine, and you're right, it's a useful thing to do, but this > series originated because of a problem and we ought to make sure we > capture what the actual problem is. That's not to say we shouldn't > improve things that could stand to be improved. > >>> >>> Is the problem that the monitor responds slowly? Is the problem >>> that the guest isn't consistently getting execution time? Is the >>> proper simply that the guest isn't getting enough total execution >>> time? >> >> All three can happen if qemu_mutex is held too long. > > Right, but I'm starting to think that the root of the problem is not > that it's being held too long but that it's being held too often.
Ok, I tested yesterday dropping qemu_mutex on ram_save_block (crude thing, just qemu_mutex_unlock_iothread(); loop ; qemu_mutex_lock_iothread(); As requested by Anthony, I tested on the guest to see how big stalls were. Code is: while (1) { if (gettimeofday(&t0, NULL) != 0) perror("gettimeofday 1"); if (usleep(100) != 0) perror("usleep"); if (gettimeofday(&t1, NULL) != 0) perror("gettimeofday 2"); t1.tv_usec -= t0.tv_usec; if (t1.tv_usec < 0) { t1.tv_usec += 1000000; t1.tv_sec--; } t1.tv_sec -= t0.tv_sec; if (t1.tv_sec || t1.tv_usec > 5000) printf("delay of %ld\n", t1.tv_sec * 1000000 + t1.tv_usec); } I tried in a guest that is completely idle with 8vcpus. on idle, only some stalls in the 5-8ms happens (as expected). (this is after my series). As soon as I start migration, we got several stalls in the 15-200ms range. Notice that stalls are not bigger because I limit the time that qemu_mutex is held on the iothread to 50ms each time. doing the crude qemu_mutex drop on ram_save_live, means that this ministalls got way smaller in the 10-15ms range (some rare at 20ms). And then we have an stall of around 120ms during the non-live part of the migration. I can't find where this stall comes from (i.e. saving all the rest of pages and normal sections take much less time). But on the other hand, I have no instrumentation yet to measure how long it takes to move to the other host and restart there. So, we are still not there, but now we have only a single 120ms stall on the guest, versus the 1-4 seconds ones that we used to have. I don't have access to this machines until next week, so I am spending this week implementing the ideas given on this thread. Later, Juan.