Anthony Liguori <anth...@codemonkey.ws> wrote:
> On 12/01/2010 11:01 AM, Avi Kivity wrote:
>> On 12/01/2010 06:56 PM, Anthony Liguori wrote:
>>> On 12/01/2010 10:52 AM, Avi Kivity wrote:
>>>> On 12/01/2010 06:49 PM, Anthony Liguori wrote:
>>>>>> We need actual measurements instead of speculations.
>>>>>
>>>>>
>>>>> Yes, I agree 100%.  I think the place to start is what I
>>>>> suggested in a previous note in this thread, we need to measure
>>>>> actual stall time in the guest.
>>>>
>>>> I'd actually start at the host.  How much time does
>>>> ioctl(KVM_GET_DIRTY_LOG) take?  What's the percentage of time
>>>> qemu_mutex is held?
>>>
>>> The question is, what really are the symptoms of the problem.  It's
>>> not necessarily a bad thing if KVM_GET_DIRTY_LOG takes a long while
>>> qemu_mutex is held.
>>
>> Whether or not qemu_mutex is held, long KVM_GET_DIRTY_LONG runtimes
>> are bad, since they are a lower bound on your downtime.  And
>> KVM_GET_DIRTY_LOG does a lot of work, and invokes
>> synchronize_srcu_expedited(), which can be very slow.
>
> That's fine, and you're right, it's a useful thing to do, but this
> series originated because of a problem and we ought to make sure we
> capture what the actual problem is.  That's not to say we shouldn't
> improve things that could stand to be improved.
>
>>>
>>> Is the problem that the monitor responds slowly?  Is the problem
>>> that the guest isn't consistently getting execution time?  Is the
>>> proper simply that the guest isn't getting enough total execution
>>> time?
>>
>> All three can happen if qemu_mutex is held too long.
>
> Right, but I'm starting to think that the root of the problem is not
> that it's being held too long but that it's being held too often.

Ok, I tested yesterday dropping qemu_mutex on ram_save_block (crude
thing, just qemu_mutex_unlock_iothread(); loop ;
qemu_mutex_lock_iothread();

As requested by Anthony, I tested on the guest to see how big stalls
were.  Code is:

        while (1) {
                if (gettimeofday(&t0, NULL) != 0)
                        perror("gettimeofday 1");
                if (usleep(100) != 0)
                        perror("usleep");
                if (gettimeofday(&t1, NULL) != 0)
                        perror("gettimeofday 2");
                t1.tv_usec -= t0.tv_usec;
                if (t1.tv_usec < 0) {
                        t1.tv_usec += 1000000;
                        t1.tv_sec--;
                }
                t1.tv_sec -= t0.tv_sec;

                if (t1.tv_sec || t1.tv_usec > 5000)
                        printf("delay of %ld\n", t1.tv_sec * 1000000 + 
t1.tv_usec);
        }

I tried in a guest that is completely idle with 8vcpus.  on idle, only
some stalls in the 5-8ms happens (as expected).

(this is after my series).

As soon as I start migration, we got several stalls in the 15-200ms
range.  Notice that stalls are not bigger because I limit the time that
qemu_mutex is held on the iothread to 50ms each time.

doing the crude qemu_mutex drop on ram_save_live, means that this
ministalls got way smaller in the 10-15ms range (some rare at 20ms).

And then we have an stall of around 120ms during the non-live part of
the migration.  I can't find where this stall comes from (i.e. saving
all the rest of pages and normal sections take much less time).  But on
the other hand, I have no instrumentation yet to measure how long it
takes to move to the other host and restart there.

So, we are still not there, but now we have only a single 120ms stall on
the guest, versus the 1-4 seconds ones that we used to have.

I don't have access to this machines until next week, so I am spending
this week implementing the ideas given on this thread.

Later, Juan.

Reply via email to