I've been digging into some of the instability we see when running
larger numbers of guests at the same time.  The test I'm currently using
involves launching 64 1vcpu guests on an 8-way AMD box.  With the latest
kvm-userspace git and kvm.git + Gerd's kvmclock fixes, I can launch all
64 of these 1 second apart, and only a handful (1 to 3)  end up not
making it up.  In dmesg on the host, I get a couple messages:

[321365.362534] vcpu not ready for apic_round_robin

and 

[321503.023788] Unsupported delivery mode 7

Now, the interesting bit for me was when I used numactl to pin the guest
to a processor, all of the guests come up with no issues at all.  As I
looked into it, it means that we're not running any of the vcpu
migration code which on svm is comprised of tsc_offset recalibration and
apic migration, and on vmx, a little more per-vcpu work

I've convinced myself that svm.c's tsc offset calculation works and
handles the migration from cpu to cpu quite well.  I added the following
snippet to trigger if we ever encountered the case where we migrated to
a tsc that was behind:

    rdtscll(tsc_this);
    delta = vcpu->arch.host_tsc - tsc_this;
    old_time = vcpu->arch.host_tsc + svm->vmcb->control.tsc_offset;
    new_time = tsc_this + svm->vmcb->control.tsc_offset + delta;
    if (new_time < old_time) {
        printk(KERN_ERR "ACK! (CPU%d->CPU%d) time goes back %llu\n",
               vcpu->cpu, cpu, old_time - new_time);
    }
    svm->vmcb->control.tsc_offset += delta;

Noting that vcpu->arch.host_tsc is the tsc of the previous cpu the vcpu
was running on (see svm_put_vcpu()).  This allows me to check if we are
in fact increasing the guest's view of the tsc.  I've not be able to
trigger this at all when the vcpus are migrating.

As for the apic, the migrate code seems to be rather simple, but I've
not yet dived in to see if we've got anything racy in there:

lapic.c:
void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu)
{
    struct kvm_lapic *apic = vcpu->arch.apic;
    struct hrtimer *timer;

    if (!apic)
        return;

    timer = &apic->timer.dev;
    if (hrtimer_cancel(timer))
        hrtimer_start(timer, timer->expires, HRTIMER_MODE_ABS);
}



Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
[EMAIL PROTECTED]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to