Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-20 Thread Glauber de Oliveira Costa
On Dec 19, 2007 1:41 PM, Avi Kivity <[EMAIL PROTECTED]> wrote:
> Glauber de Oliveira Costa wrote:
> > Changes in rate does not sound good. It's possibly what's screwing up
> > my paravirt clock implementation in smp.
> >
>
> You should renew the timebase on vcpu migration, and hook cpufreq so
> that changes in frequency are reflected in the timebase.

 To be conservative, I do it in every vcpu run, and have any kind of
cpu frequency scaling disabled. And it does not work.

In a trace in the host, I see that vcpu runs happens very often in
vcpu 0 (probably because exits happen often there, so we have to go
back),
and comparatively, very few times in vcpu 1.

So what's probably happening is : vcpu 1 does system_time + tsc_delta,
 but vcpu 0 has already updated it so many times, the tsc does not
keep up,
and it end going backwards.

I'm running (in the host), the following test, upon module loading
(and Ingo can please tell me if I'm doing something idiotic in it,
compromising my conclusions)

void test (int foo)
{
   u64 start, stop;
   start = native_read_tsc();
   udelay(foo);
   stop = native_read_tsc();
   printk("%d Result: %lld\n", foo, foo * 1000 - cycles_2_ns(stop
- start));
}

Output is:

30 Result: -126
90 Result: 576
300 Result: 2627
1000 Result: 9381
3000 Result: 28238
5000 Result: 48086


So the delta is expecting to get bigger. If a vcpu passes a long time
without having the time updated.
Xen manages to keep the guest tsc stable and steady by doing
synchronization from time to time.

We can either: (If I'm right at this, of course):

* put a periodic timer in the host to update the system time from time to time;
* use some sort of global timestamp, instead of the per-cpu one.
* do something akin to what xen does, and still rely on the tsc.

Any thoughts?
-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-19 Thread Amit Shah
On Wednesday 19 December 2007 21:02:06 Glauber de Oliveira Costa wrote:
> On Dec 19, 2007 12:27 PM, Avi Kivity <[EMAIL PROTECTED]> wrote:
> > Ingo Molnar wrote:
> > > * Avi Kivity <[EMAIL PROTECTED]> wrote:
> > >> Avi Kivity wrote:
> > >>>  Testing shows wrmsr and rdtsc function normally.
> > >>>
> > >>> I'll try pinning the vcpus to cpus and see if that helps.
> > >>
> > >> It does.
> > >
> > > do we let the guest read the physical CPU's TSC? That would be trouble.
> >
> > vmx (and svm) allow us to add an offset to the physical tsc.  We set it
> > on startup to -tsc (so that an rdtsc on boot would return 0), and
> > massage it on vcpu migration so that guest rdtsc is monotonic.
> >
> > The net effect is that tsc on a vcpu can experience large forward jumps
> > and changes in rate, but no negative jumps.
>
> Changes in rate does not sound good. It's possibly what's screwing up
> my paravirt clock implementation in smp.

Do you mean in the case of VM migration, or just starting them on a single 
host?

> Since the host updates guest time prior to putting vcpu to run, two
> vcpus that start running at different times will have different system
> values.
>
> Now if the vcpu that started running later probes the time first,
> we'll se the time going backwards. A constant tsc rate is the only way
> around
> my limited mind sees around the problem (besides, obviously, _not_
> making the system time per-vcpu).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-19 Thread Avi Kivity

Glauber de Oliveira Costa wrote:

Changes in rate does not sound good. It's possibly what's screwing up
my paravirt clock implementation in smp.
  


You should renew the timebase on vcpu migration, and hook cpufreq so 
that changes in frequency are reflected in the timebase.



Since the host updates guest time prior to putting vcpu to run, two
vcpus that start running at different times will have different system
values.

Now if the vcpu that started running later probes the time first,
we'll se the time going backwards. A constant tsc rate is the only way
around
my limited mind sees around the problem (besides, obviously, _not_
making the system time per-vcpu).
  


I tried disabling frequency scaling (rmmod acpi_cpufreq) but that didn't 
help my present problems.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-19 Thread Glauber de Oliveira Costa
On Dec 19, 2007 12:27 PM, Avi Kivity <[EMAIL PROTECTED]> wrote:
> Ingo Molnar wrote:
> > * Avi Kivity <[EMAIL PROTECTED]> wrote:
> >
> >
> >> Avi Kivity wrote:
> >>
> >>>  Testing shows wrmsr and rdtsc function normally.
> >>>
> >>> I'll try pinning the vcpus to cpus and see if that helps.
> >>>
> >>>
> >> It does.
> >>
> >
> > do we let the guest read the physical CPU's TSC? That would be trouble.
> >
> >
>
> vmx (and svm) allow us to add an offset to the physical tsc.  We set it
> on startup to -tsc (so that an rdtsc on boot would return 0), and
> massage it on vcpu migration so that guest rdtsc is monotonic.
>
> The net effect is that tsc on a vcpu can experience large forward jumps
> and changes in rate, but no negative jumps.
>

Changes in rate does not sound good. It's possibly what's screwing up
my paravirt clock implementation in smp.
Since the host updates guest time prior to putting vcpu to run, two
vcpus that start running at different times will have different system
values.

Now if the vcpu that started running later probes the time first,
we'll se the time going backwards. A constant tsc rate is the only way
around
my limited mind sees around the problem (besides, obviously, _not_
making the system time per-vcpu).


-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-19 Thread Avi Kivity

Ingo Molnar wrote:

try this test perhaps in an SMP guest:

 http://people.redhat.com/mingo/time-warp-test/time-warp-test.c

you can ignore TSC warps - but no GTOD or CLOCK warps should occur.

  


On a broken guest kernel, I see gtod and clock warps.  On a good guest 
kernel, I do not, presumably because the tsc clocksource is marked as 
unstable.


I see tsc warps on both.  8 threads on 4 cpus.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-19 Thread Ingo Molnar

try this test perhaps in an SMP guest:

 http://people.redhat.com/mingo/time-warp-test/time-warp-test.c

you can ignore TSC warps - but no GTOD or CLOCK warps should occur.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-19 Thread Avi Kivity

Ingo Molnar wrote:

* Avi Kivity <[EMAIL PROTECTED]> wrote:

  

Avi Kivity wrote:


 Testing shows wrmsr and rdtsc function normally.

I'll try pinning the vcpus to cpus and see if that helps.

  

It does.



do we let the guest read the physical CPU's TSC? That would be trouble.

  


vmx (and svm) allow us to add an offset to the physical tsc.  We set it
on startup to -tsc (so that an rdtsc on boot would return 0), and
massage it on vcpu migration so that guest rdtsc is monotonic.

The net effect is that tsc on a vcpu can experience large forward jumps
and changes in rate, but no negative jumps.

--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-19 Thread Avi Kivity

Ingo Molnar wrote:

* Avi Kivity <[EMAIL PROTECTED]> wrote:

  

Avi Kivity wrote:


 Testing shows wrmsr and rdtsc function normally.

I'll try pinning the vcpus to cpus and see if that helps.

  

It does.



do we let the guest read the physical CPU's TSC? That would be trouble.

  


vmx (and svm) allow us to add an offset to the physical tsc.  We set it 
on startup to -tsc (so that an rdtsc on boot would return 0), and 
massage it on vcpu migration so that guest rdtsc is monotonic.


The net effect is that tsc on a vcpu can experience large forward jumps 
and changes in rate, but no negative jumps.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-19 Thread Ingo Molnar

* Avi Kivity <[EMAIL PROTECTED]> wrote:

> Avi Kivity wrote:
>>  Testing shows wrmsr and rdtsc function normally.
>>
>> I'll try pinning the vcpus to cpus and see if that helps.
>>
>
> It does.

do we let the guest read the physical CPU's TSC? That would be trouble.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-19 Thread Avi Kivity

Avi Kivity wrote:
 
Testing shows wrmsr and rdtsc function normally.


I'll try pinning the vcpus to cpus and see if that helps.



It does.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [kvm-devel] Guest kernel hangs in smp kvm for older kernels prior to tsc sync cleanup

2007-12-19 Thread Avi Kivity

Avi Kivity wrote:

Ingo Molnar wrote:
  
While the change mentions that it fixes a time warp bug, it also says 
it should be rare.  So clearly kvm smp tsc handing is buggy.  
Ingo/Thomas, (or anybody else), do you have any insight as to what kvm 
can be doing wrong to trigger this behavior?

  
hm. Those time warps were really small, due to the small imperfections 
in the "sync up all CPUs to the same moment and do a WRMSR to clear all 
their TSCs" mechanism. I.e. at most a few usec time warps. I really dont 
know how that should result in udevd hanging. Can you debug udevd in any 
way?


  



Adding debug didn't help.  I'll try some sysrq keys to see what the 
guest thinks is happening.


  


many udev children are exiting; udevd itself is sleeping:


udevd S D5DCDF24  2924   573372   594 629   535 (NOTLB)
   d5dcdf38 0086 0002 d5dcdf24 d5dcdf20  d5dcdefc 
d6169f68
   d7db7f68 d5dcdf68 0001 d5dd7560 c13b8a90 749ae8d2 0002 
000326a1
   d5dd7684 c131c700 0003 d74f8900 892d6946 0402  


Call Trace:
 [] do_nanosleep+0x3b/0x66
 [] hrtimer_nanosleep+0x50/0x106
 [] hrtimer_wakeup+0x0/0x18
 [] sys_nanosleep+0x49/0x59
 [] syscall_call+0x7/0xb
 [] xfrm_state_find+0x49f/0x51e


So likely sleeping is screwed up somehow (though only on smp).


so the only thing that KVM might be doing incorrectly here is the 
emulation of the WRMSR that clears the TSC of each vcpu?
  



By inspection, it is correct.  Of course I may be missing something, so 
I'll write a unit test for it.  It should also be much slower than the 
native wrmsr.


  


Testing shows wrmsr and rdtsc function normally.

I'll try pinning the vcpus to cpus and see if that helps.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/