Re: Timedrift in KVM guests after livemigration.

2010-04-22 Thread Thomas Treutner
On Sunday 18 April 2010 11:33:44 Espen Berg wrote:
 All guest are Debian lenny with latest upstream kernel, hvm/kvm.

 We are using kvm-clock as guest source clock.

 cat /sys/devices/system/clocksource/clocksource0/current_clocksource
 kvm-clock

I had to deactivate C1E (AMD CPUs) and use acpi clocksource (for both servers 
and VMs, IIRC). If you can, you should give it a try. After that, live 
migration worked somewhat stable.


regards, 
thomas
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timedrift in KVM guests after livemigration.

2010-04-19 Thread Espen Berg

Den 18.04.2010 11:56, skrev Gleb Natapov:


That's two different things here:
The issue that Espen is reporting is that the hosts have different
frequency and guests that relay on the tsc as a source clock will
notice that post migration. The is indeed a problem that -tdf does
not solve. -tdf only adds compensation for the RTC clock emulation.


It's -rtc-td-hack. -tdf does pit compensation, but since usually kernel
pit is used it does nothing.


So this hack will not solve our problem?

Espen


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timedrift in KVM guests after livemigration.

2010-04-19 Thread Gleb Natapov
On Mon, Apr 19, 2010 at 11:21:47AM +0200, Espen Berg wrote:
 Den 18.04.2010 11:56, skrev Gleb Natapov:
 
 That's two different things here:
 The issue that Espen is reporting is that the hosts have different
 frequency and guests that relay on the tsc as a source clock will
 notice that post migration. The is indeed a problem that -tdf does
 not solve. -tdf only adds compensation for the RTC clock emulation.
 
 It's -rtc-td-hack. -tdf does pit compensation, but since usually kernel
 pit is used it does nothing.
 
 So this hack will not solve our problem?
 
If your guest uses RTC for time keeping it may help. Otherwise it does
nothing.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timedrift in KVM guests after livemigration.

2010-04-19 Thread Dor Laor

On 04/19/2010 12:29 PM, Gleb Natapov wrote:

On Mon, Apr 19, 2010 at 11:21:47AM +0200, Espen Berg wrote:

Den 18.04.2010 11:56, skrev Gleb Natapov:


That's two different things here:
The issue that Espen is reporting is that the hosts have different
frequency and guests that relay on the tsc as a source clock will
notice that post migration. The is indeed a problem that -tdf does
not solve. -tdf only adds compensation for the RTC clock emulation.


It's -rtc-td-hack. -tdf does pit compensation, but since usually kernel
pit is used it does nothing.


So this hack will not solve our problem?


As I also stated, in the past the kvmclock MSRs were not sync upon live 
migration and it was fixed in 1a03675db146dfc760b3b48b3448075189f142cc ,

better check with the code.




If your guest uses RTC for time keeping it may help. Otherwise it does
nothing.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timedrift in KVM guests after livemigration.

2010-04-18 Thread Dor Laor

On 04/18/2010 02:21 AM, Espen Berg wrote:

Den 17.04.2010 22:17, skrev Michael Tokarev:

We have three KVM hosts that supports live-migration between them, but
one of our problems is time drifting. The three frontends has different
CPU frequency and the KVM guests adopt the frequency from the host
machine where it was first started.

What do you mean by adopts ? Note that the cpu frequency
means nothing for all the modern operating systems, at least
since the days of common usage of MS-DOS which relied on CPU
frequency for its time functions. All interesting things are
now done using timers instead, and timers (which don't depend
on CPU frequency again) usually work quite well.


The assumption that frequency of the ticks was calculated by the hosts
MHz, was based on the fact that grater clock frequency differences
caused higher time drift. 60 MHz difference caused about 24min drift,
332 MHz difference caused about 2h25min drift.



What complicates things is that the most cheap and accurate
enough time source is TSC (time stamp counter register in
the CPU), but it will definitely be different on each
machine. For that, 0.12.3 kvm and 2.6.32 kernel (I think)
introduced a compensation. See for example -tdf kvm option.


Ah, nice to know. :)


That's two different things here:
The issue that Espen is reporting is that the hosts have different 
frequency and guests that relay on the tsc as a source clock will notice 
that post migration. The is indeed a problem that -tdf does not solve. 
-tdf only adds compensation for the RTC clock emulation.


What's the guest type and what's the guest's source clock?
Using tsc directly as a source clock is not recommended because of this 
migration issue (that is not solveable until we trap every rdtsc by the 
guest). Using pv kvmclock in Linux mitigates this issue since it exposes 
both the tsc and the host clock so guests can adjust themselves.


Several months ago a pvclock migration fix was added to pass the pvclock 
MSRs reading to the destination: 1a03675db146dfc760b3b48b3448075189f142cc






Since this is a cluster in production, I'm not able to try the latest
version either.

Well, that's difficult one, no? It either works or not.
If you can't try anything else, why to ask? :)


What I tried to say was that there are many important virtual servers
running on this cluster at the moment, so trial by error was not an
option. The last time we tried 0.12.x (during the initial tests of the
cluster) there where a lot of stability issues, crashes during migration
etc.

Regards, Espen

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timedrift in KVM guests after livemigration.

2010-04-18 Thread Espen Berg

Den 18.04.2010 11:22, skrev Dor Laor:

What do you mean by adopts ? Note that the cpu frequency
means nothing for all the modern operating systems, at least
since the days of common usage of MS-DOS which relied on CPU
frequency for its time functions. All interesting things are
now done using timers instead, and timers (which don't depend
on CPU frequency again) usually work quite well.

The assumption that frequency of the ticks was calculated by the hosts
MHz, was based on the fact that grater clock frequency differences
caused higher time drift. 60 MHz difference caused about 24min drift,
332 MHz difference caused about 2h25min drift.

What complicates things is that the most cheap and accurate
enough time source is TSC (time stamp counter register in
the CPU), but it will definitely be different on each
machine. For that, 0.12.3 kvm and 2.6.32 kernel (I think)
introduced a compensation. See for example -tdf kvm option.

Ah, nice to know. :)

That's two different things here:
The issue that Espen is reporting is that the hosts have different
frequency and guests that relay on the tsc as a source clock will notice
that post migration. The is indeed a problem that -tdf does not solve.
-tdf only adds compensation for the RTC clock emulation.

What's the guest type and what's the guest's source clock?


All guest are Debian lenny with latest upstream kernel, hvm/kvm.

We are using kvm-clock as guest source clock.

cat /sys/devices/system/clocksource/clocksource0/current_clocksource
kvm-clock


Regards
Espen
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timedrift in KVM guests after livemigration.

2010-04-18 Thread Gleb Natapov
On Sun, Apr 18, 2010 at 12:22:54PM +0300, Dor Laor wrote:
 On 04/18/2010 02:21 AM, Espen Berg wrote:
 Den 17.04.2010 22:17, skrev Michael Tokarev:
 We have three KVM hosts that supports live-migration between them, but
 one of our problems is time drifting. The three frontends has different
 CPU frequency and the KVM guests adopt the frequency from the host
 machine where it was first started.
 What do you mean by adopts ? Note that the cpu frequency
 means nothing for all the modern operating systems, at least
 since the days of common usage of MS-DOS which relied on CPU
 frequency for its time functions. All interesting things are
 now done using timers instead, and timers (which don't depend
 on CPU frequency again) usually work quite well.
 
 The assumption that frequency of the ticks was calculated by the hosts
 MHz, was based on the fact that grater clock frequency differences
 caused higher time drift. 60 MHz difference caused about 24min drift,
 332 MHz difference caused about 2h25min drift.
 
 
 What complicates things is that the most cheap and accurate
 enough time source is TSC (time stamp counter register in
 the CPU), but it will definitely be different on each
 machine. For that, 0.12.3 kvm and 2.6.32 kernel (I think)
 introduced a compensation. See for example -tdf kvm option.
 
 Ah, nice to know. :)
 
 That's two different things here:
 The issue that Espen is reporting is that the hosts have different
 frequency and guests that relay on the tsc as a source clock will
 notice that post migration. The is indeed a problem that -tdf does
 not solve. -tdf only adds compensation for the RTC clock emulation.
 
It's -rtc-td-hack. -tdf does pit compensation, but since usually kernel
pit is used it does nothing.

 What's the guest type and what's the guest's source clock?
 Using tsc directly as a source clock is not recommended because of
 this migration issue (that is not solveable until we trap every
 rdtsc by the guest). Using pv kvmclock in Linux mitigates this issue
 since it exposes both the tsc and the host clock so guests can
 adjust themselves.
 
 Several months ago a pvclock migration fix was added to pass the
 pvclock MSRs reading to the destination:
 1a03675db146dfc760b3b48b3448075189f142cc
 
 
 
 Since this is a cluster in production, I'm not able to try the latest
 version either.
 Well, that's difficult one, no? It either works or not.
 If you can't try anything else, why to ask? :)
 
 What I tried to say was that there are many important virtual servers
 running on this cluster at the moment, so trial by error was not an
 option. The last time we tried 0.12.x (during the initial tests of the
 cluster) there where a lot of stability issues, crashes during migration
 etc.
 
 Regards, Espen
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at http://vger.kernel.org/majordomo-info.html
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timedrift in KVM guests after livemigration.

2010-04-17 Thread Espen Berg

Den 15.04.2010 09:35, skrev Espen Berg:

We have three KVM hosts that supports live-migration between them, but
one of our problems is time drifting. The three frontends has different
CPU frequency and the KVM guests adopt the frequency from the host
machine where it was first started.

Host1: cat /proc/cpuinfo
model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
cpu MHz : 2394.048

Host2: cat /proc/cpuinfo
model name : Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz
cpu MHz : 2659.685

Host3: cat /proc/cpuinfo
model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz
cpu MHz : 2327.507


virsh version
Compiled against library: libvir 0.7.6
Using library: libvir 0.7.6
Using API: QEMU 0.7.6
Running hypervisor: QEMU 0.11.0

Is there any solution to our problems, or is a reboot the only safe
solution?


Is there no one with similar problems here? :\  Guess I should file a 
bug report or something if the same problems occur in the latest 
version.  I can't se any changes in change log after 0.11.x that relate 
to this problem.  We can't be the only one that uses different CPUs in a 
migration environment.


Since this is a cluster in production, I'm not able to try the latest 
version either.


Espen.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timedrift in KVM guests after livemigration.

2010-04-17 Thread Michael Tokarev

17.04.2010 23:52, Espen Berg wrote:

Den 15.04.2010 09:35, skrev Espen Berg:

We have three KVM hosts that supports live-migration between them, but
one of our problems is time drifting. The three frontends has different
CPU frequency and the KVM guests adopt the frequency from the host
machine where it was first started.


What do you mean by adopts ?  Note that the cpu frequency
means nothing for all the modern operating systems, at least
since the days of common usage of MS-DOS which relied on CPU
frequency for its time functions.  All interesting things are
now done using timers instead, and timers (which don't depend
on CPU frequency again) usually work quite well.

What complicates things is that the most cheap and accurate
enough time source is TSC (time stamp counter register in
the CPU), but it will definitely be different on each
machine.  For that, 0.12.3 kvm and 2.6.32 kernel (I think)
introduced a compensation.  See for example -tdf kvm option.

[]

Is there any solution to our problems, or is a reboot the only safe
solution?


Well, reboot is definitely a safe solution.


Is there no one with similar problems here? :\ Guess I should file a bug
report or something if the same problems occur in the latest version. I
can't se any changes in change log after 0.11.x that relate to this
problem. We can't be the only one that uses different CPUs in a
migration environment.


Actually there is a difference in 0.12.


Since this is a cluster in production, I'm not able to try the latest
version either.


Well, that's difficult one, no?  It either works or not.
If you can't try anything else, why to ask? :)

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Timedrift in KVM guests after livemigration.

2010-04-17 Thread Espen Berg

Den 17.04.2010 22:17, skrev Michael Tokarev:

We have three KVM hosts that supports live-migration between them, but
one of our problems is time drifting. The three frontends has different
CPU frequency and the KVM guests adopt the frequency from the host
machine where it was first started.

What do you mean by adopts ? Note that the cpu frequency
means nothing for all the modern operating systems, at least
since the days of common usage of MS-DOS which relied on CPU
frequency for its time functions. All interesting things are
now done using timers instead, and timers (which don't depend
on CPU frequency again) usually work quite well.


The assumption that frequency of the ticks was calculated by the hosts 
MHz, was based on the fact that grater clock frequency differences 
caused  higher time drift.  60 MHz difference caused about 24min drift, 
332 MHz difference caused about 2h25min drift.




What complicates things is that the most cheap and accurate
enough time source is TSC (time stamp counter register in
the CPU), but it will definitely be different on each
machine. For that, 0.12.3 kvm and 2.6.32 kernel (I think)
introduced a compensation. See for example -tdf kvm option.


Ah, nice to know. :)


Since this is a cluster in production, I'm not able to try the latest
version either.

Well, that's difficult one, no? It either works or not.
If you can't try anything else, why to ask? :)


What I tried to say was that there are many important virtual servers 
running on this cluster at the moment, so trial by error was not an 
option.  The last time we tried 0.12.x (during the initial tests of the 
cluster) there where a lot of stability issues, crashes during migration 
etc.


Regards, Espen

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Timedrift in KVM guests after livemigration.

2010-04-15 Thread Espen Berg
We have three KVM hosts that supports live-migration between them, but 
one of our problems is time drifting.  The three frontends has different 
CPU frequency and the KVM guests adopt the frequency from the host 
machine where it was first started.


Host1: cat /proc/cpuinfo
model name  : Intel(R) Core(TM)2 CPU  6600  @ 2.40GHz
cpu MHz : 2394.048

Host2: cat /proc/cpuinfo
model name  : Intel(R) Core(TM)2 CPU  6700  @ 2.66GHz
cpu MHz : 2659.685

Host3: cat /proc/cpuinfo
model name  : Intel(R) Xeon(R) CPU   E5410  @ 2.33GHz
cpu MHz : 2327.507


virsh version
Compiled against library: libvir 0.7.6
Using library: libvir 0.7.6
Using API: QEMU 0.7.6
Running hypervisor: QEMU 0.11.0

Is there any solution to our problems, or is a reboot the only safe 
solution?


Regards
Espen



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html