Re: Timedrift in KVM guests after livemigration.
On Sunday 18 April 2010 11:33:44 Espen Berg wrote: All guest are Debian lenny with latest upstream kernel, hvm/kvm. We are using kvm-clock as guest source clock. cat /sys/devices/system/clocksource/clocksource0/current_clocksource kvm-clock I had to deactivate C1E (AMD CPUs) and use acpi clocksource (for both servers and VMs, IIRC). If you can, you should give it a try. After that, live migration worked somewhat stable. regards, thomas -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Timedrift in KVM guests after livemigration.
Den 18.04.2010 11:56, skrev Gleb Natapov: That's two different things here: The issue that Espen is reporting is that the hosts have different frequency and guests that relay on the tsc as a source clock will notice that post migration. The is indeed a problem that -tdf does not solve. -tdf only adds compensation for the RTC clock emulation. It's -rtc-td-hack. -tdf does pit compensation, but since usually kernel pit is used it does nothing. So this hack will not solve our problem? Espen -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Timedrift in KVM guests after livemigration.
On Mon, Apr 19, 2010 at 11:21:47AM +0200, Espen Berg wrote: Den 18.04.2010 11:56, skrev Gleb Natapov: That's two different things here: The issue that Espen is reporting is that the hosts have different frequency and guests that relay on the tsc as a source clock will notice that post migration. The is indeed a problem that -tdf does not solve. -tdf only adds compensation for the RTC clock emulation. It's -rtc-td-hack. -tdf does pit compensation, but since usually kernel pit is used it does nothing. So this hack will not solve our problem? If your guest uses RTC for time keeping it may help. Otherwise it does nothing. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Timedrift in KVM guests after livemigration.
On 04/19/2010 12:29 PM, Gleb Natapov wrote: On Mon, Apr 19, 2010 at 11:21:47AM +0200, Espen Berg wrote: Den 18.04.2010 11:56, skrev Gleb Natapov: That's two different things here: The issue that Espen is reporting is that the hosts have different frequency and guests that relay on the tsc as a source clock will notice that post migration. The is indeed a problem that -tdf does not solve. -tdf only adds compensation for the RTC clock emulation. It's -rtc-td-hack. -tdf does pit compensation, but since usually kernel pit is used it does nothing. So this hack will not solve our problem? As I also stated, in the past the kvmclock MSRs were not sync upon live migration and it was fixed in 1a03675db146dfc760b3b48b3448075189f142cc , better check with the code. If your guest uses RTC for time keeping it may help. Otherwise it does nothing. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Timedrift in KVM guests after livemigration.
On 04/18/2010 02:21 AM, Espen Berg wrote: Den 17.04.2010 22:17, skrev Michael Tokarev: We have three KVM hosts that supports live-migration between them, but one of our problems is time drifting. The three frontends has different CPU frequency and the KVM guests adopt the frequency from the host machine where it was first started. What do you mean by adopts ? Note that the cpu frequency means nothing for all the modern operating systems, at least since the days of common usage of MS-DOS which relied on CPU frequency for its time functions. All interesting things are now done using timers instead, and timers (which don't depend on CPU frequency again) usually work quite well. The assumption that frequency of the ticks was calculated by the hosts MHz, was based on the fact that grater clock frequency differences caused higher time drift. 60 MHz difference caused about 24min drift, 332 MHz difference caused about 2h25min drift. What complicates things is that the most cheap and accurate enough time source is TSC (time stamp counter register in the CPU), but it will definitely be different on each machine. For that, 0.12.3 kvm and 2.6.32 kernel (I think) introduced a compensation. See for example -tdf kvm option. Ah, nice to know. :) That's two different things here: The issue that Espen is reporting is that the hosts have different frequency and guests that relay on the tsc as a source clock will notice that post migration. The is indeed a problem that -tdf does not solve. -tdf only adds compensation for the RTC clock emulation. What's the guest type and what's the guest's source clock? Using tsc directly as a source clock is not recommended because of this migration issue (that is not solveable until we trap every rdtsc by the guest). Using pv kvmclock in Linux mitigates this issue since it exposes both the tsc and the host clock so guests can adjust themselves. Several months ago a pvclock migration fix was added to pass the pvclock MSRs reading to the destination: 1a03675db146dfc760b3b48b3448075189f142cc Since this is a cluster in production, I'm not able to try the latest version either. Well, that's difficult one, no? It either works or not. If you can't try anything else, why to ask? :) What I tried to say was that there are many important virtual servers running on this cluster at the moment, so trial by error was not an option. The last time we tried 0.12.x (during the initial tests of the cluster) there where a lot of stability issues, crashes during migration etc. Regards, Espen -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Timedrift in KVM guests after livemigration.
Den 18.04.2010 11:22, skrev Dor Laor: What do you mean by adopts ? Note that the cpu frequency means nothing for all the modern operating systems, at least since the days of common usage of MS-DOS which relied on CPU frequency for its time functions. All interesting things are now done using timers instead, and timers (which don't depend on CPU frequency again) usually work quite well. The assumption that frequency of the ticks was calculated by the hosts MHz, was based on the fact that grater clock frequency differences caused higher time drift. 60 MHz difference caused about 24min drift, 332 MHz difference caused about 2h25min drift. What complicates things is that the most cheap and accurate enough time source is TSC (time stamp counter register in the CPU), but it will definitely be different on each machine. For that, 0.12.3 kvm and 2.6.32 kernel (I think) introduced a compensation. See for example -tdf kvm option. Ah, nice to know. :) That's two different things here: The issue that Espen is reporting is that the hosts have different frequency and guests that relay on the tsc as a source clock will notice that post migration. The is indeed a problem that -tdf does not solve. -tdf only adds compensation for the RTC clock emulation. What's the guest type and what's the guest's source clock? All guest are Debian lenny with latest upstream kernel, hvm/kvm. We are using kvm-clock as guest source clock. cat /sys/devices/system/clocksource/clocksource0/current_clocksource kvm-clock Regards Espen -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Timedrift in KVM guests after livemigration.
On Sun, Apr 18, 2010 at 12:22:54PM +0300, Dor Laor wrote: On 04/18/2010 02:21 AM, Espen Berg wrote: Den 17.04.2010 22:17, skrev Michael Tokarev: We have three KVM hosts that supports live-migration between them, but one of our problems is time drifting. The three frontends has different CPU frequency and the KVM guests adopt the frequency from the host machine where it was first started. What do you mean by adopts ? Note that the cpu frequency means nothing for all the modern operating systems, at least since the days of common usage of MS-DOS which relied on CPU frequency for its time functions. All interesting things are now done using timers instead, and timers (which don't depend on CPU frequency again) usually work quite well. The assumption that frequency of the ticks was calculated by the hosts MHz, was based on the fact that grater clock frequency differences caused higher time drift. 60 MHz difference caused about 24min drift, 332 MHz difference caused about 2h25min drift. What complicates things is that the most cheap and accurate enough time source is TSC (time stamp counter register in the CPU), but it will definitely be different on each machine. For that, 0.12.3 kvm and 2.6.32 kernel (I think) introduced a compensation. See for example -tdf kvm option. Ah, nice to know. :) That's two different things here: The issue that Espen is reporting is that the hosts have different frequency and guests that relay on the tsc as a source clock will notice that post migration. The is indeed a problem that -tdf does not solve. -tdf only adds compensation for the RTC clock emulation. It's -rtc-td-hack. -tdf does pit compensation, but since usually kernel pit is used it does nothing. What's the guest type and what's the guest's source clock? Using tsc directly as a source clock is not recommended because of this migration issue (that is not solveable until we trap every rdtsc by the guest). Using pv kvmclock in Linux mitigates this issue since it exposes both the tsc and the host clock so guests can adjust themselves. Several months ago a pvclock migration fix was added to pass the pvclock MSRs reading to the destination: 1a03675db146dfc760b3b48b3448075189f142cc Since this is a cluster in production, I'm not able to try the latest version either. Well, that's difficult one, no? It either works or not. If you can't try anything else, why to ask? :) What I tried to say was that there are many important virtual servers running on this cluster at the moment, so trial by error was not an option. The last time we tried 0.12.x (during the initial tests of the cluster) there where a lot of stability issues, crashes during migration etc. Regards, Espen -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Timedrift in KVM guests after livemigration.
Den 15.04.2010 09:35, skrev Espen Berg: We have three KVM hosts that supports live-migration between them, but one of our problems is time drifting. The three frontends has different CPU frequency and the KVM guests adopt the frequency from the host machine where it was first started. Host1: cat /proc/cpuinfo model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz cpu MHz : 2394.048 Host2: cat /proc/cpuinfo model name : Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz cpu MHz : 2659.685 Host3: cat /proc/cpuinfo model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz cpu MHz : 2327.507 virsh version Compiled against library: libvir 0.7.6 Using library: libvir 0.7.6 Using API: QEMU 0.7.6 Running hypervisor: QEMU 0.11.0 Is there any solution to our problems, or is a reboot the only safe solution? Is there no one with similar problems here? :\ Guess I should file a bug report or something if the same problems occur in the latest version. I can't se any changes in change log after 0.11.x that relate to this problem. We can't be the only one that uses different CPUs in a migration environment. Since this is a cluster in production, I'm not able to try the latest version either. Espen. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Timedrift in KVM guests after livemigration.
17.04.2010 23:52, Espen Berg wrote: Den 15.04.2010 09:35, skrev Espen Berg: We have three KVM hosts that supports live-migration between them, but one of our problems is time drifting. The three frontends has different CPU frequency and the KVM guests adopt the frequency from the host machine where it was first started. What do you mean by adopts ? Note that the cpu frequency means nothing for all the modern operating systems, at least since the days of common usage of MS-DOS which relied on CPU frequency for its time functions. All interesting things are now done using timers instead, and timers (which don't depend on CPU frequency again) usually work quite well. What complicates things is that the most cheap and accurate enough time source is TSC (time stamp counter register in the CPU), but it will definitely be different on each machine. For that, 0.12.3 kvm and 2.6.32 kernel (I think) introduced a compensation. See for example -tdf kvm option. [] Is there any solution to our problems, or is a reboot the only safe solution? Well, reboot is definitely a safe solution. Is there no one with similar problems here? :\ Guess I should file a bug report or something if the same problems occur in the latest version. I can't se any changes in change log after 0.11.x that relate to this problem. We can't be the only one that uses different CPUs in a migration environment. Actually there is a difference in 0.12. Since this is a cluster in production, I'm not able to try the latest version either. Well, that's difficult one, no? It either works or not. If you can't try anything else, why to ask? :) /mjt -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Timedrift in KVM guests after livemigration.
Den 17.04.2010 22:17, skrev Michael Tokarev: We have three KVM hosts that supports live-migration between them, but one of our problems is time drifting. The three frontends has different CPU frequency and the KVM guests adopt the frequency from the host machine where it was first started. What do you mean by adopts ? Note that the cpu frequency means nothing for all the modern operating systems, at least since the days of common usage of MS-DOS which relied on CPU frequency for its time functions. All interesting things are now done using timers instead, and timers (which don't depend on CPU frequency again) usually work quite well. The assumption that frequency of the ticks was calculated by the hosts MHz, was based on the fact that grater clock frequency differences caused higher time drift. 60 MHz difference caused about 24min drift, 332 MHz difference caused about 2h25min drift. What complicates things is that the most cheap and accurate enough time source is TSC (time stamp counter register in the CPU), but it will definitely be different on each machine. For that, 0.12.3 kvm and 2.6.32 kernel (I think) introduced a compensation. See for example -tdf kvm option. Ah, nice to know. :) Since this is a cluster in production, I'm not able to try the latest version either. Well, that's difficult one, no? It either works or not. If you can't try anything else, why to ask? :) What I tried to say was that there are many important virtual servers running on this cluster at the moment, so trial by error was not an option. The last time we tried 0.12.x (during the initial tests of the cluster) there where a lot of stability issues, crashes during migration etc. Regards, Espen -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Timedrift in KVM guests after livemigration.
We have three KVM hosts that supports live-migration between them, but one of our problems is time drifting. The three frontends has different CPU frequency and the KVM guests adopt the frequency from the host machine where it was first started. Host1: cat /proc/cpuinfo model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz cpu MHz : 2394.048 Host2: cat /proc/cpuinfo model name : Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz cpu MHz : 2659.685 Host3: cat /proc/cpuinfo model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz cpu MHz : 2327.507 virsh version Compiled against library: libvir 0.7.6 Using library: libvir 0.7.6 Using API: QEMU 0.7.6 Running hypervisor: QEMU 0.11.0 Is there any solution to our problems, or is a reboot the only safe solution? Regards Espen -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html