Your message dated Sat, 10 Dec 2011 20:00:59 +0100
with message-id <[email protected]>
and subject line no bug
has caused the Debian Bug report #649141,
regarding xen-hypervisor-4.0-amd64: live migration fails with invalid opcode
due to nonstop_tsc
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)
--
649141: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=649141
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: xen-hypervisor-4.0-amd64Version: 4.0.1-4Severity: graveTags:
squeezeJustification: causes data loss, migrations fail
When migrating from a machine that has nonstop_tsc in /proc/cpuinfo
(A) to a machine that does not have this flag (B), guests VM either
get a kernel panic or most/all running processes report invalid
opcodes in dmesg and crash. When migrating in the opposite direction
(B -> A), there seems to be no issues.
The solution would seem to involve masking out cpu features, but I'm
not sure entirely how to do this.
There are red hat specific patches here for the same issue:
https://bugzilla.redhat.com/show_bug.cgi?id=526862
and the issue is discussed here as well:
https://bugzilla.redhat.com/show_bug.cgi?id=711322
https://bugzilla.redhat.com/show_bug.cgi?id=694492
xm dmesg doesn't show anything strange, but the guests need to be
destroyed, or sometimes you can type reboot and it'll get a kernel
panic and reboot itself.
This link has some info on cpu masking but I'm not sure how to apply
it my hosts for this flag (or whether it should be applied to guests
or host?):
http://zhigang.org/wiki/XenCPUID#cpuid-boot-options-definition
A # cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
cpu MHz : 2400.178
cache size : 12288 KB
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov
pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc up
rep_good nonstop_tsc aperfmperf pni pclmulqdq est ssse3 cx16 sse4_1
sse4_2 popcnt aes hypervisor lahf_lm ida arat
bogomips : 4800.35
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
A # xm info | grep hw_caps
hw_caps :
bfebfbff:2c100800:00000000:00001f40:029ee3ff:00000000:00000001:00000000
B # cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU X3350 @ 2.66GHz
stepping : 7
cpu MHz : 2660.054
cache size : 6144 KB
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov
pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc up
rep_good aperfmperf pni est ssse3 cx16 sse4_1 hypervisor lahf_lm
bogomips : 5320.10
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
B # xm info | grep hw_caps
hw_caps :
bfebfbff:20100800:00000000:00000940:0008e3fd:00000000:00000001:00000000
--
Marcus Furlong
--- End Message ---
--- Begin Message ---
This is no bug, but actually documented. You can't migrate systems to a
cpu with lower capabilities.
Bastian
--
Ahead warp factor one, Mr. Sulu.
--- End Message ---