Re: [CentOS] CentOS 6 early panic on ESXi 4.1.0 build 800380

2012-09-11 Thread John R Pierce
On 09/11/12 12:06 PM, Tilman Schmidt wrote:
> I run VMware vSphere 4 Essentials with three almost identically
> configured ESXi 4.1 hosts and a mix of 32 and 64 bit guests including
> Windows 2003 and 2008 as well as CentOS 5 and 6. Recently I updated one
> of the hosts to build 800380. The new build runs Windows and CentOS 5
> VMs fine, but CentOS 6 guests won't come up.
>
> I tried two different CentOS 6 VMs. Both have the latest standard kernel
> (2.6.32-279.5.2.el6.x86_64). Both run perfectly fine on one of the other
> VMware hosts still running ESXi 4.1.0 build 702113. On build 800380,
> both display the GRUB menu alright but freeze immediately afterwards,
> emitting the message
>
> PANIC: early exception 0d rip 10:81038879 error 0 cr2 0
>
> on the bottom of the virtual console. Both run perfectly fine again once
> I move them back to the host with the older ESXi build.
>
>  From one of the failed boot attempts, I captured a VMware debug log
> which shows:
>
> Sep 11 17:21:19.628: vcpu-0| RDMSR: unknown MSR[0x1a0] (read as zero):
> rip=0x810388db count=1
> Sep 11 17:21:19.628: vcpu-0| RDMSR: unknown MSR[0x1a0] (read as zero):
> rip=0x810388db count=2
> Sep 11 17:21:19.629: vcpu-0| X86Fault_Warning:
> vmcore/vmm64/cpu/interp.c:427: cs:eip=0x10:0x81038879 fault=13
> Sep 11 17:21:19.632: vcpu-0| Vix: [1125838 vmxCommands.c:9609]:
> VMAutomation_HandleCLIHLTEvent. Do nothing.
> Sep 11 17:21:19.632: vcpu-0| MsgHint: msg.monitorevent.halt (sent)
> Sep 11 17:21:19.632: vcpu-0| The CPU has been disabled by the guest
> operating system. Power off or reset the virtual machine.
>
> Ideas?
>
>

from here, it appears to be a hardware or vmware issue.   NOTHING the 
guest OS does should crash the hypervisor.   I'd file a bug report with 
vmware.


-- 
john r pierceN 37, W 122
santa cruz ca mid-left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 6 early panic on ESXi 4.1.0 build 800380

2012-09-11 Thread Laurent
Le 2012-09-11 21:06, Tilman Schmidt a écrit :
> I run VMware vSphere 4 Essentials with three almost identically
> configured ESXi 4.1 hosts and a mix of 32 and 64 bit guests including
> Windows 2003 and 2008 as well as CentOS 5 and 6. Recently I updated 
> one
> of the hosts to build 800380. The new build runs Windows and CentOS 5
> VMs fine, but CentOS 6 guests won't come up.
>
> I tried two different CentOS 6 VMs. Both have the latest standard 
> kernel
> (2.6.32-279.5.2.el6.x86_64). Both run perfectly fine on one of the 
> other
> VMware hosts still running ESXi 4.1.0 build 702113. On build 800380,
> both display the GRUB menu alright but freeze immediately afterwards,
> emitting the message
>

I've found what is probably your post on VMware Communities.
http://communities.vmware.com/message/2112173?tstart=0

It seems there's a second 4.1 update 3 build (811144):
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2020362

It fixes another panic, so trying this build may help.
--
PR722061: When a Linux kernel crashes, the linux kexec feature is used 
to enable booting into a special kdump kernel and gathering crash dump 
files. An SMP Linux guest configured with kexec might cause the virtual 
machine to fail with a monitor panic during this reboot. Error messages 
such as the following might be logged:

vcpu-0| CPU reset: soft (mode 2)
vcpu-0| MONITOR PANIC: vcpu-0:VMM fault 14: src=MONITOR 
rip=0xfc28c30d regs=0xfc008b50
--

-- 
Laurent.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 6 early panic on ESXi 4.1.0 build 800380

2012-09-11 Thread Tilman Schmidt
Am 11.09.2012 21:14, schrieb John R Pierce:
> On 09/11/12 12:06 PM, Tilman Schmidt wrote:

>> I tried two different CentOS 6 VMs. Both have the latest standard kernel
>> (2.6.32-279.5.2.el6.x86_64). Both run perfectly fine on one of the other
>> VMware hosts still running ESXi 4.1.0 build 702113. On build 800380,
>> both display the GRUB menu alright but freeze immediately afterwards,
>> emitting the message
>>
>> PANIC: early exception 0d rip 10:81038879 error 0 cr2 0
[...]
> from here, it appears to be a hardware or vmware issue.

I tend to exclude hardware issues. The host in question was working fine
before the update.

>   NOTHING the guest OS does should crash the hypervisor.

Perhaps I wasn't quite clear. It's the guest OS that panics. The
hypervisor continues quite unperturbed by its guest's fate.

Thanks,
Tilman



signature.asc
Description: OpenPGP digital signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 6 early panic on ESXi 4.1.0 build 800380

2012-09-11 Thread Tilman Schmidt
Am 11.09.2012 21:57, schrieb Laurent:
> I've found what is probably your post on VMware Communities.
> http://communities.vmware.com/message/2112173?tstart=0

Indeed.

> It seems there's a second 4.1 update 3 build (811144):
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2020362

Thanks, I'll give it a try.




signature.asc
Description: OpenPGP digital signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 6 early panic on ESXi 4.1.0 build 800380

2012-09-12 Thread Tilman Schmidt
Am 11.09.2012 21:57, schrieb Laurent:
> Le 2012-09-11 21:06, Tilman Schmidt a écrit :

>> I tried two different CentOS 6 VMs. Both have the latest standard kernel
>> (2.6.32-279.5.2.el6.x86_64). Both run perfectly fine on one of the other
>> VMware hosts still running ESXi 4.1.0 build 702113. On build 800380,
>> both display the GRUB menu alright but freeze immediately afterwards,
[...]
> It seems there's a second 4.1 update 3 build (811144):
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2020362

I can't find that second build anywhere.

I have installed patch ESXi410-201208201-UG on the problem host.
It refers to http://kb.vmware.com/kb/2020373 for details, which states:

Build
800380
811144 (security-only)

I have also installed ESXi410-Update03 ("VMware ESXi 4.1 Complete Update
3") which is named in the title of that KB article. Update Manager does
not offer me anything else to install. Still vSphere Client reports
build 800380.

-- 
Tilman Schmidt
Phoenix Software GmbH
Bonn, Germany



signature.asc
Description: OpenPGP digital signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 6 early panic on ESXi 4.1.0 build 800380

2012-09-12 Thread Tilman Schmidt
Am 11.09.2012 21:14, schrieb John R Pierce:
> On 09/11/12 12:06 PM, Tilman Schmidt wrote:

>> I tried two different CentOS 6 VMs. Both have the latest standard kernel
>> (2.6.32-279.5.2.el6.x86_64). Both run perfectly fine on one of the other
>> VMware hosts still running ESXi 4.1.0 build 702113. On build 800380,
>> both display the GRUB menu alright but freeze immediately afterwards,
>> emitting the message
>>
>> PANIC: early exception 0d rip 10:81038879 error 0 cr2 0

> I'd file a bug report with vmware.

Well, yes, I'm working on that. It's a tedious process trying to
convince VMware support that I really have bought support.

Meanwhile I'd like to understand what's going wrong here, and ideally
how to work around it. I found this blog post

http://www.basemont.com/panic_early_exception_i3_i5_i7_vmware_virtualbox_parallels

which seems to hint that the Linux kernel might be involved in the
problem after all. The processor in the problem host is a Xeon
E3-1270V2, while the other one which works fine has an E3-1230. Alas the
"nosmep" boot option did not have any effect.

-- 
Tilman Schmidt
Phoenix Software GmbH
Bonn, Germany



signature.asc
Description: OpenPGP digital signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CentOS 6 early panic on ESXi 4.1.0 build 800380

2012-09-12 Thread Tilman Schmidt
Am 11.09.2012 21:06, schrieb Tilman Schmidt:
> I tried two different CentOS 6 VMs. Both have the latest standard kernel
> (2.6.32-279.5.2.el6.x86_64). Both run perfectly fine on one of the other
> VMware hosts still running ESXi 4.1.0 build 702113. On build 800380,
> both display the GRUB menu alright but freeze immediately afterwards,
> emitting the message
> 
> PANIC: early exception 0d rip 10:81038879 error 0 cr2 0
> 
> on the bottom of the virtual console. Both run perfectly fine again once
> I move them back to the host with the older ESXi build.

Two and a half new data points:

- The problem host has a Xeon E3-1270V2 processor while the one which
  runs the CentOS 6 guests fine has an E3-1230. I'm not sufficiently
  up to date with Intel processor types to tell whether this would
  make a difference.

- Another CentOS 6 VM with older kernel 2.6.32-220.7.1.el6.x86_64
  does come up on the problem host. It does a panic blink (Caps Lock
  and Scroll Lock blinking in unison while the VM has the keyboard)
  but I get a working login prompt (I don't get any further because I
  don't have a logon for the machine) and I can shut it down normally
  by sending Ctrl-Alt-Del.

- (the half point, no idea if it matters) The CentOS 6 VMs which
  die with "PANIC: early exception 0d" do *not* do a panic blink.

So it would seem that something related to the problem was changed
in the CentOS kernel between releases 2.6.32-220.7.1 and
2.6.32-279.5.2.

-- 
Tilman Schmidt
Phoenix Software GmbH
Bonn, Germany



signature.asc
Description: OpenPGP digital signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos