Re: [CentOS-virt] BUG: soft lockup detected on CPU#?

Eli Stair Mon, 21 Jan 2008 11:11:52 -0800

My un-authoritative answer: I've been tracking this bug (or several with thesame symptoms) for going on a couple years. It's ridiculously common,apparently well known to the Xen/Xensource guys judging by the number ofreports/bugs posted, but I haven't seen mention of it actually being addressedand resolved. Unfortunately I see the same issue with it cropping up after VMmoves, though it occurs /every/ time there is a VM migration, once perprocessor in the VM; doesn't matter if there is any IO on the Dom0 or DomU.Occasionally VM's die during a migration and have to be manuallydestroyed/restarted.

I do see evidence of significant instability (not implying it is related to theabove softlockup issues) however, in either VM moves migrating from a Xeon(5345) to Opteron Dom0, and in high-utilization DomU's which are just plainflaky and reboot/die semi-frequently even when never altered from their start Dom0.

For me, it currently means running only low-priority non-production services ina VM, and not shelling out for RHEL5 support for the project (contrary to whatI planned) since it's not being addressed. I'd be curious if this is beingaddressed in the Xen 3.2 release for RHEL5*...


Cheers,

/eli



Brett Worth wrote:

Hello All.
I've just started looking into Xen and have a test environment inplace. I'm seeing an
annoying problem that I thought worthy of a post.

Config:
I have 2 x HP DL585 servers each with 4 Dual core Opterons (non-vmx) and16GB RAMconfigured as Xen servers. These run CentOS 5.1 with the latest updatesapplied. Thesesystem both attach to an iSCSI target which is an HP DL385 running ietdand serving SAN
based storage.

I have a test VM running CentOS 5.1 also updated.

Problem:
If I run the VM on a single server everything is OK. If I do a migrateof the VM to theother server I start getting random "BUG: soft lockup detected on CPU#?"messages on theVM console. The messages seem to happen with IO but not every time. Areboot of the VM
on the new server will stop these messages.
I've also left the VM running overnight a couple of times and when I doI find that anyexternal sessions (ssh) are hung in the morning but the console sessionis not. New ssh
sessions can be started and seem to work.
After much googling it looks like the kernel messages can occur if dom0is very busy but
mine is not.

Any suggestions?

Regards

Brett Worth

_______________________________________________
CentOS-virt mailing list
CentOS-virt@centos.org
http://lists.centos.org/mailman/listinfo/centos-virt


_______________________________________________
CentOS-virt mailing list
CentOS-virt@centos.org
http://lists.centos.org/mailman/listinfo/centos-virt

Re: [CentOS-virt] BUG: soft lockup detected on CPU#?

Reply via email to