Bugs item #2351676, was opened at 2008-11-26 09:59
Message generated for change (Comment added) made by clesiuk
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2351676&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Chris Jones (c_jones)
Assigned to: Nobody/Anonymous (nobody)
Summary: Guests hang periodically on Ubuntu-8.10

Initial Comment:
I'm seeing periodic hangs on my guests.  I've been unable so far to find a 
trigger - they always boot fine, but after anywhere from 10 minutes to 24 hours 
they eventually hang completely.

My setup:
  * AMD Athlon X2 4850e (2500 MHz dual core)
  * 4Gig memory
  * Ubuntu 8.10 server, 64-bit
  * KVMs tried:
    : kvm-72 (shipped with ubuntu)
    : kvm-79 (built myself, --patched-kernel option)
  * Kernels tried:
    : 2.6.27.7 (kernel.org, self built)
    : 2.6.27-7-server from Ubuntu 8.10 distribution

  In guests
  * Ubuntu 8.10 server, 64-bit (virtual machine install)
  * kernel 2.6.27-7-server from Ubuntu 8.10

I'm running the guests like:
  sudo /usr/local/bin/qemu-system-x86_64        \
     -daemonize                                 \
     -no-kvm-irqchip                            \
     -hda Imgs/ndev_root.img                    \
     -m 1024                                    \
     -cdrom ISOs/ubuntu-8.10-server-amd64.iso   \
     -vnc :4                                    \
     -net nic,macaddr=DE:AD:BE:EF:04:04,model=e1000 \
     -net tap,ifname=tap4,script=/home/chris/kvm/qemu-ifup.sh 

The problem does not happen if I use -no-kvm.

I've tried some other options that have no effect:
  -no-kvm-pit
  -no-acpi

The disk images are raw format.

When the guests hang, I cannot ping them, and the vnc console us hung.  The 
qemu monitor is still accessible, and the guests recover if I issue a 
system_reset command from the monitor.  However, often, the console will not 
take keyboard after doing so.

When the guest is hung, kvm_stat shows all 0s for the counters:

efer_relo      exits  fpu_reloa  halt_exit  halt_wake  host_stat  hypercall
+insn_emul  insn_emul     invlpg   io_exits  irq_exits  irq_windo  largepage
+mmio_exit  mmu_cache  mmu_flood  mmu_pde_z  mmu_pte_u  mmu_pte_w  mmu_recyc
+mmu_shado  nmi_windo   pf_fixed   pf_guest  remote_tl  request_i  signal_ex
+tlb_flush
>          0          0          0          0          0          0          0
+0          0          0          0          0          0          0          0
+0          0          0          0          0          0          0          0
+0          0          0          0          0          0

gdb shows two threads - both waiting:

c(gdb) info threads
  2 Thread 0x414f1950 (LWP 422)  0x00007f36f07a03e1 in sigtimedwait ()
   from /lib/libc.so.6
  1 Thread 0x7f36f1f306e0 (LWP 414)  0x00007f36f084b482 in select ()
   from /lib/libc.so.6
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f36f1f306e0 (LWP 414))]#0  0x00007f36f084b482
+in select () from /lib/libc.so.6
(gdb) bt
#0  0x00007f36f084b482 in select () from /lib/libc.so.6
#1  0x00000000004094cb in main_loop_wait (timeout=0)
    at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4719
#2  0x000000000050a7ea in kvm_main_loop ()
    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:619
#3  0x000000000040fafc in main (argc=<value optimized out>,
    argv=0x7ffff9f41948) at /home/chris/pkgs/kvm/kvm-79/qemu/vl.c:4871
(gdb) thread 2
[Switching to thread 2 (Thread 0x414f1950 (LWP 422))]#0  0x00007f36f07a03e1 in
+sigtimedwait () from /lib/libc.so.6
(gdb) bt
#0  0x00007f36f07a03e1 in sigtimedwait () from /lib/libc.so.6
#1  0x000000000050a560 in kvm_main_loop_wait (env=0xc319e0, timeout=0)
    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:284
#2  0x000000000050aaf7 in ap_main_loop (_env=<value optimized out>)
    at /home/chris/pkgs/kvm/kvm-79/qemu/qemu-kvm.c:425
#3  0x00007f36f11ba3ea in start_thread () from /lib/libpthread.so.0
#4  0x00007f36f0852c6d in clone () from /lib/libc.so.6
#5  0x0000000000000000 in ?? ()


Any clues to help me resolve this would be much appreciated.


----------------------------------------------------------------------

Comment By: B. Cameron Lesiuk (clesiuk)
Date: 2009-03-25 10:35

Message:
I have a similar problem as the original poster. 

I've discovered a possible workaround: disable CPU frequency scaling in
the host:
# apt-get remove powernowd

I'm running with disabled frequency scaling and so far my system is
stable.

I set the host frequency manually: 
# cd /sys/devices/system/cpu/cpu0/cpufreq
# cat scaling_available_frequencies
>     2500000 2400000 2200000 2000000 1800000 1000000 
# cat scaling_available_governors
>     conservative ondemand userspace powersave performance 
# echo powersave > scaling_governor    (minimum frequency)
# echo performance > scaling_governor  (maximum frequency)

Here's my rig: 
* AMD Athlon X2 4850e (2500 MHz dual core)
* 4Gig memory, 800MHz, dual channel
* 780G chipset (Jetway NC81-LF motherboard)

I tried combinations of Host/Guest using:
* Ubuntu 8.10 server, i686, KVM-72 
* Ubuntu 8.10 server, amd64, KVM-72
* Ubuntu 9.04 server, amd64, KVM-84 (22 March 2009 beta)

Stuff I've tried which had no discernible effect: 
* clock source: kvm-clock, acpi_pm
* block device: ide, virtual
* network device: e1000, virtual

----------------------------------------------------------------------

Comment By: Michael Tokarev (mjtsf)
Date: 2009-02-09 05:52

Message:
Ok, I have very similar issue here as well.
Host - 4-core Phenom CPU and AMD 780G chipset, running 2.6.28.4-x86-64
(from kernel.org).
kvm-83 32bits
Guest - 2.6.27.13-i686smp, also from kernel.org.

The guest is running with KVM_GUEST stuff enabled, using kvm timer and
virtio network and block.  The system is Debian (lenny-to-be) on both, but
I don't think it matters since both uses custom-compiled kernels.

Guest - at least one of them - hangs, especially when many guests are
running in parallel (we've 4 windows machines and 4 linux machines, mostly
idle).  When it hangs, nothing really works - console, ping, etc.  It
usually continues working after 1..2 minutes or more.  During the hang, the
host is either silent or is spewing tons of "vcpu not ready for
apic_round_robin" messages (several 1000s of them) -- but I can't be sure
that message is directly related to the hangs.

Nothing is logged on guest.

The so-far-only-affected guest is assigned 2 virtual CPUs, -- I'll try to
reboot it with single cpu only to see if it will change anything.

I wasn't able to check gdb/trace/etc so far, because the guest that hangs
is my main working machine, which is a terminal server, so I have to run to
another room to server's console and check there.

----------------------------------------------------------------------

Comment By: Dustin Kirkland (dustin_kirkland)
Date: 2009-02-09 04:38

Message:
In the Ubuntu 8.10 guest, can you try the linux-image-virtual kernel?  The
current one points to linux-image-2.6.27-11-virtual.

:-Dustin

----------------------------------------------------------------------

Comment By: Daniel Poelzleithner (poelzi)
Date: 2009-01-17 22:18

Message:
New stability infos on my side.

Host:
Linux dirus-dom 2.6.28-2-server #3-Ubuntu SMP Thu Dec 4 22:35:12 UTC 2008
x86_64 GNU/Linux


Guest:
2.6.28 x86_64 
- disabled all kvm guest options (with kvm_clock disabled)
- enabled virtio_block 
- started with -smp 1 and -smp 2

they didn't crash yet, with 1 or 2 smp. I think disabling kvm guest
support did the trick.
however using nfs out of the guest is quite slow and not very stable it
seems. the guest laggs quite often
i have the feeling but even loads up to 11. running crashme, high -j
kernel build and file transfers didn't crash the machine.

----------------------------------------------------------------------

Comment By: James Thomason (james_thomason)
Date: 2009-01-14 23:30

Message:
Update: 

I installed Ubuntu 8.10 server and upgraded to 2.6.29-rc1 and KVM-83. I am
still able to reproduce when kvm -smp > 1.  New behavior in this
configuration is the printing of the message "Stuck??" to the console,
followed shortly by a kernel panic.   

KVM Host:

Ubuntu Server 8.10
Linux 2.6.29-RC1
KVM-83 

KVM Guest: 

Ubuntu Server 8.10
2.6.27-9-server



----------------------------------------------------------------------

Comment By: James Thomason (james_thomason)
Date: 2009-01-14 23:20

Message:
Hello, 

I am able to reliably reproduce a condition where a guest goes into a
tight
loop or spinlock on all running cores.  The scenario is exactly as
described
in bug 2351676, though my environment differs as detailed below.  My
observation is that the issue is correlated to the number of VCPUs
assigned
to the guest and CPU load. The higher the number of VCPUs and CPU
utilization, the more easily it is triggered.  If a KVM developer is
interested in debugging live, I might be able to arrange getting the
system
in question into a DMZ.  A review of the kvm tracker leads me to believe
that the following bugs are possibly related:

[ 2351676 ] Guests hang periodically on Ubuntu-8.10
[ 2353811 ] Solaris 10 guest unstable
[ 2494730 ] Guests "stalling" on kvm-82
[ 2138079 ] kvm locks up system
[ 2113643 ] guests AND host still getting stuck under CPU load

KVM Host Configuration:

4 x Quad-Core AMD Opteron Processors (8346 HE @ 1.8Ghz)
64GB DDR2 667Mhz
Fedora 10 x64
Kernel 2.6.28
KVM-82 

KVM Guest Configuration:
32GB Memory
1 to 16 VCPUs
Centos 5.2 x64
Kernel 2.6.28
IDE disk
e1000 NIC

----------------------------------------------------------------------

Comment By: Daniel Poelzleithner (poelzi)
Date: 2009-01-13 11:11

Message:
I have a very simelar setup.

Host: 
Ubuntu 8.10. 
Kernel 2.6.28-2-server
KVM: 72, 80, 81, 82, 83 tried (using the up to date kvm module, too)

Guests:
Endian Firewall (centos based.) 
Kernel 2.6.22.19-72.endian15
Is stable so far. sometimes loos usb devices

Ubuntu 8.10
Kernel 2.6.27, 2.6.28-2-server, 2.6.28 vanilla home brew
Very unstable.

As the Ubuntu 8.10 is also unstable when using the 2.6.28 vanilla kernel,
i'm not so sure it's a guest problem.
I will now compile a 2.6.28 kernel not having any kvm guest support.

Things doesn't seem to have a affect:
- using ide instead of virtio
- using e1000 instead of virtio

however, it seems that it may be caused by io access, but is not
reproducable easily.

Last tries i did': using kernel parameters "clocksource=acpi_pm notsc" in
the guest. Still investigating if it makes the guest stable.

btw. with kvm-82 i saw arround 100 io_exits when only the crashed ubuntu
8.10 is running. nothing else.

----------------------------------------------------------------------

Comment By: Chris Jones (c_jones)
Date: 2008-12-10 12:29

Message:
Actually, I was too quick to say that a Fedora 8 guest is stable.  Even
there, I'm seeing hangs once I get my application fully installed
(basically, once I introduce some load).

I also did an update to kvm-80 and the problem still exists (on all the
guests I've tried).  That's with kvm-80 kernel modules and the kvm-80 user,
running on linux-2.6.27.8.

Thanks,
Chris

----------------------------------------------------------------------

Comment By: Chris Jones (c_jones)
Date: 2008-12-01 11:09

Message:
Alexey,

Thanks for the response.  As you advised, I tried a Fedora 8 guest, and it
does seem to be much more stable.  However, I really need a Debian base
system for my application.  Not necessarily Ubuntu 8.10, but I haven't had
much luck with others either.  Do you have any recommendations on one that
is particularly stable?

Over the weekend I tried:
  Fedora 8       : Seems very stable, but I really need a debian base.
  Ubuntu 8.04LTS : Same periodic hangs I was seeing on 8.10
  Debian 4.0 Etch: Seems stable on the guest, but on the host, qemu
process is running 100% busy
                   while the guest is idle.

Any chance you know a workaround for the issue I'm seeing on etch, or can
recommend a Debian base distribution which works well with KVM?

Thanks much,
Chris

----------------------------------------------------------------------

Comment By: Technologov (technologov)
Date: 2008-11-27 04:54

Message:
In my opinion it is not the Ubuntu host that is problematic - but the guest
on KVM.

I mean that Ubuntu 8.10 guest is unstable on KVM. I have not found out
why.

If you try some better tested guest (Fedora 7/8 or Windows XP guest it
should be lots more stable).

And if you try some other host (i.e. Fedora host and run Ubuntu 8.10 guest
it will be unstable).

In short - in my opinion - the problem is not host OS, but either KVM or
it's connection with guest OS.

-Alexey E. "Technologov", 27.11.2008.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2351676&group_id=180599
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to