Re: random crash in post_kvm_run()
On 07/05/2010 10:42 AM, Avi Kivity wrote: Please don't top-post. On 07/03/2010 05:23 PM, BuraphaLinux Server wrote: Ok, I kept going like you said. Here is what it said: $git bisect good 44ea2b1758d88ad822e65b1c4c21ca6164494e27 is the first bad commit commit 44ea2b1758d88ad822e65b1c4c21ca6164494e27 Author: Avi Kivitya...@redhat.com Date: Sun Sep 6 15:55:37 2009 +0300 KVM: VMX: Move MSR_KERNEL_GS_BASE out of the vmx autoload msr area Currently MSR_KERNEL_GS_BASE is saved and restored as part of the guest/host msr reloading. Since we wish to lazy-restore all the other msrs, save and reload MSR_KERNEL_GS_BASE explicitly instead of using the common code. Signed-off-by: Avi Kivitya...@redhat.com That doesn't make any sense. This commit shouldn't affect anything in user-kernel communications. Can you describe your environment? I'll try to reproduce it. I was able to reproduce it, and the commit does make sense. The faulting instruction is 0x807182a post_kvm_run+10 mov%gs:0x14,%eax which is a stack guard fetch. It shouldn't ever fault - so it looks like %gs is corrupted, and indeed the commit plays with %gs. I'll investigate further. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
On 07/06/2010 10:46 AM, Avi Kivity wrote: I'll investigate further. Patch posted. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
On 7/6/10, Avi Kivity a...@redhat.com wrote: On 07/06/2010 10:46 AM, Avi Kivity wrote: I'll investigate further. Patch posted. Hello Avi Kitty, I spent the day getting things ready for you to log in, but was amazed to find you already had a patch ready for testing, which was good news. I wasn't sure which kernel that was for. I applied it by hand to 2.6.34.1 and then regenerated the patch for that (attached). That patch on top of 2.6.34.1 works for me. I tested running the qmeu_kvm 5 times in a row, and updated the kernel in the guest to match and it worked fine as the host and the guest (and I ran that on the client machine too). If you want me to test your original unmodified patch, I will be happy to do that too. What kernel version is it for? Your patch updates the line with vmx_set_efer() in exit_lmode() in vmx.c, but that line wasn't in 2.6.34.1, so I guessed it should be added. Thank you very much for your help, JGH ttt Description: Binary data
Re: random crash in post_kvm_run()
Please don't top-post. On 07/03/2010 05:23 PM, BuraphaLinux Server wrote: Ok, I kept going like you said. Here is what it said: $git bisect good 44ea2b1758d88ad822e65b1c4c21ca6164494e27 is the first bad commit commit 44ea2b1758d88ad822e65b1c4c21ca6164494e27 Author: Avi Kivitya...@redhat.com Date: Sun Sep 6 15:55:37 2009 +0300 KVM: VMX: Move MSR_KERNEL_GS_BASE out of the vmx autoload msr area Currently MSR_KERNEL_GS_BASE is saved and restored as part of the guest/host msr reloading. Since we wish to lazy-restore all the other msrs, save and reload MSR_KERNEL_GS_BASE explicitly instead of using the common code. Signed-off-by: Avi Kivitya...@redhat.com That doesn't make any sense. This commit shouldn't affect anything in user-kernel communications. Can you describe your environment? I'll try to reproduce it. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
Hello, Software: We use a 32bit userland with a 64bit kernel. In the qemu-kvm I run the same distribution (using 64bit kernel with 32bit userland). The userland is compiled with gcc-4.4.4, binutils-2.20.51.0.9, glibc-2.11.2. The kernels are compiled using a crosstool-ng toolchain generated for x86_64, using crosstool-ng-1.7.0. This crosstool toolchain uses gcc-4.4.3, binutils-2.20, glibc-2.9. Except for the KVM issue, kernels work fine for me built with this toolchain. Most people seem to use 64bit userlands with 64bit kernels so perhaps configurations like mine are not well tested yet. Hardware: The CPU is core i7 950 3GHz part, stepping 5. The machine has 12GB of RAM. Motherboard is a Gigabyte X58A-UD3R, Award BIOS version F3 from 02/06/2010. 6 SATA hard disks, one SATA DVD burner. Machine runs fine with 2.6.32.14 kernel. With 2.6.33.x or 2.6.34 it runs fine except for KVM. Graphics card is a RV710 [Radeon HD 4350], but we turn off KMS for radeon in the kenrel configuration since radeon with KMS is unusuable (on all of of our machines, not just that one; intel + KMS works fine, but intel refuses to sell separate graphics cards). We use the vesa driver in the xorg.conf instead. Normally (and during all the testing and git bisect) we do not run the X windows on that machine, but qemu-kvm wants the libraries. Client machine (Dell Optiplex 330) runs the same setup, but with intel+KMS and gvncviewer from gtk_vnc-0.3.10. Client machine can do qemu without KVM just fine, but of course it is so slow it isn't really practical to use. Compilation details: The qemu-kvm version 0.12.4 is compiled with this: ./configure \ --prefix=/usr \ --enable-sdl \ --audio-drv-list=alsa \ --audio-card-list=ac97 \ --enable-mixemu \ --disable-xen \ --enable-curses \ --enable-bluez \ --enable-kvm \ --enable-nptl \ --enable-user-pie \ --target-list=x86_64-softmmu It requires this patch first to build on our systems: diff -Naur qemu-0.12.3.old/configure qemu-0.12.3.new/configure --- qemu-0.12.3.old/configure 2010-02-24 03:54:38.0 +0700 +++ qemu-0.12.3.new/configure 2010-03-25 14:43:43.0 +0700 @@ -1072,7 +1072,7 @@ int main(void) { return 0; } EOF if compile_prog $sdl_cflags $sdl_libs ; then -sdl_libs=$sdl_libs -lX11 +sdl_libs=$sdl_libs -L/usr/X11R7/lib -lX11 fi if test $mingw32 = yes ; then sdl_libs=`echo $sdl_libs | sed s/-mwindows//g` -mconsole The SDL-1.2.14 is built like this: ./configure \ --prefix=/usr \ --mandir=/usr/man \ --infodir=/usr/info \ --sysconfdir=/etc \ --disable-rpath \ --disable-nls \ --disable-joystick \ --disable-oss \ --disable-esd \ --disable-arts \ --disable-mintaudio \ --disable-ipod \ --disable-input-tslib \ --disable-atari-ldg \ --x-includes=/usr/X11R7/include \ --x-libraries=/usr/X11R7/lib \ --with-x \ --build=i686-pc-linux-gnu Access: I can arrange for you to log in (and have root) on the actual machine. I can make an ISO with the actual distribution for your download, or mail you a DVD with express mail if that is better for you. The machine is dedicated for doing KVM work, so I can run any tests you want if you prefer I do them. I worry about exceeding the list limits, but I can send kernel config file, or any other information about the environment you want, or put it somewhere for web download. Other testing: I did test on the only other machine I have control with hardware support for virtualization, an older technology quad-core intel machine, and hit identical problems (2.6.32.14 works with KVM, anything 2.6.33 or newer does not, same toolchain and userland), so I don't think it is a problem in the hardware. Segfault description: With qemu-kvm, the segfault happens after extlinux starts the kernel and the kernel is uncompressing, but then the client window closes so fast I cannot see exactly how far it gets. It does not get to a long prompt, and the boot inside the KVM is into text mode. The same ISO works with qemu without kvm on the client box with SDL or with gvncviewer, but is too slow to be practical to use. My suspiscion is that there is a 32/64 issue in the kernel kvm (or glibc threads) that people doing 32/32 or 64/64 don't see, and 32/64 is uncommon. Since it works sometimes (maybe 1 in 10 or 15 tries) it acts like a bad read in a program where once in a while you get lucky and the random value read lets the program run, like if you load al but mean ax, and sometimes ah is 0 so using ax works, but usually it is not so ax has trash in the top part and does not work (an 8/16 issue). JGH -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
On 07/05/2010 11:52 AM, BuraphaLinux Server wrote: Hello, Software: We use a 32bit userland with a 64bit kernel. In the qemu-kvm I run the same distribution (using 64bit kernel with 32bit userland). The userland is compiled with gcc-4.4.4, binutils-2.20.51.0.9, glibc-2.11.2. The kernels are compiled using a crosstool-ng toolchain generated for x86_64, using crosstool-ng-1.7.0. This crosstool toolchain uses gcc-4.4.3, binutils-2.20, glibc-2.9. Except for the KVM issue, kernels work fine for me built with this toolchain. Most people seem to use 64bit userlands with 64bit kernels so perhaps configurations like mine are not well tested yet. Hardware: The CPU is core i7 950 3GHz part, stepping 5. The machine has 12GB of RAM. Motherboard is a Gigabyte X58A-UD3R, Award BIOS version F3 from 02/06/2010. 6 SATA hard disks, one SATA DVD burner. Machine runs fine with 2.6.32.14 kernel. With 2.6.33.x or 2.6.34 it runs fine except for KVM. Graphics card is a RV710 [Radeon HD 4350], but we turn off KMS for radeon in the kenrel configuration since radeon with KMS is unusuable (on all of of our machines, not just that one; intel + KMS works fine, but intel refuses to sell separate graphics cards). We use the vesa driver in the xorg.conf instead. Normally (and during all the testing and git bisect) we do not run the X windows on that machine, but qemu-kvm wants the libraries. Client machine (Dell Optiplex 330) runs the same setup, but with intel+KMS and gvncviewer from gtk_vnc-0.3.10. Client machine can do qemu without KVM just fine, but of course it is so slow it isn't really practical to use. Compilation details: The qemu-kvm version 0.12.4 is compiled with this: ./configure \ --prefix=/usr \ --enable-sdl \ --audio-drv-list=alsa \ --audio-card-list=ac97 \ --enable-mixemu \ --disable-xen \ --enable-curses \ --enable-bluez \ --enable-kvm \ --enable-nptl \ --enable-user-pie \ --target-list=x86_64-softmmu It requires this patch first to build on our systems: diff -Naur qemu-0.12.3.old/configure qemu-0.12.3.new/configure --- qemu-0.12.3.old/configure 2010-02-24 03:54:38.0 +0700 +++ qemu-0.12.3.new/configure 2010-03-25 14:43:43.0 +0700 @@ -1072,7 +1072,7 @@ int main(void) { return 0; } EOF if compile_prog $sdl_cflags $sdl_libs ; then -sdl_libs=$sdl_libs -lX11 +sdl_libs=$sdl_libs -L/usr/X11R7/lib -lX11 fi if test $mingw32 = yes ; then sdl_libs=`echo $sdl_libs | sed s/-mwindows//g` -mconsole The SDL-1.2.14 is built like this: ./configure \ --prefix=/usr \ --mandir=/usr/man \ --infodir=/usr/info \ --sysconfdir=/etc \ --disable-rpath \ --disable-nls \ --disable-joystick \ --disable-oss \ --disable-esd \ --disable-arts \ --disable-mintaudio \ --disable-ipod \ --disable-input-tslib \ --disable-atari-ldg \ --x-includes=/usr/X11R7/include \ --x-libraries=/usr/X11R7/lib \ --with-x \ --build=i686-pc-linux-gnu Access: I can arrange for you to log in (and have root) on the actual machine. I can make an ISO with the actual distribution for your download, or mail you a DVD with express mail if that is better for you. The machine is dedicated for doing KVM work, so I can run any tests you want if you prefer I do them. I wasn't able to reproduce the crash with a 32-on-64 setup, so please send the access details by private mail. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
On 07/02/2010 10:08 PM, BuraphaLinux Server wrote: Hello, I tried my best to do the bisection, and the result after many kernels was: Bisecting: 0 revisions left to test after this (roughly 0 steps) [3ce672d48400e0112fec7a3cb6bb2120493c6e11] KVM: SVM: init_vmcb(): remove redundant save-cr0 initialization So what do I do next? You still need to test that last kernel. 0 revisions left means that after this test (and the last git bisect good or git bisect bad) you'll have the answer. I did not 'make mrproper' between each build - should I have done that? No need. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
Ok, I kept going like you said. Here is what it said: $git bisect good 44ea2b1758d88ad822e65b1c4c21ca6164494e27 is the first bad commit commit 44ea2b1758d88ad822e65b1c4c21ca6164494e27 Author: Avi Kivity a...@redhat.com Date: Sun Sep 6 15:55:37 2009 +0300 KVM: VMX: Move MSR_KERNEL_GS_BASE out of the vmx autoload msr area Currently MSR_KERNEL_GS_BASE is saved and restored as part of the guest/host msr reloading. Since we wish to lazy-restore all the other msrs, save and reload MSR_KERNEL_GS_BASE explicitly instead of using the common code. Signed-off-by: Avi Kivity a...@redhat.com :04 04 fcf14f9e5578a996430650f7806a54bcc8184ef6 24f4b80719c5b7931a5ed5604f3554f78352ed67 M arch $ On 7/3/10, Avi Kivity a...@redhat.com wrote: On 07/02/2010 10:08 PM, BuraphaLinux Server wrote: Hello, I tried my best to do the bisection, and the result after many kernels was: Bisecting: 0 revisions left to test after this (roughly 0 steps) [3ce672d48400e0112fec7a3cb6bb2120493c6e11] KVM: SVM: init_vmcb(): remove redundant save-cr0 initialization So what do I do next? You still need to test that last kernel. 0 revisions left means that after this test (and the last git bisect good or git bisect bad) you'll have the answer. I did not 'make mrproper' between each build - should I have done that? No need. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
Hello, I tried my best to do the bisection, and the result after many kernels was: Bisecting: 0 revisions left to test after this (roughly 0 steps) [3ce672d48400e0112fec7a3cb6bb2120493c6e11] KVM: SVM: init_vmcb(): remove redundant save-cr0 initialization So what do I do next? I did not 'make mrproper' between each build - should I have done that? JGH On 7/1/10, Avi Kivity a...@redhat.com wrote: On 06/30/2010 09:25 PM, BuraphaLinux Server wrote: Can you downgrade the kernel to a known good one to see which component causes the failure? Thank you for your suggestion. Changing only the kernel back to 2.6.32.14 and changing nothing else, the qemu/kvm works well. However, I want to run the newer kernel to get all the fixes in newer kernels. But for the short term, at least I can run my KVMs. Sure. This looks like a regression, and we want to fix it. Since I want to move to the newer kernel, I would like to keep working to resolve my problem. Should I send the two kernel configs, or bootup logs, or put them somewhere for download, or... ? The surest way to find out the cause is to bisect. To do that, install git, and clone the Linux repository: $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git $ cd linux-2.6 Start by verifying that 2.6.32 (not 2.6.32.14) still works: $ git checkout v2.6.32 # prune unnecessary modules $ make localmodconfig $ make sudo make install # do your normal tests If that fails, it means the fix is somewhere in v2.6.32.y, which should be easy to find. If it works, start your bisect $ git bisect start v.2.6.34 v2.6.32 virt/kvm arch/x86/kvm Git will choose a commit, build it and test it. If it works, do a $ git bisect good otherwise, $ git bisect bad Git will then choose another test point; build, test, and repeat. Eventually it will spit out the commit which caused the problem. The process is time consuming, but has a high probability of pinpointing the cause of the bug accurately. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
On 06/30/2010 09:25 PM, BuraphaLinux Server wrote: Can you downgrade the kernel to a known good one to see which component causes the failure? Thank you for your suggestion. Changing only the kernel back to 2.6.32.14 and changing nothing else, the qemu/kvm works well. However, I want to run the newer kernel to get all the fixes in newer kernels. But for the short term, at least I can run my KVMs. Sure. This looks like a regression, and we want to fix it. Since I want to move to the newer kernel, I would like to keep working to resolve my problem. Should I send the two kernel configs, or bootup logs, or put them somewhere for download, or... ? The surest way to find out the cause is to bisect. To do that, install git, and clone the Linux repository: $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git $ cd linux-2.6 Start by verifying that 2.6.32 (not 2.6.32.14) still works: $ git checkout v2.6.32 # prune unnecessary modules $ make localmodconfig $ make sudo make install # do your normal tests If that fails, it means the fix is somewhere in v2.6.32.y, which should be easy to find. If it works, start your bisect $ git bisect start v.2.6.34 v2.6.32 virt/kvm arch/x86/kvm Git will choose a commit, build it and test it. If it works, do a $ git bisect good otherwise, $ git bisect bad Git will then choose another test point; build, test, and repeat. Eventually it will spit out the commit which caused the problem. The process is time consuming, but has a high probability of pinpointing the cause of the bug accurately. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
Reply inline. On 6/29/10, Brian Jackson i...@theiggy.com wrote: On Monday, June 28, 2010 12:28:52 pm BuraphaLinux Server wrote: Hello, I have tried qemu_kvm 0.12.4 release and also git from about 1/2 an hour ago. In both cases, I crash in the post_kvm_run() function on the line about: pthread_mutex_lock(qemu_mutex); The command I use to run qemu worked great with glibc-2.11.1,linux-2.6.32.14,and gcc-4.4.3, but I have upgraded to glibc-2.11.2, linux-2.6.34, and gcc-4.4.4 and get this: (gdb) bt #0 post_kvm_run (kvm=0x84cde04, env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:566 #1 0x08086ccf in kvm_run (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:619 #2 0x080882d0 in kvm_cpu_exec (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1238 #3 0x08088cf6 in kvm_main_loop_cpu (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1495 #4 0x08088e72 in ap_main_loop (_env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1541 #5 0x55598690 in start_thread () from /lib/libpthread.so.0 #6 0x55a8ca7e in clone () from /lib/libc.so.6 (gdb) list 561 in /tmp/qemu-kvm-201006282359/qemu-kvm.c (gdb) print qemu_mutex $1 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 0, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' repeats 23 times, __align = 0} (gdb) I rebuilt the kernel, then glibc, then the entire graphics stack, then qemu_kvm to try and be sure I have no problems about headers. All my other software works, but qemu_kvm does not. About 1 time in 10 it will actually run fine, but the other times it will crash as shown. I use a dedicated LV for this. I have a 32bit userland with a 64bit kernel. Here is the script I use: #! /sbin/bash INSTANCE=0 NAME=VM${INSTANCE} FAKEDISK=/dev/mapper/vmland-vmdisk${INSTANCE} ((MACNO=22+INSTANCE)) ulimit -S -c unlimited echo qemu-system-x86_64 \ -cpu core2duo -smp 2 -m 512 \ -vga std \ -vnc :${INSTANCE} -monitor stdio \ -localtime -usb -usbdevice mouse \ -net nic,vlan=0,model=rtl8139,macaddr=DE:AD:BE:EF:25:${MACNO} \ -net tap,ifname=tap${INSTANCE},script=/etc/qemu-ifup,downscript=/etc/qemu-ifdow n \ -name ${NAME} \ -hda ${FAKEDISK} \ -boot c qemu-system-x86_64 \ -cpu core2duo -smp 2 -m 512 \ try without -cpu core2duo I tried without that option, but I get the same crash. Thank you for the suggestion however, and I guess that rules out problems with the '-cpu' option. -vga std \ -vnc :${INSTANCE} -monitor stdio \ -localtime -usb -usbdevice mouse \ -net nic,vlan=0,model=rtl8139,macaddr=DE:AD:BE:EF:25:${MACNO} \ -net tap,ifname=tap${INSTANCE},script=/etc/qemu-ifup,downscript=/etc/qemu-ifdow n \ -name ${NAME} \ -hda ${FAKEDISK} \ -boot c # just in case /usr/sbin/brctl delif br0 tap${INSTANCE} The bridging and taps all worked before. The CPU is a core i7 950, I've got 12GB of RAM, and I'm going nuts trying to debug this. Since it sometimes works, I wonder if there is some uninitialized variable that sometimes is set so I get lucky but usually is set where things crash. I don't want to place blame, I just want to get it working. Any hints? I'm not subscribed, but the page at http://www.linux-kvm.org/page/Lists,_IRC said it's ok to send a message anyway. Please cc: me so I get a copy, or if I need to join the list please tell me. I compile it all from source (similar to linux from scratch) so there is no upstream distro to go ask for help. Since everything else works, I suspect something strange in qemu_kvm. I did google a lot but found nothing helpful. The ISO image used works on real hardware, and uses the same kernel and userland. The isolinux shows the menu and works great, but when it is time to boot the kernel I get the crash. The kernel modules kvm and kvm_intel are loaded when I try to start qemu_kvm. The /var/log/messages just shows this: Jun 29 00:05:47 banpuk kernel: [20299.236926] qemu-system-x86[31375]: segfault at 14 ip 08086a64 sp 5601e180 error 4 in qemu-system-x86_64[8048000+256000] The /var/log/syslog show this: Jun 29 00:06:00 banpuk kernel: [20312.302498] kvm: 31383: cpu0 unhandled wrmsr: 0x198 data 0 Jun 29 00:06:00 banpuk kernel: [20312.302606] kvm: 31383: cpu1 unhandled wrmsr: 0x198 data 0 JGH -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
On 06/28/2010 12:28 PM, BuraphaLinux Server wrote: Hello, I have tried qemu_kvm 0.12.4 release and also git from about 1/2 an hour ago. In both cases, I crash in the post_kvm_run() function on the line about: pthread_mutex_lock(qemu_mutex); The command I use to run qemu worked great with glibc-2.11.1,linux-2.6.32.14,and gcc-4.4.3, but I have upgraded to glibc-2.11.2, linux-2.6.34, and gcc-4.4.4 and get this: (gdb) bt #0 post_kvm_run (kvm=0x84cde04, env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:566 #1 0x08086ccf in kvm_run (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:619 #2 0x080882d0 in kvm_cpu_exec (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1238 #3 0x08088cf6 in kvm_main_loop_cpu (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1495 #4 0x08088e72 in ap_main_loop (_env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1541 #5 0x55598690 in start_thread () from /lib/libpthread.so.0 #6 0x55a8ca7e in clone () from /lib/libc.so.6 (gdb) list 561 in /tmp/qemu-kvm-201006282359/qemu-kvm.c (gdb) print qemu_mutex $1 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 0, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000'repeats 23 times, __align = 0} (gdb) I rebuilt the kernel, then glibc, then the entire graphics stack, then qemu_kvm to try and be sure I have no problems about headers. All my other software works, but qemu_kvm does not. About 1 time in 10 it will actually run fine, but the other times it will crash as shown. I use a dedicated LV for this. I have a 32bit userland with a 64bit kernel. This is an unusual configuration. Is it possible try try a 64-bit userland? Regards, Anthony Liguori Here is the script I use: #! /sbin/bash INSTANCE=0 NAME=VM${INSTANCE} FAKEDISK=/dev/mapper/vmland-vmdisk${INSTANCE} ((MACNO=22+INSTANCE)) ulimit -S -c unlimited echo qemu-system-x86_64 \ -cpu core2duo -smp 2 -m 512 \ -vga std \ -vnc :${INSTANCE} -monitor stdio \ -localtime -usb -usbdevice mouse \ -net nic,vlan=0,model=rtl8139,macaddr=DE:AD:BE:EF:25:${MACNO} \ -net tap,ifname=tap${INSTANCE},script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \ -name ${NAME} \ -hda ${FAKEDISK} \ -boot c qemu-system-x86_64 \ -cpu core2duo -smp 2 -m 512 \ -vga std \ -vnc :${INSTANCE} -monitor stdio \ -localtime -usb -usbdevice mouse \ -net nic,vlan=0,model=rtl8139,macaddr=DE:AD:BE:EF:25:${MACNO} \ -net tap,ifname=tap${INSTANCE},script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \ -name ${NAME} \ -hda ${FAKEDISK} \ -boot c # just in case /usr/sbin/brctl delif br0 tap${INSTANCE} The bridging and taps all worked before. The CPU is a core i7 950, I've got 12GB of RAM, and I'm going nuts trying to debug this. Since it sometimes works, I wonder if there is some uninitialized variable that sometimes is set so I get lucky but usually is set where things crash. I don't want to place blame, I just want to get it working. Any hints? I'm not subscribed, but the page at http://www.linux-kvm.org/page/Lists,_IRC said it's ok to send a message anyway. Please cc: me so I get a copy, or if I need to join the list please tell me. I compile it all from source (similar to linux from scratch) so there is no upstream distro to go ask for help. Since everything else works, I suspect something strange in qemu_kvm. I did google a lot but found nothing helpful. The ISO image used works on real hardware, and uses the same kernel and userland. The isolinux shows the menu and works great, but when it is time to boot the kernel I get the crash. The kernel modules kvm and kvm_intel are loaded when I try to start qemu_kvm. The /var/log/messages just shows this: Jun 29 00:05:47 banpuk kernel: [20299.236926] qemu-system-x86[31375]: segfault at 14 ip 08086a64 sp 5601e180 error 4 in qemu-system-x86_64[8048000+256000] The /var/log/syslog show this: Jun 29 00:06:00 banpuk kernel: [20312.302498] kvm: 31383: cpu0 unhandled wrmsr: 0x198 data 0 Jun 29 00:06:00 banpuk kernel: [20312.302606] kvm: 31383: cpu1 unhandled wrmsr: 0x198 data 0 JGH -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
On 6/29/10, Avi Kivity a...@redhat.com wrote: On 06/28/2010 08:28 PM, BuraphaLinux Server wrote: Hello, I have tried qemu_kvm 0.12.4 release and also git from about 1/2 an hour ago. In both cases, I crash in the post_kvm_run() function on the line about: pthread_mutex_lock(qemu_mutex); The command I use to run qemu worked great with glibc-2.11.1,linux-2.6.32.14,and gcc-4.4.3, but I have upgraded to glibc-2.11.2, linux-2.6.34, and gcc-4.4.4 and get this: Can you downgrade the kernel to a known good one to see which component causes the failure? Thank you for your suggestion. Changing only the kernel back to 2.6.32.14 and changing nothing else, the qemu/kvm works well. However, I want to run the newer kernel to get all the fixes in newer kernels. But for the short term, at least I can run my KVMs. Since I want to move to the newer kernel, I would like to keep working to resolve my problem. Should I send the two kernel configs, or bootup logs, or put them somewhere for download, or... ? JGH -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
On 7/1/10, Anthony Liguori anth...@codemonkey.ws wrote: On 06/28/2010 12:28 PM, BuraphaLinux Server wrote: Hello, I have tried qemu_kvm 0.12.4 release and also git from about 1/2 an hour ago. In both cases, I crash in the post_kvm_run() function on the line about: pthread_mutex_lock(qemu_mutex); The command I use to run qemu worked great with glibc-2.11.1,linux-2.6.32.14,and gcc-4.4.3, but I have upgraded to glibc-2.11.2, linux-2.6.34, and gcc-4.4.4 and get this: (gdb) bt #0 post_kvm_run (kvm=0x84cde04, env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:566 #1 0x08086ccf in kvm_run (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:619 #2 0x080882d0 in kvm_cpu_exec (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1238 #3 0x08088cf6 in kvm_main_loop_cpu (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1495 #4 0x08088e72 in ap_main_loop (_env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1541 #5 0x55598690 in start_thread () from /lib/libpthread.so.0 #6 0x55a8ca7e in clone () from /lib/libc.so.6 (gdb) list 561 in /tmp/qemu-kvm-201006282359/qemu-kvm.c (gdb) print qemu_mutex $1 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 0, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000'repeats 23 times, __align = 0} (gdb) I rebuilt the kernel, then glibc, then the entire graphics stack, then qemu_kvm to try and be sure I have no problems about headers. All my other software works, but qemu_kvm does not. About 1 time in 10 it will actually run fine, but the other times it will crash as shown. I use a dedicated LV for this. I have a 32bit userland with a 64bit kernel. This is an unusual configuration. Is it possible try try a 64-bit userland? Regards, Anthony Liguori Not at this time, since I haven't ever built a 64bit userland with my 32bit userland yet. The purpose of getting things working in KVM (for me) is laregely to support a 'safe' place to experiment with creating such a 64bit userland. Unfortunately I've hit this snag. However. thank you for the suggestion. JGH -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
On 06/28/2010 08:28 PM, BuraphaLinux Server wrote: Hello, I have tried qemu_kvm 0.12.4 release and also git from about 1/2 an hour ago. In both cases, I crash in the post_kvm_run() function on the line about: pthread_mutex_lock(qemu_mutex); The command I use to run qemu worked great with glibc-2.11.1,linux-2.6.32.14,and gcc-4.4.3, but I have upgraded to glibc-2.11.2, linux-2.6.34, and gcc-4.4.4 and get this: Can you downgrade the kernel to a known good one to see which component causes the failure? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random crash in post_kvm_run()
On Monday, June 28, 2010 12:28:52 pm BuraphaLinux Server wrote: Hello, I have tried qemu_kvm 0.12.4 release and also git from about 1/2 an hour ago. In both cases, I crash in the post_kvm_run() function on the line about: pthread_mutex_lock(qemu_mutex); The command I use to run qemu worked great with glibc-2.11.1,linux-2.6.32.14,and gcc-4.4.3, but I have upgraded to glibc-2.11.2, linux-2.6.34, and gcc-4.4.4 and get this: (gdb) bt #0 post_kvm_run (kvm=0x84cde04, env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:566 #1 0x08086ccf in kvm_run (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:619 #2 0x080882d0 in kvm_cpu_exec (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1238 #3 0x08088cf6 in kvm_main_loop_cpu (env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1495 #4 0x08088e72 in ap_main_loop (_env=0x84e7168) at /tmp/qemu-kvm-201006282359/qemu-kvm.c:1541 #5 0x55598690 in start_thread () from /lib/libpthread.so.0 #6 0x55a8ca7e in clone () from /lib/libc.so.6 (gdb) list 561 in /tmp/qemu-kvm-201006282359/qemu-kvm.c (gdb) print qemu_mutex $1 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 0, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' repeats 23 times, __align = 0} (gdb) I rebuilt the kernel, then glibc, then the entire graphics stack, then qemu_kvm to try and be sure I have no problems about headers. All my other software works, but qemu_kvm does not. About 1 time in 10 it will actually run fine, but the other times it will crash as shown. I use a dedicated LV for this. I have a 32bit userland with a 64bit kernel. Here is the script I use: #! /sbin/bash INSTANCE=0 NAME=VM${INSTANCE} FAKEDISK=/dev/mapper/vmland-vmdisk${INSTANCE} ((MACNO=22+INSTANCE)) ulimit -S -c unlimited echo qemu-system-x86_64 \ -cpu core2duo -smp 2 -m 512 \ -vga std \ -vnc :${INSTANCE} -monitor stdio \ -localtime -usb -usbdevice mouse \ -net nic,vlan=0,model=rtl8139,macaddr=DE:AD:BE:EF:25:${MACNO} \ -net tap,ifname=tap${INSTANCE},script=/etc/qemu-ifup,downscript=/etc/qemu-ifdow n \ -name ${NAME} \ -hda ${FAKEDISK} \ -boot c qemu-system-x86_64 \ -cpu core2duo -smp 2 -m 512 \ try without -cpu core2duo -vga std \ -vnc :${INSTANCE} -monitor stdio \ -localtime -usb -usbdevice mouse \ -net nic,vlan=0,model=rtl8139,macaddr=DE:AD:BE:EF:25:${MACNO} \ -net tap,ifname=tap${INSTANCE},script=/etc/qemu-ifup,downscript=/etc/qemu-ifdow n \ -name ${NAME} \ -hda ${FAKEDISK} \ -boot c # just in case /usr/sbin/brctl delif br0 tap${INSTANCE} The bridging and taps all worked before. The CPU is a core i7 950, I've got 12GB of RAM, and I'm going nuts trying to debug this. Since it sometimes works, I wonder if there is some uninitialized variable that sometimes is set so I get lucky but usually is set where things crash. I don't want to place blame, I just want to get it working. Any hints? I'm not subscribed, but the page at http://www.linux-kvm.org/page/Lists,_IRC said it's ok to send a message anyway. Please cc: me so I get a copy, or if I need to join the list please tell me. I compile it all from source (similar to linux from scratch) so there is no upstream distro to go ask for help. Since everything else works, I suspect something strange in qemu_kvm. I did google a lot but found nothing helpful. The ISO image used works on real hardware, and uses the same kernel and userland. The isolinux shows the menu and works great, but when it is time to boot the kernel I get the crash. The kernel modules kvm and kvm_intel are loaded when I try to start qemu_kvm. The /var/log/messages just shows this: Jun 29 00:05:47 banpuk kernel: [20299.236926] qemu-system-x86[31375]: segfault at 14 ip 08086a64 sp 5601e180 error 4 in qemu-system-x86_64[8048000+256000] The /var/log/syslog show this: Jun 29 00:06:00 banpuk kernel: [20312.302498] kvm: 31383: cpu0 unhandled wrmsr: 0x198 data 0 Jun 29 00:06:00 banpuk kernel: [20312.302606] kvm: 31383: cpu1 unhandled wrmsr: 0x198 data 0 JGH -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html