Re: [kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
Dave Hansen wrote: On Wed, 2008-03-26 at 18:58 +0200, Avi Kivity wrote: Dave Hansen wrote: On Wed, 2008-03-26 at 11:50 +0200, Avi Kivity wrote: Dave Hansen wrote: I was getting some kvm userspace crashes trying to run a Windows guest. So, I decided to try a recent kernel (2.6.25-rc6-00333-ga4083c9) with the kvm kernel code that shipped with that kernel. This is fixed in 2.6.25-rc7. I just updated to -rc7 and re-tested. Same symptoms: Bad. Which kvm userspace are you running? ~/src/kvm-userspace$ git describe kvm-63-118-g52be1a1 I dug out my i386 install and tried it. Doesn't reproduce for me on either kvm.git or -rc7. Do you have a working setup that we can bisect? -- error compiling committee.c: too many arguments to function - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
Avi Kivity wrote: Dave Hansen wrote: On Wed, 2008-03-26 at 18:58 +0200, Avi Kivity wrote: Dave Hansen wrote: On Wed, 2008-03-26 at 11:50 +0200, Avi Kivity wrote: Dave Hansen wrote: I was getting some kvm userspace crashes trying to run a Windows guest. So, I decided to try a recent kernel (2.6.25-rc6-00333-ga4083c9) with the kvm kernel code that shipped with that kernel. [...] btw, is this with = 4GB RAM on the host? -- error compiling committee.c: too many arguments to function - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
Dave Hansen wrote: On Thu, 2008-03-27 at 12:10 +0200, Avi Kivity wrote: btw, is this with = 4GB RAM on the host? Well, are you asking whether I have PAE on or not? :) No, I'm asking whether there is a possibility of address truncation :) PAE by itself doesn't affect kvm much, as it always runs the guest in pae mode. Can you try running with mem=2000M or something? -- error compiling committee.c: too many arguments to function - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
On Thu, 2008-03-27 at 11:36 +0200, Avi Kivity wrote: I dug out my i386 install and tried it. Doesn't reproduce for me on either kvm.git or -rc7. Do you have a working setup that we can bisect? I don't really have a working revision to bisect against. I'm not sure that it ever worked. It's also on my actual laptop, so it's a bit of a pain to get any other work done while I'm bisecting. :) I'll move the Windows image over to another machine today and see if I can reproduce elsewhere. I'll also check some older versions of KVM to see if any of those work. If I do that, should I keep the kvm userspace, modules and BIOSes all synchronized from each version that I test? -- Dave - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
On Thu, 2008-03-27 at 17:53 +0200, Avi Kivity wrote: Dave Hansen wrote: On Thu, 2008-03-27 at 11:36 +0200, Avi Kivity wrote: I dug out my i386 install and tried it. Doesn't reproduce for me on either kvm.git or -rc7. Do you have a working setup that we can bisect? I don't really have a working revision to bisect against. I'm not sure that it ever worked. I'm fairly sure Windows works on kvm... Oh, I didn't mean to imply that Windows doesn't work, just that the particular perverted way in which I'm poking it may have never worked. :) How did you generate the image? The original install was done in a kqemu-accelerated host. It's also on my actual laptop, so it's a bit of a pain to get any other work done while I'm bisecting. :) I'll move the Windows image over to another machine today and see if I can reproduce elsewhere. I'll also check some older versions of KVM to see if any of those work. If I do that, should I keep the kvm userspace, modules and BIOSes all synchronized from each version that I test? You can keep the userspace (qemu + bios) fixed and change the kernel, or vice versa. -- Dave - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
On Thu, 2008-03-27 at 16:59 +0200, Avi Kivity wrote: Dave Hansen wrote: On Thu, 2008-03-27 at 12:10 +0200, Avi Kivity wrote: btw, is this with = 4GB RAM on the host? Well, are you asking whether I have PAE on or not? :) No, I'm asking whether there is a possibility of address truncation :) PAE by itself doesn't affect kvm much, as it always runs the guest in pae mode. Can you try running with mem=2000M or something? Oh, sure. I'll give that a shot. -- Dave - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
Dave Hansen wrote: I was getting some kvm userspace crashes trying to run a Windows guest. So, I decided to try a recent kernel (2.6.25-rc6-00333-ga4083c9) with the kvm kernel code that shipped with that kernel. This is fixed in 2.6.25-rc7. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
Dave Hansen wrote: On Wed, 2008-03-26 at 11:50 +0200, Avi Kivity wrote: Dave Hansen wrote: I was getting some kvm userspace crashes trying to run a Windows guest. So, I decided to try a recent kernel (2.6.25-rc6-00333-ga4083c9) with the kvm kernel code that shipped with that kernel. This is fixed in 2.6.25-rc7. I just updated to -rc7 and re-tested. Same symptoms: Bad. Which kvm userspace are you running? -- error compiling committee.c: too many arguments to function - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
On Wed, 2008-03-26 at 18:58 +0200, Avi Kivity wrote: Dave Hansen wrote: On Wed, 2008-03-26 at 11:50 +0200, Avi Kivity wrote: Dave Hansen wrote: I was getting some kvm userspace crashes trying to run a Windows guest. So, I decided to try a recent kernel (2.6.25-rc6-00333-ga4083c9) with the kvm kernel code that shipped with that kernel. This is fixed in 2.6.25-rc7. I just updated to -rc7 and re-tested. Same symptoms: Bad. Which kvm userspace are you running? ~/src/kvm-userspace$ git describe kvm-63-118-g52be1a1 -- Dave - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
On Wed, 2008-03-26 at 11:50 +0200, Avi Kivity wrote: Dave Hansen wrote: I was getting some kvm userspace crashes trying to run a Windows guest. So, I decided to try a recent kernel (2.6.25-rc6-00333-ga4083c9) with the kvm kernel code that shipped with that kernel. This is fixed in 2.6.25-rc7. I just updated to -rc7 and re-tested. Same symptoms: [ 751.033545] BUG: unable to handle kernel paging request at 0096b848 [ 751.040082] IP: [c01a0636] d_instantiate+0x26/0x50 [ 751.048065] Oops: 0002 [#1] SMP [ 751.052057] Modules linked in: kvm_intel kvm nls_iso8859_1 vfat fat rfcomm l2cap tun ppdev acpi_cpufreq cpufreq_ondemand cpe [ 751.052057] [ 751.052057] Pid: 8743, comm: evolution Not tainted (2.6.25-rc7 #146) [ 751.052057] EIP: 0060:[c01a0636] EFLAGS: 00210286 CPU: 0 [ 751.052057] EIP is at d_instantiate+0x26/0x50 [ 751.052057] EAX: 0096b844 EBX: e65d7d48 ECX: EDX: e65d7d60 [ 751.052057] ESI: e67a7d00 EDI: e67a7cc0 EBP: e802ce48 ESP: e802ce3c [ 751.052057] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 751.052057] Process evolution (pid: 8743, ti=e802c000 task=f3c8ce00 task.ti=e802c000) [ 751.052057] Stack: e65d7d48 f4c191f8 e802ce60 c01e2fa4 e67a7cc0 f4c191f8 e65d7d48 [ 751.052057]e660c280 e802ce80 c01e30c3 8180 e67a7cc0 c03b4a00 e660c280 [ 751.052057]8180 e802cea0 c0197964 e802cf24 c03b4a00 e67a7cc0 e660c280 e802cf24 [ 751.052057] Call Trace: [ 751.052057] [c01e2fa4] ? ext3_add_nondir+0x34/0x60 [ 751.052057] [c01e30c3] ? ext3_create+0xf3/0x100 [ 751.052057] [c0197964] ? vfs_create+0x74/0x100 [ 751.052057] [c0197c8f] ? open_namei_create+0x4f/0xa0 [ 751.052057] [c01981f3] ? open_namei+0x513/0x560 [ 751.052057] [c018db2c] ? do_filp_open+0x2c/0x60 [ 751.052057] [c018dd29] ? get_unused_fd_flags+0x39/0xd0 [ 751.052057] [c018dec4] ? do_sys_open+0x54/0xe0 [ 751.052057] [c018df6c] ? sys_open+0x1c/0x20 [ 751.052057] [c0104e2c] ? sysenter_past_esp+0x6d/0xa5 [ 751.052057] [c039] ? quirk_vt8235_acpi+0x90/0xa0 [ 751.052057] === [ 751.052057] Code: 27 00 00 00 00 55 89 e5 57 89 c7 56 8d 70 40 53 89 d3 39 70 40 75 37 b8 40 15 4e c0 e8 14 d1 1f 00 85 db [ 751.052057] EIP: [c01a0636] d_instantiate+0x26/0x50 SS:ESP 0068:e802ce3c [ 751.052103] ---[ end trace 514c1de750400319 ]--- -- Dave - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] kvm causing memory corruption? ~2.6.25-rc6
I was getting some kvm userspace crashes trying to run a Windows guest. So, I decided to try a recent kernel (2.6.25-rc6-00333-ga4083c9) with the kvm kernel code that shipped with that kernel. I've had some lockups doing similar things over the last month or two, but figured it was something really stupid I was doing, and never really connected the dots. Now, I've hooked up a serial console and reproduced it with a fresh boot and not much else going on at all on the machine. Machine is a Thinkpad T61. .config is here: http://sr71.net/~dave/linux/config-2.6.25-rc6-00333-ga4083c9 To trigger it, I first run kvm and see an error (-no-kvm works fine, btw): $ ~/src/kvm-userspace/qemu/x86_64-softmmu/qemu-system-x86_64 -hda ~/projects/qemu/windows-xp-base-runme.img kvm_run: Cannot allocate memory kvm_run returned -12 Then, run it again. I usually get an oops. But, the weird part is that the oops isn't *in* kvm. It's in some other part of the kernel and in some *OTHER* process. One in bash is below. That's what leads me to believe it is memory corruption. The machine also becomes increasingly unstable after the original oops so there's definitely collateral damage. $ addr2line -e vmlinux c01795e4 /home/dave/kernels/linux-2.6.git/mm/filemap.c:1327 int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf) { int error; struct file *file = vma-vm_file; struct address_space *mapping = file-f_mapping; struct file_ra_state *ra = file-f_ra; HERE---struct inode *inode = mapping-host; Which is a line of code that literally hasn't touched since the beginning of time (in git terms :). Full oops is below: [ 435.057922] BUG: unable to handle kernel NULL pointer dereference at 0048 [ 435.067275] IP: [c01795e4] filemap_fault+0x34/0x310 [ 435.072815] *pdpt = 2a4a7001 *pde = [ 435.081272] Oops: [#2] SMP [ 435.084812] Modules linked in: nls_iso8859_1 vfat fat rfcomm l2cap tun ppdev acpi_cpufreq cpufreq_ondemand cpufreq_conservative cpufreq_stats freq_table cpufreq_userspace cpufreq_powersave sbs container sbshc af_packet sbp2 lp loop usb_storage arc4 ecb crypto_blkcipher pcmcia usbhid libusual hid snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_timer snd_seq_device joydev iwl4965 snd serio_raw mac80211 yenta_socket parport_pc sdhci uhci_hcd ehci_hcd ricoh_mmc ohci1394 rsrc_nonstatic soundcore cfg80211 parport psmouse mmc_core ieee1394 pcmcia_core usbcore snd_page_alloc e1000 button thinkpad_acpi nvram evdev thermal processor fan fuse [ 435.084812] [ 435.084812] Pid: 7691, comm: bash Tainted: G D (2.6.25-rc6-00333-ga4083c9 #144) [ 435.084812] EIP: 0060:[c01795e4] EFLAGS: 00010286 CPU: 0 [ 435.084812] EIP is at filemap_fault+0x34/0x310 [ 435.084812] EAX: ef83bf48 EBX: 0012 ECX: EDX: ef83c7e8 [ 435.084812] ESI: c04cc248 EDI: EBP: ef96ee40 ESP: ef96ee00 [ 435.084812] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 435.084812] Process bash (pid: 7691, ti=ef96e000 task=ef8a2e00 task.ti=ef96e000) [ 435.084812] Stack: ef96ee2c c0130cbc ef96ee28 c01870bb ef96ee28 [ 435.084812]ef83bf48 ef83c7e8 ef83bf00 ef96ee9c ea49f7e8 0012 c04cc248 [ 435.084812]ef96eeb8 c018ab57 8001 0001 0001 eacb6314 [ 435.084812] Call Trace: [ 435.084812] [c0130cbc] ? kmap_atomic_prot+0x12c/0x150 [ 435.084812] [c01870bb] ? vm_normal_page+0x2b/0xa0 [ 435.084812] [c018ab57] ? __do_fault+0x67/0x4e0 [ 435.084812] [c01a8a70] ? pipe_read+0x1f0/0x290 [ 435.084812] [c018b03d] ? do_linear_fault+0x6d/0x80 [ 435.084812] [c018b570] ? handle_mm_fault+0x1c0/0x4d0 [ 435.084812] [c014d58e] ? do_sigaction+0x16e/0x190 [ 435.084812] [c03b3419] ? do_page_fault+0x169/0x4d0 [ 435.084812] [c01a38b9] ? fput+0x19/0x20 [ 435.084812] [c03b32b0] ? do_page_fault+0x0/0x4d0 [ 435.084812] [c03b187a] ? error_code+0x72/0x78 [ 435.084812] [c03b] ? wait_for_completion_killable+0x10/0x30 [ 435.084812] === [ 435.084812] Code: 89 45 f0 89 55 ec 8b 40 4c 89 45 e8 8b 50 7c 83 c0 48 89 45 e0 89 55 e4 8b 0a c7 45 d8 00 00 00 00 c7 45 d4 00 00 00 00 89 4d dc 8b 49 48 89 f6 8d bc 27 00 00 00 00 89 c8 8b 7d dc 8b 5f 40 8b [ 435.084812] EIP: [c01795e4] filemap_fault+0x34/0x310 SS:ESP 0068:ef96ee00 [ 435.084870] ---[ end trace addcd60623916614 ]--- ~/src/kvm-userspace$ git describe kvm-63-118-g52be1a1 /proc/cpuinfo: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz stepping: 10 cpu MHz : 800.000 cache size : 4096 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 2 fdiv_bug: no hlt_bug : no