Bug#503821: linux-image-2.6.26-1-xen-amd64: Kernel crash in Dom0 (Eeek! page_mapcount(page) went negative! (-1))
* Bastian Blank [EMAIL PROTECTED] [2008-10-30 10:50:03]: I comitted a workaround, to be exact an update for a workaround. I was not longer able to trigger that under load. Please test the snapshots[1] tomorrow (2.6.26-10~snapshot.12362 or higher). Now I've tested the workaround, and I'm sorry to say that it doesn't help. [0.00] Linux version 2.6.26-1-xen-amd64 (Debian 2.6.26-10~snapshot.12362) ([EMAIL PROTECTED]) (gcc version 4.1.3 20080623 (prerelease) (Debian 4.1.2-23+1)) #1 SMP Fri Oct 31 03:53:45 UTC 2008 [ 200.424526] Eeek! page_mapcount(page) went negative! (-1) [ 200.424637] page pfn = 4 [ 200.424937] page-flags = 0 [ 200.424937] page-count = 0 [ 200.424937] page-mapping = [ 200.424937] vma-vm_ops = 0x0 [ 200.424937] [ cut here ] [ 200.424937] kernel BUG at mm/rmap.c:673! [ 200.424937] invalid opcode: [1] SMP [ 200.425856] CPU 0 [ 200.425856] Modules linked in: bridge netloop video output ac battery microcode firmware_class nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc ipv6 xfs reiserfs ext2 sha256_generic aes_x86_64 aes_generic cbc dm_crypt crypto_blkcipher raid456 async_xor async_memcpy async_tx xor loop iTCO_wdt serio_raw pcspkr i2c_i801 psmouse rng_core i2c_core container shpchp pci_hotplug button i3000_edac edac_core evdev joydev ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod raid1 md_mod ide_cd_mod cdrom ide_pci_generic usbhid hid ff_memless piix ata_piix ide_core sd_mod floppy ata_generic ehci_hcd uhci_hcd sata_sil24 libata 3w_9xxx dock e1000e scsi_mod thermal processor fan thermal_sys [ 200.429852] Pid: 6332, comm: a.out Not tainted 2.6.26-1-xen-amd64 #1 [ 200.429852] RIP: e030:[8027c550] [8027c550] page_remove_rmap+0xfb/0x117 [ 200.429852] RSP: e02b:880071aabdc8 EFLAGS: 00010246 [ 200.429852] RAX: RBX: 88000235a0e0 RCX: 80501fc8 [ 200.429852] RDX: ff5f7000 RSI: 0001 RDI: 80501fc0 [ 200.429852] RBP: 880071a577c8 R08: 80501fb0 R09: 880001b48f08 [ 200.429852] R10: 880071aaba58 R11: 00015382 R12: 88000235a0e0 [ 200.429852] R13: 88c010d8 R14: 88007d426840 R15: 880002384048 [ 200.429852] FS: 7f41ff3e36e0() GS:80539000() knlGS: [ 200.429852] CS: e033 DS: ES: [ 200.429852] DR0: DR1: DR2: [ 200.429852] DR3: DR6: 0ff0 DR7: 0400 [ 200.429852] Process a.out (pid: 6332, threadinfo 880071aaa000, task 880071ad8500) [ 200.429852] Stack: 8800717f08d0 4800 00322061b000 80273239 [ 200.429852] 0206 880071aabec8 [ 200.429852] 880071a577c8 880071aabed0 0003628e [ 200.429852] Call Trace: [ 200.429852] [80273239] ? unmap_vmas+0x744/0xa49 [ 200.429852] [80278567] ? exit_mmap+0x7b/0xf7 [ 200.429852] [8022a73d] ? mmput+0x2c/0xc0 [ 200.429852] [8022fef8] ? do_exit+0x25a/0x6ce [ 200.429852] [80230412] ? do_group_exit+0xa6/0xdc [ 200.429852] [8020b528] ? system_call+0x68/0x6d [ 200.429852] [8020b4c0] ? system_call+0x0/0x6d [ 200.429852] [ 200.429852] [ 200.429852] Code: 80 e8 18 0c fd ff 48 8b 85 90 00 00 00 48 85 c0 74 19 48 8b 40 20 48 85 c0 74 10 48 8b 70 58 48 c7 c7 e1 52 4b 80 e8 f3 0b fd ff 0f 0b eb fe 8b 77 18 41 58 5b 5d 83 e6 01 f7 de 83 c6 04 e9 64 [ 200.429852] RIP [8027c550] page_remove_rmap+0xfb/0x117 [ 200.429852] RSP 880071aabdc8 [ 200.439529] ---[ end trace 10482cbe68c8d062 ]--- [ 200.439619] Fixing recursive fault but reboot is needed! My ugly testprogram crashed directly on this one aswell. Best Regards, /LM -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#503821: linux-image-2.6.26-1-xen-amd64: Kernel crash in Dom0 (Eeek! page_mapcount(page) went negative! (-1))
On Fri, Oct 31, 2008 at 08:49:42AM +0100, Lars Michael Jogback wrote: * Bastian Blank [EMAIL PROTECTED] [2008-10-30 10:50:03]: I comitted a workaround, to be exact an update for a workaround. I was not longer able to trigger that under load. Please test the snapshots[1] tomorrow (2.6.26-10~snapshot.12362 or higher). Now I've tested the workaround, and I'm sorry to say that it doesn't help. Okay, so I have no further possibilities except removing the kernel. Bastian -- ... bacteriological warfare ... hard to believe we were once foolish enough to play around with that. -- McCoy, The Omega Glory, stardate unknown -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#503821: linux-image-2.6.26-1-xen-amd64: Kernel crash in Dom0 (Eeek! page_mapcount(page) went negative! (-1))
* Bastian Blank [EMAIL PROTECTED] [2008-10-31 12:47:34]: On Fri, Oct 31, 2008 at 08:49:42AM +0100, Lars Michael Jogback wrote: * Bastian Blank [EMAIL PROTECTED] [2008-10-30 10:50:03]: I comitted a workaround, to be exact an update for a workaround. I was not longer able to trigger that under load. Please test the snapshots[1] tomorrow (2.6.26-10~snapshot.12362 or higher). Now I've tested the workaround, and I'm sorry to say that it doesn't help. Okay, so I have no further possibilities except removing the kernel. Do you know if there is someplace to download the OpenSUSE-kernel to test and see if the error is present in that too? If it is, there might be more kernel-hackers that have idea's on what the problem might be? Best Regards, /LM -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#503821: linux-image-2.6.26-1-xen-amd64: Kernel crash in Dom0 (Eeek! page_mapcount(page) went negative! (-1))
On Fri, Oct 31, 2008 at 12:54:14PM +0100, Lars Michael Jogback wrote: * Bastian Blank [EMAIL PROTECTED] [2008-10-31 12:47:34]: Okay, so I have no further possibilities except removing the kernel. Do you know if there is someplace to download the OpenSUSE-kernel to test and see if the error is present in that too? The SuSE kernels can be found at http://ftp.suse.com/pub/projects/kernel/kotd/HEAD/x86_64/. But as the next SLES will ship .27, they don't really care. Bastian -- Pain is a thing of the mind. The mind can be controlled. -- Spock, Operation -- Annihilate! stardate 3287.2 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#503821: linux-image-2.6.26-1-xen-amd64: Kernel crash in Dom0 (Eeek! page_mapcount(page) went negative! (-1))
Bastian Blank wrote: On Fri, Oct 31, 2008 at 12:54:14PM +0100, Lars Michael Jogback wrote: * Bastian Blank [EMAIL PROTECTED] [2008-10-31 12:47:34]: Okay, so I have no further possibilities except removing the kernel. Do you know if there is someplace to download the OpenSUSE-kernel to test and see if the error is present in that too? The SuSE kernels can be found at http://ftp.suse.com/pub/projects/kernel/kotd/HEAD/x86_64/. But as the next SLES will ship .27, they don't really care. A related bugreport I found when searching for this error two weeks ago is an ubuntu bug report in launchpad. [1] It has a syslog-excerpt attachted [2] showing the use of kernel 2.6.27. (kernel BUG at [..] linux-2.6.27/mm/rmap.c:662) There's also [3] on ubuntuforums, reported about 16 hours ago, same kernel, same error etc... So I guess whatever is broken, it's in our 2.6.27 as well. Hans van Kranenburg [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/252977 [2] http://launchpadlibrarian.net/17573131/syslog-2009-09-12.txt [3] http://ubuntuforums.org/showthread.php?t=964133 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#503821: linux-image-2.6.26-1-xen-amd64: Kernel crash in Dom0 (Eeek! page_mapcount(page) went negative! (-1))
On Tue, Oct 28, 2008 at 11:49:16AM +0100, Lars Michael Jogback wrote: It seems to be some problems on the amd64 architecture. After approx 10-20 hours of uptime, the Dom0 crash (even if there is no DomU running) with the following error: I comitted a workaround, to be exact an update for a workaround. I was not longer able to trigger that under load. Please test the snapshots[1] tomorrow (2.6.26-10~snapshot.12362 or higher). Bastian [1]: http://wiki.debian.org/DebianKernel, sid branch -- Emotions are alien to me. I'm a scientist. -- Spock, This Side of Paradise, stardate 3417.3 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#503821: linux-image-2.6.26-1-xen-amd64: Kernel crash in Dom0 (Eeek! page_mapcount(page) went negative! (-1))
Bastian Blank [EMAIL PROTECTED] écrivait (wrote) : On Tue, Oct 28, 2008 at 11:49:16AM +0100, Lars Michael Jogback wrote: It seems to be some problems on the amd64 architecture. After approx 10-20 hours of uptime, the Dom0 crash (even if there is no DomU running) with the following error: I comitted a workaround, to be exact an update for a workaround. I was not longer able to trigger that under load. Please test the snapshots[1] tomorrow (2.6.26-10~snapshot.12362 or higher). Oct 30 15:29:21 fw kernel: [0.00] Linux version 2.6.26-1-xen-amd64 (Debian 2.6.26-10~snapshot.12361) ([EMAIL PROTECTED]) (gcc version 4.1.3 20080623 (prerelease) (Debian 4.1.2-23+1)) #1 SMP Thu Oct 30 03:57:26 UTC 2008 Oct 30 15:32:16 fw kernel: [ 281.498595] Eeek! page_mapcount(page) went negative! (-1) Oct 30 15:32:16 fw kernel: [ 281.498610] page pfn = 7 Oct 30 15:32:16 fw kernel: [ 281.498612] page-flags = 868 Oct 30 15:32:16 fw kernel: [ 281.498615] page-count = 2 Oct 30 15:32:16 fw kernel: [ 281.498618] page-mapping = 880007bd0890 Oct 30 15:32:16 fw kernel: [ 281.498646] vma-vm_ops = 0x0 Oct 30 15:32:16 fw kernel: [ 281.498662] [ cut here ] Oct 30 15:32:16 fw kernel: [ 281.498665] kernel BUG at mm/rmap.c:673! Oct 30 15:32:16 fw kernel: [ 281.498667] invalid opcode: [1] SMP Oct 30 15:32:16 fw kernel: [ 281.498671] CPU 0 Oct 30 15:32:16 fw kernel: [ 281.498674] Modules linked in: tun ppp_deflate bsd_comp xfrm_user xfrm4_tunnel tunnel4 ipcomp esp4 aead ah4 iptable_raw xt_comment xt_policy ipt_ULOG ipt_TTL ipt_ttl ipt_REJECT ipt_REDIRECT ipt_recent ipt_NETMAP ipt_MASQUERADE ipt_LOG ipt_ECN ipt_ecn ipt_CLUSTERIP ipt_ah ipt_addrtype nf_nat_tftp nf_nat_snmp_basic nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_tcpmss xt_pkttype xt_physdev xt_owner xt_NFQUEUE xt_NFLOG xt_multiport xt_MARK xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp xt_conntrack xt_CONNMARK xt_connmark xt_CLASSIFY xt_tcpudp xt_state iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack iptable_mangle nfnetlink iptable_filter ip_tables x_tables ppp_async crc_ccitt ppp_generic slhc ipv6 deflate zlib_deflate zlib_inflate ctr twofish twofish_common camellia serpent blowfish des_generic cbc aes_x86_64 aes_generic xcbc sha256_generic sha1_generic crypto_null af_key arc4 ecb crypto_blkcipher rt61pci crc_itu_t rt2x00pci rt2x00lib firmware_class rfkill led_class input_polldev mac80211 cfg80211 eeprom_93cx6 evdev joydev ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod thermal_sys Oct 30 15:32:16 fw kernel: [ 281.498796] Pid: 2892, comm: aptitude Not tainted 2.6.26-1-xen-amd64 #1 Oct 30 15:32:16 fw kernel: [ 281.498799] RIP: e030:[8027c550] [8027c550] page_remove_rmap+0xfb/0x117 Oct 30 15:32:16 fw kernel: [ 281.498807] RSP: e02b:880003381bf8 EFLAGS: 00010246 Oct 30 15:32:16 fw kernel: [ 281.498810] RAX: RBX: 880001d25188 RCX: bdbd25f0 Oct 30 15:32:16 fw kernel: [ 281.498813] RDX: ff5f7000 RSI: 0001 RDI: 805aaab0 Oct 30 15:32:16 fw kernel: [ 281.498816] RBP: 880006974138 R08: 8800063ff1f0 R09: 880001e92501 Oct 30 15:32:16 fw kernel: [ 281.498819] R10: 0024 R11: 00300028 R12: 880001d25188 Oct 30 15:32:16 fw kernel: [ 281.498822] R13: 880006860a78 R14: 880006b9adc0 R15: 880001e92510 Oct 30 15:32:16 fw kernel: [ 281.498828] FS: 7fe2eca526f0() GS:80539000() knlGS: Oct 30 15:32:16 fw kernel: [ 281.498831] CS: e033 DS: ES: Oct 30 15:32:16 fw kernel: [ 281.498834] DR0: DR1: DR2: Oct 30 15:32:16 fw kernel: [ 281.498837] DR3: DR6: 0ff0 DR7: 0400 Oct 30 15:32:16 fw kernel: [ 281.498840] Process aptitude (pid: 2892, threadinfo 88000338, task 88000621e280) x Oct 30 15:32:16 fw kernel: [ 281.498843] Stack: 71007500 0274f000 80273239 Oct 30 15:32:16 fw kernel: [ 281.498849] 88000321 880003381cf8 Oct 30 15:32:16 fw kernel: [ 281.498853] 880006974138 880003381d00 000a2f2a Oct 30 15:32:16 fw kernel: [ 281.498857] Call Trace: Oct 30 15:32:16 fw kernel: [ 281.498870] [80273239] ? unmap_vmas+0x744/0xa49 Oct 30 15:32:16 fw kernel: [ 281.498910] [80278567] ? exit_mmap+0x7b/0xf7 Oct 30 15:32:16 fw kernel: [ 281.498921] [8022a73d] ? mmput+0x2c/0xc0 Oct 30 15:32:16 fw kernel: [ 281.498929] [8022fef8] ? do_exit+0x25a/0x6ce Oct 30 15:32:16 fw
Bug#503821: linux-image-2.6.26-1-xen-amd64: Kernel crash in Dom0 (Eeek! page_mapcount(page) went negative! (-1))
On Thu, Oct 30, 2008 at 05:29:42PM +0100, Jean Charles Delepine wrote: Bastian Blank [EMAIL PROTECTED] écrivait (wrote) : Please test the snapshots[1] tomorrow (2.6.26-10~snapshot.12362 or higher). Oct 30 15:29:21 fw kernel: [0.00] Linux version 2.6.26-1-xen-amd64 (Debian 2.6.26-10~snapshot.12361) ([EMAIL PROTECTED]) (gcc version 4.1.3 20080623 (prerelease) (Debian 4.1.2-23+1)) #1 SMP Thu Oct 30 03:57:26 UTC 2008 12361 12362! Bastian -- Even historians fail to learn from history -- they repeat the same mistakes. -- John Gill, Patterns of Force, stardate 2534.7 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#503821: linux-image-2.6.26-1-xen-amd64: Kernel crash in Dom0 (Eeek! page_mapcount(page) went negative! (-1))
Package: linux-image-2.6.26-1-xen-amd64 Version: 2.6.26-9 Severity: grave Justification: renders package unusable I'm trying to run Xen in Lenny with the new 2.6.26-kernel as Dom0. It seems to be some problems on the amd64 architecture. After approx 10-20 hours of uptime, the Dom0 crash (even if there is no DomU running) with the following error: [50079.669383] Eeek! page_mapcount(page) went negative! (-1) [50079.669383] page pfn = 5 [50079.669383] page-flags = 0 [50079.669383] page-count = 0 [50079.669383] page-mapping = [50079.669383] vma-vm_ops = 0x0 [50079.669383] [ cut here ] [50079.669383] kernel BUG at mm/rmap.c:673! [50079.669383] invalid opcode: [1] SMP [50079.669383] CPU 0 [50079.669383] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables bridge netloop video output ac battery microcode firmware_class nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc ipv6 xfs reiserfs ext2 sha256_generic aes_x86_64 aes_generic cbc dm_crypt crypto_blkcipher raid456 async_xor async_memcpy async_tx xor loop iTCO_wdt serio_raw i2c_i801 psmouse pcspkr i2c_core rng_core container button i3000_edac edac_core shpchp pci_hotplug evdev joydev ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod raid1 md_mod ide_cd_mod cdrom ide_pci_generic usbhid hid ff_memless piix ide_core ata_piix sd_mod floppy ata_generic ehci_hcd uhci_hcd sata_sil24 e1000e libata dock 3w_9xxx scsi_mod thermal processor fan thermal_sys [50079.669383] Pid: 9197, comm: mutt Not tainted 2.6.26-1-xen-amd64 #1 [50079.669383] RIP: e030:[8027c550] [8027c550] page_remove_rmap+0xfb/0x117 [50079.669383] RSP: e02b:880074601dc8 EFLAGS: 00010246 [50079.669383] RAX: RBX: 880002359118 RCX: 51510001509d [50079.669383] RDX: ff5f7000 RSI: 0001 RDI: 805aaab0 [50079.669383] RBP: 8800746d1918 R08: 0023 R09: 880074601800 [50079.669383] R10: R11: 014221337ed7 R12: 880002359118 [50079.669383] R13: 880014e61320 R14: 88007ff34b80 R15: 8800027eb548 [50079.669383] FS: 7f2b26ff4700() GS:80539000() knlGS: [50079.669383] CS: e033 DS: ES: [50079.669383] DR0: DR1: DR2: [50079.669383] DR3: DR6: 0ff0 DR7: 0400 [50079.669383] Process mutt (pid: 9197, threadinfo 88007460, task 8800745929f0) [50079.669383] Stack: 880014e610e8 5100 05e64000 80273239 [50079.669383] 88007475c000 880074601ec8 [50079.669383] 8800746d1918 880074601ed0 003b9000 [50079.669383] Call Trace: [50079.669383] [80273239] ? unmap_vmas+0x744/0xa49 [50079.669383] [80278567] ? exit_mmap+0x7b/0xf7 [50079.669383] [8022a73d] ? mmput+0x2c/0xc0 [50079.669383] [8022fef8] ? do_exit+0x25a/0x6ce [50079.669383] [80230412] ? do_group_exit+0xa6/0xdc [50079.669383] [8020b528] ? system_call+0x68/0x6d [50079.669383] [8020b4c0] ? system_call+0x0/0x6d [50079.669383] [50079.669383] [50079.669383] Code: 80 e8 18 0c fd ff 48 8b 85 90 00 00 00 48 85 c0 74 19 48 8b 40 20 48 85 c0 74 10 48 8b 70 58 48 c7 c7 e1 52 4b 80 e8 f3 0b fd ff 0f 0b eb fe 8b 77 18 41 58 5b 5d 83 e6 01 f7 de 83 c6 04 e9 64 [50079.669383] RIP [8027c550] page_remove_rmap+0xfb/0x117 [50079.669383] RSP 880074601dc8 [50079.673388] ---[ end trace c445527cbda75056 ]--- [50079.673479] Fixing recursive fault but reboot is needed! The same system is stable when running linux-image-2.6.26-1-amd64. I've also successfully ran linux-image-2.6.26-1-xen-686 using i386-architecture on the same hardware, so it seems to happen only in Xen-variants on amd64 architecture. I have no reliable way of forcing the error to occur. The best way I've found so far to get this crash is to do a couple of kernel-recompilations, but sometimes I can do a couple of compile-runs without crash. The hardware on this box is a Supermicro PDSME+-motherboard, with a E6600 Core2Duo and 8 Gigs of RAM (ECC) I've run memtest86+ for over 24 hours with no problems reported. I'm not sure if I should tag this as grave or critical, but I feel that it's impossible to run a Xen-system on AMD64 with Debian Lenny currently. -- System Information: Debian Release: lenny/sid APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 2.6.26-1-xen-amd64 (SMP w/2 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages linux-image-2.6.26-1-xen-amd64 depends on: ii initramfs-tools 0.92j tools for generating an initramfs ii linux-modules-2.6.26-1-xen-am 2.6.26-9 Linux 2.6.26 modules on AMD64 linux-image-2.6.26-1-xen-amd64