> > > > > > I ended up with corrupted ext3 (see sparc.jpg) and working now to
> > > > > > restore it.
> > > > >
> > > > > Up and running again. I managed to get some more info from yesterday
> > > > > crash from syslog:
> > > >
> > > > I've seen similar corruptions with an Intel SSD drive on an MPT Fusion
> > > > SAS interface, which also goes through scsi like your sym case here.
> > > >
> > > > But I'm pretty sure I also saw them with 2.6.28, were you able to run
> > > > 2.6.28 cleanly?
> > >
> > > The thing is the last time I run this sparc was somewhere in
> > > 2.6.27~2.6.28 window.
> > > I didn't see anything like it at that time. I'll give a try to 2.6.28 but
> > > it might
> > > be hard to trigger it as I don't know what exactly caused it. Also
> > > yesterday
> > > the very same kernel (f3b8436a) worked under I/O load for hours just fine.
> > > I'll leave it with 2.6.28 on it busy for a few days and see what happens.
> >
> > This is vanilla 2.6.28 and it shows memory corruption problems as well.
> > After a few hours I got these:
> >
> > (see my comments at the end of mail)
> >
> > Jan 22 15:49:27 localhost kernel:
> > =============================================================================
> > Jan 22 15:49:27 localhost kernel: BUG tsb_16KB: Object padding overwritten
> > Jan 22 15:49:27 localhost kernel:
> > -----------------------------------------------------------------------------
> > Jan 22 15:49:27 localhost kernel:
> > Jan 22 15:49:27 localhost kernel: INFO:
> > 0xfffff800bdb7fcc0-0xfffff800bdb7fcff. First byte 0x0 instead of 0x5a
> > Jan 22 15:49:27 localhost kernel: INFO: Allocated in tsb_grow+0x88/0x440
> > age=19349783 cpu=0 pid=3212
> > Jan 22 15:49:27 localhost kernel: INFO: Freed in tsb_grow+0x2f0/0x440
> > age=19810433 cpu=0 pid=2823
> > Jan 22 15:49:27 localhost kernel: INFO: Slab 0x0000000202392680 objects=1
> > used=1 fp=0x0000000000000000 flags=0x2083
> > Jan 22 15:49:27 localhost kernel: INFO: Object 0xfffff800bdb78000 @offset=0
> > fp=0x0000000000000000
> > Jan 22 15:49:27 localhost kernel:
> > Jan 22 15:49:27 localhost kernel: Object 0xfffff800bdb78000: 00 00 40 00
> > 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b [email protected]
> [...]
> > Jan 22 15:49:28 localhost kernel: Padding 0xfffff800bdb7fff0: 5a 5a 5a 5a
> > 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
> > Jan 22 15:49:28 localhost kernel: Call Trace:
> > Jan 22 15:49:28 localhost kernel: [00000000004bde04]
> > print_trailer+0xc4/0x160
> > Jan 22 15:49:28 localhost kernel: [00000000004be4a8]
> > check_bytes_and_report+0xa8/0xe0
> > Jan 22 15:49:28 localhost kernel: [00000000004be6e0]
> > check_object+0x200/0x260
> > Jan 22 15:49:28 localhost kernel: [00000000004bee0c]
> > __slab_free+0x2ac/0x440
> > Jan 22 15:49:28 localhost kernel: [00000000004c1ce8]
> > kmem_cache_free+0x88/0xe0
> > Jan 22 15:49:28 localhost kernel: [00000000004448e0]
> > destroy_context+0x20/0xa0
> > Jan 22 15:49:28 localhost kernel: [000000000045a260] __mmdrop+0x80/0xc0
> > Jan 22 15:49:28 localhost kernel: [00000000004552e8]
> > finish_task_switch+0x88/0xc0
> > Jan 22 15:49:28 localhost kernel: [00000000006bf874]
> > switch_to_pc+0x88/0x4d4
> > Jan 22 15:49:28 localhost kernel: [00000000006c11a4]
> > schedule_hrtimeout_range+0xc4/0xe0
> > Jan 22 15:49:28 localhost kernel: [00000000004d3338] do_select+0x3b8/0x4c0
> > Jan 22 15:49:28 localhost kernel: [00000000004fb300]
> > compat_core_sys_select+0x160/0x200
> > Jan 22 15:49:28 localhost kernel: [00000000004fce60]
> > compat_sys_select+0x20/0x100
> > Jan 22 15:49:28 localhost kernel: [0000000000406254]
> > linux_sparc_syscall32+0x34/0x40
> > Jan 22 15:49:28 localhost kernel: FIX tsb_16KB: Restoring
> > 0xfffff800bdb7fcc0-0xfffff800bdb7fcff=0x5a
> > Jan 22 15:49:28 localhost kernel:
> > Jan 22 15:55:40 localhost init: Id "s0" respawning too fast: disabled for 5
> > minutes
> > Jan 22 16:02:21 localhost init: Id "s0" respawning too fast: disabled for 5
> > minutes
> > Jan 22 16:09:02 localhost init: Id "s0" respawning too fast: disabled for 5
> > minutes
> > Jan 22 16:15:43 localhost init: Id "s0" respawning too fast: disabled for 5
> > minutes
> > Jan 22 16:22:24 localhost init: Id "s0" respawning too fast: disabled for 5
> > minutes
> > Jan 22 16:29:05 localhost init: Id "s0" respawning too fast: disabled for 5
> > minutes
> > Jan 22 16:35:46 localhost init: Id "s0" respawning too fast: disabled for 5
> > minutes
> > Jan 22 16:42:27 localhost init: Id "s0" respawning too fast: disabled for 5
> > minutes
> > Jan 22 16:45:40 localhost rc-scripts: WARNING: sshd has already been
> > started.
> > Jan 22 16:48:50 localhost kernel:
> > =============================================================================
> > Jan 22 16:48:50 localhost kernel: BUG kmalloc-64: Redzone overwritten
> > Jan 22 16:48:50 localhost kernel:
> > -----------------------------------------------------------------------------
> > Jan 22 16:48:50 localhost kernel:
> > Jan 22 16:48:50 localhost kernel: INFO:
> > 0xfffff800bdd526a0-0xfffff800bdd526a7. First byte 0x0 instead of 0xbb
> > Jan 22 16:48:50 localhost kernel: INFO: Freed in
> > free_rb_tree_fname+0x48/0xc0 age=104208 cpu=0 pid=5636
> > Jan 22 16:48:50 localhost kernel: INFO: Slab 0x0000000202397f60 objects=60
> > used=59 fp=0xfffff800bdd52660 flags=0x00c3
> > Jan 22 16:48:50 localhost kernel: INFO: Object 0xfffff800bdd52660
> > @offset=1632 fp=0x0000000000000000
> > Jan 22 16:48:50 localhost kernel:
> > Jan 22 16:48:50 localhost kernel: Bytes b4 0xfffff800bdd52650: 00 00 00 00
> > 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
> > Jan 22 16:48:50 localhost kernel: Object 0xfffff800bdd52660: 6b 6b 6b 6b
> > 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
> > Jan 22 16:48:50 localhost kernel: Object 0xfffff800bdd52670: 6b 6b 6b 6b
> > 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
> > Jan 22 16:48:50 localhost kernel: Object 0xfffff800bdd52680: 00 00 00 00
> > 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > Jan 22 16:48:50 localhost kernel: Object 0xfffff800bdd52690: 00 00 00 00
> > 00 00 00 00 00 00 00 00 00 00 00 00 ................
> > Jan 22 16:48:50 localhost kernel: Redzone 0xfffff800bdd526a0: 00 00 00 00
> > 00 00 00 00 ........
> > Jan 22 16:48:50 localhost kernel: Padding 0xfffff800bdd526e0: 5a 5a 5a 5a
> > 5a 5a 5a 5a ZZZZZZZZ
> > Jan 22 16:48:50 localhost kernel: Call Trace:
> > Jan 22 16:48:50 localhost kernel: [00000000004bde04]
> > print_trailer+0xc4/0x160
> > Jan 22 16:48:50 localhost kernel: [00000000004be4a8]
> > check_bytes_and_report+0xa8/0xe0
> > Jan 22 16:48:50 localhost kernel: [00000000004be528]
> > check_object+0x48/0x260
> > Jan 22 16:48:50 localhost kernel: [00000000004bfadc]
> > __slab_alloc+0x59c/0x720
> > Jan 22 16:48:50 localhost kernel: [00000000004bfef4] __kmalloc+0xf4/0x120
> > Jan 22 16:48:50 localhost kernel: [0000000000519258]
> > ext3_htree_store_dirent+0x18/0x160
> > Jan 22 16:48:50 localhost kernel: [00000000005232d0]
> > htree_dirblock_to_tree+0x170/0x1e0
> > Jan 22 16:48:50 localhost kernel: [0000000000523394]
> > ext3_htree_fill_tree+0x54/0x260
> > Jan 22 16:48:50 localhost kernel: [0000000000519900]
> > ext3_readdir+0x560/0x660
> > Jan 22 16:48:50 localhost kernel: [00000000004d22f8] vfs_readdir+0x78/0xc0
> > Jan 22 16:48:50 localhost kernel: [00000000004d2370]
> > sys_getdents64+0x30/0xa0
> > Jan 22 16:48:50 localhost kernel: [0000000000406254]
> > linux_sparc_syscall32+0x34/0x40
> > Jan 22 16:48:50 localhost kernel: FIX kmalloc-64: Restoring
> > 0xfffff800bdd526a0-0xfffff800bdd526a7=0xbb
> > Jan 22 16:48:50 localhost kernel:
> > Jan 22 16:48:50 localhost kernel: FIX kmalloc-64: Marking all objects used
> > Jan 22 16:49:09 localhost init: Id "s0" respawning too fast: disabled for 5
> > minutes
> > Jan 22 16:49:14 localhost kernel: SysRq : Emergency Sync
> > Jan 22 16:49:14 localhost kernel: Emergency Sync complete
> >
> > I left the box doing some compilation etc. Somewhere around 16:45 I
> > couldn't log into it via ssh so I logged into
> > the box on a few terminals, started typing something and the terminals
> > started freezing one after another. At the
> > end the buzzer was making constant sound and then the keyboard stopped
> > working - at that point I considered the
> > box dead so I restared it manually. I'll push it some more to see if/what
> > else pops out.
>
> Another one:
>
> =============================================================================
> BUG nfs_inode_cache: Redzone overwritten
> -----------------------------------------------------------------------------
>
> INFO: 0xfffff800affe7b78-0xfffff800affe7b7f. First byte 0x30 instead of 0xcc
> INFO: Allocated in nfs_alloc_inode+0x10/0x40 age=11324173 cpu=0 pid=4471
> INFO: Slab 0x00000002020ffa00 objects=22 used=22 fp=0x0000000000000000
> flags=0x2083
> INFO: Object 0xfffff800affe7620 @offset=30240 fp=0x0000000000000000
>
> Bytes b4 0xfffff800affe7610: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a
> ........ZZZZZZZZ
> Object 0xfffff800affe7620: 00 00 00 00 01 f4 c4 89 00 24 01 00 07 01 9a 44
> .....ôÄ..$.....D
> Object 0xfffff800affe7630: 6b 01 00 00 00 00 58 c4 5a 3f 85 9e 41 3f ae b8
> k.....XÄZ?..A?®¸
> Object 0xfffff800affe7640: 1d 46 52 42 55 be 89 c4 f4 01 b8 cd d9 ad 5a 5a
> .FRBU¾.Äô.¸ÍÙZZ
> Object 0xfffff800affe7650: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
> ZZZZZZZZZZZZZZZZ
> Object 0xfffff800affe7660: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
> ZZZZZZZZZZZZZZZZ
> Object 0xfffff800affe7670: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
> ZZZZZZZZZZZZZZZZ
> Object 0xfffff800affe7680: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
> ZZZZZZZZZZZZZZZZ
> Object 0xfffff800affe7690: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
> ZZZZZZZZZZZZZZZZ
> Object 0xfffff800affe76a0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
> ZZZZZZZZZZZZZZZZ
> Object 0xfffff800affe76b0: 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 02
> ................
> Object 0xfffff800affe76c0: 00 00 00 01 00 02 8a bd 00 00 00 00 00 00 75 30
> .......½......u0
> Object 0xfffff800affe76d0: 00 00 00 01 00 02 88 b7 00 00 00 00 00 00 00 01
> .......·........
> Object 0xfffff800affe76e0: 00 00 00 00 00 00 36 b8 5a 5a 5a 5a 5a 5a 5a 5a
> ......6¸ZZZZZZZZ
> Object 0xfffff800affe76f0: 00 00 00 00 00 00 00 00 ff ff f8 00 af fe 76 f8
> ........ÿÿø.¯þvø
> Object 0xfffff800affe7700: ff ff f8 00 af fe 76 f8 ff ff f8 00 af fe 77 08
> ÿÿø.¯þvøÿÿø.¯þw.
> Object 0xfffff800affe7710: ff ff f8 00 af fe 77 08 00 00 00 00 00 00 00 00
> ÿÿø.¯þw.........
> Object 0xfffff800affe7720: 00 00 00 00 00 00 00 20 00 00 00 00 00 00 00 00
> ................
> Object 0xfffff800affe7730: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ................
> Object 0xfffff800affe7740: ff ff f8 00 af fe 77 40 ff ff f8 00 af fe 77 40
> ÿÿø.¯...@ÿÿø.¯þw@
> Object 0xfffff800affe7750: 00 00 00 01 5a 5a 5a 5a 00 00 00 00 00 00 00 00
> ....ZZZZ........
> Object 0xfffff800affe7760: 00 5a 5a 5a 5a 5a 5a 5a de ad 4e ad ff ff ff ff
> .ZZZZZZZÞNÿÿÿÿ
> Object 0xfffff800affe7770: ff ff ff ff ff ff ff ff 00 00 00 00 00 92 a6 30
> ÿÿÿÿÿÿÿÿ......¦0
> Object 0xfffff800affe7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 75 b3 c8
> .............u³È
> Object 0xfffff800affe7790: ff ff f8 00 af fe 77 90 ff ff f8 00 af fe 77 90
> ÿÿø.¯þw.ÿÿø.¯þw.
> Object 0xfffff800affe77a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ................
> Object 0xfffff800affe77b0: 00 00 00 00 00 10 01 00 00 00 00 00 00 20 02 00
> ................
> Object 0xfffff800affe77c0: ff ff f8 00 af fe 77 c0 ff ff f8 00 af fe 77 c0
> ÿÿø.¯þwÀÿÿø.¯þwÀ
> Object 0xfffff800affe77d0: ff ff f8 00 af fe 77 d0 ff ff f8 00 af fe 77 d0
> ÿÿø.¯þwÐÿÿø.¯þwÐ
> Object 0xfffff800affe77e0: 00 00 00 00 01 f4 c4 89 00 00 00 00 00 00 00 02
> .....ôÄ.........
> Object 0xfffff800affe77f0: 00 00 03 e8 00 00 03 e8 00 00 00 00 00 00 00 00
> ...è...è........
> Object 0xfffff800affe7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00
> ................
> Object 0xfffff800affe7810: 00 00 00 00 48 ea 74 ac 00 00 00 00 00 00 00 00
> ....Hêt¬........
> Object 0xfffff800affe7820: 00 00 00 00 49 76 fe 74 00 00 00 00 00 00 00 00
> ....Ivþt........
> Object 0xfffff800affe7830: 00 00 00 00 49 76 fe 74 00 00 00 00 00 00 00 00
> ....Ivþt........
> Object 0xfffff800affe7840: 00 00 00 12 00 00 00 00 00 00 00 00 00 00 00 08
> ................
> Object 0xfffff800affe7850: 00 00 41 ed 00 00 00 00 00 00 00 00 00 00 00 00
> ..Aí............
> Object 0xfffff800affe7860: de ad 4e ad ff ff ff ff ff ff ff ff ff ff ff ff
> ÞNÿÿÿÿÿÿÿÿÿÿÿÿ
> Object 0xfffff800affe7870: 00 00 00 00 00 7b be 88 00 00 00 00 00 00 00 00
> .....{¾.........
> Object 0xfffff800affe7880: 00 00 00 00 00 76 4b f0 00 00 00 01 00 00 00 00
> .....vKð........
> Object 0xfffff800affe7890: 00 00 00 00 00 00 00 00 de ad 4e ad ff ff ff ff
> ........ÞNÿÿÿÿ
> Object 0xfffff800affe78a0: ff ff ff ff ff ff ff ff 00 00 00 00 00 92 a6 60
> ÿÿÿÿÿÿÿÿ......¦`
> Object 0xfffff800affe78b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 75 b4 c0
> .............u´À
> Object 0xfffff800affe78c0: ff ff f8 00 af fe 78 c0 ff ff f8 00 af fe 78 c0
> ÿÿø.¯þxÀÿÿø.¯þxÀ
> Object 0xfffff800affe78d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ................
> Object 0xfffff800affe78e0: ff ff f8 00 af fe 78 88 00 00 00 00 00 7b be 98
> ÿÿø.¯þx......{¾.
> Object 0xfffff800affe78f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 76 4c 70
> .............vLp
> Object 0xfffff800affe7900: 00 69 7a 38 00 69 78 d8 00 00 00 00 00 00 00 00
> .iz8.ixØ........
> Object 0xfffff800affe7910: 00 69 78 f8 00 00 00 00 00 5f 38 d0 00 71 dc c0
> .ixø....._8Ð.qÜÀ
> Object 0xfffff800affe7920: 00 69 79 38 00 71 dc e0 00 00 00 00 00 00 00 00
> .iy8.qÜà........
> Object 0xfffff800affe7930: 00 69 79 18 00 00 00 02 00 5f 38 d0 00 69 79 18
> .iy......_8Ð.iy.
> Object 0xfffff800affe7940: ff ff f8 00 af fe 79 38 00 00 00 00 00 7b be a0
> ÿÿø.¯þy8.....{¾.
> Object 0xfffff800affe7950: 00 00 00 00 00 00 00 00 00 00 00 00 00 76 4c 50
> .............vLP
> Object 0xfffff800affe7960: 00 00 00 00 00 6d 5e 38 00 00 00 00 00 6d 5c a8
> .....m^8.....m\¨
> Object 0xfffff800affe7970: ff ff f8 00 bd 84 5b 18 00 00 00 00 00 00 00 00
> ÿÿø.½.[.........
> Object 0xfffff800affe7980: ff ff f8 00 af fe 79 88 ff ff f8 00 af fe 77 a0
> ÿÿø.¯þy.ÿÿø.¯þw.
> Object 0xfffff800affe7990: 00 00 00 00 00 00 00 20 00 00 00 00 00 00 00 00
> ................
> Object 0xfffff800affe79a0: 00 00 00 00 00 00 00 00 de ad 4e ad ff ff ff ff
> ........ÞNÿÿÿÿ
> Object 0xfffff800affe79b0: ff ff ff ff ff ff ff ff 00 00 00 00 00 e7 1d 70
> ÿÿÿÿÿÿÿÿ.....ç.p
> Object 0xfffff800affe79c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 76 4c d0
> .............vLÐ
> Object 0xfffff800affe79d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ................
> Object 0xfffff800affe79e0: 00 01 00 01 00 00 00 00 ff ff f8 00 af fe 79 e8
> ........ÿÿø.¯þyè
> Object 0xfffff800affe79f0: ff ff f8 00 af fe 79 e8 00 00 00 00 00 00 00 00
> ÿÿø.¯þyè........
> Object 0xfffff800affe7a00: de ad 4e ad ff ff ff ff ff ff ff ff ff ff ff ff
> ÞNÿÿÿÿÿÿÿÿÿÿÿÿ
> Object 0xfffff800affe7a10: 00 00 00 00 00 e7 1d 68 00 00 00 00 00 00 00 00
> .....ç.h........
> Object 0xfffff800affe7a20: 00 00 00 00 00 76 4c f0 00 00 00 00 00 00 00 00
> .....vLð........
> Object 0xfffff800affe7a30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ................
> Object 0xfffff800affe7a40: 00 00 00 00 00 e7 1c c8 00 00 00 00 00 12 00 d2
> .....ç.È.......Ò
> Object 0xfffff800affe7a50: 00 00 00 00 00 7b 46 40 00 00 00 00 00 00 00 00
> .......@........
> Object 0xfffff800affe7a60: de ad 4e ad ff ff ff ff ff ff ff ff ff ff ff ff
> ÞNÿÿÿÿÿÿÿÿÿÿÿÿ
> Object 0xfffff800affe7a70: 00 00 00 00 00 e7 1d 60 00 00 00 00 00 00 00 00
> .....ç.`........
> Object 0xfffff800affe7a80: 00 00 00 00 00 76 4d 10 ff ff f8 00 af fe 7a 88
> .....vM.ÿÿø.¯þz.
> Object 0xfffff800affe7a90: ff ff f8 00 af fe 7a 88 00 00 00 00 00 00 00 00
> ÿÿø.¯þz.........
> Object 0xfffff800affe7aa0: ff ff f8 00 af fe 7a a0 ff ff f8 00 af fe 7a a0
> ÿÿø.¯þz.ÿÿø.¯þz.
> Object 0xfffff800affe7ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ................
> Object 0xfffff800affe7ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ................
> Object 0xfffff800affe7ad0: ff ff f8 00 af fe 7a d0 ff ff f8 00 af fe 7a d0
> ÿÿø.¯þzÐÿÿø.¯þzÐ
> Object 0xfffff800affe7ae0: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00
> ................
> Object 0xfffff800affe7af0: de ad 4e ad ff ff ff ff ff ff ff ff ff ff ff ff
> ÞNÿÿÿÿÿÿÿÿÿÿÿÿ
> Object 0xfffff800affe7b00: 00 00 00 00 00 92 a6 60 00 00 00 00 00 00 00 00
> ......¦`........
> Object 0xfffff800affe7b10: 00 00 00 00 00 75 b4 c0 ff ff f8 00 af fe 7b 18
> .....u´Àÿÿø.¯þ{.
> Object 0xfffff800affe7b20: ff ff f8 00 af fe 7b 18 00 00 00 00 00 00 00 00
> ÿÿø.¯þ{.........
> Object 0xfffff800affe7b30: 00 00 00 00 00 00 00 00 ff ff f8 00 af fe 7a e0
> ........ÿÿø.¯þzà
> Object 0xfffff800affe7b40: 00 00 00 00 00 06 02 00 30 7f fc 00 00 00 c0 00
> ........0.ü...À.
> Object 0xfffff800affe7b50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40
> ...............@
> Object 0xfffff800affe7b60: 00 00 01 0c 00 00 00 09 00 00 00 24 00 00 00 07
> ...........$....
> Object 0xfffff800affe7b70: 00 00 00 00 00 06 02 00
> ........
> Redzone 0xfffff800affe7b78: 30 7f fc 00 00 00 40 00
> 0.ü...@.
> Padding 0xfffff800affe7bb8: 5a 5a 5a 5a 5a 5a 5a 5a
> ZZZZZZZZ
> Call Trace:
> [00000000004bde04] print_trailer+0xc4/0x160
> [00000000004be4a8] check_bytes_and_report+0xa8/0xe0
> [00000000004be528] check_object+0x48/0x260
> [00000000004bee0c] __slab_free+0x2ac/0x440
> [00000000004c1ce8] kmem_cache_free+0x88/0xe0
> [00000000005456ec] nfs_destroy_inode+0xc/0x20
> [00000000004d75cc] destroy_inode+0x2c/0x60
> [00000000004d7cec] dispose_list+0xac/0x100
> [00000000004d8614] shrink_icache_memory+0x1d4/0x2e0
> [00000000004a29cc] shrink_slab+0x16c/0x200
> [00000000004a2df8] kswapd+0x398/0x5a0
> [000000000047381c] kthread+0x3c/0x80
> [0000000000427010] kernel_thread+0x30/0x60
> [0000000000473b10] kthreadd+0x170/0x200
> FIX nfs_inode_cache: Restoring 0xfffff800affe7b78-0xfffff800affe7b7f=0xcc
And ... boom :)
Unable to handle kernel NULL pointer dereference
tsk->{mm,active_mm}->context = 00000000000004b2
tsk->{mm,active_mm}->pgd = fffff8009848e000
\|/ ____ \|/
"@'/ .. \`@"
/_| \__/ |_\
\__U_/
kswapd0(156): Oops [#1]
TSTATE: 0000009980f09605 TPC: 0000000000480094 TNPC: 0000000000480098 Y:
00000000 Not tainted
TPC: <__lock_acquire+0x34/0xb00>
g0: fffff800bf759090 g1: 000000000000000f g2: 0000000000000001 g3:
fffff800bf348000
g4: fffff800bf331b00 g5: fffff800bf6d6000 g6: fffff800bf348000 g7:
0000000000000050
o0: 0000000000002000 o1: 00000000004a0cac o2: fff8000000000000 o3:
fffff800bf331b00
o4: 0000000000000000 o5: 000000000099a760 sp: fffff800bf34b061 ret_pc:
00000000004a4564
RPC: <__dec_zone_state+0x4/0xa0>
l0: 0000000000000494 l1: fffff80080002000 l2: 0000000000000000 l3:
0000080000000000
l4: fffff800bf331b00 l5: 00000000007c0c00 l6: 0000000000800000 l7:
00000000007a6000
i0: 00000000000000e8 i1: 0000000000000000 i2: 0000000000000000 i3:
0000000000000000
i4: 0000000000000001 i5: 0000000000000000 i6: fffff800bf34b121 i7:
0000000000481c04
I7: <lock_acquire+0x44/0x60>
Caller[0000000000481c04]: lock_acquire+0x44/0x60
Caller[00000000006c25e4]: _spin_lock+0x24/0x40
Caller[00000000004e5f94]: remove_inode_buffers+0x34/0xc0
Caller[00000000004d863c]: shrink_icache_memory+0x1fc/0x2e0
Caller[00000000004a29cc]: shrink_slab+0x16c/0x200
Caller[00000000004a2df8]: kswapd+0x398/0x5a0
Caller[000000000047381c]: kthread+0x3c/0x80
Caller[0000000000427010]: kernel_thread+0x30/0x60
Caller[0000000000473b10]: kthreadd+0x170/0x200
Instruction DUMP: 01000000 2ace408c c25e0000 <e65e2008> 22c4c089 c25e0000
f20524b8 80a6602f 184000da
note: kswapd0[156] exited with preempt_count 1
BUG: sleeping function called from invalid context at
/home/mako/linux/lkt/sources/linux-2.6/kernel/nsproxy.c:217
in_atomic(): 1, irqs_disabled(): 0, pid: 156, name: kswapd0
INFO: lockdep is turned off.
Call Trace:
[0000000000453654] __might_sleep+0xd4/0x120
[000000000047814c] switch_task_namespaces+0xc/0x60
[00000000004781a8] exit_task_namespaces+0x8/0x20
[000000000046010c] do_exit+0x4ec/0x8a0
[0000000000429190] die_if_kernel+0x150/0x300
[0000000000445030] unhandled_fault+0x70/0xe0
[00000000004452f8] do_sparc64_fault+0x1d8/0x5c0
[000000000040796c] sparc64_realfault_common+0x10/0x20
[0000000000480094] __lock_acquire+0x34/0xb00
[0000000000481c04] lock_acquire+0x44/0x60
[00000000006c25e4] _spin_lock+0x24/0x40
[00000000004e5f94] remove_inode_buffers+0x34/0xc0
[00000000004d863c] shrink_icache_memory+0x1fc/0x2e0
[00000000004a29cc] shrink_slab+0x16c/0x200
[00000000004a2df8] kswapd+0x398/0x5a0
[000000000047381c] kthread+0x3c/0x80
So (in reverse order) with gdb we get some more information:
(gdb) l *shrink_icache_memory+0x1fc
0x4d863c is in shrink_icache_memory
(/home/mako/linux/lkt/sources/linux-2.6/fs/inode.c:430).
425 continue;
426 }
427 if (inode_has_buffers(inode) || inode->i_data.nrpages) {
428 __iget(inode);
429 spin_unlock(&inode_lock);
430 if (remove_inode_buffers(inode))
431 reap +=
invalidate_mapping_pages(&inode->i_data,
432 0, -1);
433 iput(inode);
434 spin_lock(&inode_lock);
(gdb) l *remove_inode_buffers+0x34
0x4e5f94 is in remove_inode_buffers
(/home/mako/linux/lkt/sources/linux-2.6/fs/buffer.c:898).
893 if (inode_has_buffers(inode)) {
894 struct address_space *mapping = &inode->i_data;
895 struct list_head *list = &mapping->private_list;
896 struct address_space *buffer_mapping =
mapping->assoc_mapping;
897
898 spin_lock(&buffer_mapping->private_lock);
899 while (!list_empty(list)) {
900 struct buffer_head *bh = BH_ENTRY(list->next);
901 if (buffer_dirty(bh)) {
902 ret = 0;
(gdb) l *_spin_lock+0x24
0x6c25e4 is in _spin_lock
(/home/mako/linux/lkt/sources/linux-2.6/kernel/spinlock.c:180).
175 EXPORT_SYMBOL(_write_lock_bh);
176
177 void __lockfunc _spin_lock(spinlock_t *lock)
178 {
179 preempt_disable();
180 spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
181 LOCK_CONTENDED(lock, _raw_spin_trylock, _raw_spin_lock);
182 }
183
184 EXPORT_SYMBOL(_spin_lock);
(gdb) l *lock_acquire+0x44
0x481c04 is in lock_acquire
(/home/mako/linux/lkt/sources/linux-2.6/kernel/lockdep.c:2941).
2936
2937 raw_local_irq_save(flags);
2938 check_flags(flags);
2939
2940 current->lockdep_recursion = 1;
2941 __lock_acquire(lock, subclass, trylock, read, check,
2942 irqs_disabled_flags(flags), nest_lock, ip);
2943 current->lockdep_recursion = 0;
2944 raw_local_irq_restore(flags);
2945 }
(gdb) l *__lock_acquire+0x34
0x480094 is in __lock_acquire
(/home/mako/linux/lkt/sources/linux-2.6/kernel/lockdep.c:2546).
2541 printk("turning off the locking correctness
validator.\n");
2542 return 0;
2543 }
2544
2545 if (!subclass)
2546 class = lock->class_cache; <---- boom
2547 /*
2548 * Not cached yet or subclass?
2549 */
2550 if (unlikely(!class)) {
Hm... and there are processess hanging in uninteruptible sleep:
This would be the bash in which i run echo 3 > /proc/sys/vm/drop_caches
bash D 00000000004d84a0 0 17499 1
Call Trace:
[00000000006c0a70] mutex_lock_nested+0x110/0x320
[00000000004d84a0] shrink_icache_memory+0x60/0x2e0
[00000000004a29cc] shrink_slab+0x16c/0x200
[00000000004e2a6c] drop_caches_sysctl_handler+0x4c/0x220
[000000000050f998] proc_sys_call_handler+0x78/0xa0
[000000000050f9d4] proc_sys_write+0x14/0x40
[00000000004c4bec] vfs_write+0x6c/0x120
[00000000004c506c] sys_write+0x2c/0x60
[0000000000406254] linux_sparc_syscall32+0x34/0x40
And this one would be sparc-unknown-gnu-linux-gcc compiler running from emerge
world.
sparc-unknown D 00000000004d84a0 0 16529 1
Call Trace:
[00000000006c0a70] mutex_lock_nested+0x110/0x320
[00000000004d84a0] shrink_icache_memory+0x60/0x2e0
[00000000004a29cc] shrink_slab+0x16c/0x200
[00000000004a3224] try_to_free_pages+0x224/0x360
[000000000049ad54] __alloc_pages_internal+0x194/0x440
[00000000004bfb9c] __slab_alloc+0x65c/0x720
[00000000004bffbc] kmem_cache_alloc+0x9c/0xc0
[00000000004449e8] tsb_grow+0x88/0x440
[00000000004455dc] do_sparc64_fault+0x4bc/0x5c0
[000000000040796c] sparc64_realfault_common+0x10/0x20
Both hang in the same place and it looks like fallout from the NULL pointer
dereference,
the thread that oopsed took iprune_mutex held with it.
# cat /proc/17499/wchan
shrink_icache_memory
# cat /proc/16529/wchan
shrink_icache_memory
# cat /proc/17499/stat
17499 (bash) D 1 17499 17499 0 -1 4194560 776 2910 0 7 5 294 22 97 20 0 1 0
1878410 3874816 267 18446744073709551615 65536 872296 4289715792 4289713784
4158566936 0 0 3293188 2072526587 5080224 0 0 20 0 0 0 0 0 0
# cat /proc/16529/stat
16529 (sparc-unknown-l) D 1 21719 4519 34818 4519 4196608 701 0 0 0 2 10 0 0 20
0 1 0 1966196 5201920 385 18446744073709551615 65536 404580 4292198832
4292198184 1880286944 0 0 16777216 0 5080224 0 0 20 2 0 0 0 0 0
5080224 -> 0x4D84A0
and that falls somewhere within shrink_icache_memory()
# grep shrink_icache /proc/kallsyms -A1
00000000004d8440 t shrink_icache_memory
00000000004d8720 T inode_init_once
offset is 0x4D84A0 - 0x4d8440 = 0x60
and that points to:
(gdb) l *shrink_icache_memory+0x60
0x4d84a0 is in shrink_icache_memory
(/home/mako/linux/lkt/sources/linux-2.6/fs/inode.c:413).
408 LIST_HEAD(freeable);
409 int nr_pruned = 0;
410 int nr_scanned;
411 unsigned long reap = 0;
412
413 mutex_lock(&iprune_mutex); <---- here
414 spin_lock(&inode_lock);
415 for (nr_scanned = 0; nr_scanned < nr_to_scan; nr_scanned++) {
416 struct inode *inode;
417
I'm not an expert but it seems that the corruption happens to random memory
areas
and thus the system dies in many different wonderful ways ;) Alhough this might
be just a coincidence.
Hope that helps,
Mariusz
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html