On Wed, May 30, 2007 at 09:09:45AM -0700, Andrew Morton wrote: > On Wed, 30 May 2007 19:01:09 +0300 (EEST) Tero Roponen <[EMAIL PROTECTED]> > wrote: > > > On Wed, 30 May 2007, Andrew Morton wrote: > > > > > On Wed, 30 May 2007 15:02:49 +0300 (EEST) Tero Roponen <[EMAIL > > > PROTECTED]> wrote: > > > > > > > On Wed, 30 May 2007, Pekka Enberg wrote: > > > > > > > > > On 5/30/07, Tero Roponen <[EMAIL PROTECTED]> wrote: > > > > > > Hmmm, I just found something interesting. In 2.6.21.3 the /sbin/init > > > > > > gets corrupted when I watch the video! > > > > > > > > > > > > $ cp /sbin/init init.before > > > > > > $ mplayer kiwi.flv > > > > > > $ cp /sbin/init init.after > > > > > > > > > > > > The sha1sums are here: > > > > > > > > > > > > 52c8d643057619cbe137b8e69d4709ce3bdd832d init.after > > > > > > 8efc7864a5b535a9e336fa82e9d7f112f3d956c1 init.before > > > > > > > > > > > > It seems that something corrupts memory somewhere... > > > > > > > > > > To debug this a bit further: > > > > > > > > > > $ od -a -t x1 -v init.after > init.after.dump > > > > > $ od -a -t x1 -v init.before > init.before.dump > > > > > $ diff -u init.before.dump init.after.dump | less > > > > > > > > > > -0011340 nul nul nul e9 f0 fe ff ff ff % < soh enq bs h > > > > > 80 > > > > > - 00 00 00 e9 f0 fe ff ff ff 25 3c 01 05 08 > > > > > 68 80 > > > > > +0010000 y ack nul nul y ack nul nul y ack nul nul y ack nul > > > > > nul > > > > > + 79 06 00 00 79 06 00 00 79 06 00 00 79 06 > > > > > 00 00 > > > > > +0010020 y ack nul nul y ack nul nul y ack nul nul y ack nul > > > > > nul > > > > > + 79 06 00 00 79 06 00 00 79 06 00 00 79 06 > > > > > 00 00 > > > > > +0011340 y ack nul nul y ack nul nul ff % < soh enq bs h > > > > > 80 > > > > > + 79 06 00 00 79 06 00 00 ff 25 3c 01 05 08 > > > > > 68 80 > > > > > > > > > > The file at offset 0010000 - 0011348 is overwritten with the byte > > > > > pattern 79 06 00 00. > > > > > > > > > > Do you see anything in the logs or is this a silent corruption? Did > > > > > you see this corruption with 2.6.19 or 2.6.22-rc3? > > > > > > > > > > > > > I recompiled 2.6.22-rc3 and booted it with slub_debug. Now I can't oops > > > > the kernel, but ./slab_info -v gives me a warning: > > > > > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > neofb: no support for 32bpp > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1024x768) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1152x864) larger than the LCD panel (800x600) > > > > Mode (1024x1024) larger than the LCD panel (800x600) > > > > Mode (1024x1024) larger than the LCD panel (800x600) > > > > Mode (1024x1024) larger than the LCD panel (800x600) > > > > Mode (1024x1024) larger than the LCD panel (800x600) > > > > Mode (1280x1024) larger than the LCD panel (800x600) > > > > Mode (1280x1024) larger than the LCD panel (800x600) > > > > Mode (1280x1024) larger than the LCD panel (800x600) > > > > Mode (1280x1024) larger than the LCD panel (800x600) > > > > *** SLUB kmalloc-1024: Redzone [EMAIL PROTECTED] slab 0xc10217c0 > > > > offset=2144 flags=0x80004082 inuse=7 freelist=0x00000000 > > > > Bytes b4 0xc10be850: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a > > > > ........ZZZZZZZZ > > > > Object 0xc10be860: 00 00 00 00 00 20 00 00 20 03 00 00 58 02 00 00 > > > > ............X... > > > > Object 0xc10be870: 20 03 00 00 58 02 00 00 00 00 00 00 00 00 00 00 > > > > ....X........... > > > > Object 0xc10be880: 10 00 00 00 00 00 00 00 0b 00 00 00 05 00 00 00 > > > > ................ > > > > Object 0xc10be890: 00 00 00 00 05 00 00 00 06 00 00 00 00 00 00 00 > > > > ................ > > > > Object 0xc10be8a0: 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 > > > > ................ > > > > Object 0xc10be8b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > ................ > > > > Object 0xc10be8c0: ff ff ff ff ff ff ff ff 00 00 00 00 a8 61 00 00 > > > > ÿÿÿÿÿÿÿÿ....¨a.. > > > > Object 0xc10be8d0: 58 00 00 00 28 00 00 00 17 00 00 00 01 00 00 00 > > > > X...(........... > > > > Redzone 0xc10bec60: 4d 6b 00 00 > > > > Mk.. > > > > FreePointer 0xc10bec64 -> 0x00006b4d > > > > Last alloc: 0x6b4d jiffies_ago=4294923792 cpu=27469 pid=27469 > > > > Last free : 0x6b4d jiffies_ago=4294923792 cpu=27469 pid=27469 > > > > Filler 0xc10bec88: 4d 6b 00 00 4d 6b 00 00 > > > > Mk..Mk.. > > > > [<c013f717>] check_object+0x64/0x23d > > > > [<c0141371>] validate_slab+0xff/0x12a > > > > [<c01413aa>] validate_slab_slab+0xe/0x51 > > > > [<c0141488>] validate_store+0x9b/0xe8 > > > > [<c01343d1>] __handle_mm_fault+0x370/0x68b > > > > [<c01413ed>] validate_store+0x0/0xe8 > > > > [<c013eaa6>] slab_attr_store+0x1e/0x22 > > > > [<c016e470>] sysfs_write_file+0xad/0xd6 > > > > [<c016e3c3>] sysfs_write_file+0x0/0xd6 > > > > [<c0143341>] vfs_write+0x8a/0x10c > > > > [<c01437d7>] sys_write+0x41/0x67 > > > > [<c01022c2>] sysenter_past_esp+0x5f/0x85 > > > > ======================= > > > > @@@ SLUB kmalloc-1024: Restoring redzone (0xcc) from > > > > 0xc10bec60-0xc10bec63 > > > > > > > > > > So something did an overwrite of a 1024-byte kmalloc. Unfortunately that > > > overwrite seems to have trashed our last-alloc info, so we don't know who > > > allocated that memory. Darn. > > > > > > Does the problem go away if you disable CONFIG_SLUB and enable > > > CONFIG_SLAB? > > > > > > > > > > Hi, > > > > after some trial and error I found a simple way to trigger the > > corruption: > > > > [EMAIL PROTECTED] ~]# ./slabinfo -v > > [EMAIL PROTECTED] ~]# ./oops > > [EMAIL PROTECTED] ~]# ./slabinfo -v > > Whoa. Impressed. > > > *** SLUB kmalloc-1024: Redzone [EMAIL PROTECTED] slab 0xc10217c0 > > offset=2144 flags=0x80004082 inuse=7 freelist=0x00000000 > > Bytes b4 0xc10be850: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a > > ........ZZZZZZZZ > > Object 0xc10be860: 00 00 00 00 00 20 00 00 20 03 00 00 58 02 00 00 > > ............X... > > Object 0xc10be870: 20 03 00 00 58 02 00 00 00 00 00 00 00 00 00 00 > > ....X........... > > Object 0xc10be880: 18 00 00 00 00 00 00 00 10 00 00 00 08 00 00 00 > > ................ > > Object 0xc10be890: 00 00 00 00 08 00 00 00 08 00 00 00 00 00 00 00 > > ................ > > Object 0xc10be8a0: 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 > > ................ > > Object 0xc10be8b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > ................ > > Object 0xc10be8c0: ff ff ff ff ff ff ff ff 00 00 00 00 a8 61 00 00 > > ÿÿÿÿÿÿÿÿ....¨a.. > > Object 0xc10be8d0: 58 00 00 00 28 00 00 00 17 00 00 00 01 00 00 00 > > X...(........... > > Redzone 0xc10bec60: 6b 6b 6b 00 > > kkk. > > FreePointer 0xc10bec64 -> 0x006b6b6b > > Last alloc: 0x6b6b6b jiffies_ago=4287907122 cpu=7039851 pid=7039851 > > Last free : 0x6b6b6b jiffies_ago=4287907122 cpu=7039851 pid=7039851 > > Filler 0xc10bec88: 6b 6b 6b 00 6b 6b 6b 00 > > kkk.kkk. > > [<c013f717>] check_object+0x64/0x23d > > [<c0141371>] validate_slab+0xff/0x12a > > [<c01413aa>] validate_slab_slab+0xe/0x51 > > [<c0141488>] validate_store+0x9b/0xe8 > > [<c01343d1>] __handle_mm_fault+0x370/0x68b > > [<c01413ed>] validate_store+0x0/0xe8 > > [<c013eaa6>] slab_attr_store+0x1e/0x22 > > [<c016e470>] sysfs_write_file+0xad/0xd6 > > [<c016e3c3>] sysfs_write_file+0x0/0xd6 > > [<c0143341>] vfs_write+0x8a/0x10c > > [<c01437d7>] sys_write+0x41/0x67 > > [<c01022c2>] sysenter_past_esp+0x5f/0x85 > > ======================= > > @@@ SLUB kmalloc-1024: Restoring redzone (0xcc) from 0xc10bec60-0xc10bec63 > > > > [EMAIL PROTECTED] ~]# cat oops.c > > #include <sys/ioctl.h> > > #include <stdio.h> > > #include <linux/fb.h> > > #include <fcntl.h> > > > > int main(void) > > { > > struct fb_var_screeninfo fbinfo; > > int fd = open("/dev/fb0", O_RDWR); > > if (fd < 0) > > return 1; > > > > /* Get screeninfo */ > > ioctl(fd, FBIOGET_VSCREENINFO, &fbinfo); > > > > /* Change depth from current 16 to 24. */ > > fbinfo.bits_per_pixel = 24; > > ioctl(fd, FBIOPUT_VSCREENINFO, &fbinfo); > > > > return 0; > > } > > > > So this seems to be a framebuffer error. > > > > cc's added ;) > > Thanks. > > Tony, this is with SLUB enabled, which might be detecting a > hitherto-undetected bug. > > Config is at http://userweb.kernel.org/~akpm/config-tero.txt
Two suspicious things for me: 1) --- a/drivers/video/neofb.c +++ b/drivers/video/neofb.c @@ -1295,7 +1295,7 @@ static int neofb_setcolreg(u_int regno, outb(blue >> 10, 0x3c9); break; case 16: - ((u32 *) fb->pseudo_palette)[regno] = + ((u16 *) fb->pseudo_palette)[regno] = ((red & 0xf800)) | ((green & 0xfc00) >> 5) | ((blue & 0xf800) >> 11); break; 2) palette in neofb_par is "u32 palette[16];" which is 4x16 = 64 bytes. struct fb_info::pseudo_palette is assigned to it in neo_alloc_fb_info(). Yet, we check at the beginning of neofb_setcolreg() for color map length which neofb advertises as 256 which seems too many. printk()s showing "regno" at the beginning of neofb_setcolreg() welcome. Alexey, who only knows how to spell framebuffer and a bit. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/