Re: Oops while running fs_racer test on a POWER6 box against latest git
On 2010-07-09 08:57, divya wrote: > On Friday 02 July 2010 12:16 PM, divya wrote: >> On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote: >>> On środa, 30 czerwca 2010 o 13:22:27 divya wrote: While running fs_racer test from LTP on a POWER6 box against latest git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the following warning followed by multiple oops. >>> I created a Bugzilla entry at >>> https://bugzilla.kernel.org/show_bug.cgi?id=16324 >>> for your bug report, please add your address to the CC list in there, >>> thanks! >>> >>> >> Here I find a cleaner back trace while running fs_racer test from LTP >> on a POWER6 >> box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242) >> >> Badness at kernel/mutex-debug.c:64 >> BUG: key (null) not in .data! >> NIP: c00be9e8 LR: c00be9cc CTR: >> REGS: c0010bb176f0 TRAP: 0700 Not tainted >> (2.6.35-rc3-git5-autotest) >> BUG: key 01d8 not in .data! >> BUG: key 01e0 not in .data! >> BUG: key 01e8 not in .data! >> MSR: 80029032 >> Unable to handle kernel paging request for data at address 0x0028 >> Faulting instruction address: 0xc03ad0ec >> Oops: Kernel access of bad area, sig: 11 [#1] >> SMP NR_CPUS=1024 NUMA pSeries >> last sysfs file: >> /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map >> Page fault in user mode with in_atomic() = 1 mm = c0010943e600 >> Modules linked in: >> NIP = fff9e98fc40 MSR = 80004001d032 >> ipv6 fuse loop >> Unable to handle kernel paging request for unknown fault >> dm_mod >> Faulting instruction address: 0xc008d0f4 >> sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic >> scsi_transport_srp scsi_tgt scsi_mod >> NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0 >> REGS: c00109b4f610 TRAP: 0300 Not tainted >> (2.6.35-rc3-git5-autotest) >> MSR: 80009032 CR: 88004484 XER: 0001 >> DAR: 0028, DSISR: 4001 >> TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19 >> GPR00: 8013 c00109b4f890 c0d3d798 >> 0028 >> GPR04: >> 0001 >> GPR08: 0028 c0189f2c >> c00109a98600 >> GPR12: 24004424 cf602f80 41ff >> 0001 >> GPR16: 0002 c0010d8304c0 c00109b4fb44 >> >> GPR20: c0010df77908 f000 0001 >> 41ff >> GPR24: c0010df77758 c00109fa1800 c0010df77908 >> c000ff236600 >> GPR28: 0028 0040 c0ca7b38 >> c0189f2c >> NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48 >> LR [c064c3b0] ._raw_spin_lock+0x50/0xa4 >> Call Trace: >> [c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 >> (unreliable) >> [c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4 >> [c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70 >> [c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438 >> [c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160 >> [c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114 >> [c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30 >> [c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40 >> Instruction dump: >> eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 >> 3800 7c691b78 980d0214 800d0008<7d601829> 2c0b 40c20010 7c00192d >> Oops: Weird page fault, sig: 11 [#2] >> >> Pls let me know if this back trace would help in analyzing further. >> Meanwhile I shall do a git bisect and send the inputs. >> >> Thanks >> Divya >> >> >> > Hi All, > > From the git bisect,seems like the commit > 57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above > issue. CC'ing Nick and Al. -- Jens Axboe ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Oops while running fs_racer test on a POWER6 box against latest git
On Fri, Jul 09, 2010 at 09:34:16AM +0200, Jens Axboe wrote: > On 2010-07-09 08:57, divya wrote: > > On Friday 02 July 2010 12:16 PM, divya wrote: > >> On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote: > >>> On środa, 30 czerwca 2010 o 13:22:27 divya wrote: > While running fs_racer test from LTP on a POWER6 box against latest > git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the > following > warning followed by multiple oops. > > >>> I created a Bugzilla entry at > >>> https://bugzilla.kernel.org/show_bug.cgi?id=16324 > >>> for your bug report, please add your address to the CC list in there, > >>> thanks! > >>> > >>> > >> Here I find a cleaner back trace while running fs_racer test from LTP > >> on a POWER6 > >> box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242) > >> > >> Badness at kernel/mutex-debug.c:64 > >> BUG: key (null) not in .data! > >> NIP: c00be9e8 LR: c00be9cc CTR: > >> REGS: c0010bb176f0 TRAP: 0700 Not tainted > >> (2.6.35-rc3-git5-autotest) > >> BUG: key 01d8 not in .data! > >> BUG: key 01e0 not in .data! > >> BUG: key 01e8 not in .data! > >> MSR: 80029032 > >> Unable to handle kernel paging request for data at address 0x0028 > >> Faulting instruction address: 0xc03ad0ec > >> Oops: Kernel access of bad area, sig: 11 [#1] > >> SMP NR_CPUS=1024 NUMA pSeries > >> last sysfs file: > >> /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map > >> Page fault in user mode with in_atomic() = 1 mm = c0010943e600 > >> Modules linked in: > >> NIP = fff9e98fc40 MSR = 80004001d032 > >> ipv6 fuse loop > >> Unable to handle kernel paging request for unknown fault > >> dm_mod > >> Faulting instruction address: 0xc008d0f4 > >> sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic > >> scsi_transport_srp scsi_tgt scsi_mod > >> NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0 > >> REGS: c00109b4f610 TRAP: 0300 Not tainted > >> (2.6.35-rc3-git5-autotest) > >> MSR: 80009032 CR: 88004484 XER: 0001 > >> DAR: 0028, DSISR: 4001 > >> TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19 > >> GPR00: 8013 c00109b4f890 c0d3d798 > >> 0028 > >> GPR04: > >> 0001 > >> GPR08: 0028 c0189f2c >> c00109a98600 > >> GPR12: 24004424 cf602f80 41ff > >> 0001 > >> GPR16: 0002 c0010d8304c0 c00109b4fb44 > >> > >> GPR20: c0010df77908 f000 0001 > >> 41ff > >> GPR24: c0010df77758 c00109fa1800 c0010df77908 > >> c000ff236600 > >> GPR28: 0028 0040 c0ca7b38 > >> c0189f2c > >> NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48 > >> LR [c064c3b0] ._raw_spin_lock+0x50/0xa4 > >> Call Trace: > >> [c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 > >> (unreliable) > >> [c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4 > >> [c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70 > >> [c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438 > >> [c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160 > >> [c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114 > >> [c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30 > >> [c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40 > >> Instruction dump: > >> eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 > >> 3800 7c691b78 980d0214 800d0008<7d601829> 2c0b 40c20010 7c00192d > >> Oops: Weird page fault, sig: 11 [#2] > >> > >> Pls let me know if this back trace would help in analyzing further. > >> Meanwhile I shall do a git bisect and send the inputs. The call stack for Badness at kernel/mutex-debug.c:64 (or whatever explodes first) would be handy. This one seems jumbled still. What spinlock is in the trace? inode_lock? That would indicate some random corruption or breakage in the lock debugging. > >> > >> Thanks > >> Divya > >> > >> > >> > > Hi All, > > > > From the git bisect,seems like the commit > > 57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above > > issue. Call me blind but I can't see the problem. Are you sure this commit breaks it? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Oops while running fs_racer test on a POWER6 box against latest git
On Friday 02 July 2010 12:16 PM, divya wrote: On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote: On środa, 30 czerwca 2010 o 13:22:27 divya wrote: While running fs_racer test from LTP on a POWER6 box against latest git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the following warning followed by multiple oops. I created a Bugzilla entry at https://bugzilla.kernel.org/show_bug.cgi?id=16324 for your bug report, please add your address to the CC list in there, thanks! Here I find a cleaner back trace while running fs_racer test from LTP on a POWER6 box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242) Badness at kernel/mutex-debug.c:64 BUG: key (null) not in .data! NIP: c00be9e8 LR: c00be9cc CTR: REGS: c0010bb176f0 TRAP: 0700 Not tainted (2.6.35-rc3-git5-autotest) BUG: key 01d8 not in .data! BUG: key 01e0 not in .data! BUG: key 01e8 not in .data! MSR: 80029032 Unable to handle kernel paging request for data at address 0x0028 Faulting instruction address: 0xc03ad0ec Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=1024 NUMA pSeries last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map Page fault in user mode with in_atomic() = 1 mm = c0010943e600 Modules linked in: NIP = fff9e98fc40 MSR = 80004001d032 ipv6 fuse loop Unable to handle kernel paging request for unknown fault dm_mod Faulting instruction address: 0xc008d0f4 sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0 REGS: c00109b4f610 TRAP: 0300 Not tainted (2.6.35-rc3-git5-autotest) MSR: 80009032 CR: 88004484 XER: 0001 DAR: 0028, DSISR: 4001 TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19 GPR00: 8013 c00109b4f890 c0d3d798 0028 GPR04: 0001 GPR08: 0028 c0189f2c c00109a98600 GPR12: 24004424 cf602f80 41ff 0001 GPR16: 0002 c0010d8304c0 c00109b4fb44 GPR20: c0010df77908 f000 0001 41ff GPR24: c0010df77758 c00109fa1800 c0010df77908 c000ff236600 GPR28: 0028 0040 c0ca7b38 c0189f2c NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48 LR [c064c3b0] ._raw_spin_lock+0x50/0xa4 Call Trace: [c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 (unreliable) [c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4 [c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70 [c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438 [c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160 [c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114 [c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30 [c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40 Instruction dump: eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 3800 7c691b78 980d0214 800d0008<7d601829> 2c0b 40c20010 7c00192d Oops: Weird page fault, sig: 11 [#2] Pls let me know if this back trace would help in analyzing further. Meanwhile I shall do a git bisect and send the inputs. Thanks Divya Hi All, From the git bisect,seems like the commit 57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above issue. Thanks Divya ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Oops while running fs_racer test on a POWER6 box against latest git
On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote: On środa, 30 czerwca 2010 o 13:22:27 divya wrote: While running fs_racer test from LTP on a POWER6 box against latest git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the following warning followed by multiple oops. I created a Bugzilla entry at https://bugzilla.kernel.org/show_bug.cgi?id=16324 for your bug report, please add your address to the CC list in there, thanks! Here I find a cleaner back trace while running fs_racer test from LTP on a POWER6 box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242) Badness at kernel/mutex-debug.c:64 BUG: key (null) not in .data! NIP: c00be9e8 LR: c00be9cc CTR: REGS: c0010bb176f0 TRAP: 0700 Not tainted (2.6.35-rc3-git5-autotest) BUG: key 01d8 not in .data! BUG: key 01e0 not in .data! BUG: key 01e8 not in .data! MSR: 80029032 Unable to handle kernel paging request for data at address 0x0028 Faulting instruction address: 0xc03ad0ec Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=1024 NUMA pSeries last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map Page fault in user mode with in_atomic() = 1 mm = c0010943e600 Modules linked in: NIP = fff9e98fc40 MSR = 80004001d032 ipv6 fuse loop Unable to handle kernel paging request for unknown fault dm_mod Faulting instruction address: 0xc008d0f4 sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0 REGS: c00109b4f610 TRAP: 0300 Not tainted (2.6.35-rc3-git5-autotest) MSR: 80009032 CR: 88004484 XER: 0001 DAR: 0028, DSISR: 4001 TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19 GPR00: 8013 c00109b4f890 c0d3d798 0028 GPR04: 0001 GPR08: 0028 c0189f2c c00109a98600 GPR12: 24004424 cf602f80 41ff 0001 GPR16: 0002 c0010d8304c0 c00109b4fb44 GPR20: c0010df77908 f000 0001 41ff GPR24: c0010df77758 c00109fa1800 c0010df77908 c000ff236600 GPR28: 0028 0040 c0ca7b38 c0189f2c NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48 LR [c064c3b0] ._raw_spin_lock+0x50/0xa4 Call Trace: [c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 (unreliable) [c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4 [c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70 [c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438 [c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160 [c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114 [c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30 [c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40 Instruction dump: eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 3800 7c691b78 980d0214 800d0008<7d601829> 2c0b 40c20010 7c00192d Oops: Weird page fault, sig: 11 [#2] Pls let me know if this back trace would help in analyzing further. Meanwhile I shall do a git bisect and send the inputs. Thanks Divya ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Oops while running fs_racer test on a POWER6 box against latest git
In message <20100701105907.gk22...@laptop> you wrote: > On Thu, Jul 01, 2010 at 03:04:54PM +1000, Michael Neuling wrote: > > > While running fs_racer test from LTP on a POWER6 box against latest git(2 .6.3 > > 5-rc3-git4 - commitid 984bc9601f64fd) > > > came across the following warning followed by multiple oops. > > > > > > [ cut here ] > > > > > > Badness at kernel/mutex-debug.c:64 > > > NIP: c00be9e8 LR: c00be9cc CTR: > > > REGS: c0010be8f6f0 TRAP: 0700 Not tainted (2.6.35-rc3-git4-autotes t) > > > MSR: 80029032CR: 24224422 XER: 0012 > > > TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 CPU: > > 2 > > > GPR00: c0010be8f970 c0d3d798 000 1 > > > GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 000 0 > > > GPR08: c43042f0 c16534e8 017a c0c29a1 c > > > GPR12: 28228424 cf600500 c0010be8fc40 200 0 > > > GPR16: f000 c00109c73000 c0010be8fc30 0001044 2 > > > GPR20: 01b6 c0010dd1225 0 > > > GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd1221 0 > > > GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa7 0 > > > NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130 > > > LR [c00be9cc] .mutex_remove_waiter+0x88/0x130 > > > Call Trace: > > > [c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable) > > > [c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430 > > > Instruction dump: > > > e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018 > > > e93e8008 8009 2f80 409e0008<0fe0> e93e8000 8009 2f8 0 > > > Unable to handle kernel paging request for unknown fault > > > Faulting instruction address: 0xc008d0f4 > > > Oops: Kernel access of bad area, sig: 7 [#1] > > > SMP NR_CPUS=1024 NUMA > > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > > pSeries > > > last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_ma p > > > Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg > > > sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod > > > NIP: c008d0f4 LR: c008d0d0 CTR: > > > REGS: c0010978f900 TRAP: 0600 Tainted: GW(2.6.35-rc3-gi t4-a > > utotest) > > > MSR: 80009032 > > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > > EE,ME,IR,DR>CR: 24022442 XER: 0012 > > > DAR: c0648f54, DSISR: 4001 > > > TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 CPU: > > 10 > > > GPR00: 4000 c0010978fb80 c0d3d798 000 1 > > > GPR04: c083539e c1610228 c54c688 0 > > > GPR08: 06a5 c0648f54 0007 049b000 0 > > > GPR12: cf601900 fff f > > > GPR16: 4b7dc520 c0010978fea 0 > > > GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd 0 > > > GPR24: 01200011 c0e1c0a8 c0648ed 4 > > > GPR28: c001096e4900 c0ca0458 c0010725d40 0 > > > NIP [c008d0f4] .copy_process+0x310/0xf40 > > > LR [c008d0d0] .copy_process+0x2ec/0xf40 > > > Call Trace: > > > [c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliab le) > > > [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc > > > [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70 > > > [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc > > > Instruction dump: > > > 419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080 > > > 78004800 6042 901f0014 38004000<7d6048a8> 7d6b0078 7d6049ad 40c2fff 4 > > > > > > Kernel version 2.6.34-rc3-git3 works fine. > > > > Should this read 2.6.35-rc3-git3? > > > > If so, there's only about 20 commits in: > > 5904b3b81d2516..984bc9601f64fd > > > > The likely fs related candidates are from Christoph and Nick Piggin > > (added to CC) > > > > No commits relating to POWER6 or PPC. > > Not sure what's happening here. The first warning looks like some mutex > corruption, but it doesn't have a stack trace (these are 2 seperate > dumps, right? ie. the copy_process stack doesn't relate to the mutex > warning?) So I don't have much idea. > > If it is reproducable, can you try getting a better stack trace, or > better
Re: Oops while running fs_racer test on a POWER6 box against latest git
On środa, 30 czerwca 2010 o 13:22:27 divya wrote: > While running fs_racer test from LTP on a POWER6 box against latest > git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the following > warning followed by multiple oops. > I created a Bugzilla entry at https://bugzilla.kernel.org/show_bug.cgi?id=16324 for your bug report, please add your address to the CC list in there, thanks! -- Maciej Rutecki http://www.maciek.unixy.pl ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Oops while running fs_racer test on a POWER6 box against latest git
On Thu, Jul 01, 2010 at 03:04:54PM +1000, Michael Neuling wrote: > > While running fs_racer test from LTP on a POWER6 box against latest > > git(2.6.3 > 5-rc3-git4 - commitid 984bc9601f64fd) > > came across the following warning followed by multiple oops. > > > > [ cut here ] > > > > Badness at kernel/mutex-debug.c:64 > > NIP: c00be9e8 LR: c00be9cc CTR: > > REGS: c0010be8f6f0 TRAP: 0700 Not tainted (2.6.35-rc3-git4-autotest) > > MSR: 80029032CR: 24224422 XER: 0012 > > TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 > > CPU: > 2 > > GPR00: c0010be8f970 c0d3d798 0001 > > GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 > > GPR08: c43042f0 c16534e8 017a c0c29a1c > > GPR12: 28228424 cf600500 c0010be8fc40 2000 > > GPR16: f000 c00109c73000 c0010be8fc30 00010442 > > GPR20: 01b6 c0010dd12250 > > GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd12210 > > GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa70 > > NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130 > > LR [c00be9cc] .mutex_remove_waiter+0x88/0x130 > > Call Trace: > > [c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable) > > [c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430 > > Instruction dump: > > e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018 > > e93e8008 8009 2f80 409e0008<0fe0> e93e8000 8009 2f80 > > Unable to handle kernel paging request for unknown fault > > Faulting instruction address: 0xc008d0f4 > > Oops: Kernel access of bad area, sig: 7 [#1] > > SMP NR_CPUS=1024 NUMA > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > pSeries > > last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map > > Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg > > sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod > > NIP: c008d0f4 LR: c008d0d0 CTR: > > REGS: c0010978f900 TRAP: 0600 Tainted: GW > > (2.6.35-rc3-git4-a > utotest) > > MSR: 80009032 > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > > EE,ME,IR,DR>CR: 24022442 XER: 0012 > > DAR: c0648f54, DSISR: 4001 > > TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 > > CPU: > 10 > > GPR00: 4000 c0010978fb80 c0d3d798 0001 > > GPR04: c083539e c1610228 c54c6880 > > GPR08: 06a5 c0648f54 0007 049b > > GPR12: cf601900 > > GPR16: 4b7dc520 c0010978fea0 > > GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd0 > > GPR24: 01200011 c0e1c0a8 c0648ed4 > > GPR28: c001096e4900 c0ca0458 c0010725d400 > > NIP [c008d0f4] .copy_process+0x310/0xf40 > > LR [c008d0d0] .copy_process+0x2ec/0xf40 > > Call Trace: > > [c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliable) > > [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc > > [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70 > > [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc > > Instruction dump: > > 419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080 > > 78004800 6042 901f0014 38004000<7d6048a8> 7d6b0078 7d6049ad 40c2fff4 > > > > Kernel version 2.6.34-rc3-git3 works fine. > > Should this read 2.6.35-rc3-git3? > > If so, there's only about 20 commits in: > 5904b3b81d2516..984bc9601f64fd > > The likely fs related candidates are from Christoph and Nick Piggin > (added to CC) > > No commits relating to POWER6 or PPC. Not sure what's happening here. The first warning looks like some mutex corruption, but it doesn't have a stack trace (these are 2 seperate dumps, right? ie. the copy_process stack doesn't relate to the mutex warning?) So I don't have much idea. If it is reproducable, can you try getting a better stack trace, or better yet, even bisecting if there is just a small window? Thanks, Nick ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Oops while running fs_racer test on a POWER6 box against latest git
> While running fs_racer test from LTP on a POWER6 box against latest git(2.6.3 5-rc3-git4 - commitid 984bc9601f64fd) > came across the following warning followed by multiple oops. > > [ cut here ] > > Badness at kernel/mutex-debug.c:64 > NIP: c00be9e8 LR: c00be9cc CTR: > REGS: c0010be8f6f0 TRAP: 0700 Not tainted (2.6.35-rc3-git4-autotest) > MSR: 80029032CR: 24224422 XER: 0012 > TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 CPU: 2 > GPR00: c0010be8f970 c0d3d798 0001 > GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 > GPR08: c43042f0 c16534e8 017a c0c29a1c > GPR12: 28228424 cf600500 c0010be8fc40 2000 > GPR16: f000 c00109c73000 c0010be8fc30 00010442 > GPR20: 01b6 c0010dd12250 > GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd12210 > GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa70 > NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130 > LR [c00be9cc] .mutex_remove_waiter+0x88/0x130 > Call Trace: > [c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable) > [c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430 > Instruction dump: > e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018 > e93e8008 8009 2f80 409e0008<0fe0> e93e8000 8009 2f80 > Unable to handle kernel paging request for unknown fault > Faulting instruction address: 0xc008d0f4 > Oops: Kernel access of bad area, sig: 7 [#1] > SMP NR_CPUS=1024 NUMA > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > pSeries > last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map > Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg > sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod > NIP: c008d0f4 LR: c008d0d0 CTR: > REGS: c0010978f900 TRAP: 0600 Tainted: GW(2.6.35-rc3-git4-a utotest) > MSR: 80009032 > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > Unrecoverable FP Unavailable Exception 800 at c0648ed4 > EE,ME,IR,DR>CR: 24022442 XER: 0012 > DAR: c0648f54, DSISR: 4001 > TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 CPU: 10 > GPR00: 4000 c0010978fb80 c0d3d798 0001 > GPR04: c083539e c1610228 c54c6880 > GPR08: 06a5 c0648f54 0007 049b > GPR12: cf601900 > GPR16: 4b7dc520 c0010978fea0 > GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd0 > GPR24: 01200011 c0e1c0a8 c0648ed4 > GPR28: c001096e4900 c0ca0458 c0010725d400 > NIP [c008d0f4] .copy_process+0x310/0xf40 > LR [c008d0d0] .copy_process+0x2ec/0xf40 > Call Trace: > [c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliable) > [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc > [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70 > [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc > Instruction dump: > 419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080 > 78004800 6042 901f0014 38004000<7d6048a8> 7d6b0078 7d6049ad 40c2fff4 > > Kernel version 2.6.34-rc3-git3 works fine. Should this read 2.6.35-rc3-git3? If so, there's only about 20 commits in: 5904b3b81d2516..984bc9601f64fd The likely fs related candidates are from Christoph and Nick Piggin (added to CC) No commits relating to POWER6 or PPC. Mikey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Oops while running fs_racer test on a POWER6 box against latest git
While running fs_racer test from LTP on a POWER6 box against latest git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the following warning followed by multiple oops. [ cut here ] Badness at kernel/mutex-debug.c:64 NIP: c00be9e8 LR: c00be9cc CTR: REGS: c0010be8f6f0 TRAP: 0700 Not tainted (2.6.35-rc3-git4-autotest) MSR: 80029032CR: 24224422 XER: 0012 TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 CPU: 2 GPR00: c0010be8f970 c0d3d798 0001 GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 GPR08: c43042f0 c16534e8 017a c0c29a1c GPR12: 28228424 cf600500 c0010be8fc40 2000 GPR16: f000 c00109c73000 c0010be8fc30 00010442 GPR20: 01b6 c0010dd12250 GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd12210 GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa70 NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130 LR [c00be9cc] .mutex_remove_waiter+0x88/0x130 Call Trace: [c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable) [c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430 Instruction dump: e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018 e93e8008 8009 2f80 409e0008<0fe0> e93e8000 8009 2f80 Unable to handle kernel paging request for unknown fault Faulting instruction address: 0xc008d0f4 Oops: Kernel access of bad area, sig: 7 [#1] SMP NR_CPUS=1024 NUMA Unrecoverable FP Unavailable Exception 800 at c0648ed4 pSeries last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod NIP: c008d0f4 LR: c008d0d0 CTR: REGS: c0010978f900 TRAP: 0600 Tainted: GW (2.6.35-rc3-git4-autotest) MSR: 80009032 Unrecoverable FP Unavailable Exception 800 at c0648ed4 Unrecoverable FP Unavailable Exception 800 at c0648ed4 Unrecoverable FP Unavailable Exception 800 at c0648ed4 Unrecoverable FP Unavailable Exception 800 at c0648ed4 Unrecoverable FP Unavailable Exception 800 at c0648ed4 EE,ME,IR,DR>CR: 24022442 XER: 0012 DAR: c0648f54, DSISR: 4001 TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 CPU: 10 GPR00: 4000 c0010978fb80 c0d3d798 0001 GPR04: c083539e c1610228 c54c6880 GPR08: 06a5 c0648f54 0007 049b GPR12: cf601900 GPR16: 4b7dc520 c0010978fea0 GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd0 GPR24: 01200011 c0e1c0a8 c0648ed4 GPR28: c001096e4900 c0ca0458 c0010725d400 NIP [c008d0f4] .copy_process+0x310/0xf40 LR [c008d0d0] .copy_process+0x2ec/0xf40 Call Trace: [c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliable) [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70 [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc Instruction dump: 419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080 78004800 6042 901f0014 38004000<7d6048a8> 7d6b0078 7d6049ad 40c2fff4 Kernel version 2.6.34-rc3-git3 works fine. Thanks Divya Using 007dfade bytes for initrd buffer Please wait, loading kernel... Allocated 0180 bytes for kernel @ 01e0 Elf64 kernel loaded... Loading ramdisk... ramdisk loaded 007dfade @ 0360 OF stdout device is: /vdevice/v...@3000 Preparing to boot Linux version 2.6.35-rc3-git4-autotest (r...@p55alp2) (gcc version 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Wed Jun 30 08:47:11 IST 2010 Max number of cores passed to firmware: 0x0200 Calling ibm,client-architecture-support... not implemented command line: root=/dev/sda5 IDENT=1277868480 memory layout at init: memory_limit : (16 MB aligned) alloc_bottom : 03de alloc_top: 1000 alloc_top_hi : 0001f000 rmo_top : 1000 ram_top : 0001f000 instantiating rtas at 0x0f6a... done boot cpu hw idx starting cpu hw idx 0002... done starting cpu hw idx 0004... done starting cpu hw idx 0006... done starting cpu hw idx 0008... done starting cpu hw idx 000a... done starting cpu hw idx 00