Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-09 Thread Jens Axboe
On 2010-07-09 08:57, divya wrote:
> On Friday 02 July 2010 12:16 PM, divya wrote:
>> On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:
>>> On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
 While running fs_racer test from LTP on a POWER6 box against latest
 git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the 
 following
 warning followed by multiple oops.

>>> I created a Bugzilla entry at
>>> https://bugzilla.kernel.org/show_bug.cgi?id=16324
>>> for your bug report, please add your address to the CC list in there, 
>>> thanks!
>>>
>>>
>> Here I find a cleaner back trace while running fs_racer test from LTP 
>> on a POWER6
>> box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)
>>
>> Badness at kernel/mutex-debug.c:64
>> BUG: key (null) not in .data!
>> NIP: c00be9e8 LR: c00be9cc CTR: 
>> REGS: c0010bb176f0 TRAP: 0700   Not tainted  
>> (2.6.35-rc3-git5-autotest)
>> BUG: key 01d8 not in .data!
>> BUG: key 01e0 not in .data!
>> BUG: key 01e8 not in .data!
>> MSR: 80029032
>> Unable to handle kernel paging request for data at address 0x0028
>> Faulting instruction address: 0xc03ad0ec
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> SMP NR_CPUS=1024 NUMA pSeries
>> last sysfs file: 
>> /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
>> Page fault in user mode with in_atomic() = 1 mm = c0010943e600
>> Modules linked in:
>> NIP = fff9e98fc40  MSR = 80004001d032
>>  ipv6 fuse loop
>> Unable to handle kernel paging request for unknown fault
>>  dm_mod
>> Faulting instruction address: 0xc008d0f4
>>  sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic 
>> scsi_transport_srp scsi_tgt scsi_mod
>> NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0
>> REGS: c00109b4f610 TRAP: 0300   Not tainted  
>> (2.6.35-rc3-git5-autotest)
>> MSR: 80009032   CR: 88004484  XER: 0001
>> DAR: 0028, DSISR: 4001
>> TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19
>> GPR00: 8013 c00109b4f890 c0d3d798 
>> 0028
>> GPR04:    
>> 0001
>> GPR08:  0028 c0189f2c 
>> c00109a98600
>> GPR12: 24004424 cf602f80 41ff 
>> 0001
>> GPR16: 0002 c0010d8304c0 c00109b4fb44 
>> 
>> GPR20: c0010df77908 f000 0001 
>> 41ff
>> GPR24: c0010df77758 c00109fa1800 c0010df77908 
>> c000ff236600
>> GPR28: 0028 0040 c0ca7b38 
>> c0189f2c
>> NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48
>> LR [c064c3b0] ._raw_spin_lock+0x50/0xa4
>> Call Trace:
>> [c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 
>> (unreliable)
>> [c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4
>> [c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70
>> [c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438
>> [c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160
>> [c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114
>> [c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30
>> [c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40
>> Instruction dump:
>> eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
>> 3800 7c691b78 980d0214 800d0008<7d601829>  2c0b 40c20010 7c00192d
>> Oops: Weird page fault, sig: 11 [#2]
>>
>> Pls let me know if this back trace would help in analyzing further.
>> Meanwhile I shall do a git bisect and send the inputs.
>>
>> Thanks
>> Divya
>>
>>
>>
> Hi All,
> 
>  From the git bisect,seems like the commit
>  57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above
>  issue.

CC'ing Nick and Al.

-- 
Jens Axboe

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-09 Thread Nick Piggin
On Fri, Jul 09, 2010 at 09:34:16AM +0200, Jens Axboe wrote:
> On 2010-07-09 08:57, divya wrote:
> > On Friday 02 July 2010 12:16 PM, divya wrote:
> >> On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:
> >>> On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
>  While running fs_racer test from LTP on a POWER6 box against latest
>  git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the 
>  following
>  warning followed by multiple oops.
> 
> >>> I created a Bugzilla entry at
> >>> https://bugzilla.kernel.org/show_bug.cgi?id=16324
> >>> for your bug report, please add your address to the CC list in there, 
> >>> thanks!
> >>>
> >>>
> >> Here I find a cleaner back trace while running fs_racer test from LTP 
> >> on a POWER6
> >> box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)
> >>
> >> Badness at kernel/mutex-debug.c:64
> >> BUG: key (null) not in .data!
> >> NIP: c00be9e8 LR: c00be9cc CTR: 
> >> REGS: c0010bb176f0 TRAP: 0700   Not tainted  
> >> (2.6.35-rc3-git5-autotest)
> >> BUG: key 01d8 not in .data!
> >> BUG: key 01e0 not in .data!
> >> BUG: key 01e8 not in .data!
> >> MSR: 80029032
> >> Unable to handle kernel paging request for data at address 0x0028
> >> Faulting instruction address: 0xc03ad0ec
> >> Oops: Kernel access of bad area, sig: 11 [#1]
> >> SMP NR_CPUS=1024 NUMA pSeries
> >> last sysfs file: 
> >> /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
> >> Page fault in user mode with in_atomic() = 1 mm = c0010943e600
> >> Modules linked in:
> >> NIP = fff9e98fc40  MSR = 80004001d032
> >>  ipv6 fuse loop
> >> Unable to handle kernel paging request for unknown fault
> >>  dm_mod
> >> Faulting instruction address: 0xc008d0f4
> >>  sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic 
> >> scsi_transport_srp scsi_tgt scsi_mod
> >> NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0
> >> REGS: c00109b4f610 TRAP: 0300   Not tainted  
> >> (2.6.35-rc3-git5-autotest)
> >> MSR: 80009032   CR: 88004484  XER: 0001
> >> DAR: 0028, DSISR: 4001
> >> TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19
> >> GPR00: 8013 c00109b4f890 c0d3d798 
> >> 0028
> >> GPR04:    
> >> 0001
> >> GPR08:  0028 c0189f2c 
 >> c00109a98600
> >> GPR12: 24004424 cf602f80 41ff 
> >> 0001
> >> GPR16: 0002 c0010d8304c0 c00109b4fb44 
> >> 
> >> GPR20: c0010df77908 f000 0001 
> >> 41ff
> >> GPR24: c0010df77758 c00109fa1800 c0010df77908 
> >> c000ff236600
> >> GPR28: 0028 0040 c0ca7b38 
> >> c0189f2c
> >> NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48
> >> LR [c064c3b0] ._raw_spin_lock+0x50/0xa4
> >> Call Trace:
> >> [c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 
> >> (unreliable)
> >> [c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4
> >> [c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70
> >> [c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438
> >> [c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160
> >> [c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114
> >> [c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30
> >> [c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40
> >> Instruction dump:
> >> eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
> >> 3800 7c691b78 980d0214 800d0008<7d601829>  2c0b 40c20010 7c00192d
> >> Oops: Weird page fault, sig: 11 [#2]
> >>
> >> Pls let me know if this back trace would help in analyzing further.
> >> Meanwhile I shall do a git bisect and send the inputs.

The call stack for Badness at kernel/mutex-debug.c:64 (or whatever
explodes first) would be handy.  This one seems jumbled still. What
spinlock is in the trace? inode_lock?  That would indicate some random
corruption or breakage in the lock debugging.

> >>
> >> Thanks
> >> Divya
> >>
> >>
> >>
> > Hi All,
> > 
> >  From the git bisect,seems like the commit
> >  57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above
> >  issue.

Call me blind but I can't see the problem. Are you sure this commit
breaks it?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-09 Thread divya

On Friday 02 July 2010 12:16 PM, divya wrote:

On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:

On środa, 30 czerwca 2010 o 13:22:27 divya wrote:

While running fs_racer test from LTP on a POWER6 box against latest
git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the 
following

warning followed by multiple oops.


I created a Bugzilla entry at
https://bugzilla.kernel.org/show_bug.cgi?id=16324
for your bug report, please add your address to the CC list in there, 
thanks!



Here I find a cleaner back trace while running fs_racer test from LTP 
on a POWER6

box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)

Badness at kernel/mutex-debug.c:64
BUG: key (null) not in .data!
NIP: c00be9e8 LR: c00be9cc CTR: 
REGS: c0010bb176f0 TRAP: 0700   Not tainted  
(2.6.35-rc3-git5-autotest)

BUG: key 01d8 not in .data!
BUG: key 01e0 not in .data!
BUG: key 01e8 not in .data!
MSR: 80029032
Unable to handle kernel paging request for data at address 0x0028
Faulting instruction address: 0xc03ad0ec
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
last sysfs file: 
/sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map

Page fault in user mode with in_atomic() = 1 mm = c0010943e600
Modules linked in:
NIP = fff9e98fc40  MSR = 80004001d032
 ipv6 fuse loop
Unable to handle kernel paging request for unknown fault
 dm_mod
Faulting instruction address: 0xc008d0f4
 sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic 
scsi_transport_srp scsi_tgt scsi_mod

NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0
REGS: c00109b4f610 TRAP: 0300   Not tainted  
(2.6.35-rc3-git5-autotest)

MSR: 80009032   CR: 88004484  XER: 0001
DAR: 0028, DSISR: 4001
TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19
GPR00: 8013 c00109b4f890 c0d3d798 
0028
GPR04:    
0001
GPR08:  0028 c0189f2c 
c00109a98600
GPR12: 24004424 cf602f80 41ff 
0001
GPR16: 0002 c0010d8304c0 c00109b4fb44 

GPR20: c0010df77908 f000 0001 
41ff
GPR24: c0010df77758 c00109fa1800 c0010df77908 
c000ff236600
GPR28: 0028 0040 c0ca7b38 
c0189f2c

NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48
LR [c064c3b0] ._raw_spin_lock+0x50/0xa4
Call Trace:
[c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 
(unreliable)

[c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4
[c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70
[c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438
[c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160
[c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114
[c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30
[c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40
Instruction dump:
eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
3800 7c691b78 980d0214 800d0008<7d601829>  2c0b 40c20010 7c00192d
Oops: Weird page fault, sig: 11 [#2]

Pls let me know if this back trace would help in analyzing further.
Meanwhile I shall do a git bisect and send the inputs.

Thanks
Divya




Hi All,

From the git bisect,seems like the commit 
57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above issue.

Thanks
Divya


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-02 Thread divya

On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:

On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
   

While running fs_racer test from LTP on a POWER6 box against latest
git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the following
warning followed by multiple oops.

 

I created a Bugzilla entry at
https://bugzilla.kernel.org/show_bug.cgi?id=16324
for your bug report, please add your address to the CC list in there, thanks!


   

Here I find a cleaner back trace while running fs_racer test from LTP on a 
POWER6
box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)

Badness at kernel/mutex-debug.c:64
BUG: key (null) not in .data!
NIP: c00be9e8 LR: c00be9cc CTR: 
REGS: c0010bb176f0 TRAP: 0700   Not tainted  (2.6.35-rc3-git5-autotest)
BUG: key 01d8 not in .data!
BUG: key 01e0 not in .data!
BUG: key 01e8 not in .data!
MSR: 80029032
Unable to handle kernel paging request for data at address 0x0028
Faulting instruction address: 0xc03ad0ec
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
Page fault in user mode with in_atomic() = 1 mm = c0010943e600
Modules linked in:
NIP = fff9e98fc40  MSR = 80004001d032
 ipv6 fuse loop
Unable to handle kernel paging request for unknown fault
 dm_mod
Faulting instruction address: 0xc008d0f4
 sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp 
scsi_tgt scsi_mod
NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0
REGS: c00109b4f610 TRAP: 0300   Not tainted  (2.6.35-rc3-git5-autotest)
MSR: 80009032   CR: 88004484  XER: 0001
DAR: 0028, DSISR: 4001
TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19
GPR00: 8013 c00109b4f890 c0d3d798 0028
GPR04:    0001
GPR08:  0028 c0189f2c c00109a98600
GPR12: 24004424 cf602f80 41ff 0001
GPR16: 0002 c0010d8304c0 c00109b4fb44 
GPR20: c0010df77908 f000 0001 41ff
GPR24: c0010df77758 c00109fa1800 c0010df77908 c000ff236600
GPR28: 0028 0040 c0ca7b38 c0189f2c
NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48
LR [c064c3b0] ._raw_spin_lock+0x50/0xa4
Call Trace:
[c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 (unreliable)
[c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4
[c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70
[c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438
[c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160
[c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114
[c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30
[c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40
Instruction dump:
eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
3800 7c691b78 980d0214 800d0008<7d601829>  2c0b 40c20010 7c00192d
Oops: Weird page fault, sig: 11 [#2]

Pls let me know if this back trace would help in analyzing further.
Meanwhile I shall do a git bisect and send the inputs.

Thanks
Divya



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-01 Thread Michael Neuling
In message <20100701105907.gk22...@laptop> you wrote:
> On Thu, Jul 01, 2010 at 03:04:54PM +1000, Michael Neuling wrote:
> > > While running fs_racer test from LTP on a POWER6 box against latest git(2
.6.3
> > 5-rc3-git4 - commitid 984bc9601f64fd)
> > > came across the following warning followed by multiple oops.
> > > 
> > > [ cut here ]
> > > 
> > > Badness at kernel/mutex-debug.c:64
> > > NIP: c00be9e8 LR: c00be9cc CTR: 
> > > REGS: c0010be8f6f0 TRAP: 0700   Not tainted  (2.6.35-rc3-git4-autotes
t)
> > > MSR: 80029032CR: 24224422  XER: 0012
> > > TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 
CPU:
> >  2
> > > GPR00:  c0010be8f970 c0d3d798 000
1
> > > GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 000
0
> > > GPR08: c43042f0 c16534e8 017a c0c29a1
c
> > > GPR12: 28228424 cf600500 c0010be8fc40 200
0
> > > GPR16: f000 c00109c73000 c0010be8fc30 0001044
2
> > > GPR20:   01b6 c0010dd1225
0
> > > GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd1221
0
> > > GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa7
0
> > > NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130
> > > LR [c00be9cc] .mutex_remove_waiter+0x88/0x130
> > > Call Trace:
> > > [c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable)
> > > [c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430
> > > Instruction dump:
> > > e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018
> > > e93e8008 8009 2f80 409e0008<0fe0>   e93e8000 8009 2f8
0
> > > Unable to handle kernel paging request for unknown fault
> > > Faulting instruction address: 0xc008d0f4
> > > Oops: Kernel access of bad area, sig: 7 [#1]
> > > SMP NR_CPUS=1024 NUMA
> > > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > > pSeries
> > > last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_ma
p
> > > Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
> > > sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
> > > NIP: c008d0f4 LR: c008d0d0 CTR: 
> > > REGS: c0010978f900 TRAP: 0600   Tainted: GW(2.6.35-rc3-gi
t4-a
> > utotest)
> > > MSR: 80009032
> > > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > > EE,ME,IR,DR>CR: 24022442  XER: 0012
> > > DAR: c0648f54, DSISR: 4001
> > > TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 
CPU:
> >  10
> > > GPR00: 4000 c0010978fb80 c0d3d798 000
1
> > > GPR04: c083539e c1610228  c54c688
0
> > > GPR08: 06a5 c0648f54 0007 049b000
0
> > > GPR12:  cf601900  fff
f
> > > GPR16: 4b7dc520   c0010978fea
0
> > > GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd
0
> > > GPR24:  01200011 c0e1c0a8 c0648ed
4
> > > GPR28:  c001096e4900 c0ca0458 c0010725d40
0
> > > NIP [c008d0f4] .copy_process+0x310/0xf40
> > > LR [c008d0d0] .copy_process+0x2ec/0xf40
> > > Call Trace:
> > > [c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliab
le)
> > > [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc
> > > [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70
> > > [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc
> > > Instruction dump:
> > > 419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080
> > > 78004800 6042 901f0014 38004000<7d6048a8>   7d6b0078 7d6049ad 40c2fff
4
> > > 
> > > Kernel version 2.6.34-rc3-git3 works fine.
> > 
> > Should this read 2.6.35-rc3-git3?
> > 
> > If so, there's only about 20 commits in:
> > 5904b3b81d2516..984bc9601f64fd
> > 
> > The likely fs related candidates are from Christoph and Nick Piggin
> > (added to CC)
> > 
> > No commits relating to POWER6 or PPC.
> 
> Not sure what's happening here. The first warning looks like some mutex
> corruption, but it doesn't have a stack trace (these are 2 seperate
> dumps, right? ie. the copy_process stack doesn't relate to the mutex
> warning?) So I don't have much idea.
> 
> If it is reproducable, can you try getting a better stack trace, or
> better 

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-01 Thread Maciej Rutecki
On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
> While running fs_racer test from LTP on a POWER6 box against latest
> git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the following
> warning followed by multiple oops.
> 

I created a Bugzilla entry at 
https://bugzilla.kernel.org/show_bug.cgi?id=16324
for your bug report, please add your address to the CC list in there, thanks!


-- 
Maciej Rutecki
http://www.maciek.unixy.pl
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-01 Thread Nick Piggin
On Thu, Jul 01, 2010 at 03:04:54PM +1000, Michael Neuling wrote:
> > While running fs_racer test from LTP on a POWER6 box against latest 
> > git(2.6.3
> 5-rc3-git4 - commitid 984bc9601f64fd)
> > came across the following warning followed by multiple oops.
> > 
> > [ cut here ]
> > 
> > Badness at kernel/mutex-debug.c:64
> > NIP: c00be9e8 LR: c00be9cc CTR: 
> > REGS: c0010be8f6f0 TRAP: 0700   Not tainted  (2.6.35-rc3-git4-autotest)
> > MSR: 80029032CR: 24224422  XER: 0012
> > TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 
> > CPU:
>  2
> > GPR00:  c0010be8f970 c0d3d798 0001
> > GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 
> > GPR08: c43042f0 c16534e8 017a c0c29a1c
> > GPR12: 28228424 cf600500 c0010be8fc40 2000
> > GPR16: f000 c00109c73000 c0010be8fc30 00010442
> > GPR20:   01b6 c0010dd12250
> > GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd12210
> > GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa70
> > NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130
> > LR [c00be9cc] .mutex_remove_waiter+0x88/0x130
> > Call Trace:
> > [c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable)
> > [c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430
> > Instruction dump:
> > e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018
> > e93e8008 8009 2f80 409e0008<0fe0>   e93e8000 8009 2f80
> > Unable to handle kernel paging request for unknown fault
> > Faulting instruction address: 0xc008d0f4
> > Oops: Kernel access of bad area, sig: 7 [#1]
> > SMP NR_CPUS=1024 NUMA
> > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > pSeries
> > last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
> > Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
> > sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
> > NIP: c008d0f4 LR: c008d0d0 CTR: 
> > REGS: c0010978f900 TRAP: 0600   Tainted: GW
> > (2.6.35-rc3-git4-a
> utotest)
> > MSR: 80009032
> > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > Unrecoverable FP Unavailable Exception 800 at c0648ed4
> > EE,ME,IR,DR>CR: 24022442  XER: 0012
> > DAR: c0648f54, DSISR: 4001
> > TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 
> > CPU:
>  10
> > GPR00: 4000 c0010978fb80 c0d3d798 0001
> > GPR04: c083539e c1610228  c54c6880
> > GPR08: 06a5 c0648f54 0007 049b
> > GPR12:  cf601900  
> > GPR16: 4b7dc520   c0010978fea0
> > GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd0
> > GPR24:  01200011 c0e1c0a8 c0648ed4
> > GPR28:  c001096e4900 c0ca0458 c0010725d400
> > NIP [c008d0f4] .copy_process+0x310/0xf40
> > LR [c008d0d0] .copy_process+0x2ec/0xf40
> > Call Trace:
> > [c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliable)
> > [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc
> > [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70
> > [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc
> > Instruction dump:
> > 419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080
> > 78004800 6042 901f0014 38004000<7d6048a8>   7d6b0078 7d6049ad 40c2fff4
> > 
> > Kernel version 2.6.34-rc3-git3 works fine.
> 
> Should this read 2.6.35-rc3-git3?
> 
> If so, there's only about 20 commits in:
> 5904b3b81d2516..984bc9601f64fd
> 
> The likely fs related candidates are from Christoph and Nick Piggin
> (added to CC)
> 
> No commits relating to POWER6 or PPC.

Not sure what's happening here. The first warning looks like some mutex
corruption, but it doesn't have a stack trace (these are 2 seperate
dumps, right? ie. the copy_process stack doesn't relate to the mutex
warning?) So I don't have much idea.

If it is reproducable, can you try getting a better stack trace, or
better yet, even bisecting if there is just a small window?

Thanks,
Nick

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-06-30 Thread Michael Neuling
> While running fs_racer test from LTP on a POWER6 box against latest git(2.6.3
5-rc3-git4 - commitid 984bc9601f64fd)
> came across the following warning followed by multiple oops.
> 
> [ cut here ]
> 
> Badness at kernel/mutex-debug.c:64
> NIP: c00be9e8 LR: c00be9cc CTR: 
> REGS: c0010be8f6f0 TRAP: 0700   Not tainted  (2.6.35-rc3-git4-autotest)
> MSR: 80029032CR: 24224422  XER: 0012
> TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 CPU:
 2
> GPR00:  c0010be8f970 c0d3d798 0001
> GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 
> GPR08: c43042f0 c16534e8 017a c0c29a1c
> GPR12: 28228424 cf600500 c0010be8fc40 2000
> GPR16: f000 c00109c73000 c0010be8fc30 00010442
> GPR20:   01b6 c0010dd12250
> GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd12210
> GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa70
> NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130
> LR [c00be9cc] .mutex_remove_waiter+0x88/0x130
> Call Trace:
> [c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable)
> [c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430
> Instruction dump:
> e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018
> e93e8008 8009 2f80 409e0008<0fe0>   e93e8000 8009 2f80
> Unable to handle kernel paging request for unknown fault
> Faulting instruction address: 0xc008d0f4
> Oops: Kernel access of bad area, sig: 7 [#1]
> SMP NR_CPUS=1024 NUMA
> Unrecoverable FP Unavailable Exception 800 at c0648ed4
> pSeries
> last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
> Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
> sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
> NIP: c008d0f4 LR: c008d0d0 CTR: 
> REGS: c0010978f900 TRAP: 0600   Tainted: GW(2.6.35-rc3-git4-a
utotest)
> MSR: 80009032
> Unrecoverable FP Unavailable Exception 800 at c0648ed4
> Unrecoverable FP Unavailable Exception 800 at c0648ed4
> Unrecoverable FP Unavailable Exception 800 at c0648ed4
> Unrecoverable FP Unavailable Exception 800 at c0648ed4
> Unrecoverable FP Unavailable Exception 800 at c0648ed4
> EE,ME,IR,DR>CR: 24022442  XER: 0012
> DAR: c0648f54, DSISR: 4001
> TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 CPU:
 10
> GPR00: 4000 c0010978fb80 c0d3d798 0001
> GPR04: c083539e c1610228  c54c6880
> GPR08: 06a5 c0648f54 0007 049b
> GPR12:  cf601900  
> GPR16: 4b7dc520   c0010978fea0
> GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd0
> GPR24:  01200011 c0e1c0a8 c0648ed4
> GPR28:  c001096e4900 c0ca0458 c0010725d400
> NIP [c008d0f4] .copy_process+0x310/0xf40
> LR [c008d0d0] .copy_process+0x2ec/0xf40
> Call Trace:
> [c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliable)
> [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc
> [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70
> [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc
> Instruction dump:
> 419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080
> 78004800 6042 901f0014 38004000<7d6048a8>   7d6b0078 7d6049ad 40c2fff4
> 
> Kernel version 2.6.34-rc3-git3 works fine.

Should this read 2.6.35-rc3-git3?

If so, there's only about 20 commits in:
5904b3b81d2516..984bc9601f64fd

The likely fs related candidates are from Christoph and Nick Piggin
(added to CC)

No commits relating to POWER6 or PPC.

Mikey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Oops while running fs_racer test on a POWER6 box against latest git

2010-06-30 Thread divya

While running fs_racer test from LTP on a POWER6 box against latest 
git(2.6.35-rc3-git4 - commitid 984bc9601f64fd)
came across the following warning followed by multiple oops.

[ cut here ]

Badness at kernel/mutex-debug.c:64
NIP: c00be9e8 LR: c00be9cc CTR: 
REGS: c0010be8f6f0 TRAP: 0700   Not tainted  (2.6.35-rc3-git4-autotest)
MSR: 80029032CR: 24224422  XER: 0012
TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 CPU: 2
GPR00:  c0010be8f970 c0d3d798 0001
GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 
GPR08: c43042f0 c16534e8 017a c0c29a1c
GPR12: 28228424 cf600500 c0010be8fc40 2000
GPR16: f000 c00109c73000 c0010be8fc30 00010442
GPR20:   01b6 c0010dd12250
GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd12210
GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa70
NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130
LR [c00be9cc] .mutex_remove_waiter+0x88/0x130
Call Trace:
[c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable)
[c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430
Instruction dump:
e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018
e93e8008 8009 2f80 409e0008<0fe0>   e93e8000 8009 2f80
Unable to handle kernel paging request for unknown fault
Faulting instruction address: 0xc008d0f4
Oops: Kernel access of bad area, sig: 7 [#1]
SMP NR_CPUS=1024 NUMA
Unrecoverable FP Unavailable Exception 800 at c0648ed4
pSeries
last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
NIP: c008d0f4 LR: c008d0d0 CTR: 
REGS: c0010978f900 TRAP: 0600   Tainted: GW
(2.6.35-rc3-git4-autotest)
MSR: 80009032
Unrecoverable FP Unavailable Exception 800 at c0648ed4
Unrecoverable FP Unavailable Exception 800 at c0648ed4
Unrecoverable FP Unavailable Exception 800 at c0648ed4
Unrecoverable FP Unavailable Exception 800 at c0648ed4
Unrecoverable FP Unavailable Exception 800 at c0648ed4
EE,ME,IR,DR>CR: 24022442  XER: 0012
DAR: c0648f54, DSISR: 4001
TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 CPU: 10
GPR00: 4000 c0010978fb80 c0d3d798 0001
GPR04: c083539e c1610228  c54c6880
GPR08: 06a5 c0648f54 0007 049b
GPR12:  cf601900  
GPR16: 4b7dc520   c0010978fea0
GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd0
GPR24:  01200011 c0e1c0a8 c0648ed4
GPR28:  c001096e4900 c0ca0458 c0010725d400
NIP [c008d0f4] .copy_process+0x310/0xf40
LR [c008d0d0] .copy_process+0x2ec/0xf40
Call Trace:
[c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliable)
[c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc
[c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70
[c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc
Instruction dump:
419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080
78004800 6042 901f0014 38004000<7d6048a8>   7d6b0078 7d6049ad 40c2fff4

Kernel version 2.6.34-rc3-git3 works fine.

Thanks
Divya


Using 007dfade bytes for initrd buffer
Please wait, loading kernel...
Allocated 0180 bytes for kernel @ 01e0
   Elf64 kernel loaded...
Loading ramdisk...
ramdisk loaded 007dfade @ 0360
OF stdout device is: /vdevice/v...@3000
Preparing to boot Linux version 2.6.35-rc3-git4-autotest (r...@p55alp2) (gcc version 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Wed Jun 30 08:47:11 IST 2010
Max number of cores passed to firmware: 0x0200
Calling ibm,client-architecture-support... not implemented
command line: root=/dev/sda5 IDENT=1277868480
memory layout at init:
  memory_limit :  (16 MB aligned)
  alloc_bottom : 03de
  alloc_top: 1000
  alloc_top_hi : 0001f000
  rmo_top  : 1000
  ram_top  : 0001f000
instantiating rtas at 0x0f6a... done
boot cpu hw idx 
starting cpu hw idx 0002... done
starting cpu hw idx 0004... done
starting cpu hw idx 0006... done
starting cpu hw idx 0008... done
starting cpu hw idx 000a... done
starting cpu hw idx 00