Re: 2.6.24-rc5-mm1 - SCSI/blkdev probing hang

2007-12-24 Thread Rik van Riel
On Thu, 20 Dec 2007 13:22:12 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Thu, 20 Dec 2007 15:57:45 -0500
> Rik van Riel <[EMAIL PROTECTED]> wrote:
> 
> > 2.6.24-rc5-mm1 seems to have a hang related to the SCSI or block
> > device probing code.

> It could be a scsi problem, or it could be all the kobject changes in
> Greg's driver tree.  Or a combination of the two.
> 
> Don't know, sorry.

Whatever it was, it's gone now.

2.6.24-rc6-mm1 boots on my system.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - SCSI/blkdev probing hang

2007-12-24 Thread Rik van Riel
On Thu, 20 Dec 2007 13:22:12 -0800
Andrew Morton [EMAIL PROTECTED] wrote:
 On Thu, 20 Dec 2007 15:57:45 -0500
 Rik van Riel [EMAIL PROTECTED] wrote:
 
  2.6.24-rc5-mm1 seems to have a hang related to the SCSI or block
  device probing code.

 It could be a scsi problem, or it could be all the kobject changes in
 Greg's driver tree.  Or a combination of the two.
 
 Don't know, sorry.

Whatever it was, it's gone now.

2.6.24-rc6-mm1 boots on my system.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-21 Thread Andrew Morton
On Fri, 21 Dec 2007 22:51:45 +0100 Mariusz Kozlowski <[EMAIL PROTECTED]> wrote:

> > Here's a test patch:
> 
> Tested on 2.6.23 and 2.6.24-rc5-mm1. The patch fixes the bug.
> 
> Thanks a lot to both of you.

Thank you for testing -mm (especially on sparc64) and for reporting
the bug and for testing the fix.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-21 Thread Mariusz Kozlowski
Hello,

> > > [  145.128915] TSTATE: 004411009603 TPC: 005119ac TNPC: 
> > > 005119b0 Y: Not tainted
> > > [  145.128940] TPC: 
> > 
> > My suspicion at this point is that with certain RAM layouts, simply
> > iterating over PFN's is simply not working out.
> 
> That was my original suspicion, which is why I asked Mariusz to
> effectively comment out the actual PFN lookup up-thread. I didn't send
> him a patch to do that, so I guess my instructions on how to hack it
> may have been misunderstood.

No. I just made a trivial mistake :-/ Sorry for confusion. I guess I need to
verify things three times before sending an email next time.
  
> > pfn_to_page() seems to be doing no range checking, and with sparsemem
> > vmemmap, which sparc64 always uses, this can be problematic.
> > 
> > It just blindly goes "vmemmap + pfn" which is asking for trouble, in
> > particular when the physical RAM layout really is sparse.
> > 
> > Maybe it's enough to add a pfn_valid() check here?  If pfn_valid()
> > means there is a vmemmap translation setup for that page struct too,
> > it would work.
> 
> Here's a test patch:

Tested on 2.6.23 and 2.6.24-rc5-mm1. The patch fixes the bug.

Thanks a lot to both of you.

Mariusz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-21 Thread Jason Wessel
Andrew Morton wrote:
> On Thu, 20 Dec 2007 10:55:51 -0600
> Jason Wessel <[EMAIL PROTECTED]> wrote:
>
>   
>> Andrew Morton wrote:
>> 
>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/
>>>
>>> - git-kgdb.patch is still dropped for the same reason
>>>
>>>  
[snip]  Regarding the merge output targeting -mm
> Conflicts with the arm, ia64, mips, sh and driver trees (at least).  I
> fixed most of them but gave up on sh, where there has been major code
> motion.
>
>   

Andrew,

Given the churn in patches I think the best approach is to put kgdb
after you have cut a -mm1 so it can go in -mm2 or as a fix or however
you would like to manage it.  The churn should be a whole lot less once
the new kgdb arch support gets merged.

I updated the for_mm branch to be against 2.6.24-rc5-mm1, and it will
merge cleanly.  I can update again once the next mm branch is available.


Thanks,
Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-21 Thread Jason Wessel
Andrew Morton wrote:
 On Thu, 20 Dec 2007 10:55:51 -0600
 Jason Wessel [EMAIL PROTECTED] wrote:

   
 Andrew Morton wrote:
 
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/

 - git-kgdb.patch is still dropped for the same reason

  
[snip]  Regarding the merge output targeting -mm
 Conflicts with the arm, ia64, mips, sh and driver trees (at least).  I
 fixed most of them but gave up on sh, where there has been major code
 motion.

   

Andrew,

Given the churn in patches I think the best approach is to put kgdb
after you have cut a -mm1 so it can go in -mm2 or as a fix or however
you would like to manage it.  The churn should be a whole lot less once
the new kgdb arch support gets merged.

I updated the for_mm branch to be against 2.6.24-rc5-mm1, and it will
merge cleanly.  I can update again once the next mm branch is available.


Thanks,
Jason.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-21 Thread Mariusz Kozlowski
Hello,

   [  145.128915] TSTATE: 004411009603 TPC: 005119ac TNPC: 
   005119b0 Y: Not tainted
   [  145.128940] TPC: kpagecount_read+0x94/0xe0
  
  My suspicion at this point is that with certain RAM layouts, simply
  iterating over PFN's is simply not working out.
 
 That was my original suspicion, which is why I asked Mariusz to
 effectively comment out the actual PFN lookup up-thread. I didn't send
 him a patch to do that, so I guess my instructions on how to hack it
 may have been misunderstood.

No. I just made a trivial mistake :-/ Sorry for confusion. I guess I need to
verify things three times before sending an email next time.
  
  pfn_to_page() seems to be doing no range checking, and with sparsemem
  vmemmap, which sparc64 always uses, this can be problematic.
  
  It just blindly goes vmemmap + pfn which is asking for trouble, in
  particular when the physical RAM layout really is sparse.
  
  Maybe it's enough to add a pfn_valid() check here?  If pfn_valid()
  means there is a vmemmap translation setup for that page struct too,
  it would work.
 
 Here's a test patch:

Tested on 2.6.23 and 2.6.24-rc5-mm1. The patch fixes the bug.

Thanks a lot to both of you.

Mariusz
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-21 Thread Andrew Morton
On Fri, 21 Dec 2007 22:51:45 +0100 Mariusz Kozlowski [EMAIL PROTECTED] wrote:

  Here's a test patch:
 
 Tested on 2.6.23 and 2.6.24-rc5-mm1. The patch fixes the bug.
 
 Thanks a lot to both of you.

Thank you for testing -mm (especially on sparc64) and for reporting
the bug and for testing the fix.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread David Miller
From: Matt Mackall <[EMAIL PROTECTED]>
Date: Thu, 20 Dec 2007 19:06:55 -0600

> @@ -707,7 +707,10 @@ static ssize_t kpagecount_read(struct fi
>   return -EIO;
>  
>   while (count > 0) {
> - ppage = pfn_to_page(pfn++);
> + ppage = 0;
> + if (pfn_valid(pfn))
> + ppage = pfn_to_page(pfn);
> + pfn++;
>   if (!ppage)
>   pcount = 0;
>   else

Yes that should work, please use "NULL" in the final
version of the patch instead of "0" so that sparse is
happy.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread Matt Mackall
On Thu, Dec 20, 2007 at 04:17:26PM -0800, David Miller wrote:
> From: Mariusz Kozlowski <[EMAIL PROTECTED]>
> Date: Thu, 20 Dec 2007 20:47:55 +0100
> 
> > [  145.128915] TSTATE: 004411009603 TPC: 005119ac TNPC: 
> > 005119b0 Y: Not tainted
> > [  145.128940] TPC: 
> 
> My suspicion at this point is that with certain RAM layouts, simply
> iterating over PFN's is simply not working out.

That was my original suspicion, which is why I asked Mariusz to
effectively comment out the actual PFN lookup up-thread. I didn't send
him a patch to do that, so I guess my instructions on how to hack it
may have been misunderstood.
 
> pfn_to_page() seems to be doing no range checking, and with sparsemem
> vmemmap, which sparc64 always uses, this can be problematic.
> 
> It just blindly goes "vmemmap + pfn" which is asking for trouble, in
> particular when the physical RAM layout really is sparse.
> 
> Maybe it's enough to add a pfn_valid() check here?  If pfn_valid()
> means there is a vmemmap translation setup for that page struct too,
> it would work.

Here's a test patch:

Index: mm/fs/proc/proc_misc.c
===
--- mm.orig/fs/proc/proc_misc.c 2007-12-20 19:04:35.0 -0600
+++ mm/fs/proc/proc_misc.c  2007-12-20 19:06:01.0 -0600
@@ -707,7 +707,10 @@ static ssize_t kpagecount_read(struct fi
return -EIO;
 
while (count > 0) {
-   ppage = pfn_to_page(pfn++);
+   ppage = 0;
+   if (pfn_valid(pfn))
+   ppage = pfn_to_page(pfn);
+   pfn++;
if (!ppage)
pcount = 0;
else
@@ -773,7 +776,10 @@ static ssize_t kpageflags_read(struct fi
return -EIO;
 
while (count > 0) {
-   ppage = pfn_to_page(pfn++);
+   ppage = 0;
+   if (pfn_valid(pfn))
+   ppage = pfn_to_page(pfn);
+   pfn++;
if (!ppage)
kflags = 0;
else


-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread David Miller
From: Mariusz Kozlowski <[EMAIL PROTECTED]>
Date: Thu, 20 Dec 2007 20:47:55 +0100

> [  145.128915] TSTATE: 004411009603 TPC: 005119ac TNPC: 
> 005119b0 Y: Not tainted
> [  145.128940] TPC: 

My suspicion at this point is that with certain RAM layouts, simply
iterating over PFN's is simply not working out.

pfn_to_page() seems to be doing no range checking, and with sparsemem
vmemmap, which sparc64 always uses, this can be problematic.

It just blindly goes "vmemmap + pfn" which is asking for trouble, in
particular when the physical RAM layout really is sparse.

Maybe it's enough to add a pfn_valid() check here?  If pfn_valid()
means there is a vmemmap translation setup for that page struct too,
it would work.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-20 Thread Andrew Morton
On Thu, 20 Dec 2007 10:55:51 -0600
Jason Wessel <[EMAIL PROTECTED]> wrote:

> Andrew Morton wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/
> >
> > - If something goes wrong with a PCI device's probing or initialisation, try
> >   reverting pci-disable-decoding-during-sizing-of-bars.patch.
> >
> > - git-sched was dropped due to breaking suspend-to-RAM.
> >
> > - git-block has been restored after having had a few problems
> >
> > - git-newsetup.patch was dropped due to conflicts with git-x86
> >
> > - git-perfmon.patch is still dropped for the same reason
> >
> > - git-kgdb.patch is still dropped for the same reason
> >
> >   
> Andrew,
> 
> I re-based the for_mm branch at:
> http://git.kernel.org/?p=linux/kernel/git/jwessel/linux-2.6-kgdb.git;a=shortlog;h=for_mm
> against the git-x86/mm branch from the x86-git tree. If there are other
> patch trees I need to pull in and patch against to allow for kgdb to be
> included into -mm please let me know.

The x86 merge worked OK.


Here's what it looks like:

patching file Documentation/DocBook/Makefile
Hunk #1 FAILED at 11.
1 out of 1 hunk FAILED -- saving rejects to file 
Documentation/DocBook/Makefile.rej
patching file Documentation/DocBook/kgdb.tmpl
patching file Documentation/kernel-parameters.txt
Hunk #1 succeeded at 816 (offset 7 lines).
patching file MAINTAINERS
Hunk #1 succeeded at 2279 (offset 52 lines).
patching file Makefile
patching file arch/arm/kernel/Makefile
patching file arch/arm/kernel/kgdb-jmp.S
patching file arch/arm/kernel/kgdb.c
patching file arch/arm/kernel/setup.c
patching file arch/arm/kernel/traps.c
patching file arch/arm/mach-ixp2000/core.c
patching file arch/arm/mach-ixp2000/ixdp2x01.c
patching file arch/arm/mach-ixp4xx/coyote-setup.c
patching file arch/arm/mach-ixp4xx/ixdp425-setup.c
patching file arch/arm/mach-omap1/serial.c
patching file arch/arm/mach-omap2/serial.c
patching file arch/arm/mach-pnx4008/core.c
patching file arch/arm/mach-pxa/Makefile
Hunk #1 FAILED at 43.
1 out of 1 hunk FAILED -- saving rejects to file arch/arm/mach-pxa/Makefile.rej
patching file arch/arm/mach-pxa/kgdb-serial.c
patching file arch/arm/mach-versatile/core.c
patching file arch/arm/mm/extable.c
patching file arch/ia64/kernel/Makefile
patching file arch/ia64/kernel/kgdb-jmp.S
patching file arch/ia64/kernel/kgdb.c
patching file arch/ia64/kernel/smp.c
patching file arch/ia64/kernel/traps.c
Hunk #1 FAILED at 155.
1 out of 1 hunk FAILED -- saving rejects to file arch/ia64/kernel/traps.c.rej
patching file arch/ia64/mm/extable.c
patching file arch/ia64/mm/fault.c
patching file arch/mips/Kconfig
Hunk #2 succeeded at 323 (offset -6 lines).
Hunk #4 succeeded at 419 (offset -7 lines).
Hunk #5 succeeded at 531 (offset 21 lines).
Hunk #6 succeeded at 608 (offset -21 lines).
Hunk #7 succeeded at 670 (offset 21 lines).
Hunk #8 succeeded at 914 (offset -24 lines).
patching file arch/mips/Kconfig.debug
patching file arch/mips/au1000/common/Makefile
patching file arch/mips/au1000/common/dbg_io.c
patching file arch/mips/basler/excite/Makefile
patching file arch/mips/basler/excite/excite_dbg_io.c
patching file arch/mips/basler/excite/excite_irq.c
patching file arch/mips/basler/excite/excite_setup.c
patching file arch/mips/jmr3927/rbhma3100/Makefile
patching file arch/mips/jmr3927/rbhma3100/kgdb_io.c
patching file arch/mips/kernel/Makefile
patching file arch/mips/kernel/gdb-low.S
patching file arch/mips/kernel/gdb-stub.c
patching file arch/mips/kernel/irq.c
patching file arch/mips/kernel/kgdb-jmp.c
patching file arch/mips/kernel/kgdb-setjmp.S
patching file arch/mips/kernel/kgdb.c
patching file arch/mips/kernel/kgdb_handler.S
patching file arch/mips/kernel/traps.c
patching file arch/mips/mips-boards/atlas/Makefile
patching file arch/mips/mips-boards/atlas/atlas_gdb.c
patching file arch/mips/mips-boards/atlas/atlas_setup.c
patching file arch/mips/mips-boards/generic/Makefile
patching file arch/mips/mips-boards/generic/gdb_hook.c
patching file arch/mips/mips-boards/generic/init.c
patching file arch/mips/mips-boards/malta/malta_setup.c
patching file arch/mips/mm/extable.c
patching file arch/mips/pci/fixup-atlas.c
patching file arch/mips/philips/pnx8550/common/Makefile
patching file arch/mips/philips/pnx8550/common/gdb_hook.c
patching file arch/mips/philips/pnx8550/common/setup.c
patching file arch/mips/pmc-sierra/yosemite/Makefile
patching file arch/mips/pmc-sierra/yosemite/dbg_io.c
patching file arch/mips/pmc-sierra/yosemite/irq.c
patching file arch/mips/sgi-ip22/ip22-setup.c
patching file arch/mips/sgi-ip27/Makefile
patching file arch/mips/sgi-ip27/ip27-dbgio.c
patching file arch/mips/sibyte/bcm1480/irq.c
patching file arch/mips/sibyte/cfe/setup.c
Hunk #3 succeeded at 298 (offset -3 lines).
patching file arch/mips/sibyte/sb1250/irq.c
patching file arch/mips/sibyte/sb1250/kgdb_sibyte.c
patching file arch/mips/sibyte/swarm/Makefile
patching file arch/mips/sibyte/swarm/dbg_io.c
patching file arch/mips/tx4927/common/Makefile

Re: 2.6.24-rc5-mm1 - SCSI/blkdev probing hang

2007-12-20 Thread Andrew Morton
On Thu, 20 Dec 2007 15:57:45 -0500
Rik van Riel <[EMAIL PROTECTED]> wrote:

> 2.6.24-rc5-mm1 seems to have a hang related to the SCSI or block
> device probing code.
> 
> This is on a dual quad-core x86-64 system with megaraid_sas controller.
> 
> scsi 0:2:0:0: Direct-Access DELL PERC 5/i 1.03 PQ: 0 ANSI: 5
> general protection fault:  [1] SMP 
> last sysfs file: /sys/class/firmware/timeout
> CPU 7 
> Modules linked in: ata_piix libata dm_snapshot dm_zero dm_mirror dm_mod 
> shpchp megaraid_sas sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd 
> ehci_hcd
> Pid: 678, comm: scsi_scan_0 Not tainted 2.6.24-rc5-mm1 #1
> RIP: 0010:[]  [] mark_lock+0x1b/0x472

Could be that someone passed a garbage pointer into lockdep.

> RSP: 0018:81043ba29c20  EFLAGS: 00010002
> RAX: 0010 RBX: 81043b9ee8f0 RCX: 81043b9ee804
> RDX: 6b6b6b6b6b6b6b6b RSI: 81043b9ee8f0 RDI: 81043b9ee000
> RBP: 81043b9ee000 R08: 0002 R09: 
> R10: 81129055 R11: 000281128c8d R12: 0004
> R13: 0001 R14: 0002 R15: 81043e508028
> FS:  () GS:81043e4e6a28() knlGS:
> CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
> CR2: 00361969afa0 CR3: 00201000 CR4: 06e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process scsi_scan_0 (pid: 678, threadinfo 81043ba28000, task 
> 81043b9ee000)
> Stack:  81043b9ee8f0 6b6b6b6b6b6b6b6b 81043b9ee000<6>ata1.00: ATAPI: 
> HL-DT-STCD-RW/DVD-ROM GCC-T10N, A102, max UDMA/33
>  81059139
>  3ba29c50 0002  81058623
>  81043b504660 0246 81043e508028 81043b504660
> Call Trace:
>  [] __lock_acquire+0x4d7/0xc8e
>  [] mark_held_locks+0x49/0x67
>  [] lock_acquire+0x5a/0x73
>  [] kobject_add+0xca/0x194
>  [] mutex_lock_nested+0x2a1/0x2b0
>  [] _spin_lock+0x26/0x52
>  [] kobject_add+0xca/0x194
>  [] device_add+0x9a/0x56e
>  [] :scsi_mod:scsi_alloc_target+0x2cd/0x343
>  [] :scsi_mod:__scsi_scan_target+0x66/0x5c6
>  [] trace_hardirqs_on+0x115/0x138
>  [] :scsi_mod:scsi_scan_channel+0x45/0x70
>  [] :scsi_mod:scsi_scan_host_selected+0xd5/0x110
> ata1.00: configured for UDMA/33
> ata2: port disabled. ignoring.
>  [] :scsi_mod:do_scan_async+0x0/0x152
>  [] :scsi_mod:do_scan_async+0x14/0x152
>  [] :scsi_mod:do_scan_async+0x0/0x152
>  [] kthread+0x47/0x73
>  [] trace_hardirqs_on_thunk+0x35/0x3a
>  [] child_rip+0xa/0x12
>  [] restore_args+0x0/0x30
>  [] menu_reflect+0x0/0x75
>  [] kthreadd+0x115/0x13a
>  [] kthread+0x0/0x73
>  [] child_rip+0x0/0x12
> 

It could be a scsi problem, or it could be all the kobject changes in
Greg's driver tree.  Or a combination of the two.

Don't know, sorry.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - SCSI/blkdev probing hang

2007-12-20 Thread Rik van Riel
On Thu, 13 Dec 2007 02:40:50 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> 

2.6.24-rc5-mm1 seems to have a hang related to the SCSI or block
device probing code.

This is on a dual quad-core x86-64 system with megaraid_sas controller.

scsi 0:2:0:0: Direct-Access DELL PERC 5/i 1.03 PQ: 0 ANSI: 5
general protection fault:  [1] SMP 
last sysfs file: /sys/class/firmware/timeout
CPU 7 
Modules linked in: ata_piix libata dm_snapshot dm_zero dm_mirror dm_mod shpchp 
megaraid_sas sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd
Pid: 678, comm: scsi_scan_0 Not tainted 2.6.24-rc5-mm1 #1
RIP: 0010:[]  [] mark_lock+0x1b/0x472
RSP: 0018:81043ba29c20  EFLAGS: 00010002
RAX: 0010 RBX: 81043b9ee8f0 RCX: 81043b9ee804
RDX: 6b6b6b6b6b6b6b6b RSI: 81043b9ee8f0 RDI: 81043b9ee000
RBP: 81043b9ee000 R08: 0002 R09: 
R10: 81129055 R11: 000281128c8d R12: 0004
R13: 0001 R14: 0002 R15: 81043e508028
FS:  () GS:81043e4e6a28() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 00361969afa0 CR3: 00201000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process scsi_scan_0 (pid: 678, threadinfo 81043ba28000, task 
81043b9ee000)
Stack:  81043b9ee8f0 6b6b6b6b6b6b6b6b 81043b9ee000<6>ata1.00: ATAPI: 
HL-DT-STCD-RW/DVD-ROM GCC-T10N, A102, max UDMA/33
 81059139
 3ba29c50 0002  81058623
 81043b504660 0246 81043e508028 81043b504660
Call Trace:
 [] __lock_acquire+0x4d7/0xc8e
 [] mark_held_locks+0x49/0x67
 [] lock_acquire+0x5a/0x73
 [] kobject_add+0xca/0x194
 [] mutex_lock_nested+0x2a1/0x2b0
 [] _spin_lock+0x26/0x52
 [] kobject_add+0xca/0x194
 [] device_add+0x9a/0x56e
 [] :scsi_mod:scsi_alloc_target+0x2cd/0x343
 [] :scsi_mod:__scsi_scan_target+0x66/0x5c6
 [] trace_hardirqs_on+0x115/0x138
 [] :scsi_mod:scsi_scan_channel+0x45/0x70
 [] :scsi_mod:scsi_scan_host_selected+0xd5/0x110
ata1.00: configured for UDMA/33
ata2: port disabled. ignoring.
 [] :scsi_mod:do_scan_async+0x0/0x152
 [] :scsi_mod:do_scan_async+0x14/0x152
 [] :scsi_mod:do_scan_async+0x0/0x152
 [] kthread+0x47/0x73
 [] trace_hardirqs_on_thunk+0x35/0x3a
 [] child_rip+0xa/0x12
 [] restore_args+0x0/0x30
 [] menu_reflect+0x0/0x75
 [] kthreadd+0x115/0x13a
 [] kthread+0x0/0x73
 [] child_rip+0x0/0x12


Code: 48 85 42 30 0f 85 2e 04 00 00 f0 ff 0d 2c ce 34 00 79 0d f3 
RIP  [] mark_lock+0x1b/0x472
 RSP 
general protection fault:  [2] SMP 
last sysfs file: /sys/class/firmware/timeout
CPU 3 
Modules linked in: ata_piix libata dm_snapshot dm_zero dm_mirror dm_mod shpchp 
megaraid_sas sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd
Pid: 743, comm: insmod Tainted: G  D 2.6.24-rc5-mm1 #1
RIP: 0010:[]  [] __list_add+0x2b/0x5b
RSP: :81043b4319c8  EFLAGS: 00010246
RAX: 6b6b6b6b6b6b6b6b RBX: 81043bec4a68 RCX: 
RDX: 6b6b6b6b6b6b6b6b RSI: 81043e508000 RDI: 81043bec4a78
RBP: 81043ba794b0 R08: 0002 R09: 
R10: 81129055 R11: 8102093a R12: 81043bec4aa8
R13: fffe R14:  R15: 81043ba79090
FS:  7fc3239ae6f0() GS:81043fc01d48() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0036196d5140 CR3: 00043bb4c000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process insmod (pid: 743, threadinfo 81043b43, task 81043b42e000)
Stack:  81043ba79090 81129066 81043ba79098 81043bec48b8
 81043bec4aa8 81043bec48b8  811a318c
 81043ba79098 81043ba79300 81043bec4a68 81043ba79098
Call Trace:
 [] kobject_add+0xdb/0x194
 [] device_add+0x9a/0x56e
 [] :scsi_mod:scsi_alloc_target+0x2cd/0x343
 [] :scsi_mod:__scsi_add_device+0x5b/0xd9
 [] :libata:ata_scsi_scan_host+0xa8/0x28b
 [] :libata:ata_host_register+0x256/0x280
 [] :libata:ata_pci_init_one+0x231/0x285
 [] :ata_piix:piix_init_one+0x512/0x53d
 [] native_sched_clock+0x47/0x70
 [] _spin_unlock+0x17/0x20
 [] pci_device_probe+0xb3/0xfd
 [] driver_probe_device+0xee/0x16b
 [] __driver_attach+0x90/0xcc
 [] __driver_attach+0x0/0xcc
 [] __driver_attach+0x0/0xcc
 [] bus_for_each_dev+0x47/0x72
 [] bus_add_driver+0xc4/0x20b
 [] driver_register+0x59/0xcd
 [] __pci_register_driver+0x57/0x8b
 [] :ata_piix:piix_init+0x1e/0x32
 [] sys_init_module+0x15e5/0x173b
 [] system_call+0x7e/0x83
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread Mariusz Kozlowski
Hello, 

> > > Actually, you may only need these two:
> > > 
> > > > maps4-add-proc-kpagecount-interface.patch
> > > > maps4-add-proc-kpageflags-interface.patch
> > 
> > Yes these two were enough, and exporting fs/proc/base.c's
> > mem_lseek().
> > 
> > As hard as I try, I can't reproduce this at all.  I tried
> > both on my workstation and my niagara boxes.
> 
> That's good to know, I was having a very hard time imagining how the
> kpagecount code could be going south.
>  
> > It must be other needle in the 30MB+ -mm haystack. :-(

I'm afraid you are wrong. Eariler kernel are affected as well. At reading your 
mail I was
thinking of applying those two patches to 2.6.24-rc5 and do bisection on the 
rest of -mm series.
Unfortunately clean 2.6.24-rc5 with these two patches is affected as well (new 
processes
stuck in D state etc). So I tried vanilla 2.6.23 patched by these two patches 
(and
mem_lseek export from fs/proc/base.c). Now at least I got a trace produced by 
'cat /proc/kpagecount'
which you can find below. Also, in spite of the oops, the box doesn't get 
locked (as with -mm)
and is still usable.

[  126.060976] TSTATE: 009980009603 TPC: 00428a84 TNPC: 
00428a88 Y: Not tainted
[  126.063486] TPC: 
[  126.065986] g0: 0009 g1: 04804000 g2: 000f 
g3: 007204c0
[  126.068636] g4: 007244c0 g5: f8007f878000 g6: 007204c0 
g7: 00724958
[  126.071232] o0: 0001 o1: 007204c8 o2: 0001 
o3: 
[  126.073924] o4: 6000 o5: 0078f140 sp: 007239b1 
ret_pc: 00428a78
[  126.076569] RPC: 
[  126.079185] l0: 0072 l1: 0002 l2: 0001 
l3: 0075d400
[  126.081934] l4: 0075d400 l5: f80080015b10 l6: f80080005b08 
l7: 0001
[  126.084637] i0: 0001 i1: 00720094 i2:  
i3: 
[  126.087375] i4: 007204c0 i5: 0002 i6: 00723a71 
i7: 00665a24
[  126.090135] I7: 
[  145.121228] Unable to handle kernel NULL pointer dereference
[  145.124515] tsk->{mm,active_mm}->context = 0d41
[  145.127778] tsk->{mm,active_mm}->pgd = f800bd8d2000
[  145.127801]   \|/  \|/
[  145.127808]   "@'/ .. \`@"
[  145.127815]   /_| \__/ |_\
[  145.127821]  \__U_/
[  145.127831] cat(3111): Oops [#1]
[  145.127849] 
[  145.127853] =
[  145.127861] [ INFO: inconsistent lock state ]
[  145.127873] 2.6.23 #1
[  145.127880] -
[  145.127891] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
[  145.127906] cat/3111 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  145.127918]  (regdump_lock){+...}, at: [<004281d0>] 
__show_regs+0x18/0x320
[  145.127951] {in-hardirq-W} state was registered at:
[  145.127960]   [<00669780>] _spin_lock+0x28/0x40
[  145.127983]   [<004281d0>] __show_regs+0x18/0x320
[  145.128000]   [<004284e4>] show_regs+0xc/0x20
[  145.128016]   [<005ac9d8>] sysrq_handle_showregs+0x20/0x40
[  145.128041]   [<005ac7fc>] __handle_sysrq+0x84/0x160
[  145.128060]   [<005ac8f8>] handle_sysrq+0x20/0x40
[  145.128078]   [<005a4f08>] kbd_event+0x670/0xb60
[  145.128110]   [<005ea0c0>] input_event+0x1e8/0x560
[  145.128140]   [<005efa2c>] sunkbd_interrupt+0x114/0x140
[  145.128167]   [<005e6270>] serio_interrupt+0x38/0xa0
[  145.128186]   [<005b2e58>] sunsu_kbd_ms_interrupt+0xa0/0x140
[  145.128212]   [<0049f6f8>] handle_IRQ_event+0x20/0x80
[  145.128251]   [<0049f808>] __do_IRQ+0xb0/0x140
[  145.128268]   [<0042f48c>] handler_irq+0x94/0xc0
[  145.128306]   [<00426f30>] sunos_sys_table+0x560/0x728
[  145.128324]   [<00428a78>] cpu_idle+0x20/0xe0
[  145.128341]   [<00665a24>] rest_init+0x6c/0x80
[  145.128375]   [<0076ec24>] start_kernel+0x2ec/0x340
[  145.128405]   [<0066599c>] tlb_fixup_done+0xa0/0xbc
[  145.128425]   [<>] 0x8
[  145.128443] irq event stamp: 1209
[  145.128451] hardirqs last  enabled at (1209): [<00404b74>] 
__handle_softirq_continue+0x20/0x24
[  145.128480] hardirqs last disabled at (1207): [<00474494>] 
__do_softirq+0xbc/0x140
[  145.128506] softirqs last  enabled at (1208): [<004744dc>] 
__do_softirq+0x104/0x140
[  145.128526] softirqs last disabled at (1203): [<004745a0>] 
do_softirq+0x88/0xa0
[  145.128546] 
[  145.128551] other info that might help us debug this:
[  145.128562] no locks held by cat/3111.
[  145.128570] 
[  145.128574] stack backtrace:
[  145.128582] Call Trace:
[  145.128590]  [004907a0] print_usage_bug+0x148/0x160
[  145.128624]  [004917f4] mark_lock+0x6dc/0x780
[  145.128641]  [0049286c] __lock_acquire+0x734/0x12a0
[  145.128659]  

Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread Matt Mackall
On Thu, Dec 20, 2007 at 04:53:59AM -0800, David Miller wrote:
> From: Matt Mackall <[EMAIL PROTECTED]>
> Date: Mon, 17 Dec 2007 08:55:54 -0600
> 
> > On Sun, Dec 16, 2007 at 10:39:17PM -0800, Andrew Morton wrote:
> > Actually, you may only need these two:
> > 
> > > maps4-add-proc-kpagecount-interface.patch
> > > maps4-add-proc-kpageflags-interface.patch
> 
> Yes these two were enough, and exporting fs/proc/base.c's
> mem_lseek().
> 
> As hard as I try, I can't reproduce this at all.  I tried
> both on my workstation and my niagara boxes.

That's good to know, I was having a very hard time imagining how the
kpagecount code could be going south.
 
> It must be other needle in the 30MB+ -mm haystack. :-(

Have we seen a config for the broken machine? Perhaps that'll help us
make a guess..

-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-20 Thread Jason Wessel
Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/
>
> - If something goes wrong with a PCI device's probing or initialisation, try
>   reverting pci-disable-decoding-during-sizing-of-bars.patch.
>
> - git-sched was dropped due to breaking suspend-to-RAM.
>
> - git-block has been restored after having had a few problems
>
> - git-newsetup.patch was dropped due to conflicts with git-x86
>
> - git-perfmon.patch is still dropped for the same reason
>
> - git-kgdb.patch is still dropped for the same reason
>
>   
Andrew,

I re-based the for_mm branch at:
http://git.kernel.org/?p=linux/kernel/git/jwessel/linux-2.6-kgdb.git;a=shortlog;h=for_mm
against the git-x86/mm branch from the x86-git tree. If there are other
patch trees I need to pull in and patch against to allow for kgdb to be
included into -mm please let me know.

I would like to submit another review request for kgdb into the mainline
as well as resolve the issues with the -mm tree + kgdb.

Thanks,
Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-20 Thread Rafael J. Wysocki
On Thursday, 20 of December 2007, Miles Lane wrote:
> On Dec 19, 2007 8:31 PM, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:
> >
> > On Thursday, 20 of December 2007, Miles Lane wrote:
> > > On Dec 19, 2007 7:09 PM, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:
> > >
> > > > On Thursday, 20 of December 2007, Christoph Lameter wrote:
> > > > > On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
> > > > >
> > > > > > > We could reexport drain_local_pages() again but then I do not
> > > > understand
> > > > > > > why we would only drain the pages of this processor and not of all
> > > > other
> > > > > > > processors as well. It seems that software suspend intend was to
> > > > flush
> > > > > > > them all right?
> > > > > >
> > > > > > Well, not exactly.  We are on one CPU at this point, the others have
> > > > been
> > > > > > disabled.
> > > > >
> > > > > Ok so the others are flush. Here is a patch to re-export
> > > > > drain_local_pages() again and use it for software suspend:
> > > > >
> > > > > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> > > > >
> > > > > ---
> > > > >  include/linux/gfp.h |1 +
> > > > >  kernel/power/snapshot.c |2 +-
> > > > >  mm/page_alloc.c |2 +-
> > > > >  3 files changed, 3 insertions(+), 2 deletions(-)
> > > > >
> > > > > Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
> > > > > ===
> > > > > --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 
> > > > > 11:59:
> > > > 25.233961700 -0800
> > > > > +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c  2007-12-19 
> > > > > 15:16:
> > > > 34.179661929 -0800
> > > > > @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
> > > > >
> > > > >   printk(KERN_INFO "PM: Creating hibernation image: \n");
> > > > >
> > > > > - drain_all_pages();
> > > > > + drain_local_pages(NULL);
> > > > >   nr_pages = count_data_pages();
> > > > >   nr_highmem = count_highmem_pages();
> > > > >   printk(KERN_INFO "PM: Need to copy %u pages\n", nr_pages +
> > > > nr_highmem);
> > > >
> > > > You've omitted the second instance, right before the copy_data_pages()
> > > > call.
> > > >
> > >
> > > I guess I will wait for a revised patch.
> >
> > There's an Andrew's fix on top of this one in -mm:
> > http://marc.info/?l=linux-mm-commits=119810866812965=2
> >
> >
> >
> > > > > Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
> > > > > ===
> > > > > --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:
> > > > 00.630421258 -0800
> > > > > +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c  2007-12-19 15:12:
> > > > 19.850545818 -0800
> > > > > @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
> > > > >  /*
> > > > >   * Spill all of this CPU's per-cpu pages back into the buddy 
> > > > > allocator.
> > > > >   */
> > > > > -static void drain_local_pages(void *arg)
> > > > > +void drain_local_pages(void *arg)
> > > > >  {
> > > > >   drain_pages(smp_processor_id());
> > > > >  }
> > > > > Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
> > > > > ===
> > > > > --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 
> > > > > 15:13:
> > > > 51.926950065 -0800
> > > > > +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h  2007-12-19 15:16:
> > > > 11.951564369 -0800
> > > > > @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
> > > > >  void page_alloc_init(void);
> > > > >  void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
> > > > >  void drain_all_pages(void);
> > > > > +void drain_local_pages(void *dummy);
> > > > >
> > > > >  #endif /* __LINUX_GFP_H */
> > > >
> >
> 
> I applied Christoph and Andrew's patches and recompiled.  I suspended
> to disk and to ram several times and all looks good.

OK, thanks for testing!

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread David Miller
From: Matt Mackall <[EMAIL PROTECTED]>
Date: Mon, 17 Dec 2007 08:55:54 -0600

> On Sun, Dec 16, 2007 at 10:39:17PM -0800, Andrew Morton wrote:
> Actually, you may only need these two:
> 
> > maps4-add-proc-kpagecount-interface.patch
> > maps4-add-proc-kpageflags-interface.patch

Yes these two were enough, and exporting fs/proc/base.c's
mem_lseek().

As hard as I try, I can't reproduce this at all.  I tried
both on my workstation and my niagara boxes.

It must be other needle in the 30MB+ -mm haystack. :-(

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread David Miller
From: Matt Mackall [EMAIL PROTECTED]
Date: Mon, 17 Dec 2007 08:55:54 -0600

 On Sun, Dec 16, 2007 at 10:39:17PM -0800, Andrew Morton wrote:
 Actually, you may only need these two:
 
  maps4-add-proc-kpagecount-interface.patch
  maps4-add-proc-kpageflags-interface.patch

Yes these two were enough, and exporting fs/proc/base.c's
mem_lseek().

As hard as I try, I can't reproduce this at all.  I tried
both on my workstation and my niagara boxes.

It must be other needle in the 30MB+ -mm haystack. :-(

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-20 Thread Rafael J. Wysocki
On Thursday, 20 of December 2007, Miles Lane wrote:
 On Dec 19, 2007 8:31 PM, Rafael J. Wysocki [EMAIL PROTECTED] wrote:
 
  On Thursday, 20 of December 2007, Miles Lane wrote:
   On Dec 19, 2007 7:09 PM, Rafael J. Wysocki [EMAIL PROTECTED] wrote:
  
On Thursday, 20 of December 2007, Christoph Lameter wrote:
 On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:

   We could reexport drain_local_pages() again but then I do not
understand
   why we would only drain the pages of this processor and not of all
other
   processors as well. It seems that software suspend intend was to
flush
   them all right?
 
  Well, not exactly.  We are on one CPU at this point, the others have
been
  disabled.

 Ok so the others are flush. Here is a patch to re-export
 drain_local_pages() again and use it for software suspend:

 Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

 ---
  include/linux/gfp.h |1 +
  kernel/power/snapshot.c |2 +-
  mm/page_alloc.c |2 +-
  3 files changed, 3 insertions(+), 2 deletions(-)

 Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
 ===
 --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 
 11:59:
25.233961700 -0800
 +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c  2007-12-19 
 15:16:
34.179661929 -0800
 @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)

   printk(KERN_INFO PM: Creating hibernation image: \n);

 - drain_all_pages();
 + drain_local_pages(NULL);
   nr_pages = count_data_pages();
   nr_highmem = count_highmem_pages();
   printk(KERN_INFO PM: Need to copy %u pages\n, nr_pages +
nr_highmem);
   
You've omitted the second instance, right before the copy_data_pages()
call.
   
  
   I guess I will wait for a revised patch.
 
  There's an Andrew's fix on top of this one in -mm:
  http://marc.info/?l=linux-mm-commitsm=119810866812965w=2
 
 
 
 Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
 ===
 --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:
00.630421258 -0800
 +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c  2007-12-19 15:12:
19.850545818 -0800
 @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
  /*
   * Spill all of this CPU's per-cpu pages back into the buddy 
 allocator.
   */
 -static void drain_local_pages(void *arg)
 +void drain_local_pages(void *arg)
  {
   drain_pages(smp_processor_id());
  }
 Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
 ===
 --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 
 15:13:
51.926950065 -0800
 +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h  2007-12-19 15:16:
11.951564369 -0800
 @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
  void page_alloc_init(void);
  void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
  void drain_all_pages(void);
 +void drain_local_pages(void *dummy);

  #endif /* __LINUX_GFP_H */
   
 
 
 I applied Christoph and Andrew's patches and recompiled.  I suspended
 to disk and to ram several times and all looks good.

OK, thanks for testing!

Rafael
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-20 Thread Jason Wessel
Andrew Morton wrote:
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/

 - If something goes wrong with a PCI device's probing or initialisation, try
   reverting pci-disable-decoding-during-sizing-of-bars.patch.

 - git-sched was dropped due to breaking suspend-to-RAM.

 - git-block has been restored after having had a few problems

 - git-newsetup.patch was dropped due to conflicts with git-x86

 - git-perfmon.patch is still dropped for the same reason

 - git-kgdb.patch is still dropped for the same reason

   
Andrew,

I re-based the for_mm branch at:
http://git.kernel.org/?p=linux/kernel/git/jwessel/linux-2.6-kgdb.git;a=shortlog;h=for_mm
against the git-x86/mm branch from the x86-git tree. If there are other
patch trees I need to pull in and patch against to allow for kgdb to be
included into -mm please let me know.

I would like to submit another review request for kgdb into the mainline
as well as resolve the issues with the -mm tree + kgdb.

Thanks,
Jason.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread Matt Mackall
On Thu, Dec 20, 2007 at 04:53:59AM -0800, David Miller wrote:
 From: Matt Mackall [EMAIL PROTECTED]
 Date: Mon, 17 Dec 2007 08:55:54 -0600
 
  On Sun, Dec 16, 2007 at 10:39:17PM -0800, Andrew Morton wrote:
  Actually, you may only need these two:
  
   maps4-add-proc-kpagecount-interface.patch
   maps4-add-proc-kpageflags-interface.patch
 
 Yes these two were enough, and exporting fs/proc/base.c's
 mem_lseek().
 
 As hard as I try, I can't reproduce this at all.  I tried
 both on my workstation and my niagara boxes.

That's good to know, I was having a very hard time imagining how the
kpagecount code could be going south.
 
 It must be other needle in the 30MB+ -mm haystack. :-(

Have we seen a config for the broken machine? Perhaps that'll help us
make a guess..

-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread Mariusz Kozlowski
Hello, 

   Actually, you may only need these two:
   
maps4-add-proc-kpagecount-interface.patch
maps4-add-proc-kpageflags-interface.patch
  
  Yes these two were enough, and exporting fs/proc/base.c's
  mem_lseek().
  
  As hard as I try, I can't reproduce this at all.  I tried
  both on my workstation and my niagara boxes.
 
 That's good to know, I was having a very hard time imagining how the
 kpagecount code could be going south.
  
  It must be other needle in the 30MB+ -mm haystack. :-(

I'm afraid you are wrong. Eariler kernel are affected as well. At reading your 
mail I was
thinking of applying those two patches to 2.6.24-rc5 and do bisection on the 
rest of -mm series.
Unfortunately clean 2.6.24-rc5 with these two patches is affected as well (new 
processes
stuck in D state etc). So I tried vanilla 2.6.23 patched by these two patches 
(and
mem_lseek export from fs/proc/base.c). Now at least I got a trace produced by 
'cat /proc/kpagecount'
which you can find below. Also, in spite of the oops, the box doesn't get 
locked (as with -mm)
and is still usable.

[  126.060976] TSTATE: 009980009603 TPC: 00428a84 TNPC: 
00428a88 Y: Not tainted
[  126.063486] TPC: cpu_idle+0x2c/0xe0
[  126.065986] g0: 0009 g1: 04804000 g2: 000f 
g3: 007204c0
[  126.068636] g4: 007244c0 g5: f8007f878000 g6: 007204c0 
g7: 00724958
[  126.071232] o0: 0001 o1: 007204c8 o2: 0001 
o3: 
[  126.073924] o4: 6000 o5: 0078f140 sp: 007239b1 
ret_pc: 00428a78
[  126.076569] RPC: cpu_idle+0x20/0xe0
[  126.079185] l0: 0072 l1: 0002 l2: 0001 
l3: 0075d400
[  126.081934] l4: 0075d400 l5: f80080015b10 l6: f80080005b08 
l7: 0001
[  126.084637] i0: 0001 i1: 00720094 i2:  
i3: 
[  126.087375] i4: 007204c0 i5: 0002 i6: 00723a71 
i7: 00665a24
[  126.090135] I7: rest_init+0x6c/0x80
[  145.121228] Unable to handle kernel NULL pointer dereference
[  145.124515] tsk-{mm,active_mm}-context = 0d41
[  145.127778] tsk-{mm,active_mm}-pgd = f800bd8d2000
[  145.127801]   \|/  \|/
[  145.127808]   @'/ .. \`@
[  145.127815]   /_| \__/ |_\
[  145.127821]  \__U_/
[  145.127831] cat(3111): Oops [#1]
[  145.127849] 
[  145.127853] =
[  145.127861] [ INFO: inconsistent lock state ]
[  145.127873] 2.6.23 #1
[  145.127880] -
[  145.127891] inconsistent {in-hardirq-W} - {hardirq-on-W} usage.
[  145.127906] cat/3111 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  145.127918]  (regdump_lock){+...}, at: [004281d0] 
__show_regs+0x18/0x320
[  145.127951] {in-hardirq-W} state was registered at:
[  145.127960]   [00669780] _spin_lock+0x28/0x40
[  145.127983]   [004281d0] __show_regs+0x18/0x320
[  145.128000]   [004284e4] show_regs+0xc/0x20
[  145.128016]   [005ac9d8] sysrq_handle_showregs+0x20/0x40
[  145.128041]   [005ac7fc] __handle_sysrq+0x84/0x160
[  145.128060]   [005ac8f8] handle_sysrq+0x20/0x40
[  145.128078]   [005a4f08] kbd_event+0x670/0xb60
[  145.128110]   [005ea0c0] input_event+0x1e8/0x560
[  145.128140]   [005efa2c] sunkbd_interrupt+0x114/0x140
[  145.128167]   [005e6270] serio_interrupt+0x38/0xa0
[  145.128186]   [005b2e58] sunsu_kbd_ms_interrupt+0xa0/0x140
[  145.128212]   [0049f6f8] handle_IRQ_event+0x20/0x80
[  145.128251]   [0049f808] __do_IRQ+0xb0/0x140
[  145.128268]   [0042f48c] handler_irq+0x94/0xc0
[  145.128306]   [00426f30] sunos_sys_table+0x560/0x728
[  145.128324]   [00428a78] cpu_idle+0x20/0xe0
[  145.128341]   [00665a24] rest_init+0x6c/0x80
[  145.128375]   [0076ec24] start_kernel+0x2ec/0x340
[  145.128405]   [0066599c] tlb_fixup_done+0xa0/0xbc
[  145.128425]   [] 0x8
[  145.128443] irq event stamp: 1209
[  145.128451] hardirqs last  enabled at (1209): [00404b74] 
__handle_softirq_continue+0x20/0x24
[  145.128480] hardirqs last disabled at (1207): [00474494] 
__do_softirq+0xbc/0x140
[  145.128506] softirqs last  enabled at (1208): [004744dc] 
__do_softirq+0x104/0x140
[  145.128526] softirqs last disabled at (1203): [004745a0] 
do_softirq+0x88/0xa0
[  145.128546] 
[  145.128551] other info that might help us debug this:
[  145.128562] no locks held by cat/3111.
[  145.128570] 
[  145.128574] stack backtrace:
[  145.128582] Call Trace:
[  145.128590]  [004907a0] print_usage_bug+0x148/0x160
[  145.128624]  [004917f4] mark_lock+0x6dc/0x780
[  145.128641]  [0049286c] __lock_acquire+0x734/0x12a0
[  145.128659]  [00493430] lock_acquire+0x58/0x80
[  

Re: 2.6.24-rc5-mm1 - SCSI/blkdev probing hang

2007-12-20 Thread Rik van Riel
On Thu, 13 Dec 2007 02:40:50 -0800
Andrew Morton [EMAIL PROTECTED] wrote:

 

2.6.24-rc5-mm1 seems to have a hang related to the SCSI or block
device probing code.

This is on a dual quad-core x86-64 system with megaraid_sas controller.

scsi 0:2:0:0: Direct-Access DELL PERC 5/i 1.03 PQ: 0 ANSI: 5
general protection fault:  [1] SMP 
last sysfs file: /sys/class/firmware/timeout
CPU 7 
Modules linked in: ata_piix libata dm_snapshot dm_zero dm_mirror dm_mod shpchp 
megaraid_sas sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd
Pid: 678, comm: scsi_scan_0 Not tainted 2.6.24-rc5-mm1 #1
RIP: 0010:[81058183]  [81058183] mark_lock+0x1b/0x472
RSP: 0018:81043ba29c20  EFLAGS: 00010002
RAX: 0010 RBX: 81043b9ee8f0 RCX: 81043b9ee804
RDX: 6b6b6b6b6b6b6b6b RSI: 81043b9ee8f0 RDI: 81043b9ee000
RBP: 81043b9ee000 R08: 0002 R09: 
R10: 81129055 R11: 000281128c8d R12: 0004
R13: 0001 R14: 0002 R15: 81043e508028
FS:  () GS:81043e4e6a28() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 00361969afa0 CR3: 00201000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process scsi_scan_0 (pid: 678, threadinfo 81043ba28000, task 
81043b9ee000)
Stack:  81043b9ee8f0 6b6b6b6b6b6b6b6b 81043b9ee0006ata1.00: ATAPI: 
HL-DT-STCD-RW/DVD-ROM GCC-T10N, A102, max UDMA/33
 81059139
 3ba29c50 0002  81058623
 81043b504660 0246 81043e508028 81043b504660
Call Trace:
 [81059139] __lock_acquire+0x4d7/0xc8e
 [81058623] mark_held_locks+0x49/0x67
 [81059ce2] lock_acquire+0x5a/0x73
 [81129055] kobject_add+0xca/0x194
 [8126d56c] mutex_lock_nested+0x2a1/0x2b0
 [8126e997] _spin_lock+0x26/0x52
 [81129055] kobject_add+0xca/0x194
 [811a318c] device_add+0x9a/0x56e
 [8805c327] :scsi_mod:scsi_alloc_target+0x2cd/0x343
 [8805c492] :scsi_mod:__scsi_scan_target+0x66/0x5c6
 [810587f0] trace_hardirqs_on+0x115/0x138
 [8805ca37] :scsi_mod:scsi_scan_channel+0x45/0x70
 [8805cb37] :scsi_mod:scsi_scan_host_selected+0xd5/0x110
ata1.00: configured for UDMA/33
ata2: port disabled. ignoring.
 [8805cbe5] :scsi_mod:do_scan_async+0x0/0x152
 [8805cbf9] :scsi_mod:do_scan_async+0x14/0x152
 [8805cbe5] :scsi_mod:do_scan_async+0x0/0x152
 [8104d4e8] kthread+0x47/0x73
 [8126e418] trace_hardirqs_on_thunk+0x35/0x3a
 [8100cee8] child_rip+0xa/0x12
 [8100c5ff] restore_args+0x0/0x30
 [811e0908] menu_reflect+0x0/0x75
 [8104d371] kthreadd+0x115/0x13a
 [8104d4a1] kthread+0x0/0x73
 [8100cede] child_rip+0x0/0x12


Code: 48 85 42 30 0f 85 2e 04 00 00 f0 ff 0d 2c ce 34 00 79 0d f3 
RIP  [81058183] mark_lock+0x1b/0x472
 RSP 81043ba29c20
general protection fault:  [2] SMP 
last sysfs file: /sys/class/firmware/timeout
CPU 3 
Modules linked in: ata_piix libata dm_snapshot dm_zero dm_mirror dm_mod shpchp 
megaraid_sas sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd
Pid: 743, comm: insmod Tainted: G  D 2.6.24-rc5-mm1 #1
RIP: 0010:[81130cca]  [81130cca] __list_add+0x2b/0x5b
RSP: :81043b4319c8  EFLAGS: 00010246
RAX: 6b6b6b6b6b6b6b6b RBX: 81043bec4a68 RCX: 
RDX: 6b6b6b6b6b6b6b6b RSI: 81043e508000 RDI: 81043bec4a78
RBP: 81043ba794b0 R08: 0002 R09: 
R10: 81129055 R11: 8102093a R12: 81043bec4aa8
R13: fffe R14:  R15: 81043ba79090
FS:  7fc3239ae6f0() GS:81043fc01d48() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0036196d5140 CR3: 00043bb4c000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process insmod (pid: 743, threadinfo 81043b43, task 81043b42e000)
Stack:  81043ba79090 81129066 81043ba79098 81043bec48b8
 81043bec4aa8 81043bec48b8  811a318c
 81043ba79098 81043ba79300 81043bec4a68 81043ba79098
Call Trace:
 [81129066] kobject_add+0xdb/0x194
 [811a318c] device_add+0x9a/0x56e
 [8805c327] :scsi_mod:scsi_alloc_target+0x2cd/0x343
 [8805cd92] :scsi_mod:__scsi_add_device+0x5b/0xd9
 [880c78d4] :libata:ata_scsi_scan_host+0xa8/0x28b
 [880c4698] :libata:ata_host_register+0x256/0x280
 [880c9bfd] :libata:ata_pci_init_one+0x231/0x285
 [880e38cc] :ata_piix:piix_init_one+0x512/0x53d
 [81012f31] native_sched_clock+0x47/0x70
 [8126e8ae] 

Re: 2.6.24-rc5-mm1 - SCSI/blkdev probing hang

2007-12-20 Thread Andrew Morton
On Thu, 20 Dec 2007 15:57:45 -0500
Rik van Riel [EMAIL PROTECTED] wrote:

 2.6.24-rc5-mm1 seems to have a hang related to the SCSI or block
 device probing code.
 
 This is on a dual quad-core x86-64 system with megaraid_sas controller.
 
 scsi 0:2:0:0: Direct-Access DELL PERC 5/i 1.03 PQ: 0 ANSI: 5
 general protection fault:  [1] SMP 
 last sysfs file: /sys/class/firmware/timeout
 CPU 7 
 Modules linked in: ata_piix libata dm_snapshot dm_zero dm_mirror dm_mod 
 shpchp megaraid_sas sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd 
 ehci_hcd
 Pid: 678, comm: scsi_scan_0 Not tainted 2.6.24-rc5-mm1 #1
 RIP: 0010:[81058183]  [81058183] mark_lock+0x1b/0x472

Could be that someone passed a garbage pointer into lockdep.

 RSP: 0018:81043ba29c20  EFLAGS: 00010002
 RAX: 0010 RBX: 81043b9ee8f0 RCX: 81043b9ee804
 RDX: 6b6b6b6b6b6b6b6b RSI: 81043b9ee8f0 RDI: 81043b9ee000
 RBP: 81043b9ee000 R08: 0002 R09: 
 R10: 81129055 R11: 000281128c8d R12: 0004
 R13: 0001 R14: 0002 R15: 81043e508028
 FS:  () GS:81043e4e6a28() knlGS:
 CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
 CR2: 00361969afa0 CR3: 00201000 CR4: 06e0
 DR0:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process scsi_scan_0 (pid: 678, threadinfo 81043ba28000, task 
 81043b9ee000)
 Stack:  81043b9ee8f0 6b6b6b6b6b6b6b6b 81043b9ee0006ata1.00: ATAPI: 
 HL-DT-STCD-RW/DVD-ROM GCC-T10N, A102, max UDMA/33
  81059139
  3ba29c50 0002  81058623
  81043b504660 0246 81043e508028 81043b504660
 Call Trace:
  [81059139] __lock_acquire+0x4d7/0xc8e
  [81058623] mark_held_locks+0x49/0x67
  [81059ce2] lock_acquire+0x5a/0x73
  [81129055] kobject_add+0xca/0x194
  [8126d56c] mutex_lock_nested+0x2a1/0x2b0
  [8126e997] _spin_lock+0x26/0x52
  [81129055] kobject_add+0xca/0x194
  [811a318c] device_add+0x9a/0x56e
  [8805c327] :scsi_mod:scsi_alloc_target+0x2cd/0x343
  [8805c492] :scsi_mod:__scsi_scan_target+0x66/0x5c6
  [810587f0] trace_hardirqs_on+0x115/0x138
  [8805ca37] :scsi_mod:scsi_scan_channel+0x45/0x70
  [8805cb37] :scsi_mod:scsi_scan_host_selected+0xd5/0x110
 ata1.00: configured for UDMA/33
 ata2: port disabled. ignoring.
  [8805cbe5] :scsi_mod:do_scan_async+0x0/0x152
  [8805cbf9] :scsi_mod:do_scan_async+0x14/0x152
  [8805cbe5] :scsi_mod:do_scan_async+0x0/0x152
  [8104d4e8] kthread+0x47/0x73
  [8126e418] trace_hardirqs_on_thunk+0x35/0x3a
  [8100cee8] child_rip+0xa/0x12
  [8100c5ff] restore_args+0x0/0x30
  [811e0908] menu_reflect+0x0/0x75
  [8104d371] kthreadd+0x115/0x13a
  [8104d4a1] kthread+0x0/0x73
  [8100cede] child_rip+0x0/0x12
 

It could be a scsi problem, or it could be all the kobject changes in
Greg's driver tree.  Or a combination of the two.

Don't know, sorry.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-20 Thread Andrew Morton
On Thu, 20 Dec 2007 10:55:51 -0600
Jason Wessel [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/
 
  - If something goes wrong with a PCI device's probing or initialisation, try
reverting pci-disable-decoding-during-sizing-of-bars.patch.
 
  - git-sched was dropped due to breaking suspend-to-RAM.
 
  - git-block has been restored after having had a few problems
 
  - git-newsetup.patch was dropped due to conflicts with git-x86
 
  - git-perfmon.patch is still dropped for the same reason
 
  - git-kgdb.patch is still dropped for the same reason
 

 Andrew,
 
 I re-based the for_mm branch at:
 http://git.kernel.org/?p=linux/kernel/git/jwessel/linux-2.6-kgdb.git;a=shortlog;h=for_mm
 against the git-x86/mm branch from the x86-git tree. If there are other
 patch trees I need to pull in and patch against to allow for kgdb to be
 included into -mm please let me know.

The x86 merge worked OK.


Here's what it looks like:

patching file Documentation/DocBook/Makefile
Hunk #1 FAILED at 11.
1 out of 1 hunk FAILED -- saving rejects to file 
Documentation/DocBook/Makefile.rej
patching file Documentation/DocBook/kgdb.tmpl
patching file Documentation/kernel-parameters.txt
Hunk #1 succeeded at 816 (offset 7 lines).
patching file MAINTAINERS
Hunk #1 succeeded at 2279 (offset 52 lines).
patching file Makefile
patching file arch/arm/kernel/Makefile
patching file arch/arm/kernel/kgdb-jmp.S
patching file arch/arm/kernel/kgdb.c
patching file arch/arm/kernel/setup.c
patching file arch/arm/kernel/traps.c
patching file arch/arm/mach-ixp2000/core.c
patching file arch/arm/mach-ixp2000/ixdp2x01.c
patching file arch/arm/mach-ixp4xx/coyote-setup.c
patching file arch/arm/mach-ixp4xx/ixdp425-setup.c
patching file arch/arm/mach-omap1/serial.c
patching file arch/arm/mach-omap2/serial.c
patching file arch/arm/mach-pnx4008/core.c
patching file arch/arm/mach-pxa/Makefile
Hunk #1 FAILED at 43.
1 out of 1 hunk FAILED -- saving rejects to file arch/arm/mach-pxa/Makefile.rej
patching file arch/arm/mach-pxa/kgdb-serial.c
patching file arch/arm/mach-versatile/core.c
patching file arch/arm/mm/extable.c
patching file arch/ia64/kernel/Makefile
patching file arch/ia64/kernel/kgdb-jmp.S
patching file arch/ia64/kernel/kgdb.c
patching file arch/ia64/kernel/smp.c
patching file arch/ia64/kernel/traps.c
Hunk #1 FAILED at 155.
1 out of 1 hunk FAILED -- saving rejects to file arch/ia64/kernel/traps.c.rej
patching file arch/ia64/mm/extable.c
patching file arch/ia64/mm/fault.c
patching file arch/mips/Kconfig
Hunk #2 succeeded at 323 (offset -6 lines).
Hunk #4 succeeded at 419 (offset -7 lines).
Hunk #5 succeeded at 531 (offset 21 lines).
Hunk #6 succeeded at 608 (offset -21 lines).
Hunk #7 succeeded at 670 (offset 21 lines).
Hunk #8 succeeded at 914 (offset -24 lines).
patching file arch/mips/Kconfig.debug
patching file arch/mips/au1000/common/Makefile
patching file arch/mips/au1000/common/dbg_io.c
patching file arch/mips/basler/excite/Makefile
patching file arch/mips/basler/excite/excite_dbg_io.c
patching file arch/mips/basler/excite/excite_irq.c
patching file arch/mips/basler/excite/excite_setup.c
patching file arch/mips/jmr3927/rbhma3100/Makefile
patching file arch/mips/jmr3927/rbhma3100/kgdb_io.c
patching file arch/mips/kernel/Makefile
patching file arch/mips/kernel/gdb-low.S
patching file arch/mips/kernel/gdb-stub.c
patching file arch/mips/kernel/irq.c
patching file arch/mips/kernel/kgdb-jmp.c
patching file arch/mips/kernel/kgdb-setjmp.S
patching file arch/mips/kernel/kgdb.c
patching file arch/mips/kernel/kgdb_handler.S
patching file arch/mips/kernel/traps.c
patching file arch/mips/mips-boards/atlas/Makefile
patching file arch/mips/mips-boards/atlas/atlas_gdb.c
patching file arch/mips/mips-boards/atlas/atlas_setup.c
patching file arch/mips/mips-boards/generic/Makefile
patching file arch/mips/mips-boards/generic/gdb_hook.c
patching file arch/mips/mips-boards/generic/init.c
patching file arch/mips/mips-boards/malta/malta_setup.c
patching file arch/mips/mm/extable.c
patching file arch/mips/pci/fixup-atlas.c
patching file arch/mips/philips/pnx8550/common/Makefile
patching file arch/mips/philips/pnx8550/common/gdb_hook.c
patching file arch/mips/philips/pnx8550/common/setup.c
patching file arch/mips/pmc-sierra/yosemite/Makefile
patching file arch/mips/pmc-sierra/yosemite/dbg_io.c
patching file arch/mips/pmc-sierra/yosemite/irq.c
patching file arch/mips/sgi-ip22/ip22-setup.c
patching file arch/mips/sgi-ip27/Makefile
patching file arch/mips/sgi-ip27/ip27-dbgio.c
patching file arch/mips/sibyte/bcm1480/irq.c
patching file arch/mips/sibyte/cfe/setup.c
Hunk #3 succeeded at 298 (offset -3 lines).
patching file arch/mips/sibyte/sb1250/irq.c
patching file arch/mips/sibyte/sb1250/kgdb_sibyte.c
patching file arch/mips/sibyte/swarm/Makefile
patching file arch/mips/sibyte/swarm/dbg_io.c
patching file arch/mips/tx4927/common/Makefile
Hunk #1 succeeded at 9 with fuzz 1.
patching 

Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread David Miller
From: Mariusz Kozlowski [EMAIL PROTECTED]
Date: Thu, 20 Dec 2007 20:47:55 +0100

 [  145.128915] TSTATE: 004411009603 TPC: 005119ac TNPC: 
 005119b0 Y: Not tainted
 [  145.128940] TPC: kpagecount_read+0x94/0xe0

My suspicion at this point is that with certain RAM layouts, simply
iterating over PFN's is simply not working out.

pfn_to_page() seems to be doing no range checking, and with sparsemem
vmemmap, which sparc64 always uses, this can be problematic.

It just blindly goes vmemmap + pfn which is asking for trouble, in
particular when the physical RAM layout really is sparse.

Maybe it's enough to add a pfn_valid() check here?  If pfn_valid()
means there is a vmemmap translation setup for that page struct too,
it would work.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread Matt Mackall
On Thu, Dec 20, 2007 at 04:17:26PM -0800, David Miller wrote:
 From: Mariusz Kozlowski [EMAIL PROTECTED]
 Date: Thu, 20 Dec 2007 20:47:55 +0100
 
  [  145.128915] TSTATE: 004411009603 TPC: 005119ac TNPC: 
  005119b0 Y: Not tainted
  [  145.128940] TPC: kpagecount_read+0x94/0xe0
 
 My suspicion at this point is that with certain RAM layouts, simply
 iterating over PFN's is simply not working out.

That was my original suspicion, which is why I asked Mariusz to
effectively comment out the actual PFN lookup up-thread. I didn't send
him a patch to do that, so I guess my instructions on how to hack it
may have been misunderstood.
 
 pfn_to_page() seems to be doing no range checking, and with sparsemem
 vmemmap, which sparc64 always uses, this can be problematic.
 
 It just blindly goes vmemmap + pfn which is asking for trouble, in
 particular when the physical RAM layout really is sparse.
 
 Maybe it's enough to add a pfn_valid() check here?  If pfn_valid()
 means there is a vmemmap translation setup for that page struct too,
 it would work.

Here's a test patch:

Index: mm/fs/proc/proc_misc.c
===
--- mm.orig/fs/proc/proc_misc.c 2007-12-20 19:04:35.0 -0600
+++ mm/fs/proc/proc_misc.c  2007-12-20 19:06:01.0 -0600
@@ -707,7 +707,10 @@ static ssize_t kpagecount_read(struct fi
return -EIO;
 
while (count  0) {
-   ppage = pfn_to_page(pfn++);
+   ppage = 0;
+   if (pfn_valid(pfn))
+   ppage = pfn_to_page(pfn);
+   pfn++;
if (!ppage)
pcount = 0;
else
@@ -773,7 +776,10 @@ static ssize_t kpageflags_read(struct fi
return -EIO;
 
while (count  0) {
-   ppage = pfn_to_page(pfn++);
+   ppage = 0;
+   if (pfn_valid(pfn))
+   ppage = pfn_to_page(pfn);
+   pfn++;
if (!ppage)
kflags = 0;
else


-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-20 Thread David Miller
From: Matt Mackall [EMAIL PROTECTED]
Date: Thu, 20 Dec 2007 19:06:55 -0600

 @@ -707,7 +707,10 @@ static ssize_t kpagecount_read(struct fi
   return -EIO;
  
   while (count  0) {
 - ppage = pfn_to_page(pfn++);
 + ppage = 0;
 + if (pfn_valid(pfn))
 + ppage = pfn_to_page(pfn);
 + pfn++;
   if (!ppage)
   pcount = 0;
   else

Yes that should work, please use NULL in the final
version of the patch instead of 0 so that sparse is
happy.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-19 Thread Dave Young
On Dec 20, 2007 11:34 AM, Alan Stern <[EMAIL PROTECTED]> wrote:
> Note carefully.  This:
>
> > > > 2. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
> > > > disk it reports the part 2 and mount the partition as rw
>
> contradicts this:
>
> > > > 3. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
> > > > disk it just mount the partition as ro with nothing more messages.

Oh, sorry. It's a typo. should be 2.6.24-rc5-mm1

>
> So which is correct?
>
> > Hi, Alan
> >
> > I'm sure about my post.
>
> But your post contradicts itself.  It can't be correct.
>
> > I'm not so famillar with usb.
> > It looks weird.  Seems that my device will be firstly recoganized as a
> > mp3 player and then a usb storage, so the system will report part 1 &
> > part 2 under previous  kernels.
>
> I think those "part 2" messages aren't caused by the kernel at all, but
> instead by some program running on your computer.  You could try
> booting into single-user mode and see if the behavior changes.

No doubt for me. Under osx plugin this device will popup a dialog(I
don't remember the content), after press ok then the disk icon go
away, and then being remount again.

>
> Also there's no question -- the device does behave strangely.  It
> shouldn't change the write-protect setting all by itself.

Yes, I think so too.
>
> Alan Stern
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Miles Lane
On Dec 19, 2007 8:31 PM, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:
>
> On Thursday, 20 of December 2007, Miles Lane wrote:
> > On Dec 19, 2007 7:09 PM, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:
> >
> > > On Thursday, 20 of December 2007, Christoph Lameter wrote:
> > > > On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
> > > >
> > > > > > We could reexport drain_local_pages() again but then I do not
> > > understand
> > > > > > why we would only drain the pages of this processor and not of all
> > > other
> > > > > > processors as well. It seems that software suspend intend was to
> > > flush
> > > > > > them all right?
> > > > >
> > > > > Well, not exactly.  We are on one CPU at this point, the others have
> > > been
> > > > > disabled.
> > > >
> > > > Ok so the others are flush. Here is a patch to re-export
> > > > drain_local_pages() again and use it for software suspend:
> > > >
> > > > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> > > >
> > > > ---
> > > >  include/linux/gfp.h |1 +
> > > >  kernel/power/snapshot.c |2 +-
> > > >  mm/page_alloc.c |2 +-
> > > >  3 files changed, 3 insertions(+), 2 deletions(-)
> > > >
> > > > Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
> > > > ===
> > > > --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 11:59:
> > > 25.233961700 -0800
> > > > +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c  2007-12-19 15:16:
> > > 34.179661929 -0800
> > > > @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
> > > >
> > > >   printk(KERN_INFO "PM: Creating hibernation image: \n");
> > > >
> > > > - drain_all_pages();
> > > > + drain_local_pages(NULL);
> > > >   nr_pages = count_data_pages();
> > > >   nr_highmem = count_highmem_pages();
> > > >   printk(KERN_INFO "PM: Need to copy %u pages\n", nr_pages +
> > > nr_highmem);
> > >
> > > You've omitted the second instance, right before the copy_data_pages()
> > > call.
> > >
> >
> > I guess I will wait for a revised patch.
>
> There's an Andrew's fix on top of this one in -mm:
> http://marc.info/?l=linux-mm-commits=119810866812965=2
>
>
>
> > > > Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
> > > > ===
> > > > --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:
> > > 00.630421258 -0800
> > > > +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c  2007-12-19 15:12:
> > > 19.850545818 -0800
> > > > @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
> > > >  /*
> > > >   * Spill all of this CPU's per-cpu pages back into the buddy allocator.
> > > >   */
> > > > -static void drain_local_pages(void *arg)
> > > > +void drain_local_pages(void *arg)
> > > >  {
> > > >   drain_pages(smp_processor_id());
> > > >  }
> > > > Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
> > > > ===
> > > > --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 15:13:
> > > 51.926950065 -0800
> > > > +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h  2007-12-19 15:16:
> > > 11.951564369 -0800
> > > > @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
> > > >  void page_alloc_init(void);
> > > >  void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
> > > >  void drain_all_pages(void);
> > > > +void drain_local_pages(void *dummy);
> > > >
> > > >  #endif /* __LINUX_GFP_H */
> > >
>

I applied Christoph and Andrew's patches and recompiled.  I suspended
to disk and to ram several times and all looks good.

   Miles
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-19 Thread Alan Stern
Note carefully.  This:

> > > 2. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
> > > disk it reports the part 2 and mount the partition as rw

contradicts this:

> > > 3. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
> > > disk it just mount the partition as ro with nothing more messages.

So which is correct?

> Hi, Alan
> 
> I'm sure about my post.

But your post contradicts itself.  It can't be correct.

> I'm not so famillar with usb.
> It looks weird.  Seems that my device will be firstly recoganized as a
> mp3 player and then a usb storage, so the system will report part 1 &
> part 2 under previous  kernels.

I think those "part 2" messages aren't caused by the kernel at all, but
instead by some program running on your computer.  You could try
booting into single-user mode and see if the behavior changes.

Also there's no question -- the device does behave strangely.  It
shouldn't change the write-protect setting all by itself.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Rafael J. Wysocki
On Thursday, 20 of December 2007, Miles Lane wrote:
> On Dec 19, 2007 7:09 PM, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:
> 
> > On Thursday, 20 of December 2007, Christoph Lameter wrote:
> > > On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
> > >
> > > > > We could reexport drain_local_pages() again but then I do not
> > understand
> > > > > why we would only drain the pages of this processor and not of all
> > other
> > > > > processors as well. It seems that software suspend intend was to
> > flush
> > > > > them all right?
> > > >
> > > > Well, not exactly.  We are on one CPU at this point, the others have
> > been
> > > > disabled.
> > >
> > > Ok so the others are flush. Here is a patch to re-export
> > > drain_local_pages() again and use it for software suspend:
> > >
> > > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> > >
> > > ---
> > >  include/linux/gfp.h |1 +
> > >  kernel/power/snapshot.c |2 +-
> > >  mm/page_alloc.c |2 +-
> > >  3 files changed, 3 insertions(+), 2 deletions(-)
> > >
> > > Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
> > > ===
> > > --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 11:59:
> > 25.233961700 -0800
> > > +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c  2007-12-19 15:16:
> > 34.179661929 -0800
> > > @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
> > >
> > >   printk(KERN_INFO "PM: Creating hibernation image: \n");
> > >
> > > - drain_all_pages();
> > > + drain_local_pages(NULL);
> > >   nr_pages = count_data_pages();
> > >   nr_highmem = count_highmem_pages();
> > >   printk(KERN_INFO "PM: Need to copy %u pages\n", nr_pages +
> > nr_highmem);
> >
> > You've omitted the second instance, right before the copy_data_pages()
> > call.
> >
> 
> I guess I will wait for a revised patch.

There's an Andrew's fix on top of this one in -mm:
http://marc.info/?l=linux-mm-commits=119810866812965=2


> > > Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
> > > ===
> > > --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:
> > 00.630421258 -0800
> > > +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c  2007-12-19 15:12:
> > 19.850545818 -0800
> > > @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
> > >  /*
> > >   * Spill all of this CPU's per-cpu pages back into the buddy allocator.
> > >   */
> > > -static void drain_local_pages(void *arg)
> > > +void drain_local_pages(void *arg)
> > >  {
> > >   drain_pages(smp_processor_id());
> > >  }
> > > Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
> > > ===
> > > --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 15:13:
> > 51.926950065 -0800
> > > +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h  2007-12-19 15:16:
> > 11.951564369 -0800
> > > @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
> > >  void page_alloc_init(void);
> > >  void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
> > >  void drain_all_pages(void);
> > > +void drain_local_pages(void *dummy);
> > >
> > >  #endif /* __LINUX_GFP_H */
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Miles Lane
On Dec 19, 2007 7:09 PM, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:
>
> On Thursday, 20 of December 2007, Christoph Lameter wrote:
> > On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
> >
> > > > We could reexport drain_local_pages() again but then I do not understand
> > > > why we would only drain the pages of this processor and not of all other
> > > > processors as well. It seems that software suspend intend was to flush
> > > > them all right?
> > >
> > > Well, not exactly.  We are on one CPU at this point, the others have been
> > > disabled.
> >
> > Ok so the others are flush. Here is a patch to re-export
> > drain_local_pages() again and use it for software suspend:
> >
> > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> >
> > ---
> >  include/linux/gfp.h |1 +
> >  kernel/power/snapshot.c |2 +-
> >  mm/page_alloc.c |2 +-
> >  3 files changed, 3 insertions(+), 2 deletions(-)
> >
> > Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
> > ===
> > --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 
> > 11:59:25.233961700 -0800
> > +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c  2007-12-19 
> > 15:16:34.179661929 -0800
> > @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
> >
> >   printk(KERN_INFO "PM: Creating hibernation image: \n");
> >
> > - drain_all_pages();
> > + drain_local_pages(NULL);
> >   nr_pages = count_data_pages();
> >   nr_highmem = count_highmem_pages();
> >   printk(KERN_INFO "PM: Need to copy %u pages\n", nr_pages + 
> > nr_highmem);
>
> You've omitted the second instance, right before the copy_data_pages() call.

I will wait for a revised patch and then test.
(Sorry for the duplicate message.  I am resending because I
accidentally sent an HTML
message the first time.  Whoops.)

> > Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
> > ===
> > --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:00.630421258 
> > -0800
> > +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c  2007-12-19 15:12:19.850545818 
> > -0800
> > @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
> >  /*
> >   * Spill all of this CPU's per-cpu pages back into the buddy allocator.
> >   */
> > -static void drain_local_pages(void *arg)
> > +void drain_local_pages(void *arg)
> >  {
> >   drain_pages(smp_processor_id());
> >  }
> > Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
> > ===
> > --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 
> > 15:13:51.926950065 -0800
> > +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h  2007-12-19 15:16:11.951564369 
> > -0800
> > @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
> >  void page_alloc_init(void);
> >  void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
> >  void drain_all_pages(void);
> > +void drain_local_pages(void *dummy);
> >
> >  #endif /* __LINUX_GFP_H */
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-19 Thread Dave Young
On Dec 20, 2007 12:07 AM, Alan Stern <[EMAIL PROTECTED]> wrote:
>
> On Wed, 19 Dec 2007, Dave Young wrote:
>
> > I tested on another machine with kernel 2.6.24-rc2. And the result is
> > diffrent again.
> > Here is the result:
> >
> > 1. on 2.6.24-rc2, when I plugin the player the kernel reports below 
> > messages:
> >
> > usb-storage: waiting for device to settle before scanning
> > /*[lets mark the below part as part 1]*/
> > scsi 0:0:0:0: Direct-Access   Newman mp3   PQ: 0 ANSI: 
> > 0 CCS
> > sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
> > sd 0:0:0:0: [sda] Write Protect is on
> > sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
> > sd 0:0:0:0: [sda] Assuming drive cache: write through
> > sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
> > sd 0:0:0:0: [sda] Write Protect is on
> > sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
> > sd 0:0:0:0: [sda] Assuming drive cache: write through
> >  sda: sda1
> > /*[lets mark the below part as part 2]*/
> > sd 0:0:0:0: [sda] Attached SCSI removable disk
> > usb-storage: device scan complete
> > sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
> > sd 0:0:0:0: [sda] Assuming drive cache: write through
> > sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
> > sd 0:0:0:0: [sda] Assuming drive cache: write through
> >  sda: sda1
>
> This is not normal.  When you plug in a storage device you should get
> all of the messages in your part 1 plus the first two lines in your
> part 2, but not the rest of part 2.
>
> > 2. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
> > disk it reports the part 2 and mount the partition as rw
> >
> > 3. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
> > disk it just mount the partition as ro with nothing more messages.
>
> You must have a typo there.  Those can't both be true for 2.6.24-rc5.
> In fact you shouldn't see part 2 at all.
>
> Here's what I get when I plug in a USB mass-storage device under
> 2.6.24-rc5:
>
> [   87.903014] usb-storage: device found at 2
> [   87.909570] scsi 0:0:0:0: Direct-Access  Memorex TD 2B1.09 
> PQ: 0 ANSI: 0 CCS
> [   87.913144] usb-storage: device scan complete
> [   88.804031] sd 0:0:0:0: [sda] 243712 512-byte hardware sectors (125 MB)
> [   88.805507] sd 0:0:0:0: [sda] Write Protect is off
> [   88.805577] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
> [   88.805639] sd 0:0:0:0: [sda] Assuming drive cache: write through
> [   88.809526] sd 0:0:0:0: [sda] 243712 512-byte hardware sectors (125 MB)
> [   88.810421] sd 0:0:0:0: [sda] Write Protect is off
> [   88.810488] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
> [   88.810575] sd 0:0:0:0: [sda] Assuming drive cache: write through
> [   88.810641]  sda: sda1
> [   88.812450] sd 0:0:0:0: [sda] Attached SCSI removable disk
> [   89.041014] sd 0:0:0:0: Attached scsi generic sg0 type 0
>
> Mounting the disk produces no extra output at all.  I get the same
> result under 2.6.23 and earlier operating systems.  You should see
> approximately the same thing.

Hi, Alan

I'm sure about my post. I'm not so famillar with usb.
It looks weird.  Seems that my device will be firstly recoganized as a
mp3 player and then a usb storage, so the system will report part 1 &
part 2 under previous  kernels.

Regards
dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Rafael J. Wysocki
On Thursday, 20 of December 2007, Christoph Lameter wrote:
> On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
> 
> > > We could reexport drain_local_pages() again but then I do not understand 
> > > why we would only drain the pages of this processor and not of all other
> > > processors as well. It seems that software suspend intend was to flush 
> > > them all right?
> > 
> > Well, not exactly.  We are on one CPU at this point, the others have been
> > disabled.
> 
> Ok so the others are flush. Here is a patch to re-export 
> drain_local_pages() again and use it for software suspend:
> 
> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> 
> ---
>  include/linux/gfp.h |1 +
>  kernel/power/snapshot.c |2 +-
>  mm/page_alloc.c |2 +-
>  3 files changed, 3 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
> ===
> --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 
> 11:59:25.233961700 -0800
> +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c  2007-12-19 
> 15:16:34.179661929 -0800
> @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
>  
>   printk(KERN_INFO "PM: Creating hibernation image: \n");
>  
> - drain_all_pages();
> + drain_local_pages(NULL);
>   nr_pages = count_data_pages();
>   nr_highmem = count_highmem_pages();
>   printk(KERN_INFO "PM: Need to copy %u pages\n", nr_pages + nr_highmem);

You've omitted the second instance, right before the copy_data_pages() call.

> Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
> ===
> --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:00.630421258 
> -0800
> +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c  2007-12-19 15:12:19.850545818 
> -0800
> @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
>  /*
>   * Spill all of this CPU's per-cpu pages back into the buddy allocator.
>   */
> -static void drain_local_pages(void *arg)
> +void drain_local_pages(void *arg)
>  {
>   drain_pages(smp_processor_id());
>  }
> Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
> ===
> --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 
> 15:13:51.926950065 -0800
> +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h  2007-12-19 15:16:11.951564369 
> -0800
> @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
>  void page_alloc_init(void);
>  void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
>  void drain_all_pages(void);
> +void drain_local_pages(void *dummy);
>  
>  #endif /* __LINUX_GFP_H */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Christoph Lameter
On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:

> > We could reexport drain_local_pages() again but then I do not understand 
> > why we would only drain the pages of this processor and not of all other
> > processors as well. It seems that software suspend intend was to flush 
> > them all right?
> 
> Well, not exactly.  We are on one CPU at this point, the others have been
> disabled.

Ok so the others are flush. Here is a patch to re-export 
drain_local_pages() again and use it for software suspend:

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

---
 include/linux/gfp.h |1 +
 kernel/power/snapshot.c |2 +-
 mm/page_alloc.c |2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
===
--- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c   2007-12-19 
11:59:25.233961700 -0800
+++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c2007-12-19 
15:16:34.179661929 -0800
@@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
 
printk(KERN_INFO "PM: Creating hibernation image: \n");
 
-   drain_all_pages();
+   drain_local_pages(NULL);
nr_pages = count_data_pages();
nr_highmem = count_highmem_pages();
printk(KERN_INFO "PM: Need to copy %u pages\n", nr_pages + nr_highmem);
Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
===
--- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c   2007-12-19 12:01:00.630421258 
-0800
+++ linux-2.6.24-rc5-mm1/mm/page_alloc.c2007-12-19 15:12:19.850545818 
-0800
@@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
 /*
  * Spill all of this CPU's per-cpu pages back into the buddy allocator.
  */
-static void drain_local_pages(void *arg)
+void drain_local_pages(void *arg)
 {
drain_pages(smp_processor_id());
 }
Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
===
--- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h   2007-12-19 
15:13:51.926950065 -0800
+++ linux-2.6.24-rc5-mm1/include/linux/gfp.h2007-12-19 15:16:11.951564369 
-0800
@@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
 void page_alloc_init(void);
 void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
 void drain_all_pages(void);
+void drain_local_pages(void *dummy);
 
 #endif /* __LINUX_GFP_H */
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Rafael J. Wysocki
On Wednesday, 19 of December 2007, Christoph Lameter wrote:
> On Wed, 19 Dec 2007, Daniel Walker wrote:
> 
> > > It looks like the swsusp_save() calls drain_all_pages() , which calls
> > > on_each_cpu() .. On return on_each_cpu() unconditionally enables
> > > interrupts so the rest of the resume process has interrupt enable
> > > (which , it looks like, shouldn't happen) and then you get the lockdep()
> > > warning due to the above..
> > > 
> > > Not sure if this has been found already, or not?
> 
> Hmmm... It will unconditionally enable interrupts regardless how we call 
> this. We could explicity save and restore interrrupts in 
> swsusp_save() I guess. Why is swsusp_save() disabling interrupts?

Actually, it's called with interrupts disabled, because it's job is to create
the hibernation image.  At this point everything is off except for the CPU
running swsusp_save().
 
> > > Should drain_all_pages() really be drain_local_pages() ?
> > 
> > It looks like it was drain_local_pages, but the following patch
> > 
> > page-allocator-clean-up-pcp-draining-functions.patch
> > 
> > Changes that in -mm .. I added Christoph Lameter to the CC since it's
> > his patch ..
> 
> We could reexport drain_local_pages() again but then I do not understand 
> why we would only drain the pages of this processor and not of all other
> processors as well. It seems that software suspend intend was to flush 
> them all right?

Well, not exactly.  We are on one CPU at this point, the others have been
disabled.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Christoph Lameter
On Wed, 19 Dec 2007, Daniel Walker wrote:

> > It looks like the swsusp_save() calls drain_all_pages() , which calls
> > on_each_cpu() .. On return on_each_cpu() unconditionally enables
> > interrupts so the rest of the resume process has interrupt enable
> > (which , it looks like, shouldn't happen) and then you get the lockdep()
> > warning due to the above..
> > 
> > Not sure if this has been found already, or not?

Hmmm... It will unconditionally enable interrupts regardless how we call 
this. We could explicity save and restore interrrupts in 
swsusp_save() I guess. Why is swsusp_save() disabling interrupts?

> > Should drain_all_pages() really be drain_local_pages() ?
> 
> It looks like it was drain_local_pages, but the following patch
> 
> page-allocator-clean-up-pcp-draining-functions.patch
> 
> Changes that in -mm .. I added Christoph Lameter to the CC since it's
> his patch ..

We could reexport drain_local_pages() again but then I do not understand 
why we would only drain the pages of this processor and not of all other
processors as well. It seems that software suspend intend was to flush 
them all right?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Daniel Walker
On Wed, 2007-12-19 at 10:42 -0800, Daniel Walker wrote:
> On Wed, 2007-12-19 at 10:06 -0500, Miles Lane wrote:
> > [   11.827653] PM: Creating hibernation image:
> > [   11.827658] WARNING: at arch/x86/kernel/smp_32.c:561 
> > native_smp_call_function_mask()
> > [   11.827661] Pid: 9940, comm: pm-hibernate Not tainted
> > 2.6.24-rc5-mm1 #8
> > [   11.827665]  [] show_trace_log_lvl+0x12/0x25
> > [   11.827673]  [] show_trace+0xd/0x10
> > [   11.827677]  [] dump_stack+0x57/0x5f
> > [   11.827681]  [] native_smp_call_function_mask+0x41/0x126
> > [   11.827686]  [] smp_call_function+0x18/0x1f
> > [   11.827690]  [] on_each_cpu+0x12/0x40
> > [   11.827695]  [] drain_all_pages+0x13/0x16
> > [   11.827700]  [] swsusp_save+0x18/0x46b
> > [   11.827705]  [] swsusp_arch_suspend+0x2a/0x2c
> > [   11.827710]  [] hibernate+0xba/0x16e
> > [   11.827714]  [] state_store+0x45/0xac
> > [   11.827717]  [] kobj_attr_store+0x1a/0x22
> > [   11.827722]  [] sysfs_write_file+0xb8/0xe3
> > [   11.827726]  [] vfs_write+0xa4/0x120
> > [   11.827731]  [] sys_write+0x3b/0x60
> > [   11.827734]  [] sysenter_past_esp+0x6b/0xc1
> > [   11.827738]  ===
> ...
> > [   15.624993] =
> > [   15.624995] [ INFO: inconsistent lock state ]
> > [   15.624998] 2.6.24-rc5-mm1 #8
> > [   15.624999] -
> > [   15.625001] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
> 
> It looks like the swsusp_save() calls drain_all_pages() , which calls
> on_each_cpu() .. On return on_each_cpu() unconditionally enables
> interrupts so the rest of the resume process has interrupt enable
> (which , it looks like, shouldn't happen) and then you get the lockdep()
> warning due to the above..
> 
> Not sure if this has been found already, or not?
> 
> Should drain_all_pages() really be drain_local_pages() ?

It looks like it was drain_local_pages, but the following patch

page-allocator-clean-up-pcp-draining-functions.patch

Changes that in -mm .. I added Christoph Lameter to the CC since it's
his patch ..

Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Daniel Walker
On Wed, 2007-12-19 at 10:06 -0500, Miles Lane wrote:
> [   11.827653] PM: Creating hibernation image:
> [   11.827658] WARNING: at arch/x86/kernel/smp_32.c:561 
> native_smp_call_function_mask()
> [   11.827661] Pid: 9940, comm: pm-hibernate Not tainted
> 2.6.24-rc5-mm1 #8
> [   11.827665]  [] show_trace_log_lvl+0x12/0x25
> [   11.827673]  [] show_trace+0xd/0x10
> [   11.827677]  [] dump_stack+0x57/0x5f
> [   11.827681]  [] native_smp_call_function_mask+0x41/0x126
> [   11.827686]  [] smp_call_function+0x18/0x1f
> [   11.827690]  [] on_each_cpu+0x12/0x40
> [   11.827695]  [] drain_all_pages+0x13/0x16
> [   11.827700]  [] swsusp_save+0x18/0x46b
> [   11.827705]  [] swsusp_arch_suspend+0x2a/0x2c
> [   11.827710]  [] hibernate+0xba/0x16e
> [   11.827714]  [] state_store+0x45/0xac
> [   11.827717]  [] kobj_attr_store+0x1a/0x22
> [   11.827722]  [] sysfs_write_file+0xb8/0xe3
> [   11.827726]  [] vfs_write+0xa4/0x120
> [   11.827731]  [] sys_write+0x3b/0x60
> [   11.827734]  [] sysenter_past_esp+0x6b/0xc1
> [   11.827738]  ===
...
> [   15.624993] =
> [   15.624995] [ INFO: inconsistent lock state ]
> [   15.624998] 2.6.24-rc5-mm1 #8
> [   15.624999] -
> [   15.625001] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.

It looks like the swsusp_save() calls drain_all_pages() , which calls
on_each_cpu() .. On return on_each_cpu() unconditionally enables
interrupts so the rest of the resume process has interrupt enable
(which , it looks like, shouldn't happen) and then you get the lockdep()
warning due to the above..

Not sure if this has been found already, or not?

Should drain_all_pages() really be drain_local_pages() ?

Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-19 Thread Alan Stern
On Wed, 19 Dec 2007, Dave Young wrote:

> I tested on another machine with kernel 2.6.24-rc2. And the result is
> diffrent again.
> Here is the result:
> 
> 1. on 2.6.24-rc2, when I plugin the player the kernel reports below messages:
> 
> usb-storage: waiting for device to settle before scanning
> /*[lets mark the below part as part 1]*/
> scsi 0:0:0:0: Direct-Access   Newman mp3   PQ: 0 ANSI: 0 
> CCS
> sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
> sd 0:0:0:0: [sda] Write Protect is on
> sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
> sd 0:0:0:0: [sda] Assuming drive cache: write through
> sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
> sd 0:0:0:0: [sda] Write Protect is on
> sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
> sd 0:0:0:0: [sda] Assuming drive cache: write through
>  sda: sda1
> /*[lets mark the below part as part 2]*/
> sd 0:0:0:0: [sda] Attached SCSI removable disk
> usb-storage: device scan complete
> sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
> sd 0:0:0:0: [sda] Assuming drive cache: write through
> sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
> sd 0:0:0:0: [sda] Assuming drive cache: write through
>  sda: sda1

This is not normal.  When you plug in a storage device you should get 
all of the messages in your part 1 plus the first two lines in your 
part 2, but not the rest of part 2.

> 2. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
> disk it reports the part 2 and mount the partition as rw
> 
> 3. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
> disk it just mount the partition as ro with nothing more messages.

You must have a typo there.  Those can't both be true for 2.6.24-rc5.  
In fact you shouldn't see part 2 at all.

Here's what I get when I plug in a USB mass-storage device under 
2.6.24-rc5:

[   87.903014] usb-storage: device found at 2
[   87.909570] scsi 0:0:0:0: Direct-Access  Memorex TD 2B1.09 
PQ: 0 ANSI: 0 CCS
[   87.913144] usb-storage: device scan complete
[   88.804031] sd 0:0:0:0: [sda] 243712 512-byte hardware sectors (125 MB)
[   88.805507] sd 0:0:0:0: [sda] Write Protect is off
[   88.805577] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
[   88.805639] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   88.809526] sd 0:0:0:0: [sda] 243712 512-byte hardware sectors (125 MB)
[   88.810421] sd 0:0:0:0: [sda] Write Protect is off
[   88.810488] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
[   88.810575] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   88.810641]  sda: sda1
[   88.812450] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   89.041014] sd 0:0:0:0: Attached scsi generic sg0 type 0

Mounting the disk produces no extra output at all.  I get the same 
result under 2.6.23 and earlier operating systems.  You should see 
approximately the same thing.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Miles Lane
I discovered that I can use IMAP with GMail now, so I can send messages 
using Thunderbird and avoid the line wrapping problem.


I tried doing a series:  suspend-to-disk, suspend-to-ram and suspend-to-disk
Here is the result:

[   11.827653] PM: Creating hibernation image:
[   11.827658] WARNING: at arch/x86/kernel/smp_32.c:561 
native_smp_call_function_mask()

[   11.827661] Pid: 9940, comm: pm-hibernate Not tainted 2.6.24-rc5-mm1 #8
[   11.827665]  [] show_trace_log_lvl+0x12/0x25
[   11.827673]  [] show_trace+0xd/0x10
[   11.827677]  [] dump_stack+0x57/0x5f
[   11.827681]  [] native_smp_call_function_mask+0x41/0x126
[   11.827686]  [] smp_call_function+0x18/0x1f
[   11.827690]  [] on_each_cpu+0x12/0x40
[   11.827695]  [] drain_all_pages+0x13/0x16
[   11.827700]  [] swsusp_save+0x18/0x46b
[   11.827705]  [] swsusp_arch_suspend+0x2a/0x2c
[   11.827710]  [] hibernate+0xba/0x16e
[   11.827714]  [] state_store+0x45/0xac
[   11.827717]  [] kobj_attr_store+0x1a/0x22
[   11.827722]  [] sysfs_write_file+0xb8/0xe3
[   11.827726]  [] vfs_write+0xa4/0x120
[   11.827731]  [] sys_write+0x3b/0x60
[   11.827734]  [] sysenter_past_esp+0x6b/0xc1
[   11.827738]  ===
[   11.920363] PM: Need to copy 124108 pages
[   11.920368] PM: Normal pages needed: 46468 + 1024 + 40, available 
pages: 182806

[   15.623893] PM: Hibernation image created (124108 pages copied)
[   15.624618] Intel machine check architecture supported.
[   15.624625] Intel machine check reporting enabled on CPU#0.
[   15.624992]
[   15.624993] =
[   15.624995] [ INFO: inconsistent lock state ]
[   15.624998] 2.6.24-rc5-mm1 #8
[   15.624999] -
[   15.625001] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
[   15.625005] pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1] takes:
[   15.625007]  (_base->lock_key){++..}, at: [] 
retrigger_next_event+0x63/0x9f

[   15.625017] {in-hardirq-W} state was registered at:
[   15.625019]   [] __lock_acquire+0x408/0xbf4
[   15.625025]   [] lock_acquire+0x76/0x9d
[   15.625029]   [] _spin_lock+0x19/0x28
[   15.625035]   [] hrtimer_interrupt+0x72/0x1b0
[   15.625039]   [] smp_apic_timer_interrupt+0x69/0x7c
[   15.625045]   [] apic_timer_interrupt+0x33/0x38
[   15.625050]   [] mwait_idle+0x1b/0x1d
[   15.625054]   [] cpu_idle+0xb3/0xd4
[   15.625058]   [] rest_init+0x49/0x4b
[   15.625062]   [] start_kernel+0x357/0x35f
[   15.625069]   [<>] 0x0
[   15.625082]   [] 0x
[   15.625087] irq event stamp: 1182359
[   15.625089] hardirqs last  enabled at (1182359): [] 
restore_nocheck+0x12/0x15
[   15.625094] hardirqs last disabled at (1182358): [] 
apic_timer_interrupt+0x29/0x38
[   15.625098] softirqs last  enabled at (933018): [] 
__rcu_offline_cpu+0x32/0x62
[   15.625104] softirqs last disabled at (933016): [] 
_spin_lock_bh+0xb/0x2d

[   15.625109]
[   15.625110] other info that might help us debug this:
[   15.625112] 2 locks held by pm-hibernate/9940:
[   15.625114]  #0:  (>mutex){--..}, at: [] 
sysfs_write_file+0x25/0xe3

[   15.625121]  #1:  (pm_mutex){--..}, at: [] hibernate+0x10/0x16e
[   15.625127]
[   15.625128] stack backtrace:
[   15.625131] Pid: 9940, comm: pm-hibernate Not tainted 2.6.24-rc5-mm1 #8
[   15.625133]  [] show_trace_log_lvl+0x12/0x25
[   15.625138]  [] show_trace+0xd/0x10
[   15.625141]  [] dump_stack+0x57/0x5f
[   15.625144]  [] print_usage_bug+0x10a/0x117
[   15.625148]  [] mark_lock+0x1e7/0x3fe
[   15.625152]  [] __lock_acquire+0x475/0xbf4
[   15.625156]  [] lock_acquire+0x76/0x9d
[   15.625159]  [] _spin_lock+0x19/0x28
[   15.625163]  [] retrigger_next_event+0x63/0x9f
[   15.625167]  [] hres_timers_resume+0x4d/0x4f
[   15.625170]  [] timekeeping_resume+0x117/0x11e
[   15.625175]  [] __sysdev_resume+0x14/0x34
[   15.625179]  [] sysdev_resume+0x21/0x57
[   15.625183]  [] device_power_up+0x8/0xf
[   15.625188]  [] hibernation_snapshot+0x13c/0x173
[   15.625192]  [] hibernate+0xba/0x16e
[   15.625195]  [] state_store+0x45/0xac
[   15.625199]  [] kobj_attr_store+0x1a/0x22
[   15.625203]  [] sysfs_write_file+0xb8/0xe3
[   15.625207]  [] vfs_write+0xa4/0x120
[   15.625211]  [] sys_write+0x3b/0x60
[   15.625214]  [] sysenter_past_esp+0x6b/0xc1
[   15.625217]  ===
[   15.625242] agpgart-intel :00:00.0: EARLY resume
...
[   15.624618] Intel machine check architecture supported.
[   15.624625] Intel machine check reporting enabled on CPU#0.
[   15.624992]
[   15.624993] =
[   15.624995] [ INFO: inconsistent lock state ]
[   15.624998] 2.6.24-rc5-mm1 #8
[   15.624999] -
[   15.625001] inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
[   15.625005] pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1] takes:
[   15.625007]  (_base->lock_key){++..}, at: [] 
retrigger_next_event+0x63/0x9f

[   15.625017] {in-hardirq-W} state was registered at:
[   15.625019]   [] __lock_acquire+0x408/0xbf4
[   15.625025]   [] lock_acquire+0x76/0x9d
[   

Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Miles Lane
I discovered that I can use IMAP with GMail now, so I can send messages 
using Thunderbird and avoid the line wrapping problem.


I tried doing a series:  suspend-to-disk, suspend-to-ram and suspend-to-disk
Here is the result:

[   11.827653] PM: Creating hibernation image:
[   11.827658] WARNING: at arch/x86/kernel/smp_32.c:561 
native_smp_call_function_mask()

[   11.827661] Pid: 9940, comm: pm-hibernate Not tainted 2.6.24-rc5-mm1 #8
[   11.827665]  [c0107d55] show_trace_log_lvl+0x12/0x25
[   11.827673]  [c010848a] show_trace+0xd/0x10
[   11.827677]  [c0108763] dump_stack+0x57/0x5f
[   11.827681]  [c0117db4] native_smp_call_function_mask+0x41/0x126
[   11.827686]  [c01192d9] smp_call_function+0x18/0x1f
[   11.827690]  [c012c624] on_each_cpu+0x12/0x40
[   11.827695]  [c0166ece] drain_all_pages+0x13/0x16
[   11.827700]  [c014f7b3] swsusp_save+0x18/0x46b
[   11.827705]  [c03103fa] swsusp_arch_suspend+0x2a/0x2c
[   11.827710]  [c014e7d8] hibernate+0xba/0x16e
[   11.827714]  [c014d56b] state_store+0x45/0xac
[   11.827717]  [c01ffe95] kobj_attr_store+0x1a/0x22
[   11.827722]  [c01b92c7] sysfs_write_file+0xb8/0xe3
[   11.827726]  [c01837eb] vfs_write+0xa4/0x120
[   11.827731]  [c0183d5e] sys_write+0x3b/0x60
[   11.827734]  [c0106bae] sysenter_past_esp+0x6b/0xc1
[   11.827738]  ===
[   11.920363] PM: Need to copy 124108 pages
[   11.920368] PM: Normal pages needed: 46468 + 1024 + 40, available 
pages: 182806

[   15.623893] PM: Hibernation image created (124108 pages copied)
[   15.624618] Intel machine check architecture supported.
[   15.624625] Intel machine check reporting enabled on CPU#0.
[   15.624992]
[   15.624993] =
[   15.624995] [ INFO: inconsistent lock state ]
[   15.624998] 2.6.24-rc5-mm1 #8
[   15.624999] -
[   15.625001] inconsistent {in-hardirq-W} - {hardirq-on-W} usage.
[   15.625005] pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1] takes:
[   15.625007]  (cpu_base-lock_key){++..}, at: [c013c453] 
retrigger_next_event+0x63/0x9f

[   15.625017] {in-hardirq-W} state was registered at:
[   15.625019]   [c0145432] __lock_acquire+0x408/0xbf4
[   15.625025]   [c0145c94] lock_acquire+0x76/0x9d
[   15.625029]   [c039aa08] _spin_lock+0x19/0x28
[   15.625035]   [c013cd92] hrtimer_interrupt+0x72/0x1b0
[   15.625039]   [c011a2b7] smp_apic_timer_interrupt+0x69/0x7c
[   15.625045]   [c010] apic_timer_interrupt+0x33/0x38
[   15.625050]   [c01054b5] mwait_idle+0x1b/0x1d
[   15.625054]   [c01055e9] cpu_idle+0xb3/0xd4
[   15.625058]   [c03986c5] rest_init+0x49/0x4b
[   15.625062]   [c04f696d] start_kernel+0x357/0x35f
[   15.625069]   [] 0x0
[   15.625082]   [] 0x
[   15.625087] irq event stamp: 1182359
[   15.625089] hardirqs last  enabled at (1182359): [c0106cb3] 
restore_nocheck+0x12/0x15
[   15.625094] hardirqs last disabled at (1182358): [c010776d] 
apic_timer_interrupt+0x29/0x38
[   15.625098] softirqs last  enabled at (933018): [c0137d89] 
__rcu_offline_cpu+0x32/0x62
[   15.625104] softirqs last disabled at (933016): [c039aa22] 
_spin_lock_bh+0xb/0x2d

[   15.625109]
[   15.625110] other info that might help us debug this:
[   15.625112] 2 locks held by pm-hibernate/9940:
[   15.625114]  #0:  (buffer-mutex){--..}, at: [c01b9234] 
sysfs_write_file+0x25/0xe3

[   15.625121]  #1:  (pm_mutex){--..}, at: [c014e72e] hibernate+0x10/0x16e
[   15.625127]
[   15.625128] stack backtrace:
[   15.625131] Pid: 9940, comm: pm-hibernate Not tainted 2.6.24-rc5-mm1 #8
[   15.625133]  [c0107d55] show_trace_log_lvl+0x12/0x25
[   15.625138]  [c010848a] show_trace+0xd/0x10
[   15.625141]  [c0108763] dump_stack+0x57/0x5f
[   15.625144]  [c0143e45] print_usage_bug+0x10a/0x117
[   15.625148]  [c01447de] mark_lock+0x1e7/0x3fe
[   15.625152]  [c014549f] __lock_acquire+0x475/0xbf4
[   15.625156]  [c0145c94] lock_acquire+0x76/0x9d
[   15.625159]  [c039aa08] _spin_lock+0x19/0x28
[   15.625163]  [c013c453] retrigger_next_event+0x63/0x9f
[   15.625167]  [c013caf7] hres_timers_resume+0x4d/0x4f
[   15.625170]  [c013eed1] timekeeping_resume+0x117/0x11e
[   15.625175]  [c027b2ba] __sysdev_resume+0x14/0x34
[   15.625179]  [c027b752] sysdev_resume+0x21/0x57
[   15.625183]  [c027f426] device_power_up+0x8/0xf
[   15.625188]  [c014e6e7] hibernation_snapshot+0x13c/0x173
[   15.625192]  [c014e7d8] hibernate+0xba/0x16e
[   15.625195]  [c014d56b] state_store+0x45/0xac
[   15.625199]  [c01ffe95] kobj_attr_store+0x1a/0x22
[   15.625203]  [c01b92c7] sysfs_write_file+0xb8/0xe3
[   15.625207]  [c01837eb] vfs_write+0xa4/0x120
[   15.625211]  [c0183d5e] sys_write+0x3b/0x60
[   15.625214]  [c0106bae] sysenter_past_esp+0x6b/0xc1
[   15.625217]  ===
[   15.625242] agpgart-intel :00:00.0: EARLY resume
...
[   15.624618] Intel machine check architecture supported.
[   15.624625] Intel machine check reporting enabled on CPU#0.
[   15.624992]
[   15.624993] =
[   15.624995] [ INFO: inconsistent lock 

Re: 2.6.24-rc5-mm1

2007-12-19 Thread Alan Stern
On Wed, 19 Dec 2007, Dave Young wrote:

 I tested on another machine with kernel 2.6.24-rc2. And the result is
 diffrent again.
 Here is the result:
 
 1. on 2.6.24-rc2, when I plugin the player the kernel reports below messages:
 
 usb-storage: waiting for device to settle before scanning
 /*[lets mark the below part as part 1]*/
 scsi 0:0:0:0: Direct-Access   Newman mp3   PQ: 0 ANSI: 0 
 CCS
 sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
 sd 0:0:0:0: [sda] Write Protect is on
 sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
 sd 0:0:0:0: [sda] Assuming drive cache: write through
 sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
 sd 0:0:0:0: [sda] Write Protect is on
 sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
 sd 0:0:0:0: [sda] Assuming drive cache: write through
  sda: sda1
 /*[lets mark the below part as part 2]*/
 sd 0:0:0:0: [sda] Attached SCSI removable disk
 usb-storage: device scan complete
 sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
 sd 0:0:0:0: [sda] Write Protect is off
 sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
 sd 0:0:0:0: [sda] Assuming drive cache: write through
 sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
 sd 0:0:0:0: [sda] Write Protect is off
 sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
 sd 0:0:0:0: [sda] Assuming drive cache: write through
  sda: sda1

This is not normal.  When you plug in a storage device you should get 
all of the messages in your part 1 plus the first two lines in your 
part 2, but not the rest of part 2.

 2. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
 disk it reports the part 2 and mount the partition as rw
 
 3. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
 disk it just mount the partition as ro with nothing more messages.

You must have a typo there.  Those can't both be true for 2.6.24-rc5.  
In fact you shouldn't see part 2 at all.

Here's what I get when I plug in a USB mass-storage device under 
2.6.24-rc5:

[   87.903014] usb-storage: device found at 2
[   87.909570] scsi 0:0:0:0: Direct-Access  Memorex TD 2B1.09 
PQ: 0 ANSI: 0 CCS
[   87.913144] usb-storage: device scan complete
[   88.804031] sd 0:0:0:0: [sda] 243712 512-byte hardware sectors (125 MB)
[   88.805507] sd 0:0:0:0: [sda] Write Protect is off
[   88.805577] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
[   88.805639] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   88.809526] sd 0:0:0:0: [sda] 243712 512-byte hardware sectors (125 MB)
[   88.810421] sd 0:0:0:0: [sda] Write Protect is off
[   88.810488] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
[   88.810575] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   88.810641]  sda: sda1
[   88.812450] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   89.041014] sd 0:0:0:0: Attached scsi generic sg0 type 0

Mounting the disk produces no extra output at all.  I get the same 
result under 2.6.23 and earlier operating systems.  You should see 
approximately the same thing.

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Daniel Walker
On Wed, 2007-12-19 at 10:06 -0500, Miles Lane wrote:
 [   11.827653] PM: Creating hibernation image:
 [   11.827658] WARNING: at arch/x86/kernel/smp_32.c:561 
 native_smp_call_function_mask()
 [   11.827661] Pid: 9940, comm: pm-hibernate Not tainted
 2.6.24-rc5-mm1 #8
 [   11.827665]  [c0107d55] show_trace_log_lvl+0x12/0x25
 [   11.827673]  [c010848a] show_trace+0xd/0x10
 [   11.827677]  [c0108763] dump_stack+0x57/0x5f
 [   11.827681]  [c0117db4] native_smp_call_function_mask+0x41/0x126
 [   11.827686]  [c01192d9] smp_call_function+0x18/0x1f
 [   11.827690]  [c012c624] on_each_cpu+0x12/0x40
 [   11.827695]  [c0166ece] drain_all_pages+0x13/0x16
 [   11.827700]  [c014f7b3] swsusp_save+0x18/0x46b
 [   11.827705]  [c03103fa] swsusp_arch_suspend+0x2a/0x2c
 [   11.827710]  [c014e7d8] hibernate+0xba/0x16e
 [   11.827714]  [c014d56b] state_store+0x45/0xac
 [   11.827717]  [c01ffe95] kobj_attr_store+0x1a/0x22
 [   11.827722]  [c01b92c7] sysfs_write_file+0xb8/0xe3
 [   11.827726]  [c01837eb] vfs_write+0xa4/0x120
 [   11.827731]  [c0183d5e] sys_write+0x3b/0x60
 [   11.827734]  [c0106bae] sysenter_past_esp+0x6b/0xc1
 [   11.827738]  ===
...
 [   15.624993] =
 [   15.624995] [ INFO: inconsistent lock state ]
 [   15.624998] 2.6.24-rc5-mm1 #8
 [   15.624999] -
 [   15.625001] inconsistent {in-hardirq-W} - {hardirq-on-W} usage.

It looks like the swsusp_save() calls drain_all_pages() , which calls
on_each_cpu() .. On return on_each_cpu() unconditionally enables
interrupts so the rest of the resume process has interrupt enable
(which , it looks like, shouldn't happen) and then you get the lockdep()
warning due to the above..

Not sure if this has been found already, or not?

Should drain_all_pages() really be drain_local_pages() ?

Daniel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Daniel Walker
On Wed, 2007-12-19 at 10:42 -0800, Daniel Walker wrote:
 On Wed, 2007-12-19 at 10:06 -0500, Miles Lane wrote:
  [   11.827653] PM: Creating hibernation image:
  [   11.827658] WARNING: at arch/x86/kernel/smp_32.c:561 
  native_smp_call_function_mask()
  [   11.827661] Pid: 9940, comm: pm-hibernate Not tainted
  2.6.24-rc5-mm1 #8
  [   11.827665]  [c0107d55] show_trace_log_lvl+0x12/0x25
  [   11.827673]  [c010848a] show_trace+0xd/0x10
  [   11.827677]  [c0108763] dump_stack+0x57/0x5f
  [   11.827681]  [c0117db4] native_smp_call_function_mask+0x41/0x126
  [   11.827686]  [c01192d9] smp_call_function+0x18/0x1f
  [   11.827690]  [c012c624] on_each_cpu+0x12/0x40
  [   11.827695]  [c0166ece] drain_all_pages+0x13/0x16
  [   11.827700]  [c014f7b3] swsusp_save+0x18/0x46b
  [   11.827705]  [c03103fa] swsusp_arch_suspend+0x2a/0x2c
  [   11.827710]  [c014e7d8] hibernate+0xba/0x16e
  [   11.827714]  [c014d56b] state_store+0x45/0xac
  [   11.827717]  [c01ffe95] kobj_attr_store+0x1a/0x22
  [   11.827722]  [c01b92c7] sysfs_write_file+0xb8/0xe3
  [   11.827726]  [c01837eb] vfs_write+0xa4/0x120
  [   11.827731]  [c0183d5e] sys_write+0x3b/0x60
  [   11.827734]  [c0106bae] sysenter_past_esp+0x6b/0xc1
  [   11.827738]  ===
 ...
  [   15.624993] =
  [   15.624995] [ INFO: inconsistent lock state ]
  [   15.624998] 2.6.24-rc5-mm1 #8
  [   15.624999] -
  [   15.625001] inconsistent {in-hardirq-W} - {hardirq-on-W} usage.
 
 It looks like the swsusp_save() calls drain_all_pages() , which calls
 on_each_cpu() .. On return on_each_cpu() unconditionally enables
 interrupts so the rest of the resume process has interrupt enable
 (which , it looks like, shouldn't happen) and then you get the lockdep()
 warning due to the above..
 
 Not sure if this has been found already, or not?
 
 Should drain_all_pages() really be drain_local_pages() ?

It looks like it was drain_local_pages, but the following patch

page-allocator-clean-up-pcp-draining-functions.patch

Changes that in -mm .. I added Christoph Lameter to the CC since it's
his patch ..

Daniel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Christoph Lameter
On Wed, 19 Dec 2007, Daniel Walker wrote:

  It looks like the swsusp_save() calls drain_all_pages() , which calls
  on_each_cpu() .. On return on_each_cpu() unconditionally enables
  interrupts so the rest of the resume process has interrupt enable
  (which , it looks like, shouldn't happen) and then you get the lockdep()
  warning due to the above..
  
  Not sure if this has been found already, or not?

Hmmm... It will unconditionally enable interrupts regardless how we call 
this. We could explicity save and restore interrrupts in 
swsusp_save() I guess. Why is swsusp_save() disabling interrupts?

  Should drain_all_pages() really be drain_local_pages() ?
 
 It looks like it was drain_local_pages, but the following patch
 
 page-allocator-clean-up-pcp-draining-functions.patch
 
 Changes that in -mm .. I added Christoph Lameter to the CC since it's
 his patch ..

We could reexport drain_local_pages() again but then I do not understand 
why we would only drain the pages of this processor and not of all other
processors as well. It seems that software suspend intend was to flush 
them all right?


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Rafael J. Wysocki
On Wednesday, 19 of December 2007, Christoph Lameter wrote:
 On Wed, 19 Dec 2007, Daniel Walker wrote:
 
   It looks like the swsusp_save() calls drain_all_pages() , which calls
   on_each_cpu() .. On return on_each_cpu() unconditionally enables
   interrupts so the rest of the resume process has interrupt enable
   (which , it looks like, shouldn't happen) and then you get the lockdep()
   warning due to the above..
   
   Not sure if this has been found already, or not?
 
 Hmmm... It will unconditionally enable interrupts regardless how we call 
 this. We could explicity save and restore interrrupts in 
 swsusp_save() I guess. Why is swsusp_save() disabling interrupts?

Actually, it's called with interrupts disabled, because it's job is to create
the hibernation image.  At this point everything is off except for the CPU
running swsusp_save().
 
   Should drain_all_pages() really be drain_local_pages() ?
  
  It looks like it was drain_local_pages, but the following patch
  
  page-allocator-clean-up-pcp-draining-functions.patch
  
  Changes that in -mm .. I added Christoph Lameter to the CC since it's
  his patch ..
 
 We could reexport drain_local_pages() again but then I do not understand 
 why we would only drain the pages of this processor and not of all other
 processors as well. It seems that software suspend intend was to flush 
 them all right?

Well, not exactly.  We are on one CPU at this point, the others have been
disabled.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Christoph Lameter
On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:

  We could reexport drain_local_pages() again but then I do not understand 
  why we would only drain the pages of this processor and not of all other
  processors as well. It seems that software suspend intend was to flush 
  them all right?
 
 Well, not exactly.  We are on one CPU at this point, the others have been
 disabled.

Ok so the others are flush. Here is a patch to re-export 
drain_local_pages() again and use it for software suspend:

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 include/linux/gfp.h |1 +
 kernel/power/snapshot.c |2 +-
 mm/page_alloc.c |2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
===
--- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c   2007-12-19 
11:59:25.233961700 -0800
+++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c2007-12-19 
15:16:34.179661929 -0800
@@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
 
printk(KERN_INFO PM: Creating hibernation image: \n);
 
-   drain_all_pages();
+   drain_local_pages(NULL);
nr_pages = count_data_pages();
nr_highmem = count_highmem_pages();
printk(KERN_INFO PM: Need to copy %u pages\n, nr_pages + nr_highmem);
Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
===
--- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c   2007-12-19 12:01:00.630421258 
-0800
+++ linux-2.6.24-rc5-mm1/mm/page_alloc.c2007-12-19 15:12:19.850545818 
-0800
@@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
 /*
  * Spill all of this CPU's per-cpu pages back into the buddy allocator.
  */
-static void drain_local_pages(void *arg)
+void drain_local_pages(void *arg)
 {
drain_pages(smp_processor_id());
 }
Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
===
--- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h   2007-12-19 
15:13:51.926950065 -0800
+++ linux-2.6.24-rc5-mm1/include/linux/gfp.h2007-12-19 15:16:11.951564369 
-0800
@@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
 void page_alloc_init(void);
 void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
 void drain_all_pages(void);
+void drain_local_pages(void *dummy);
 
 #endif /* __LINUX_GFP_H */
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Rafael J. Wysocki
On Thursday, 20 of December 2007, Christoph Lameter wrote:
 On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
 
   We could reexport drain_local_pages() again but then I do not understand 
   why we would only drain the pages of this processor and not of all other
   processors as well. It seems that software suspend intend was to flush 
   them all right?
  
  Well, not exactly.  We are on one CPU at this point, the others have been
  disabled.
 
 Ok so the others are flush. Here is a patch to re-export 
 drain_local_pages() again and use it for software suspend:
 
 Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
 
 ---
  include/linux/gfp.h |1 +
  kernel/power/snapshot.c |2 +-
  mm/page_alloc.c |2 +-
  3 files changed, 3 insertions(+), 2 deletions(-)
 
 Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
 ===
 --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 
 11:59:25.233961700 -0800
 +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c  2007-12-19 
 15:16:34.179661929 -0800
 @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
  
   printk(KERN_INFO PM: Creating hibernation image: \n);
  
 - drain_all_pages();
 + drain_local_pages(NULL);
   nr_pages = count_data_pages();
   nr_highmem = count_highmem_pages();
   printk(KERN_INFO PM: Need to copy %u pages\n, nr_pages + nr_highmem);

You've omitted the second instance, right before the copy_data_pages() call.

 Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
 ===
 --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:00.630421258 
 -0800
 +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c  2007-12-19 15:12:19.850545818 
 -0800
 @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
  /*
   * Spill all of this CPU's per-cpu pages back into the buddy allocator.
   */
 -static void drain_local_pages(void *arg)
 +void drain_local_pages(void *arg)
  {
   drain_pages(smp_processor_id());
  }
 Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
 ===
 --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 
 15:13:51.926950065 -0800
 +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h  2007-12-19 15:16:11.951564369 
 -0800
 @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
  void page_alloc_init(void);
  void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
  void drain_all_pages(void);
 +void drain_local_pages(void *dummy);
  
  #endif /* __LINUX_GFP_H */
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-19 Thread Dave Young
On Dec 20, 2007 12:07 AM, Alan Stern [EMAIL PROTECTED] wrote:

 On Wed, 19 Dec 2007, Dave Young wrote:

  I tested on another machine with kernel 2.6.24-rc2. And the result is
  diffrent again.
  Here is the result:
 
  1. on 2.6.24-rc2, when I plugin the player the kernel reports below 
  messages:
 
  usb-storage: waiting for device to settle before scanning
  /*[lets mark the below part as part 1]*/
  scsi 0:0:0:0: Direct-Access   Newman mp3   PQ: 0 ANSI: 
  0 CCS
  sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
  sd 0:0:0:0: [sda] Write Protect is on
  sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
  sd 0:0:0:0: [sda] Assuming drive cache: write through
  sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
  sd 0:0:0:0: [sda] Write Protect is on
  sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
  sd 0:0:0:0: [sda] Assuming drive cache: write through
   sda: sda1
  /*[lets mark the below part as part 2]*/
  sd 0:0:0:0: [sda] Attached SCSI removable disk
  usb-storage: device scan complete
  sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
  sd 0:0:0:0: [sda] Write Protect is off
  sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
  sd 0:0:0:0: [sda] Assuming drive cache: write through
  sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
  sd 0:0:0:0: [sda] Write Protect is off
  sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
  sd 0:0:0:0: [sda] Assuming drive cache: write through
   sda: sda1

 This is not normal.  When you plug in a storage device you should get
 all of the messages in your part 1 plus the first two lines in your
 part 2, but not the rest of part 2.

  2. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
  disk it reports the part 2 and mount the partition as rw
 
  3. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
  disk it just mount the partition as ro with nothing more messages.

 You must have a typo there.  Those can't both be true for 2.6.24-rc5.
 In fact you shouldn't see part 2 at all.

 Here's what I get when I plug in a USB mass-storage device under
 2.6.24-rc5:

 [   87.903014] usb-storage: device found at 2
 [   87.909570] scsi 0:0:0:0: Direct-Access  Memorex TD 2B1.09 
 PQ: 0 ANSI: 0 CCS
 [   87.913144] usb-storage: device scan complete
 [   88.804031] sd 0:0:0:0: [sda] 243712 512-byte hardware sectors (125 MB)
 [   88.805507] sd 0:0:0:0: [sda] Write Protect is off
 [   88.805577] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
 [   88.805639] sd 0:0:0:0: [sda] Assuming drive cache: write through
 [   88.809526] sd 0:0:0:0: [sda] 243712 512-byte hardware sectors (125 MB)
 [   88.810421] sd 0:0:0:0: [sda] Write Protect is off
 [   88.810488] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
 [   88.810575] sd 0:0:0:0: [sda] Assuming drive cache: write through
 [   88.810641]  sda: sda1
 [   88.812450] sd 0:0:0:0: [sda] Attached SCSI removable disk
 [   89.041014] sd 0:0:0:0: Attached scsi generic sg0 type 0

 Mounting the disk produces no extra output at all.  I get the same
 result under 2.6.23 and earlier operating systems.  You should see
 approximately the same thing.

Hi, Alan

I'm sure about my post. I'm not so famillar with usb.
It looks weird.  Seems that my device will be firstly recoganized as a
mp3 player and then a usb storage, so the system will report part 1 
part 2 under previous  kernels.

Regards
dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Miles Lane
On Dec 19, 2007 7:09 PM, Rafael J. Wysocki [EMAIL PROTECTED] wrote:

 On Thursday, 20 of December 2007, Christoph Lameter wrote:
  On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
 
We could reexport drain_local_pages() again but then I do not understand
why we would only drain the pages of this processor and not of all other
processors as well. It seems that software suspend intend was to flush
them all right?
  
   Well, not exactly.  We are on one CPU at this point, the others have been
   disabled.
 
  Ok so the others are flush. Here is a patch to re-export
  drain_local_pages() again and use it for software suspend:
 
  Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
 
  ---
   include/linux/gfp.h |1 +
   kernel/power/snapshot.c |2 +-
   mm/page_alloc.c |2 +-
   3 files changed, 3 insertions(+), 2 deletions(-)
 
  Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
  ===
  --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 
  11:59:25.233961700 -0800
  +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c  2007-12-19 
  15:16:34.179661929 -0800
  @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
 
printk(KERN_INFO PM: Creating hibernation image: \n);
 
  - drain_all_pages();
  + drain_local_pages(NULL);
nr_pages = count_data_pages();
nr_highmem = count_highmem_pages();
printk(KERN_INFO PM: Need to copy %u pages\n, nr_pages + 
  nr_highmem);

 You've omitted the second instance, right before the copy_data_pages() call.

I will wait for a revised patch and then test.
(Sorry for the duplicate message.  I am resending because I
accidentally sent an HTML
message the first time.  Whoops.)

  Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
  ===
  --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:00.630421258 
  -0800
  +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c  2007-12-19 15:12:19.850545818 
  -0800
  @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
   /*
* Spill all of this CPU's per-cpu pages back into the buddy allocator.
*/
  -static void drain_local_pages(void *arg)
  +void drain_local_pages(void *arg)
   {
drain_pages(smp_processor_id());
   }
  Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
  ===
  --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 
  15:13:51.926950065 -0800
  +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h  2007-12-19 15:16:11.951564369 
  -0800
  @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
   void page_alloc_init(void);
   void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
   void drain_all_pages(void);
  +void drain_local_pages(void *dummy);
 
   #endif /* __LINUX_GFP_H */

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Rafael J. Wysocki
On Thursday, 20 of December 2007, Miles Lane wrote:
 On Dec 19, 2007 7:09 PM, Rafael J. Wysocki [EMAIL PROTECTED] wrote:
 
  On Thursday, 20 of December 2007, Christoph Lameter wrote:
   On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
  
 We could reexport drain_local_pages() again but then I do not
  understand
 why we would only drain the pages of this processor and not of all
  other
 processors as well. It seems that software suspend intend was to
  flush
 them all right?
   
Well, not exactly.  We are on one CPU at this point, the others have
  been
disabled.
  
   Ok so the others are flush. Here is a patch to re-export
   drain_local_pages() again and use it for software suspend:
  
   Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
  
   ---
include/linux/gfp.h |1 +
kernel/power/snapshot.c |2 +-
mm/page_alloc.c |2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
  
   Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
   ===
   --- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 11:59:
  25.233961700 -0800
   +++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c  2007-12-19 15:16:
  34.179661929 -0800
   @@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
  
 printk(KERN_INFO PM: Creating hibernation image: \n);
  
   - drain_all_pages();
   + drain_local_pages(NULL);
 nr_pages = count_data_pages();
 nr_highmem = count_highmem_pages();
 printk(KERN_INFO PM: Need to copy %u pages\n, nr_pages +
  nr_highmem);
 
  You've omitted the second instance, right before the copy_data_pages()
  call.
 
 
 I guess I will wait for a revised patch.

There's an Andrew's fix on top of this one in -mm:
http://marc.info/?l=linux-mm-commitsm=119810866812965w=2


   Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
   ===
   --- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:
  00.630421258 -0800
   +++ linux-2.6.24-rc5-mm1/mm/page_alloc.c  2007-12-19 15:12:
  19.850545818 -0800
   @@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
/*
 * Spill all of this CPU's per-cpu pages back into the buddy allocator.
 */
   -static void drain_local_pages(void *arg)
   +void drain_local_pages(void *arg)
{
 drain_pages(smp_processor_id());
}
   Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
   ===
   --- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 15:13:
  51.926950065 -0800
   +++ linux-2.6.24-rc5-mm1/include/linux/gfp.h  2007-12-19 15:16:
  11.951564369 -0800
   @@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
void page_alloc_init(void);
void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
void drain_all_pages(void);
   +void drain_local_pages(void *dummy);
  
#endif /* __LINUX_GFP_H */
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-19 Thread Alan Stern
Note carefully.  This:

   2. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
   disk it reports the part 2 and mount the partition as rw

contradicts this:

   3. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
   disk it just mount the partition as ro with nothing more messages.

So which is correct?

 Hi, Alan
 
 I'm sure about my post.

But your post contradicts itself.  It can't be correct.

 I'm not so famillar with usb.
 It looks weird.  Seems that my device will be firstly recoganized as a
 mp3 player and then a usb storage, so the system will report part 1 
 part 2 under previous  kernels.

I think those part 2 messages aren't caused by the kernel at all, but
instead by some program running on your computer.  You could try
booting into single-user mode and see if the behavior changes.

Also there's no question -- the device does behave strangely.  It
shouldn't change the write-protect setting all by itself.

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-19 Thread Dave Young
On Dec 20, 2007 11:34 AM, Alan Stern [EMAIL PROTECTED] wrote:
 Note carefully.  This:

2. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
disk it reports the part 2 and mount the partition as rw

 contradicts this:

3. on 2.6.24-rc5 kernel reports only the part 1, after try mount the
disk it just mount the partition as ro with nothing more messages.

Oh, sorry. It's a typo. should be 2.6.24-rc5-mm1


 So which is correct?

  Hi, Alan
 
  I'm sure about my post.

 But your post contradicts itself.  It can't be correct.

  I'm not so famillar with usb.
  It looks weird.  Seems that my device will be firstly recoganized as a
  mp3 player and then a usb storage, so the system will report part 1 
  part 2 under previous  kernels.

 I think those part 2 messages aren't caused by the kernel at all, but
 instead by some program running on your computer.  You could try
 booting into single-user mode and see if the behavior changes.

No doubt for me. Under osx plugin this device will popup a dialog(I
don't remember the content), after press ok then the disk icon go
away, and then being remount again.


 Also there's no question -- the device does behave strangely.  It
 shouldn't change the write-protect setting all by itself.

Yes, I think so too.

 Alan Stern


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-hardirq-W} - {hardirq-on-W} usage -- pm-hibernate/9940 [HC0[0]:SC0[0]:HE1:SE1]

2007-12-19 Thread Miles Lane
On Dec 19, 2007 8:31 PM, Rafael J. Wysocki [EMAIL PROTECTED] wrote:

 On Thursday, 20 of December 2007, Miles Lane wrote:
  On Dec 19, 2007 7:09 PM, Rafael J. Wysocki [EMAIL PROTECTED] wrote:
 
   On Thursday, 20 of December 2007, Christoph Lameter wrote:
On Thu, 20 Dec 2007, Rafael J. Wysocki wrote:
   
  We could reexport drain_local_pages() again but then I do not
   understand
  why we would only drain the pages of this processor and not of all
   other
  processors as well. It seems that software suspend intend was to
   flush
  them all right?

 Well, not exactly.  We are on one CPU at this point, the others have
   been
 disabled.
   
Ok so the others are flush. Here is a patch to re-export
drain_local_pages() again and use it for software suspend:
   
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
   
---
 include/linux/gfp.h |1 +
 kernel/power/snapshot.c |2 +-
 mm/page_alloc.c |2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)
   
Index: linux-2.6.24-rc5-mm1/kernel/power/snapshot.c
===
--- linux-2.6.24-rc5-mm1.orig/kernel/power/snapshot.c 2007-12-19 11:59:
   25.233961700 -0800
+++ linux-2.6.24-rc5-mm1/kernel/power/snapshot.c  2007-12-19 15:16:
   34.179661929 -0800
@@ -1203,7 +1203,7 @@ asmlinkage int swsusp_save(void)
   
  printk(KERN_INFO PM: Creating hibernation image: \n);
   
- drain_all_pages();
+ drain_local_pages(NULL);
  nr_pages = count_data_pages();
  nr_highmem = count_highmem_pages();
  printk(KERN_INFO PM: Need to copy %u pages\n, nr_pages +
   nr_highmem);
  
   You've omitted the second instance, right before the copy_data_pages()
   call.
  
 
  I guess I will wait for a revised patch.

 There's an Andrew's fix on top of this one in -mm:
 http://marc.info/?l=linux-mm-commitsm=119810866812965w=2



Index: linux-2.6.24-rc5-mm1/mm/page_alloc.c
===
--- linux-2.6.24-rc5-mm1.orig/mm/page_alloc.c 2007-12-19 12:01:
   00.630421258 -0800
+++ linux-2.6.24-rc5-mm1/mm/page_alloc.c  2007-12-19 15:12:
   19.850545818 -0800
@@ -930,7 +930,7 @@ static void drain_pages(unsigned int cpu
 /*
  * Spill all of this CPU's per-cpu pages back into the buddy allocator.
  */
-static void drain_local_pages(void *arg)
+void drain_local_pages(void *arg)
 {
  drain_pages(smp_processor_id());
 }
Index: linux-2.6.24-rc5-mm1/include/linux/gfp.h
===
--- linux-2.6.24-rc5-mm1.orig/include/linux/gfp.h 2007-12-19 15:13:
   51.926950065 -0800
+++ linux-2.6.24-rc5-mm1/include/linux/gfp.h  2007-12-19 15:16:
   11.951564369 -0800
@@ -229,5 +229,6 @@ extern void FASTCALL(free_cold_page(stru
 void page_alloc_init(void);
 void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp);
 void drain_all_pages(void);
+void drain_local_pages(void *dummy);
   
 #endif /* __LINUX_GFP_H */
  


I applied Christoph and Andrew's patches and recompiled.  I suspended
to disk and to ram several times and all looks good.

   Miles
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-18 Thread Dave Young
On Dec 17, 2007 9:14 AM, Dave Young <[EMAIL PROTECTED]> wrote:
> On Dec 14, 2007 11:44 PM, Alan Stern <[EMAIL PROTECTED]> wrote:
> > On Fri, 14 Dec 2007, Dave Young wrote:
> >
> > > Hi,
> > > The behaviour of my mp3 player (also act as usb-storage device) seems
> > > changed from rc5 to rc5-mm1.
> >
> > This can't be considered a bug, right?
>
> I'm not sure.
>
>
> > It's just that the player
> > changed from one slightly non-standard behavior to a different slightly
> > non-standard behavior.
> >
> >
> > > :
> > > =
> > > usb 1-7: new high speed USB device using ehci_hcd and address 7
> > > usb 1-7: configuration #1 chosen from 1 choice
> > > scsi4 : SCSI emulation for USB Mass Storage devices
> > > usb-storage: device found at 7
> > > usb-storage: waiting for device to settle before scanning
> > > scsi 4:0:0:0: Direct-Access   Newman mp3   PQ: 0 
> > > ANSI: 0 CCS
> > > sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > > sd 4:0:0:0: [sdb] Write Protect is on
> > > sd 4:0:0:0: [sdb] Mode Sense: 03 00 80 00
> > > sd 4:0:0:0: [sdb] Assuming drive cache: write through
> > > sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > > sd 4:0:0:0: [sdb] Write Protect is on
> > > sd 4:0:0:0: [sdb] Mode Sense: 03 00 80 00
> > > sd 4:0:0:0: [sdb] Assuming drive cache: write through
> > >  sdb: sdb1
> > > sd 4:0:0:0: [sdb] Attached SCSI removable disk
> > > sd 4:0:0:0: Attached scsi generic sg1 type 0
> > > usb-storage: device scan complete
> > >
> > > ==
> > > try mount it (or just blockdev --rereadpt), then write protect become off:
> > > ==
> > >
> > > sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > > sd 4:0:0:0: [sdb] Write Protect is off
> > > sd 4:0:0:0: [sdb] Mode Sense: 03 00 00 00
> > > sd 4:0:0:0: [sdb] Assuming drive cache: write through
> > > sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > > sd 4:0:0:0: [sdb] Write Protect is off
> > > sd 4:0:0:0: [sdb] Mode Sense: 03 00 00 00
> > > sd 4:0:0:0: [sdb] Assuming drive cache: write through
> > >  sdb: sdb1
> >
> > This output won't appear if you simply mount the device.  So how do you
> > know that mounting turns off write protect?
>
> This can be observed by eye:
> dmesg -> mount -> dmesg
>
> >
> > > But under rc5-mm1, after mount command being executed, it is just
> > > mouted as read only partition without set the write-protect to off
> > >
> > > I tried "blockdev --rereadpt", it do set the write-protect to off as rc5 
> > > kernel.
> > >
> > > Below is the output of dmesg under rc5-mm1
> > > ==
> > > usb 1-8: new high speed USB device using ehci_hcd and address 6
> > > usb 1-8: configuration #1 chosen from 1 choice
> > > scsi3 : SCSI emulation for USB Mass Storage devices
> > > usb-storage: device found at 6
> > > usb-storage: waiting for device to settle before scanning
> > > scsi 3:0:0:0: Direct-Access   Newman mp3   PQ: 0 
> > > ANSI: 0 CCS
> > > sd 3:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > > sd 3:0:0:0: [sdb] Write Protect is on
> > > sd 3:0:0:0: [sdb] Mode Sense: 03 00 80 00
> > > sd 3:0:0:0: [sdb] Assuming drive cache: write through
> > > sd 3:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > > sd 3:0:0:0: [sdb] Write Protect is on
> > > sd 3:0:0:0: [sdb] Mode Sense: 03 00 80 00
> > > sd 3:0:0:0: [sdb] Assuming drive cache: write through
> > >  sdb: sdb1
> >
> > This looks exactly the same as the output above (except for various
> > port, device, and bus numbers).
>
> Yes, but lacks the part of "'Write Protect if off'  and other lines".
>
> >
> > If you turn on CONFIG_USB_STORAGE_DEBUG for both kernels and compare
> > the dmesg output for the mount command, that might highlight the
> > difference.
>
> Ok, I will test with do once have time, thanks.
>

There's not useful infomation with DEBUG on.

I tested on another machine with kernel 2.6.24-rc2. And the result is
diffrent again.
Here is the result:

1. on 2.6.24-rc2, when I plugin the player the kernel reports below messages:

usb-storage: waiting for device to settle before scanning
/*[lets mark the below part as part 1]*/
scsi 0:0:0:0: Direct-Access   Newman mp3   PQ: 0 ANSI: 0 CCS
sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
sd 0:0:0:0: [sda] Write Protect is on
sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
sd 0:0:0:0: [sda] Write Protect is on
sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
sd 0:0:0:0: [sda] Assuming drive cache: write through
 sda: sda1
/*[lets mark the below part as part 2]*/
sd 0:0:0:0: [sda] Attached SCSI removable disk
usb-storage: device scan complete
sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] 245248 512-byte 

Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-18 Thread Jeff Dike
On Tue, Dec 18, 2007 at 06:04:58PM -0800, Andrew Morton wrote:
> Nobody seems to look after hppfs.  I'll resend the fat and hostfs patches to
> maintainers for a review, please.

It's mine - I'll take a look at it.

Jeff

-- 
Work email - jdike at linux dot intel dot com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-18 Thread Andrew Morton
On Wed, 19 Dec 2007 01:22:21 + David Howells <[EMAIL PROTECTED]> wrote:

> Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > > - inode = ERR_PTR(ret);
> > > + return NULL;
> > >   } else {
> > >   unlock_new_inode(inode);
> > >   }
> > > 
> > 
> > Yup.
> 
> Nope.  The correct fix is to make the various callers use IS_ERR() to check
> the result of this function rather than checking for a NULL return.
> 
> > David, this is concerning.  More such error-path bugs in that code will take
> > years and years to get found and fixed.
> 
> Yes, I know.  I've looked over the patches several times, however I know there
> may be bugs in there because I may have made assumptions about what I've
> written that cause me to overlook things.  It's a danger of checking your own
> code:-(
> 
> > The best way to eliminate them is a line-by-line re-review of the patchset.
> 
> And ideally by someone other than me.  Some of them have been reviewed by
> other people, but I'm not sure that all have.
> 
> However, I've just had another look through.  ISOFS appears to be the only one
> in which I'd missed updating the callers.  I've sent you a patch for it.
> 
> Note that I expressed reservations about three filesystems in the cover note
> (FAT, HPPFS and HOSTFS), but nothing seems to have come of it.
> 

Nobody seems to look after hppfs.  I'll resend the fat and hostfs patches to
maintainers for a review, please.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-18 Thread Dave Young
On Dec 19, 2007 9:22 AM, David Howells <[EMAIL PROTECTED]> wrote:
> Andrew Morton <[EMAIL PROTECTED]> wrote:
>
> > > -   inode = ERR_PTR(ret);
> > > +   return NULL;
> > > } else {
> > > unlock_new_inode(inode);
> > > }
> > >
> >
> > Yup.
>
> Nope.  The correct fix is to make the various callers use IS_ERR() to check
> the result of this function rather than checking for a NULL return.
>
> > David, this is concerning.  More such error-path bugs in that code will take
> > years and years to get found and fixed.
>
> Yes, I know.  I've looked over the patches several times, however I know there
> may be bugs in there because I may have made assumptions about what I've
> written that cause me to overlook things.  It's a danger of checking your own
> code:-(
>
> > The best way to eliminate them is a line-by-line re-review of the patchset.
>
> And ideally by someone other than me.  Some of them have been reviewed by
> other people, but I'm not sure that all have.
>
> However, I've just had another look through.  ISOFS appears to be the only one
> in which I'd missed updating the callers.  I've sent you a patch for it.
>
> Note that I expressed reservations about three filesystems in the cover note
> (FAT, HPPFS and HOSTFS), but nothing seems to have come of it.
>
Hi,

The oops is at iput, I use 'return NULL ' is because I don't want to
change the the behaviour of iput in fs/inode.c.

Regards
dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-18 Thread David Howells
Andrew Morton <[EMAIL PROTECTED]> wrote:

> > -   inode = ERR_PTR(ret);
> > +   return NULL;
> > } else {
> > unlock_new_inode(inode);
> > }
> > 
> 
> Yup.

Nope.  The correct fix is to make the various callers use IS_ERR() to check
the result of this function rather than checking for a NULL return.

> David, this is concerning.  More such error-path bugs in that code will take
> years and years to get found and fixed.

Yes, I know.  I've looked over the patches several times, however I know there
may be bugs in there because I may have made assumptions about what I've
written that cause me to overlook things.  It's a danger of checking your own
code:-(

> The best way to eliminate them is a line-by-line re-review of the patchset.

And ideally by someone other than me.  Some of them have been reviewed by
other people, but I'm not sure that all have.

However, I've just had another look through.  ISOFS appears to be the only one
in which I'd missed updating the callers.  I've sent you a patch for it.

Note that I expressed reservations about three filesystems in the cover note
(FAT, HPPFS and HOSTFS), but nothing seems to have come of it.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-18 Thread Valdis . Kletnieks
(Adding Dave Howells, his name is on 
iget-stop-isofs-from-using-read_inode.patch)

On Tue, 18 Dec 2007 10:37:32 +0800, Dave Young said:

> > I don't mind it failing the mount, but the oops seems excessive.  I suspect
> > that *somewhere* in that stack trace, we're wanting something like a
> > 
> > if (!foo_ptr)
> > return -EIO;
> > 
> > but I admit not being competent enough to decide where that should be.
> > 
> 
> Hi,
> Could you please try the below patch:
> 
> Signed-off-by: Dave Young <[EMAIL PROTECTED]> 
> 
> ---
> fs/isofs/inode.c |2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

With that patch applied, I took the ISO image (which I ended up reading on
another machine and copying over the net to get a complete usable image),
and dd'ed just the first 150M into another file, and tried to loopback mount
it.  And I got:

# mount -o ro,loop /path/to/cd.partial.image /mnt/loop
mount: wrong fs type, bad option, bad superblock on /dev/loop0,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

And my dmesg says:

[   33.622073] ISO 9660 Extensions: Microsoft Joliet Level 3
[   33.622125] attempt to access beyond end of device
[   33.622129] loop0: rw=0, want=1284500, limit=30
[   33.622133] ISOFS: unable to read i-node block
Here is where we would oops before - now it errors out more reasonably:
[   33.622140] ISOFS: changing to secondary root
[   33.622148] attempt to access beyond end of device
[   33.622151] loop0: rw=0, want=1284508, limit=30
[   33.622155] ISOFS: unable to read i-node block
[   33.622159] isofs_fill_super: get root inode failed

So that fixes *this* bug. I looked in the -rc5-mm1 broken-out/, saw
the vast multitudes of 'iget-stop--from-using' patches, and decided
that somebody else will probably have to audit them for sanity.

In the iget-* series, there's some 184 'return ERR_PTR(-E);' for some FOO,
and 3 other uses:

% grep ERR_PTR iget* | grep -v return
iget-stop-isofs-from-using-read_inode.patch:+   inode = 
ERR_PTR(ret);
iget-stop-jfs-from-using-iget-and-read_inode-try.patch:+parent 
= ERR_PTR(-ENOMEM);
iget-stop-jfs-from-using-iget-and-read_inode-try.patch:-parent 
= ERR_PTR(-EACCES);
iget-stop-jfs-from-using-iget-and-read_inode-try.patch:-parent 
= ERR_PTR(-ENOMEM);

isofs is the only place we don't return a constant 'ERR_PTR(-EFOO)', but
cast somebody else's return value.  I wish I understood what that tells us. ;)




pgppwUchT0vXx.pgp
Description: PGP signature


Re: 2.6.24-rc5-mm1 - IPv6 throws section mismatches.

2007-12-18 Thread Daniel Lezcano

[EMAIL PROTECTED] wrote:

On Thu, 13 Dec 2007 02:40:50 PST, Andrew Morton said:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/


git-net.patch (I'm guessing one of Daniel's commits, but not sure which one)
causes some complaints:

  LD  vmlinux.o
  MODPOST vmlinux.o
WARNING: vmlinux.o(.init.text+0x2263f): Section mismatch: reference to 
.exit.text:tcpv6_exit (between 'inet6_init' and 'ac6_proc_init')
WARNING: vmlinux.o(.init.text+0x22644): Section mismatch: reference to 
.exit.text:udplitev6_exit (between 'inet6_init' and 'ac6_proc_init')
WARNING: vmlinux.o(.init.text+0x22649): Section mismatch: reference to 
.exit.text:udpv6_exit (between 'inet6_init' and 'ac6_proc_init')
WARNING: vmlinux.o(.init.text+0x22658): Section mismatch: reference to 
.exit.text:addrconf_cleanup (between 'inet6_init' and 'ac6_proc_init')
WARNING: vmlinux.o(.init.text+0x226bc): Section mismatch: reference to 
.exit.text:rawv6_exit (between 'inet6_init' and 'ac6_proc_init')

Looks like the problem is that tcpv6_exit and friends are called from
net/ipv6/af_inet6.c:inet6_init() - which is declared as:

static int __init inet6_init(void)

I can see how calling an __exit from an __init would be Bad Juju...



Yep, thanks Valdis for pointing that.

I sent a patch several days ago which fix that to DaveM and he applied 
it to the latest net-2.6.25


--






















































Sauf indication contraire ci-dessus:
Compagnie IBM France
Siège Social : Tour Descartes, 2, avenue Gambetta, La Défense 5, 92400
Courbevoie
RCS Nanterre 552 118 465
Forme Sociale : S.A.S.
Capital Social : 542.737.118 ?
SIREN/SIRET : 552 118 465 02430
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock

2007-12-18 Thread Johannes Berg

> Sorry.  GMail doesn't support sending unwrapped text, as far as I can
> tell.  I will send the log segment to you as an attachment.  Also,
> when I sent my .config inline to Andrew recently, it tripped his spam
> filter.  I'll attach it as well.

Thanks. This is a bug in iwlwifi.

The problem is actually another case where my workqueue debugging with
lockdep is triggering a warning :))

Here's the thing:

iwl3945_cancel_deferred_work does 

cancel_delayed_work_sync(>init_alive_start);

(which is the "(&(>init_alive_start)->work)" lock)

but it is called from within a locked section of
mutex_lock(>mutex); (locked from iwl3945_pci_suspend)

On the other hand, the task that runs from the init_alive_start
workqueue is iwl3945_bg_init_alive_start() which will lock the same
mutex.

So the deadlock condition is that you can be in
cancel_delayed_work_sync() above while the mutex is locked, and be
waiting for iwl_3945_bg_init_alive_start() which tries to lock the
mutex.

johannes


signature.asc
Description: This is a digitally signed message part


Re: 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock

2007-12-18 Thread Johannes Berg

On Tue, 2007-12-18 at 09:03 -0500, Miles Lane wrote:
> I have only seen this happen once, and cannot reproduce it.  I'll keep
> trying, though.
> 
> Dec 16 22:10:48 syntropy kernel: [  231.718023]
> ===

Do you have a version that isn't line-wrapped before I try to unwrap it?

Thanks,
johannes


signature.asc
Description: This is a digitally signed message part


Re: 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock

2007-12-18 Thread Johannes Berg

On Tue, 2007-12-18 at 09:03 -0500, Miles Lane wrote:
 I have only seen this happen once, and cannot reproduce it.  I'll keep
 trying, though.
 
 Dec 16 22:10:48 syntropy kernel: [  231.718023]
 ===

Do you have a version that isn't line-wrapped before I try to unwrap it?

Thanks,
johannes


signature.asc
Description: This is a digitally signed message part


Re: 2.6.24-rc5-mm1 -- INFO: possible circular locking dependency detected -- pm-suspend/5800 is trying to acquire lock

2007-12-18 Thread Johannes Berg

 Sorry.  GMail doesn't support sending unwrapped text, as far as I can
 tell.  I will send the log segment to you as an attachment.  Also,
 when I sent my .config inline to Andrew recently, it tripped his spam
 filter.  I'll attach it as well.

Thanks. This is a bug in iwlwifi.

The problem is actually another case where my workqueue debugging with
lockdep is triggering a warning :))

Here's the thing:

iwl3945_cancel_deferred_work does 

cancel_delayed_work_sync(priv-init_alive_start);

(which is the ((priv-init_alive_start)-work) lock)

but it is called from within a locked section of
mutex_lock(priv-mutex); (locked from iwl3945_pci_suspend)

On the other hand, the task that runs from the init_alive_start
workqueue is iwl3945_bg_init_alive_start() which will lock the same
mutex.

So the deadlock condition is that you can be in
cancel_delayed_work_sync() above while the mutex is locked, and be
waiting for iwl_3945_bg_init_alive_start() which tries to lock the
mutex.

johannes


signature.asc
Description: This is a digitally signed message part


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-18 Thread Valdis . Kletnieks
(Adding Dave Howells, his name is on 
iget-stop-isofs-from-using-read_inode.patch)

On Tue, 18 Dec 2007 10:37:32 +0800, Dave Young said:

  I don't mind it failing the mount, but the oops seems excessive.  I suspect
  that *somewhere* in that stack trace, we're wanting something like a
  
  if (!foo_ptr)
  return -EIO;
  
  but I admit not being competent enough to decide where that should be.
  
 
 Hi,
 Could you please try the below patch:
 
 Signed-off-by: Dave Young [EMAIL PROTECTED] 
 
 ---
 fs/isofs/inode.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

With that patch applied, I took the ISO image (which I ended up reading on
another machine and copying over the net to get a complete usable image),
and dd'ed just the first 150M into another file, and tried to loopback mount
it.  And I got:

# mount -o ro,loop /path/to/cd.partial.image /mnt/loop
mount: wrong fs type, bad option, bad superblock on /dev/loop0,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

And my dmesg says:

[   33.622073] ISO 9660 Extensions: Microsoft Joliet Level 3
[   33.622125] attempt to access beyond end of device
[   33.622129] loop0: rw=0, want=1284500, limit=30
[   33.622133] ISOFS: unable to read i-node block
Here is where we would oops before - now it errors out more reasonably:
[   33.622140] ISOFS: changing to secondary root
[   33.622148] attempt to access beyond end of device
[   33.622151] loop0: rw=0, want=1284508, limit=30
[   33.622155] ISOFS: unable to read i-node block
[   33.622159] isofs_fill_super: get root inode failed

So that fixes *this* bug. I looked in the -rc5-mm1 broken-out/, saw
the vast multitudes of 'iget-stop-foofs-from-using' patches, and decided
that somebody else will probably have to audit them for sanity.

In the iget-* series, there's some 184 'return ERR_PTR(-EFOO);' for some FOO,
and 3 other uses:

% grep ERR_PTR iget* | grep -v return
iget-stop-isofs-from-using-read_inode.patch:+   inode = 
ERR_PTR(ret);
iget-stop-jfs-from-using-iget-and-read_inode-try.patch:+parent 
= ERR_PTR(-ENOMEM);
iget-stop-jfs-from-using-iget-and-read_inode-try.patch:-parent 
= ERR_PTR(-EACCES);
iget-stop-jfs-from-using-iget-and-read_inode-try.patch:-parent 
= ERR_PTR(-ENOMEM);

isofs is the only place we don't return a constant 'ERR_PTR(-EFOO)', but
cast somebody else's return value.  I wish I understood what that tells us. ;)




pgppwUchT0vXx.pgp
Description: PGP signature


Re: 2.6.24-rc5-mm1 - IPv6 throws section mismatches.

2007-12-18 Thread Daniel Lezcano

[EMAIL PROTECTED] wrote:

On Thu, 13 Dec 2007 02:40:50 PST, Andrew Morton said:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/


git-net.patch (I'm guessing one of Daniel's commits, but not sure which one)
causes some complaints:

  LD  vmlinux.o
  MODPOST vmlinux.o
WARNING: vmlinux.o(.init.text+0x2263f): Section mismatch: reference to 
.exit.text:tcpv6_exit (between 'inet6_init' and 'ac6_proc_init')
WARNING: vmlinux.o(.init.text+0x22644): Section mismatch: reference to 
.exit.text:udplitev6_exit (between 'inet6_init' and 'ac6_proc_init')
WARNING: vmlinux.o(.init.text+0x22649): Section mismatch: reference to 
.exit.text:udpv6_exit (between 'inet6_init' and 'ac6_proc_init')
WARNING: vmlinux.o(.init.text+0x22658): Section mismatch: reference to 
.exit.text:addrconf_cleanup (between 'inet6_init' and 'ac6_proc_init')
WARNING: vmlinux.o(.init.text+0x226bc): Section mismatch: reference to 
.exit.text:rawv6_exit (between 'inet6_init' and 'ac6_proc_init')

Looks like the problem is that tcpv6_exit and friends are called from
net/ipv6/af_inet6.c:inet6_init() - which is declared as:

static int __init inet6_init(void)

I can see how calling an __exit from an __init would be Bad Juju...



Yep, thanks Valdis for pointing that.

I sent a patch several days ago which fix that to DaveM and he applied 
it to the latest net-2.6.25


--






















































Sauf indication contraire ci-dessus:
Compagnie IBM France
Siège Social : Tour Descartes, 2, avenue Gambetta, La Défense 5, 92400
Courbevoie
RCS Nanterre 552 118 465
Forme Sociale : S.A.S.
Capital Social : 542.737.118 ?
SIREN/SIRET : 552 118 465 02430
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-18 Thread David Howells
Andrew Morton [EMAIL PROTECTED] wrote:

  -   inode = ERR_PTR(ret);
  +   return NULL;
  } else {
  unlock_new_inode(inode);
  }
  
 
 Yup.

Nope.  The correct fix is to make the various callers use IS_ERR() to check
the result of this function rather than checking for a NULL return.

 David, this is concerning.  More such error-path bugs in that code will take
 years and years to get found and fixed.

Yes, I know.  I've looked over the patches several times, however I know there
may be bugs in there because I may have made assumptions about what I've
written that cause me to overlook things.  It's a danger of checking your own
code:-(

 The best way to eliminate them is a line-by-line re-review of the patchset.

And ideally by someone other than me.  Some of them have been reviewed by
other people, but I'm not sure that all have.

However, I've just had another look through.  ISOFS appears to be the only one
in which I'd missed updating the callers.  I've sent you a patch for it.

Note that I expressed reservations about three filesystems in the cover note
(FAT, HPPFS and HOSTFS), but nothing seems to have come of it.

David
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-18 Thread Dave Young
On Dec 19, 2007 9:22 AM, David Howells [EMAIL PROTECTED] wrote:
 Andrew Morton [EMAIL PROTECTED] wrote:

   -   inode = ERR_PTR(ret);
   +   return NULL;
   } else {
   unlock_new_inode(inode);
   }
  
 
  Yup.

 Nope.  The correct fix is to make the various callers use IS_ERR() to check
 the result of this function rather than checking for a NULL return.

  David, this is concerning.  More such error-path bugs in that code will take
  years and years to get found and fixed.

 Yes, I know.  I've looked over the patches several times, however I know there
 may be bugs in there because I may have made assumptions about what I've
 written that cause me to overlook things.  It's a danger of checking your own
 code:-(

  The best way to eliminate them is a line-by-line re-review of the patchset.

 And ideally by someone other than me.  Some of them have been reviewed by
 other people, but I'm not sure that all have.

 However, I've just had another look through.  ISOFS appears to be the only one
 in which I'd missed updating the callers.  I've sent you a patch for it.

 Note that I expressed reservations about three filesystems in the cover note
 (FAT, HPPFS and HOSTFS), but nothing seems to have come of it.

Hi,

The oops is at iput, I use 'return NULL ' is because I don't want to
change the the behaviour of iput in fs/inode.c.

Regards
dave
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-18 Thread Andrew Morton
On Wed, 19 Dec 2007 01:22:21 + David Howells [EMAIL PROTECTED] wrote:

 Andrew Morton [EMAIL PROTECTED] wrote:
 
   - inode = ERR_PTR(ret);
   + return NULL;
 } else {
 unlock_new_inode(inode);
 }
   
  
  Yup.
 
 Nope.  The correct fix is to make the various callers use IS_ERR() to check
 the result of this function rather than checking for a NULL return.
 
  David, this is concerning.  More such error-path bugs in that code will take
  years and years to get found and fixed.
 
 Yes, I know.  I've looked over the patches several times, however I know there
 may be bugs in there because I may have made assumptions about what I've
 written that cause me to overlook things.  It's a danger of checking your own
 code:-(
 
  The best way to eliminate them is a line-by-line re-review of the patchset.
 
 And ideally by someone other than me.  Some of them have been reviewed by
 other people, but I'm not sure that all have.
 
 However, I've just had another look through.  ISOFS appears to be the only one
 in which I'd missed updating the callers.  I've sent you a patch for it.
 
 Note that I expressed reservations about three filesystems in the cover note
 (FAT, HPPFS and HOSTFS), but nothing seems to have come of it.
 

Nobody seems to look after hppfs.  I'll resend the fat and hostfs patches to
maintainers for a review, please.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-18 Thread Jeff Dike
On Tue, Dec 18, 2007 at 06:04:58PM -0800, Andrew Morton wrote:
 Nobody seems to look after hppfs.  I'll resend the fat and hostfs patches to
 maintainers for a review, please.

It's mine - I'll take a look at it.

Jeff

-- 
Work email - jdike at linux dot intel dot com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-18 Thread Dave Young
On Dec 17, 2007 9:14 AM, Dave Young [EMAIL PROTECTED] wrote:
 On Dec 14, 2007 11:44 PM, Alan Stern [EMAIL PROTECTED] wrote:
  On Fri, 14 Dec 2007, Dave Young wrote:
 
   Hi,
   The behaviour of my mp3 player (also act as usb-storage device) seems
   changed from rc5 to rc5-mm1.
 
  This can't be considered a bug, right?

 I'm not sure.


  It's just that the player
  changed from one slightly non-standard behavior to a different slightly
  non-standard behavior.
 
 
   dmesg output under rc5:
   =
   usb 1-7: new high speed USB device using ehci_hcd and address 7
   usb 1-7: configuration #1 chosen from 1 choice
   scsi4 : SCSI emulation for USB Mass Storage devices
   usb-storage: device found at 7
   usb-storage: waiting for device to settle before scanning
   scsi 4:0:0:0: Direct-Access   Newman mp3   PQ: 0 
   ANSI: 0 CCS
   sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
   sd 4:0:0:0: [sdb] Write Protect is on
   sd 4:0:0:0: [sdb] Mode Sense: 03 00 80 00
   sd 4:0:0:0: [sdb] Assuming drive cache: write through
   sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
   sd 4:0:0:0: [sdb] Write Protect is on
   sd 4:0:0:0: [sdb] Mode Sense: 03 00 80 00
   sd 4:0:0:0: [sdb] Assuming drive cache: write through
sdb: sdb1
   sd 4:0:0:0: [sdb] Attached SCSI removable disk
   sd 4:0:0:0: Attached scsi generic sg1 type 0
   usb-storage: device scan complete
  
   ==
   try mount it (or just blockdev --rereadpt), then write protect become off:
   ==
  
   sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
   sd 4:0:0:0: [sdb] Write Protect is off
   sd 4:0:0:0: [sdb] Mode Sense: 03 00 00 00
   sd 4:0:0:0: [sdb] Assuming drive cache: write through
   sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
   sd 4:0:0:0: [sdb] Write Protect is off
   sd 4:0:0:0: [sdb] Mode Sense: 03 00 00 00
   sd 4:0:0:0: [sdb] Assuming drive cache: write through
sdb: sdb1
 
  This output won't appear if you simply mount the device.  So how do you
  know that mounting turns off write protect?

 This can be observed by eye:
 dmesg - mount - dmesg

 
   But under rc5-mm1, after mount command being executed, it is just
   mouted as read only partition without set the write-protect to off
  
   I tried blockdev --rereadpt, it do set the write-protect to off as rc5 
   kernel.
  
   Below is the output of dmesg under rc5-mm1
   ==
   usb 1-8: new high speed USB device using ehci_hcd and address 6
   usb 1-8: configuration #1 chosen from 1 choice
   scsi3 : SCSI emulation for USB Mass Storage devices
   usb-storage: device found at 6
   usb-storage: waiting for device to settle before scanning
   scsi 3:0:0:0: Direct-Access   Newman mp3   PQ: 0 
   ANSI: 0 CCS
   sd 3:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
   sd 3:0:0:0: [sdb] Write Protect is on
   sd 3:0:0:0: [sdb] Mode Sense: 03 00 80 00
   sd 3:0:0:0: [sdb] Assuming drive cache: write through
   sd 3:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
   sd 3:0:0:0: [sdb] Write Protect is on
   sd 3:0:0:0: [sdb] Mode Sense: 03 00 80 00
   sd 3:0:0:0: [sdb] Assuming drive cache: write through
sdb: sdb1
 
  This looks exactly the same as the output above (except for various
  port, device, and bus numbers).

 Yes, but lacks the part of 'Write Protect if off'  and other lines.

 
  If you turn on CONFIG_USB_STORAGE_DEBUG for both kernels and compare
  the dmesg output for the mount command, that might highlight the
  difference.

 Ok, I will test with do once have time, thanks.


There's not useful infomation with DEBUG on.

I tested on another machine with kernel 2.6.24-rc2. And the result is
diffrent again.
Here is the result:

1. on 2.6.24-rc2, when I plugin the player the kernel reports below messages:

usb-storage: waiting for device to settle before scanning
/*[lets mark the below part as part 1]*/
scsi 0:0:0:0: Direct-Access   Newman mp3   PQ: 0 ANSI: 0 CCS
sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
sd 0:0:0:0: [sda] Write Protect is on
sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
sd 0:0:0:0: [sda] Write Protect is on
sd 0:0:0:0: [sda] Mode Sense: 03 00 80 00
sd 0:0:0:0: [sda] Assuming drive cache: write through
 sda: sda1
/*[lets mark the below part as part 2]*/
sd 0:0:0:0: [sda] Attached SCSI removable disk
usb-storage: device scan complete
sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] 245248 512-byte hardware sectors (126 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
sd 0:0:0:0: [sda] Assuming drive cache: write through
 sda: sda1

2. on 2.6.24-rc5 kernel reports only the part 1, after try 

Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-17 Thread Andrew Morton
On Tue, 18 Dec 2007 10:37:32 +0800 Dave Young <[EMAIL PROTECTED]> wrote:

> On Mon, Dec 17, 2007 at 09:07:56PM -0500, [EMAIL PROTECTED] wrote:
> > On Mon, 17 Dec 2007 14:56:44 PST, Andrew Morton said:
> > 
> > (Adding Al Viro to the list, he's listed as "file systems" and MAINTAINERS
> > doesn't list 'isofs' anyplace.  Will Al or Andrew please vector to whoever
> > actually does that code?)
> > 
> > > > I try it again, and it reports it died at the same exact place, but in 
> > > > about
> > > > 2 seconds flat, and reports 91M/sec transfer.  OK, that's *weird*, I 
> > > > didn't
> > > > think that blocks read from /dev/cdrom would get cached, but OK.
> > > 
> > > It'll remain cached if something is holding the device open.
> > 
> > Does it need to be "device open", or are there other things as well? If the
> > drop_cache was hosed, that would result in the same symptoms, no?
> > 
> > > Something's holding s_umount for writing I guess.  Possibly busted error
> > > handling somewhere totally different.
> > 
> > Aha - found what was holding it - an attempt to loopback mount the truncated
> > file (before I realized it was truncated) had failed - I had gotten a 
> > 'Killed'
> > back from the mount, but I didn't realize it had pulled an actual oops:
> > 
> > Dec 17 15:54:33 turing-police kernel: [14503.402385] attempt to access 
> > beyond end of device
> > Dec 17 15:54:33 turing-police kernel: [14503.402391] loop1: rw=0, 
> > want=1284500, limit=314240
> > Dec 17 15:54:33 turing-police kernel: [14503.402395] ISOFS: unable to read 
> > i-node block
> > Dec 17 15:54:33 turing-police kernel: [14503.402428] Unable to handle 
> > kernel NULL pointer dereference at 010b RIP:
> > Dec 17 15:54:33 turing-police kernel: [14503.402440]  [] 
> > iput+0x11/0x80
> > ...
> > Dec 17 15:54:33 turing-police kernel: [14503.403008] Call Trace:
> > Dec 17 15:54:33 turing-police kernel: [14503.403026]  [] 
> > isofs_fill_super+0x7e9/0xa6b
> > Dec 17 15:54:33 turing-police kernel: [14503.403045]  [] 
> > __down_write_nested+0x3d/0xa1
> > Dec 17 15:54:33 turing-police kernel: [14503.403061]  [] 
> > __down_write+0xb/0xd
> > Dec 17 15:54:33 turing-police kernel: [14503.403076]  [] 
> > sget+0x397/0x3a9
> > Dec 17 15:54:33 turing-police kernel: [14503.403090]  [] 
> > set_bdev_super+0x0/0x14
> > Dec 17 15:54:33 turing-police kernel: [14503.403106]  [] 
> > get_sb_bdev+0x109/0x157
> > Dec 17 15:54:33 turing-police kernel: [14503.403120]  [] 
> > isofs_fill_super+0x0/0xa6b
> > Dec 17 15:54:33 turing-police kernel: [14503.403138]  [] 
> > isofs_get_sb+0x13/0x15
> > Dec 17 15:54:33 turing-police kernel: [14503.403151]  [] 
> > vfs_kern_mount+0x90/0x11a
> > Dec 17 15:54:33 turing-police kernel: [14503.403167]  [] 
> > do_kern_mount+0x47/0xe3
> > Dec 17 15:54:33 turing-police kernel: [14503.403183]  [] 
> > do_mount+0x717/0x78a
> > Dec 17 15:54:33 turing-police kernel: [14503.403199]  [] 
> > _read_lock_irq+0x9/0xb
> > Dec 17 15:54:33 turing-police kernel: [14503.403212]  [] 
> > find_lock_page+0x8c/0x97
> > Dec 17 15:54:33 turing-police kernel: [14503.403227]  [] 
> > filemap_fault+0x1fa/0x3c6
> > Dec 17 15:54:33 turing-police kernel: [14503.403241]  [] 
> > unlock_page+0x2d/0x31
> > Dec 17 15:54:33 turing-police kernel: [14503.403254]  [] 
> > __do_fault+0x38d/0x3c3
> > Dec 17 15:54:33 turing-police kernel: [14503.403274]  [] 
> > handle_mm_fault+0x36d/0x6e9
> > Dec 17 15:54:33 turing-police kernel: [14503.403293]  [] 
> > __alloc_pages+0x68/0x2f6
> > Dec 17 15:54:33 turing-police kernel: [14503.403314]  [] 
> > sys_mount+0x89/0xcb
> > Dec 17 15:54:33 turing-police kernel: [14503.403328]  [] 
> > syscall_trace_enter+0x97/0x9b
> > Dec 17 15:54:33 turing-police kernel: [14503.403344]  [] 
> > tracesys+0xdc/0xe1
> > Dec 17 15:54:33 turing-police kernel: [14503.403359]
> > Dec 17 15:54:33 turing-police kernel: [14503.403366]
> > Dec 17 15:54:33 turing-police kernel: [14503.403367] Code: 48 8b 87 10 01 
> > 00 00 48 83 bf 38 02 00 00 40 48 8b 40 38 75
> > 
> > I don't mind it failing the mount, but the oops seems excessive.  I suspect
> > that *somewhere* in that stack trace, we're wanting something like a
> > 
> > if (!foo_ptr)
> > return -EIO;
> > 
> > but I admit not being competent enough to decide where that should be.
> > 
> 
> Hi,
> Could you please try the below patch:
> 
> Signed-off-by: Dave Young <[EMAIL PROTECTED]> 
> 
> ---
> fs/isofs/inode.c |2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff -upr linux/fs/isofs/inode.c linux.new/fs/isofs/inode.c
> --- linux/fs/isofs/inode.c2007-12-18 10:31:12.0 +0800
> +++ linux.new/fs/isofs/inode.c2007-12-18 10:31:56.0 +0800
> @@ -1414,7 +1414,7 @@ struct inode *isofs_iget(struct super_bl
>   ret = isofs_read_inode(inode);
>   if (ret < 0) {
>   iget_failed(inode);
> - inode = ERR_PTR(ret);
> + return NULL;
>   } else {
>

Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-17 Thread Dave Young
On Mon, Dec 17, 2007 at 09:07:56PM -0500, [EMAIL PROTECTED] wrote:
> On Mon, 17 Dec 2007 14:56:44 PST, Andrew Morton said:
> 
> (Adding Al Viro to the list, he's listed as "file systems" and MAINTAINERS
> doesn't list 'isofs' anyplace.  Will Al or Andrew please vector to whoever
> actually does that code?)
> 
> > > I try it again, and it reports it died at the same exact place, but in 
> > > about
> > > 2 seconds flat, and reports 91M/sec transfer.  OK, that's *weird*, I 
> > > didn't
> > > think that blocks read from /dev/cdrom would get cached, but OK.
> > 
> > It'll remain cached if something is holding the device open.
> 
> Does it need to be "device open", or are there other things as well? If the
> drop_cache was hosed, that would result in the same symptoms, no?
> 
> > Something's holding s_umount for writing I guess.  Possibly busted error
> > handling somewhere totally different.
> 
> Aha - found what was holding it - an attempt to loopback mount the truncated
> file (before I realized it was truncated) had failed - I had gotten a 'Killed'
> back from the mount, but I didn't realize it had pulled an actual oops:
> 
> Dec 17 15:54:33 turing-police kernel: [14503.402385] attempt to access beyond 
> end of device
> Dec 17 15:54:33 turing-police kernel: [14503.402391] loop1: rw=0, 
> want=1284500, limit=314240
> Dec 17 15:54:33 turing-police kernel: [14503.402395] ISOFS: unable to read 
> i-node block
> Dec 17 15:54:33 turing-police kernel: [14503.402428] Unable to handle kernel 
> NULL pointer dereference at 010b RIP:
> Dec 17 15:54:33 turing-police kernel: [14503.402440]  [] 
> iput+0x11/0x80
> ...
> Dec 17 15:54:33 turing-police kernel: [14503.403008] Call Trace:
> Dec 17 15:54:33 turing-police kernel: [14503.403026]  [] 
> isofs_fill_super+0x7e9/0xa6b
> Dec 17 15:54:33 turing-police kernel: [14503.403045]  [] 
> __down_write_nested+0x3d/0xa1
> Dec 17 15:54:33 turing-police kernel: [14503.403061]  [] 
> __down_write+0xb/0xd
> Dec 17 15:54:33 turing-police kernel: [14503.403076]  [] 
> sget+0x397/0x3a9
> Dec 17 15:54:33 turing-police kernel: [14503.403090]  [] 
> set_bdev_super+0x0/0x14
> Dec 17 15:54:33 turing-police kernel: [14503.403106]  [] 
> get_sb_bdev+0x109/0x157
> Dec 17 15:54:33 turing-police kernel: [14503.403120]  [] 
> isofs_fill_super+0x0/0xa6b
> Dec 17 15:54:33 turing-police kernel: [14503.403138]  [] 
> isofs_get_sb+0x13/0x15
> Dec 17 15:54:33 turing-police kernel: [14503.403151]  [] 
> vfs_kern_mount+0x90/0x11a
> Dec 17 15:54:33 turing-police kernel: [14503.403167]  [] 
> do_kern_mount+0x47/0xe3
> Dec 17 15:54:33 turing-police kernel: [14503.403183]  [] 
> do_mount+0x717/0x78a
> Dec 17 15:54:33 turing-police kernel: [14503.403199]  [] 
> _read_lock_irq+0x9/0xb
> Dec 17 15:54:33 turing-police kernel: [14503.403212]  [] 
> find_lock_page+0x8c/0x97
> Dec 17 15:54:33 turing-police kernel: [14503.403227]  [] 
> filemap_fault+0x1fa/0x3c6
> Dec 17 15:54:33 turing-police kernel: [14503.403241]  [] 
> unlock_page+0x2d/0x31
> Dec 17 15:54:33 turing-police kernel: [14503.403254]  [] 
> __do_fault+0x38d/0x3c3
> Dec 17 15:54:33 turing-police kernel: [14503.403274]  [] 
> handle_mm_fault+0x36d/0x6e9
> Dec 17 15:54:33 turing-police kernel: [14503.403293]  [] 
> __alloc_pages+0x68/0x2f6
> Dec 17 15:54:33 turing-police kernel: [14503.403314]  [] 
> sys_mount+0x89/0xcb
> Dec 17 15:54:33 turing-police kernel: [14503.403328]  [] 
> syscall_trace_enter+0x97/0x9b
> Dec 17 15:54:33 turing-police kernel: [14503.403344]  [] 
> tracesys+0xdc/0xe1
> Dec 17 15:54:33 turing-police kernel: [14503.403359]
> Dec 17 15:54:33 turing-police kernel: [14503.403366]
> Dec 17 15:54:33 turing-police kernel: [14503.403367] Code: 48 8b 87 10 01 00 
> 00 48 83 bf 38 02 00 00 40 48 8b 40 38 75
> 
> I don't mind it failing the mount, but the oops seems excessive.  I suspect
> that *somewhere* in that stack trace, we're wanting something like a
> 
>   if (!foo_ptr)
>   return -EIO;
> 
> but I admit not being competent enough to decide where that should be.
> 

Hi,
Could you please try the below patch:

Signed-off-by: Dave Young <[EMAIL PROTECTED]> 

---
fs/isofs/inode.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff -upr linux/fs/isofs/inode.c linux.new/fs/isofs/inode.c
--- linux/fs/isofs/inode.c  2007-12-18 10:31:12.0 +0800
+++ linux.new/fs/isofs/inode.c  2007-12-18 10:31:56.0 +0800
@@ -1414,7 +1414,7 @@ struct inode *isofs_iget(struct super_bl
ret = isofs_read_inode(inode);
if (ret < 0) {
iget_failed(inode);
-   inode = ERR_PTR(ret);
+   return NULL;
} else {
unlock_new_inode(inode);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-17 Thread Valdis . Kletnieks
On Mon, 17 Dec 2007 14:56:44 PST, Andrew Morton said:

(Adding Al Viro to the list, he's listed as "file systems" and MAINTAINERS
doesn't list 'isofs' anyplace.  Will Al or Andrew please vector to whoever
actually does that code?)

> > I try it again, and it reports it died at the same exact place, but in about
> > 2 seconds flat, and reports 91M/sec transfer.  OK, that's *weird*, I didn't
> > think that blocks read from /dev/cdrom would get cached, but OK.
> 
> It'll remain cached if something is holding the device open.

Does it need to be "device open", or are there other things as well? If the
drop_cache was hosed, that would result in the same symptoms, no?

> Something's holding s_umount for writing I guess.  Possibly busted error
> handling somewhere totally different.

Aha - found what was holding it - an attempt to loopback mount the truncated
file (before I realized it was truncated) had failed - I had gotten a 'Killed'
back from the mount, but I didn't realize it had pulled an actual oops:

Dec 17 15:54:33 turing-police kernel: [14503.402385] attempt to access beyond 
end of device
Dec 17 15:54:33 turing-police kernel: [14503.402391] loop1: rw=0, want=1284500, 
limit=314240
Dec 17 15:54:33 turing-police kernel: [14503.402395] ISOFS: unable to read 
i-node block
Dec 17 15:54:33 turing-police kernel: [14503.402428] Unable to handle kernel 
NULL pointer dereference at 010b RIP:
Dec 17 15:54:33 turing-police kernel: [14503.402440]  [] 
iput+0x11/0x80
...
Dec 17 15:54:33 turing-police kernel: [14503.403008] Call Trace:
Dec 17 15:54:33 turing-police kernel: [14503.403026]  [] 
isofs_fill_super+0x7e9/0xa6b
Dec 17 15:54:33 turing-police kernel: [14503.403045]  [] 
__down_write_nested+0x3d/0xa1
Dec 17 15:54:33 turing-police kernel: [14503.403061]  [] 
__down_write+0xb/0xd
Dec 17 15:54:33 turing-police kernel: [14503.403076]  [] 
sget+0x397/0x3a9
Dec 17 15:54:33 turing-police kernel: [14503.403090]  [] 
set_bdev_super+0x0/0x14
Dec 17 15:54:33 turing-police kernel: [14503.403106]  [] 
get_sb_bdev+0x109/0x157
Dec 17 15:54:33 turing-police kernel: [14503.403120]  [] 
isofs_fill_super+0x0/0xa6b
Dec 17 15:54:33 turing-police kernel: [14503.403138]  [] 
isofs_get_sb+0x13/0x15
Dec 17 15:54:33 turing-police kernel: [14503.403151]  [] 
vfs_kern_mount+0x90/0x11a
Dec 17 15:54:33 turing-police kernel: [14503.403167]  [] 
do_kern_mount+0x47/0xe3
Dec 17 15:54:33 turing-police kernel: [14503.403183]  [] 
do_mount+0x717/0x78a
Dec 17 15:54:33 turing-police kernel: [14503.403199]  [] 
_read_lock_irq+0x9/0xb
Dec 17 15:54:33 turing-police kernel: [14503.403212]  [] 
find_lock_page+0x8c/0x97
Dec 17 15:54:33 turing-police kernel: [14503.403227]  [] 
filemap_fault+0x1fa/0x3c6
Dec 17 15:54:33 turing-police kernel: [14503.403241]  [] 
unlock_page+0x2d/0x31
Dec 17 15:54:33 turing-police kernel: [14503.403254]  [] 
__do_fault+0x38d/0x3c3
Dec 17 15:54:33 turing-police kernel: [14503.403274]  [] 
handle_mm_fault+0x36d/0x6e9
Dec 17 15:54:33 turing-police kernel: [14503.403293]  [] 
__alloc_pages+0x68/0x2f6
Dec 17 15:54:33 turing-police kernel: [14503.403314]  [] 
sys_mount+0x89/0xcb
Dec 17 15:54:33 turing-police kernel: [14503.403328]  [] 
syscall_trace_enter+0x97/0x9b
Dec 17 15:54:33 turing-police kernel: [14503.403344]  [] 
tracesys+0xdc/0xe1
Dec 17 15:54:33 turing-police kernel: [14503.403359]
Dec 17 15:54:33 turing-police kernel: [14503.403366]
Dec 17 15:54:33 turing-police kernel: [14503.403367] Code: 48 8b 87 10 01 00 00 
48 83 bf 38 02 00 00 40 48 8b 40 38 75

I don't mind it failing the mount, but the oops seems excessive.  I suspect
that *somewhere* in that stack trace, we're wanting something like a

if (!foo_ptr)
return -EIO;

but I admit not being competent enough to decide where that should be.



pgp96V9uaXsyW.pgp
Description: PGP signature


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-17 Thread Andrew Morton
On Mon, 17 Dec 2007 17:44:11 -0500
[EMAIL PROTECTED] wrote:

> On Thu, 13 Dec 2007 02:40:50 PST, Andrew Morton said:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/
> 
> OK, so I'm trying to 'dd' a CD and the drive on the laptop is having issues
> reading the disk.
> 
> I try it once, and get an I/O error about 117M in - dd reports 1.7M/sec.
> 
> I try it again, and it reports it died at the same exact place, but in about
> 2 seconds flat, and reports 91M/sec transfer.  OK, that's *weird*, I didn't
> think that blocks read from /dev/cdrom would get cached, but OK.

It'll remain cached if something is holding the device open.

>  So I try
> the obviously stupid thing:
> 
> # echo 1 >| /proc/sys/vm/drop_caches
> 
> Alas, that hangs gloriously - 'echo t > /proc/sysrq-trigger' tells me:
> 
> Dec 17 17:30:02 turing-police kernel: [20235.823201] bash  D 
> 0001  5288 15123  15085
> Dec 17 17:30:02 turing-police kernel: [20235.823206]  81007ba7de28 
> 0086  
> Dec 17 17:30:02 turing-police kernel: [20235.823210]  81007bbd9000 
> 81007d70e000 81007bbd9248 0001019e3e48
> Dec 17 17:30:02 turing-police kernel: [20235.823214]  e2f36028 
> e200012b9978 e2eece48 e20001164188
> Dec 17 17:30:02 turing-police kernel: [20235.823218] Call Trace:
> Dec 17 17:30:02 turing-police kernel: [20235.823224]  [] 
> __down_read+0x87/0xa1
> Dec 17 17:30:02 turing-police kernel: [20235.823229]  [] 
> down_read+0x9/0xe
> Dec 17 17:30:02 turing-police kernel: [20235.823232]  [] 
> drop_pagecache+0x3a/0x8c
> Dec 17 17:30:02 turing-police kernel: [20235.823235]  [] 
> drop_caches_sysctl_handler+0x22/0x38
> Dec 17 17:30:02 turing-police kernel: [20235.823239]  [] 
> proc_sys_write+0x7e/0xa6
> Dec 17 17:30:02 turing-police kernel: [20235.823244]  [] 
> vfs_write+0xc7/0x170
> Dec 17 17:30:02 turing-police kernel: [20235.823248]  [] 
> sys_write+0x47/0x70
> Dec 17 17:30:02 turing-police kernel: [20235.823251]  [] 
> tracesys+0xdc/0xe1
> 

Something's holding s_umount for writing I guess.  Possibly busted error
handling somewhere totally different.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-17 Thread Mariusz Kozlowski
Hello,

> > cat /proc/kpagecount on the other hand - with the change in line 710
> > - locks the box. Sysrq works, changing consoles works, but there is
> > no "BUG: soft lockup ..." message. After a while the box becomes
> > totaly unresponsive - even caps lock doesn't work, no responses to
> > ping.
> 
> Well I'm baffled. There's basically two things in that function that
> do anything interesting: pfn_to_page and put_user. access_ok is
> "return 1" on Sparc64. atomic_read is a simple read.
>
> My usual approach at this point would be to litter it with printks and
> see where its hanging.

Ok. Maybe this will help. Don't know how to compare that to the results from 
yesterday
(test with ppage = NULL) - maybe I f something up. This time I added a bunch
of printks and got these results:

This is from 'cat /proc/kpageflags' (after this the box is locked):

01
pfn:0, src:0, KPMSIZE:8
23458
ppage:0002, pfn:1

and the relevant code:

static ssize_t kpageflags_read(struct file *file, char __user *buf,
 size_t count, loff_t *ppos)
{

u64 __user *out = (u64 __user *)buf;
struct page *ppage;
unsigned long src = *ppos;
unsigned long pfn;
ssize_t ret = 0;
u64 kflags, uflags;

printk("0");

if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;

printk("1");
pfn = src / KPMSIZE;
printk("\npfn:%u, src:%u, KPMSIZE:%d\n", pfn, src, KPMSIZE);
count = min_t(unsigned long, count, (max_pfn * KPMSIZE) - src);

printk("2");
if (src & KPMMASK || count & KPMMASK)
return -EIO;

printk("3");
while (count > 0) {
printk("4");
ppage = pfn_to_page(pfn++);
printk("5");
if (!ppage) {
printk("6");
kflags = 0;
printk("7");
} else {
printk("8");
printk("\nppage:%p, pfn:%u\n", ppage, pfn);
kflags = ppage->flags; // < something 
bad happens
printk("9");
}

printk("a");



This is from 'cat /proc/kpagecount' (after this the box is locked)

01
pfn:0, src:0, KPMSIZE:8
23567a
ppage:0002, pfn:1

and this is the relevant code:

static ssize_t kpagecount_read(struct file *file, char __user *buf,
 size_t count, loff_t *ppos)
{

u64 __user *out = (u64 __user *)buf;
struct page *ppage;
unsigned long src = *ppos;
unsigned long pfn;
ssize_t ret = 0;
u64 pcount;
printk("0");
if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;

printk("1");
pfn = src / KPMSIZE;
printk("\npfn:%u, src:%u, KPMSIZE:%d\n", pfn, src, KPMSIZE);

printk("2");
count = min_t(size_t, count, (max_pfn * KPMSIZE) - src);
printk("3");
if (src & KPMMASK || count & KPMMASK) {

printk("4");
return -EIO;
}
printk("5");
while (count > 0) {
printk("6");
ppage = pfn_to_page(pfn++);
printk("7");
if (!ppage) {
printk("8");
pcount = 0;
} else {
printk("a");
printk("\nppage:%p, pfn:%u\n", ppage, pfn);
pcount = atomic_read(>_count); // 
< something bad happens
printk("b");
}


Regards,

Mariusz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-17 Thread Matt Mackall
On Sun, Dec 16, 2007 at 10:39:17PM -0800, Andrew Morton wrote:
> On Sun, 16 Dec 2007 20:26:11 -0800 (PST) David Miller <[EMAIL PROTECTED]> 
> wrote:
> 
> > From: Matt Mackall <[EMAIL PROTECTED]>
> > Date: Sun, 16 Dec 2007 20:11:49 -0600
> > 
> > > But as the function doesn't actually show up in your stack trace,
> > > something else is probably wrong. So I'd also try commenting out
> > > pieces of that function until it started working.
> > 
> > Some piece of state is being indirectly corrupted and this
> > is showing up later in some unrelated operation.
> > 
> > Can someone send me this kpageflags patch under seperate
> > cover?  I'll try figure out why it farts on sparc64.
> 
> hm, non trivial.  It's the third-from-last patch in:
> 
> maps4-add-proportional-set-size-accounting-in-smaps.patch
> maps4-rework-task_size-macros.patch
> maps4-rework-task_size-macros-mips-fix.patch
> maps4-move-is_swap_pte.patch
> maps4-introduce-a-generic-page-walker.patch
> maps4-use-pagewalker-in-clear_refs-and-smaps.patch
> maps4-simplify-interdependence-of-maps-and-smaps.patch
> maps4-move-clear_refs-code-to-task_mmuc.patch
> maps4-regroup-task_mmu-by-interface.patch
> maps4-add-proc-pid-pagemap-interface.patch

Actually, you may only need these two:

> maps4-add-proc-kpagecount-interface.patch
> maps4-add-proc-kpageflags-interface.patch

-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 compile failure: usbhid_lookup_quirk

2007-12-17 Thread Jiri Kosina
On Mon, 17 Dec 2007, Andrew Morton wrote:

> >   MODPOST 196 modules
> >   ERROR: "usbhid_lookup_quirk" [drivers/hid/usbhid/usbmouse.ko]
> >   undefined!
> >   ERROR: "usbhid_lookup_quirk" [drivers/hid/usbhid/usbkbd.ko] undefined!
> >   make[1]: *** [__modpost] Error 1
> >   make: *** [modules] Error 2
> > The problem was fixed by defining CONFIG_USB_HID=m - but I think that
> > should happen automatically if it is necessary.
> > .config was created by running make oldconfig against 2.6.23-rc8-mm2:
> Thanks.  That's coming out of git-hid.patch.

Thanks a lot for the report, I will fix that up in my tree.

By the way, please be aware that you almost certainly _do not_ want to use 
usbmouse and usbkbd drivers. Please read their Kconfig help text.

-- 
Jiri Kosina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 compile failure: usbhid_lookup_quirk

2007-12-17 Thread Andrew Morton
On Sat, 15 Dec 2007 19:50:40 +0100 jurriaan <[EMAIL PROTECTED]> wrote:

>   MODPOST 196 modules
>   ERROR: "usbhid_lookup_quirk" [drivers/hid/usbhid/usbmouse.ko]
>   undefined!
>   ERROR: "usbhid_lookup_quirk" [drivers/hid/usbhid/usbkbd.ko] undefined!
>   make[1]: *** [__modpost] Error 1
>   make: *** [modules] Error 2
> 
> The problem was fixed by defining CONFIG_USB_HID=m - but I think that
> should happen automatically if it is necessary.
> 
> .config was created by running make oldconfig against 2.6.23-rc8-mm2:

Thanks.  That's coming out of git-hid.patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 compile failure: usbhid_lookup_quirk

2007-12-17 Thread Andrew Morton
On Sat, 15 Dec 2007 19:50:40 +0100 jurriaan [EMAIL PROTECTED] wrote:

   MODPOST 196 modules
   ERROR: usbhid_lookup_quirk [drivers/hid/usbhid/usbmouse.ko]
   undefined!
   ERROR: usbhid_lookup_quirk [drivers/hid/usbhid/usbkbd.ko] undefined!
   make[1]: *** [__modpost] Error 1
   make: *** [modules] Error 2
 
 The problem was fixed by defining CONFIG_USB_HID=m - but I think that
 should happen automatically if it is necessary.
 
 .config was created by running make oldconfig against 2.6.23-rc8-mm2:

Thanks.  That's coming out of git-hid.patch.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 compile failure: usbhid_lookup_quirk

2007-12-17 Thread Jiri Kosina
On Mon, 17 Dec 2007, Andrew Morton wrote:

MODPOST 196 modules
ERROR: usbhid_lookup_quirk [drivers/hid/usbhid/usbmouse.ko]
undefined!
ERROR: usbhid_lookup_quirk [drivers/hid/usbhid/usbkbd.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
  The problem was fixed by defining CONFIG_USB_HID=m - but I think that
  should happen automatically if it is necessary.
  .config was created by running make oldconfig against 2.6.23-rc8-mm2:
 Thanks.  That's coming out of git-hid.patch.

Thanks a lot for the report, I will fix that up in my tree.

By the way, please be aware that you almost certainly _do not_ want to use 
usbmouse and usbkbd drivers. Please read their Kconfig help text.

-- 
Jiri Kosina
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-17 Thread Matt Mackall
On Sun, Dec 16, 2007 at 10:39:17PM -0800, Andrew Morton wrote:
 On Sun, 16 Dec 2007 20:26:11 -0800 (PST) David Miller [EMAIL PROTECTED] 
 wrote:
 
  From: Matt Mackall [EMAIL PROTECTED]
  Date: Sun, 16 Dec 2007 20:11:49 -0600
  
   But as the function doesn't actually show up in your stack trace,
   something else is probably wrong. So I'd also try commenting out
   pieces of that function until it started working.
  
  Some piece of state is being indirectly corrupted and this
  is showing up later in some unrelated operation.
  
  Can someone send me this kpageflags patch under seperate
  cover?  I'll try figure out why it farts on sparc64.
 
 hm, non trivial.  It's the third-from-last patch in:
 
 maps4-add-proportional-set-size-accounting-in-smaps.patch
 maps4-rework-task_size-macros.patch
 maps4-rework-task_size-macros-mips-fix.patch
 maps4-move-is_swap_pte.patch
 maps4-introduce-a-generic-page-walker.patch
 maps4-use-pagewalker-in-clear_refs-and-smaps.patch
 maps4-simplify-interdependence-of-maps-and-smaps.patch
 maps4-move-clear_refs-code-to-task_mmuc.patch
 maps4-regroup-task_mmu-by-interface.patch
 maps4-add-proc-pid-pagemap-interface.patch

Actually, you may only need these two:

 maps4-add-proc-kpagecount-interface.patch
 maps4-add-proc-kpageflags-interface.patch

-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-17 Thread Mariusz Kozlowski
Hello,

  cat /proc/kpagecount on the other hand - with the change in line 710
  - locks the box. Sysrq works, changing consoles works, but there is
  no BUG: soft lockup ... message. After a while the box becomes
  totaly unresponsive - even caps lock doesn't work, no responses to
  ping.
 
 Well I'm baffled. There's basically two things in that function that
 do anything interesting: pfn_to_page and put_user. access_ok is
 return 1 on Sparc64. atomic_read is a simple read.

 My usual approach at this point would be to litter it with printks and
 see where its hanging.

Ok. Maybe this will help. Don't know how to compare that to the results from 
yesterday
(test with ppage = NULL) - maybe I f something up. This time I added a bunch
of printks and got these results:

This is from 'cat /proc/kpageflags' (after this the box is locked):

01
pfn:0, src:0, KPMSIZE:8
23458
ppage:0002, pfn:1

and the relevant code:

static ssize_t kpageflags_read(struct file *file, char __user *buf,
 size_t count, loff_t *ppos)
{

u64 __user *out = (u64 __user *)buf;
struct page *ppage;
unsigned long src = *ppos;
unsigned long pfn;
ssize_t ret = 0;
u64 kflags, uflags;

printk(0);

if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;

printk(1);
pfn = src / KPMSIZE;
printk(\npfn:%u, src:%u, KPMSIZE:%d\n, pfn, src, KPMSIZE);
count = min_t(unsigned long, count, (max_pfn * KPMSIZE) - src);

printk(2);
if (src  KPMMASK || count  KPMMASK)
return -EIO;

printk(3);
while (count  0) {
printk(4);
ppage = pfn_to_page(pfn++);
printk(5);
if (!ppage) {
printk(6);
kflags = 0;
printk(7);
} else {
printk(8);
printk(\nppage:%p, pfn:%u\n, ppage, pfn);
kflags = ppage-flags; //  something 
bad happens
printk(9);
}

printk(a);



This is from 'cat /proc/kpagecount' (after this the box is locked)

01
pfn:0, src:0, KPMSIZE:8
23567a
ppage:0002, pfn:1

and this is the relevant code:

static ssize_t kpagecount_read(struct file *file, char __user *buf,
 size_t count, loff_t *ppos)
{

u64 __user *out = (u64 __user *)buf;
struct page *ppage;
unsigned long src = *ppos;
unsigned long pfn;
ssize_t ret = 0;
u64 pcount;
printk(0);
if (!access_ok(VERIFY_WRITE, buf, count))
return -EFAULT;

printk(1);
pfn = src / KPMSIZE;
printk(\npfn:%u, src:%u, KPMSIZE:%d\n, pfn, src, KPMSIZE);

printk(2);
count = min_t(size_t, count, (max_pfn * KPMSIZE) - src);
printk(3);
if (src  KPMMASK || count  KPMMASK) {

printk(4);
return -EIO;
}
printk(5);
while (count  0) {
printk(6);
ppage = pfn_to_page(pfn++);
printk(7);
if (!ppage) {
printk(8);
pcount = 0;
} else {
printk(a);
printk(\nppage:%p, pfn:%u\n, ppage, pfn);
pcount = atomic_read(ppage-_count); // 
 something bad happens
printk(b);
}


Regards,

Mariusz
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-17 Thread Andrew Morton
On Mon, 17 Dec 2007 17:44:11 -0500
[EMAIL PROTECTED] wrote:

 On Thu, 13 Dec 2007 02:40:50 PST, Andrew Morton said:
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/
 
 OK, so I'm trying to 'dd' a CD and the drive on the laptop is having issues
 reading the disk.
 
 I try it once, and get an I/O error about 117M in - dd reports 1.7M/sec.
 
 I try it again, and it reports it died at the same exact place, but in about
 2 seconds flat, and reports 91M/sec transfer.  OK, that's *weird*, I didn't
 think that blocks read from /dev/cdrom would get cached, but OK.

It'll remain cached if something is holding the device open.

  So I try
 the obviously stupid thing:
 
 # echo 1 | /proc/sys/vm/drop_caches
 
 Alas, that hangs gloriously - 'echo t  /proc/sysrq-trigger' tells me:
 
 Dec 17 17:30:02 turing-police kernel: [20235.823201] bash  D 
 0001  5288 15123  15085
 Dec 17 17:30:02 turing-police kernel: [20235.823206]  81007ba7de28 
 0086  
 Dec 17 17:30:02 turing-police kernel: [20235.823210]  81007bbd9000 
 81007d70e000 81007bbd9248 0001019e3e48
 Dec 17 17:30:02 turing-police kernel: [20235.823214]  e2f36028 
 e200012b9978 e2eece48 e20001164188
 Dec 17 17:30:02 turing-police kernel: [20235.823218] Call Trace:
 Dec 17 17:30:02 turing-police kernel: [20235.823224]  [80523e20] 
 __down_read+0x87/0xa1
 Dec 17 17:30:02 turing-police kernel: [20235.823229]  [8024bc13] 
 down_read+0x9/0xe
 Dec 17 17:30:02 turing-police kernel: [20235.823232]  [802abafe] 
 drop_pagecache+0x3a/0x8c
 Dec 17 17:30:02 turing-police kernel: [20235.823235]  [802abb72] 
 drop_caches_sysctl_handler+0x22/0x38
 Dec 17 17:30:02 turing-police kernel: [20235.823239]  [802d2b70] 
 proc_sys_write+0x7e/0xa6
 Dec 17 17:30:02 turing-police kernel: [20235.823244]  [8028e18c] 
 vfs_write+0xc7/0x170
 Dec 17 17:30:02 turing-police kernel: [20235.823248]  [8028e772] 
 sys_write+0x47/0x70
 Dec 17 17:30:02 turing-police kernel: [20235.823251]  [8020c34c] 
 tracesys+0xdc/0xe1
 

Something's holding s_umount for writing I guess.  Possibly busted error
handling somewhere totally different.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-17 Thread Valdis . Kletnieks
On Mon, 17 Dec 2007 14:56:44 PST, Andrew Morton said:

(Adding Al Viro to the list, he's listed as file systems and MAINTAINERS
doesn't list 'isofs' anyplace.  Will Al or Andrew please vector to whoever
actually does that code?)

  I try it again, and it reports it died at the same exact place, but in about
  2 seconds flat, and reports 91M/sec transfer.  OK, that's *weird*, I didn't
  think that blocks read from /dev/cdrom would get cached, but OK.
 
 It'll remain cached if something is holding the device open.

Does it need to be device open, or are there other things as well? If the
drop_cache was hosed, that would result in the same symptoms, no?

 Something's holding s_umount for writing I guess.  Possibly busted error
 handling somewhere totally different.

Aha - found what was holding it - an attempt to loopback mount the truncated
file (before I realized it was truncated) had failed - I had gotten a 'Killed'
back from the mount, but I didn't realize it had pulled an actual oops:

Dec 17 15:54:33 turing-police kernel: [14503.402385] attempt to access beyond 
end of device
Dec 17 15:54:33 turing-police kernel: [14503.402391] loop1: rw=0, want=1284500, 
limit=314240
Dec 17 15:54:33 turing-police kernel: [14503.402395] ISOFS: unable to read 
i-node block
Dec 17 15:54:33 turing-police kernel: [14503.402428] Unable to handle kernel 
NULL pointer dereference at 010b RIP:
Dec 17 15:54:33 turing-police kernel: [14503.402440]  [802a096b] 
iput+0x11/0x80
...
Dec 17 15:54:33 turing-police kernel: [14503.403008] Call Trace:
Dec 17 15:54:33 turing-police kernel: [14503.403026]  [802ff73e] 
isofs_fill_super+0x7e9/0xa6b
Dec 17 15:54:33 turing-police kernel: [14503.403045]  [80523d28] 
__down_write_nested+0x3d/0xa1
Dec 17 15:54:33 turing-police kernel: [14503.403061]  [80523d97] 
__down_write+0xb/0xd
Dec 17 15:54:33 turing-police kernel: [14503.403076]  [8028fb63] 
sget+0x397/0x3a9
Dec 17 15:54:33 turing-police kernel: [14503.403090]  [8028f204] 
set_bdev_super+0x0/0x14
Dec 17 15:54:33 turing-police kernel: [14503.403106]  [80290301] 
get_sb_bdev+0x109/0x157
Dec 17 15:54:33 turing-police kernel: [14503.403120]  [802fef55] 
isofs_fill_super+0x0/0xa6b
Dec 17 15:54:33 turing-police kernel: [14503.403138]  [802fe2e9] 
isofs_get_sb+0x13/0x15
Dec 17 15:54:33 turing-police kernel: [14503.403151]  [80290075] 
vfs_kern_mount+0x90/0x11a
Dec 17 15:54:33 turing-police kernel: [14503.403167]  [8029015c] 
do_kern_mount+0x47/0xe3
Dec 17 15:54:33 turing-police kernel: [14503.403183]  [802a5012] 
do_mount+0x717/0x78a
Dec 17 15:54:33 turing-police kernel: [14503.403199]  [805242fc] 
_read_lock_irq+0x9/0xb
Dec 17 15:54:33 turing-police kernel: [14503.403212]  [8026cce0] 
find_lock_page+0x8c/0x97
Dec 17 15:54:33 turing-police kernel: [14503.403227]  [8026ecb6] 
filemap_fault+0x1fa/0x3c6
Dec 17 15:54:33 turing-police kernel: [14503.403241]  [8026cb6b] 
unlock_page+0x2d/0x31
Dec 17 15:54:33 turing-police kernel: [14503.403254]  [8027925c] 
__do_fault+0x38d/0x3c3
Dec 17 15:54:33 turing-police kernel: [14503.403274]  [8027ab68] 
handle_mm_fault+0x36d/0x6e9
Dec 17 15:54:33 turing-police kernel: [14503.403293]  [80271903] 
__alloc_pages+0x68/0x2f6
Dec 17 15:54:33 turing-police kernel: [14503.403314]  [802a510e] 
sys_mount+0x89/0xcb
Dec 17 15:54:33 turing-police kernel: [14503.403328]  [80214f34] 
syscall_trace_enter+0x97/0x9b
Dec 17 15:54:33 turing-police kernel: [14503.403344]  [8020c34c] 
tracesys+0xdc/0xe1
Dec 17 15:54:33 turing-police kernel: [14503.403359]
Dec 17 15:54:33 turing-police kernel: [14503.403366]
Dec 17 15:54:33 turing-police kernel: [14503.403367] Code: 48 8b 87 10 01 00 00 
48 83 bf 38 02 00 00 40 48 8b 40 38 75

I don't mind it failing the mount, but the oops seems excessive.  I suspect
that *somewhere* in that stack trace, we're wanting something like a

if (!foo_ptr)
return -EIO;

but I admit not being competent enough to decide where that should be.



pgp96V9uaXsyW.pgp
Description: PGP signature


Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-17 Thread Dave Young
On Mon, Dec 17, 2007 at 09:07:56PM -0500, [EMAIL PROTECTED] wrote:
 On Mon, 17 Dec 2007 14:56:44 PST, Andrew Morton said:
 
 (Adding Al Viro to the list, he's listed as file systems and MAINTAINERS
 doesn't list 'isofs' anyplace.  Will Al or Andrew please vector to whoever
 actually does that code?)
 
   I try it again, and it reports it died at the same exact place, but in 
   about
   2 seconds flat, and reports 91M/sec transfer.  OK, that's *weird*, I 
   didn't
   think that blocks read from /dev/cdrom would get cached, but OK.
  
  It'll remain cached if something is holding the device open.
 
 Does it need to be device open, or are there other things as well? If the
 drop_cache was hosed, that would result in the same symptoms, no?
 
  Something's holding s_umount for writing I guess.  Possibly busted error
  handling somewhere totally different.
 
 Aha - found what was holding it - an attempt to loopback mount the truncated
 file (before I realized it was truncated) had failed - I had gotten a 'Killed'
 back from the mount, but I didn't realize it had pulled an actual oops:
 
 Dec 17 15:54:33 turing-police kernel: [14503.402385] attempt to access beyond 
 end of device
 Dec 17 15:54:33 turing-police kernel: [14503.402391] loop1: rw=0, 
 want=1284500, limit=314240
 Dec 17 15:54:33 turing-police kernel: [14503.402395] ISOFS: unable to read 
 i-node block
 Dec 17 15:54:33 turing-police kernel: [14503.402428] Unable to handle kernel 
 NULL pointer dereference at 010b RIP:
 Dec 17 15:54:33 turing-police kernel: [14503.402440]  [802a096b] 
 iput+0x11/0x80
 ...
 Dec 17 15:54:33 turing-police kernel: [14503.403008] Call Trace:
 Dec 17 15:54:33 turing-police kernel: [14503.403026]  [802ff73e] 
 isofs_fill_super+0x7e9/0xa6b
 Dec 17 15:54:33 turing-police kernel: [14503.403045]  [80523d28] 
 __down_write_nested+0x3d/0xa1
 Dec 17 15:54:33 turing-police kernel: [14503.403061]  [80523d97] 
 __down_write+0xb/0xd
 Dec 17 15:54:33 turing-police kernel: [14503.403076]  [8028fb63] 
 sget+0x397/0x3a9
 Dec 17 15:54:33 turing-police kernel: [14503.403090]  [8028f204] 
 set_bdev_super+0x0/0x14
 Dec 17 15:54:33 turing-police kernel: [14503.403106]  [80290301] 
 get_sb_bdev+0x109/0x157
 Dec 17 15:54:33 turing-police kernel: [14503.403120]  [802fef55] 
 isofs_fill_super+0x0/0xa6b
 Dec 17 15:54:33 turing-police kernel: [14503.403138]  [802fe2e9] 
 isofs_get_sb+0x13/0x15
 Dec 17 15:54:33 turing-police kernel: [14503.403151]  [80290075] 
 vfs_kern_mount+0x90/0x11a
 Dec 17 15:54:33 turing-police kernel: [14503.403167]  [8029015c] 
 do_kern_mount+0x47/0xe3
 Dec 17 15:54:33 turing-police kernel: [14503.403183]  [802a5012] 
 do_mount+0x717/0x78a
 Dec 17 15:54:33 turing-police kernel: [14503.403199]  [805242fc] 
 _read_lock_irq+0x9/0xb
 Dec 17 15:54:33 turing-police kernel: [14503.403212]  [8026cce0] 
 find_lock_page+0x8c/0x97
 Dec 17 15:54:33 turing-police kernel: [14503.403227]  [8026ecb6] 
 filemap_fault+0x1fa/0x3c6
 Dec 17 15:54:33 turing-police kernel: [14503.403241]  [8026cb6b] 
 unlock_page+0x2d/0x31
 Dec 17 15:54:33 turing-police kernel: [14503.403254]  [8027925c] 
 __do_fault+0x38d/0x3c3
 Dec 17 15:54:33 turing-police kernel: [14503.403274]  [8027ab68] 
 handle_mm_fault+0x36d/0x6e9
 Dec 17 15:54:33 turing-police kernel: [14503.403293]  [80271903] 
 __alloc_pages+0x68/0x2f6
 Dec 17 15:54:33 turing-police kernel: [14503.403314]  [802a510e] 
 sys_mount+0x89/0xcb
 Dec 17 15:54:33 turing-police kernel: [14503.403328]  [80214f34] 
 syscall_trace_enter+0x97/0x9b
 Dec 17 15:54:33 turing-police kernel: [14503.403344]  [8020c34c] 
 tracesys+0xdc/0xe1
 Dec 17 15:54:33 turing-police kernel: [14503.403359]
 Dec 17 15:54:33 turing-police kernel: [14503.403366]
 Dec 17 15:54:33 turing-police kernel: [14503.403367] Code: 48 8b 87 10 01 00 
 00 48 83 bf 38 02 00 00 40 48 8b 40 38 75
 
 I don't mind it failing the mount, but the oops seems excessive.  I suspect
 that *somewhere* in that stack trace, we're wanting something like a
 
   if (!foo_ptr)
   return -EIO;
 
 but I admit not being competent enough to decide where that should be.
 

Hi,
Could you please try the below patch:

Signed-off-by: Dave Young [EMAIL PROTECTED] 

---
fs/isofs/inode.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff -upr linux/fs/isofs/inode.c linux.new/fs/isofs/inode.c
--- linux/fs/isofs/inode.c  2007-12-18 10:31:12.0 +0800
+++ linux.new/fs/isofs/inode.c  2007-12-18 10:31:56.0 +0800
@@ -1414,7 +1414,7 @@ struct inode *isofs_iget(struct super_bl
ret = isofs_read_inode(inode);
if (ret  0) {
iget_failed(inode);
-   inode = ERR_PTR(ret);
+   return NULL;
} else {
unlock_new_inode(inode);
   

Re: 2.6.24-rc5-mm1 - wonky disk cache and CDROM behavior...

2007-12-17 Thread Andrew Morton
On Tue, 18 Dec 2007 10:37:32 +0800 Dave Young [EMAIL PROTECTED] wrote:

 On Mon, Dec 17, 2007 at 09:07:56PM -0500, [EMAIL PROTECTED] wrote:
  On Mon, 17 Dec 2007 14:56:44 PST, Andrew Morton said:
  
  (Adding Al Viro to the list, he's listed as file systems and MAINTAINERS
  doesn't list 'isofs' anyplace.  Will Al or Andrew please vector to whoever
  actually does that code?)
  
I try it again, and it reports it died at the same exact place, but in 
about
2 seconds flat, and reports 91M/sec transfer.  OK, that's *weird*, I 
didn't
think that blocks read from /dev/cdrom would get cached, but OK.
   
   It'll remain cached if something is holding the device open.
  
  Does it need to be device open, or are there other things as well? If the
  drop_cache was hosed, that would result in the same symptoms, no?
  
   Something's holding s_umount for writing I guess.  Possibly busted error
   handling somewhere totally different.
  
  Aha - found what was holding it - an attempt to loopback mount the truncated
  file (before I realized it was truncated) had failed - I had gotten a 
  'Killed'
  back from the mount, but I didn't realize it had pulled an actual oops:
  
  Dec 17 15:54:33 turing-police kernel: [14503.402385] attempt to access 
  beyond end of device
  Dec 17 15:54:33 turing-police kernel: [14503.402391] loop1: rw=0, 
  want=1284500, limit=314240
  Dec 17 15:54:33 turing-police kernel: [14503.402395] ISOFS: unable to read 
  i-node block
  Dec 17 15:54:33 turing-police kernel: [14503.402428] Unable to handle 
  kernel NULL pointer dereference at 010b RIP:
  Dec 17 15:54:33 turing-police kernel: [14503.402440]  [802a096b] 
  iput+0x11/0x80
  ...
  Dec 17 15:54:33 turing-police kernel: [14503.403008] Call Trace:
  Dec 17 15:54:33 turing-police kernel: [14503.403026]  [802ff73e] 
  isofs_fill_super+0x7e9/0xa6b
  Dec 17 15:54:33 turing-police kernel: [14503.403045]  [80523d28] 
  __down_write_nested+0x3d/0xa1
  Dec 17 15:54:33 turing-police kernel: [14503.403061]  [80523d97] 
  __down_write+0xb/0xd
  Dec 17 15:54:33 turing-police kernel: [14503.403076]  [8028fb63] 
  sget+0x397/0x3a9
  Dec 17 15:54:33 turing-police kernel: [14503.403090]  [8028f204] 
  set_bdev_super+0x0/0x14
  Dec 17 15:54:33 turing-police kernel: [14503.403106]  [80290301] 
  get_sb_bdev+0x109/0x157
  Dec 17 15:54:33 turing-police kernel: [14503.403120]  [802fef55] 
  isofs_fill_super+0x0/0xa6b
  Dec 17 15:54:33 turing-police kernel: [14503.403138]  [802fe2e9] 
  isofs_get_sb+0x13/0x15
  Dec 17 15:54:33 turing-police kernel: [14503.403151]  [80290075] 
  vfs_kern_mount+0x90/0x11a
  Dec 17 15:54:33 turing-police kernel: [14503.403167]  [8029015c] 
  do_kern_mount+0x47/0xe3
  Dec 17 15:54:33 turing-police kernel: [14503.403183]  [802a5012] 
  do_mount+0x717/0x78a
  Dec 17 15:54:33 turing-police kernel: [14503.403199]  [805242fc] 
  _read_lock_irq+0x9/0xb
  Dec 17 15:54:33 turing-police kernel: [14503.403212]  [8026cce0] 
  find_lock_page+0x8c/0x97
  Dec 17 15:54:33 turing-police kernel: [14503.403227]  [8026ecb6] 
  filemap_fault+0x1fa/0x3c6
  Dec 17 15:54:33 turing-police kernel: [14503.403241]  [8026cb6b] 
  unlock_page+0x2d/0x31
  Dec 17 15:54:33 turing-police kernel: [14503.403254]  [8027925c] 
  __do_fault+0x38d/0x3c3
  Dec 17 15:54:33 turing-police kernel: [14503.403274]  [8027ab68] 
  handle_mm_fault+0x36d/0x6e9
  Dec 17 15:54:33 turing-police kernel: [14503.403293]  [80271903] 
  __alloc_pages+0x68/0x2f6
  Dec 17 15:54:33 turing-police kernel: [14503.403314]  [802a510e] 
  sys_mount+0x89/0xcb
  Dec 17 15:54:33 turing-police kernel: [14503.403328]  [80214f34] 
  syscall_trace_enter+0x97/0x9b
  Dec 17 15:54:33 turing-police kernel: [14503.403344]  [8020c34c] 
  tracesys+0xdc/0xe1
  Dec 17 15:54:33 turing-police kernel: [14503.403359]
  Dec 17 15:54:33 turing-police kernel: [14503.403366]
  Dec 17 15:54:33 turing-police kernel: [14503.403367] Code: 48 8b 87 10 01 
  00 00 48 83 bf 38 02 00 00 40 48 8b 40 38 75
  
  I don't mind it failing the mount, but the oops seems excessive.  I suspect
  that *somewhere* in that stack trace, we're wanting something like a
  
  if (!foo_ptr)
  return -EIO;
  
  but I admit not being competent enough to decide where that should be.
  
 
 Hi,
 Could you please try the below patch:
 
 Signed-off-by: Dave Young [EMAIL PROTECTED] 
 
 ---
 fs/isofs/inode.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff -upr linux/fs/isofs/inode.c linux.new/fs/isofs/inode.c
 --- linux/fs/isofs/inode.c2007-12-18 10:31:12.0 +0800
 +++ linux.new/fs/isofs/inode.c2007-12-18 10:31:56.0 +0800
 @@ -1414,7 +1414,7 @@ struct inode *isofs_iget(struct super_bl
   ret = isofs_read_inode(inode);
   if (ret  0) {
   

Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread Andrew Morton
On Sun, 16 Dec 2007 20:26:11 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote:

> From: Matt Mackall <[EMAIL PROTECTED]>
> Date: Sun, 16 Dec 2007 20:11:49 -0600
> 
> > But as the function doesn't actually show up in your stack trace,
> > something else is probably wrong. So I'd also try commenting out
> > pieces of that function until it started working.
> 
> Some piece of state is being indirectly corrupted and this
> is showing up later in some unrelated operation.
> 
> Can someone send me this kpageflags patch under seperate
> cover?  I'll try figure out why it farts on sparc64.

hm, non trivial.  It's the third-from-last patch in:

maps4-add-proportional-set-size-accounting-in-smaps.patch
maps4-rework-task_size-macros.patch
maps4-rework-task_size-macros-mips-fix.patch
maps4-move-is_swap_pte.patch
maps4-introduce-a-generic-page-walker.patch
maps4-use-pagewalker-in-clear_refs-and-smaps.patch
maps4-simplify-interdependence-of-maps-and-smaps.patch
maps4-move-clear_refs-code-to-task_mmuc.patch
maps4-regroup-task_mmu-by-interface.patch
maps4-add-proc-pid-pagemap-interface.patch
maps4-add-proc-kpagecount-interface.patch
maps4-add-proc-kpageflags-interface.patch
maps4-make-page-monitoring-proc-file-optional.patch
maps4-make-page-monitoring-proc-file-optional-fix.patch

from
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc5/2.6.24-rc5-mm1/broken-out

That patch series does apply OK to mainline though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread David Miller
From: Matt Mackall <[EMAIL PROTECTED]>
Date: Sun, 16 Dec 2007 20:11:49 -0600

> But as the function doesn't actually show up in your stack trace,
> something else is probably wrong. So I'd also try commenting out
> pieces of that function until it started working.

Some piece of state is being indirectly corrupted and this
is showing up later in some unrelated operation.

Can someone send me this kpageflags patch under seperate
cover?  I'll try figure out why it farts on sparc64.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread Matt Mackall
On Sun, Dec 16, 2007 at 08:10:10PM +0100, Mariusz Kozlowski wrote:
> > > Can you change line 710 of fs/proc/proc_misc.c to:
> > > 
> > >   ppage = NULL;
> > 
> > Sure.
> > 
> > > ..and see if it still breaks?
> > 
> > Yes it does - the same way as eariler. Box is locked, processes stuck in D 
> > state
> > and after a while "BUG: soft lockup - CPU#0 stuck for 11s!".
> 
> My mistake. I run cat /proc/kpageflags in the first place - so how
> could anything change :)
> 
> cat /proc/kpagecount on the other hand - with the change in line 710
> - locks the box. Sysrq works, changing consoles works, but there is
> no "BUG: soft lockup ..." message. After a while the box becomes
> totaly unresponsive - even caps lock doesn't work, no responses to
> ping.

Well I'm baffled. There's basically two things in that function that
do anything interesting: pfn_to_page and put_user. access_ok is
"return 1" on Sparc64. atomic_read is a simple read.

My usual approach at this point would be to litter it with printks and
see where its hanging.

But as the function doesn't actually show up in your stack trace,
something else is probably wrong. So I'd also try commenting out
pieces of that function until it started working.

-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1

2007-12-16 Thread Dave Young
On Dec 14, 2007 11:44 PM, Alan Stern <[EMAIL PROTECTED]> wrote:
> On Fri, 14 Dec 2007, Dave Young wrote:
>
> > Hi,
> > The behaviour of my mp3 player (also act as usb-storage device) seems
> > changed from rc5 to rc5-mm1.
>
> This can't be considered a bug, right?

I'm not sure.

> It's just that the player
> changed from one slightly non-standard behavior to a different slightly
> non-standard behavior.
>
>
> > :
> > =
> > usb 1-7: new high speed USB device using ehci_hcd and address 7
> > usb 1-7: configuration #1 chosen from 1 choice
> > scsi4 : SCSI emulation for USB Mass Storage devices
> > usb-storage: device found at 7
> > usb-storage: waiting for device to settle before scanning
> > scsi 4:0:0:0: Direct-Access   Newman mp3   PQ: 0 ANSI: 
> > 0 CCS
> > sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > sd 4:0:0:0: [sdb] Write Protect is on
> > sd 4:0:0:0: [sdb] Mode Sense: 03 00 80 00
> > sd 4:0:0:0: [sdb] Assuming drive cache: write through
> > sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > sd 4:0:0:0: [sdb] Write Protect is on
> > sd 4:0:0:0: [sdb] Mode Sense: 03 00 80 00
> > sd 4:0:0:0: [sdb] Assuming drive cache: write through
> >  sdb: sdb1
> > sd 4:0:0:0: [sdb] Attached SCSI removable disk
> > sd 4:0:0:0: Attached scsi generic sg1 type 0
> > usb-storage: device scan complete
> >
> > ==
> > try mount it (or just blockdev --rereadpt), then write protect become off:
> > ==
> >
> > sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > sd 4:0:0:0: [sdb] Write Protect is off
> > sd 4:0:0:0: [sdb] Mode Sense: 03 00 00 00
> > sd 4:0:0:0: [sdb] Assuming drive cache: write through
> > sd 4:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > sd 4:0:0:0: [sdb] Write Protect is off
> > sd 4:0:0:0: [sdb] Mode Sense: 03 00 00 00
> > sd 4:0:0:0: [sdb] Assuming drive cache: write through
> >  sdb: sdb1
>
> This output won't appear if you simply mount the device.  So how do you
> know that mounting turns off write protect?

This can be observed by eye:
dmesg -> mount -> dmesg

>
> > But under rc5-mm1, after mount command being executed, it is just
> > mouted as read only partition without set the write-protect to off
> >
> > I tried "blockdev --rereadpt", it do set the write-protect to off as rc5 
> > kernel.
> >
> > Below is the output of dmesg under rc5-mm1
> > ==
> > usb 1-8: new high speed USB device using ehci_hcd and address 6
> > usb 1-8: configuration #1 chosen from 1 choice
> > scsi3 : SCSI emulation for USB Mass Storage devices
> > usb-storage: device found at 6
> > usb-storage: waiting for device to settle before scanning
> > scsi 3:0:0:0: Direct-Access   Newman mp3   PQ: 0 ANSI: 
> > 0 CCS
> > sd 3:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > sd 3:0:0:0: [sdb] Write Protect is on
> > sd 3:0:0:0: [sdb] Mode Sense: 03 00 80 00
> > sd 3:0:0:0: [sdb] Assuming drive cache: write through
> > sd 3:0:0:0: [sdb] 245248 512-byte hardware sectors (126 MB)
> > sd 3:0:0:0: [sdb] Write Protect is on
> > sd 3:0:0:0: [sdb] Mode Sense: 03 00 80 00
> > sd 3:0:0:0: [sdb] Assuming drive cache: write through
> >  sdb: sdb1
>
> This looks exactly the same as the output above (except for various
> port, device, and bus numbers).

Yes, but lacks the part of "'Write Protect if off'  and other lines".

>
> If you turn on CONFIG_USB_STORAGE_DEBUG for both kernels and compare
> the dmesg output for the mount command, that might highlight the
> difference.

Ok, I will test with do once have time, thanks.

Regards
dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} -> {softirq-on-R} usage.

2007-12-16 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Fri, 14 Dec 2007 15:36:33 -0800

> The networking bug looks to be around sock_i_ino()'s taking of
> sk_callback_lock with softirq's enabled.  Perhaps this will fix it.

One should be suspicious of any case where write_lock is performed
on sk->sk_callback_lock in softint context.  And that's the only
way this can trigger, so this patch is wrong.

Generally, sock_orphan() and sock_graft() are the only primary
places where sk->sk_callback_lock is acquired as a writer.  And
these should be invoked only from process context.

Perhaps there is some exception to this in some specialized layer such
as SUNRPC, which are the only other spots I see potentially doing
sk->sk_callback_lock write acquires in softint context, which as
stated should not be done.

OCFS2 and ISCSI seem to be following the rules in it's write lock
calls on this lock.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread Mariusz Kozlowski
Witam, 

> > > > cat /proc/kpageflags on sparc64 causes the box to lock.
> > > > I can not write on any terminal - but I can issue sysrqs and switch
> > > > between consoles.
> > > > 
> > > > cat process hangs in read(3, ...
> > > 
> > > cat /proc/kpagecount produces similar symptoms. box is locked - sysrq-w 
> > > sshd trace:
> > > 
> > > __down
> > > __down_interruptible
> > > kobject_get
> > > lock_kernel
> > > chrdev_open
> > > __dentry_open
> > > nameidata_to_filp
> > > open_pathname
> > > do_sys_open
> > > sparc32_open
> > > linux_sparc_syscall32
> > 
> > Perhaps this is related to sparsemem.
> > 
> > Can you change line 710 of fs/proc/proc_misc.c to:
> > 
> > ppage = NULL;
> 
> Sure.
> 
> > ..and see if it still breaks?
> 
> Yes it does - the same way as eariler. Box is locked, processes stuck in D 
> state
> and after a while "BUG: soft lockup - CPU#0 stuck for 11s!".

My mistake. I run cat /proc/kpageflags in the first place - so how could 
anything change :)

cat /proc/kpagecount on the other hand - with the change in line 710 - locks 
the box.
Sysrq works, changing consoles works, but there is no "BUG: soft lockup ..." 
message.
After a while the box becomes totaly unresponsive - even caps lock doesn't 
work, no
responses to ping.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread Mariusz Kozlowski
> > > cat /proc/kpageflags on sparc64 causes the box to lock.
> > > I can not write on any terminal - but I can issue sysrqs and switch
> > > between consoles.
> > > 
> > > cat process hangs in read(3, ...
> > 
> > cat /proc/kpagecount produces similar symptoms. box is locked - sysrq-w 
> > sshd trace:
> > 
> > __down
> > __down_interruptible
> > kobject_get
> > lock_kernel
> > chrdev_open
> > __dentry_open
> > nameidata_to_filp
> > open_pathname
> > do_sys_open
> > sparc32_open
> > linux_sparc_syscall32
> 
> Perhaps this is related to sparsemem.
> 
> Can you change line 710 of fs/proc/proc_misc.c to:
> 
>   ppage = NULL;

Sure.

> ..and see if it still breaks?

Yes it does - the same way as eariler. Box is locked, processes stuck in D state
and after a while "BUG: soft lockup - CPU#0 stuck for 11s!".


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc5-mm1: problems with cat /proc/kpageflags

2007-12-16 Thread Matt Mackall
On Sun, Dec 16, 2007 at 12:40:53PM +0100, Mariusz Kozlowski wrote:
> > cat /proc/kpageflags on sparc64 causes the box to lock.
> > I can not write on any terminal - but I can issue sysrqs and switch
> > between consoles.
> > 
> > cat process hangs in read(3, ...
> 
> cat /proc/kpagecount produces similar symptoms. box is locked - sysrq-w sshd 
> trace:
> 
> __down
> __down_interruptible
> kobject_get
> lock_kernel
> chrdev_open
> __dentry_open
> nameidata_to_filp
> open_pathname
> do_sys_open
> sparc32_open
> linux_sparc_syscall32

Perhaps this is related to sparsemem.

Can you change line 710 of fs/proc/proc_misc.c to:

ppage = NULL;

..and see if it still breaks?

-- 
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   >